Item 44135965

EGreg • 7 days ago

Wow. I used to correspond with Leslie Lamport years ago (about his Buridan's Principle papers, etc.)

Today I went to his website and discovered a lot about TLA+ and PlusCal. He still maintains it: https://lamport.azurewebsites.net/tla/peterson.html?back-lin...

I must say ... it would make total sense for a guy like that, who brought mathematics to programming and was an OG of concurrent systems, to create a systems design language that's used at AWS and other industrial places that need to serve people.

I wish more people who build distributed systems would use what he made. Proving correctness is crucial in large systems.

lopatin • 7 days ago

And just a tip for who may be intersted: Claude Opus with Extended Thinking seems to be very good at converting existing code into TLA+ specs.

I've found multiple bugs for personal Rust projects like this (A Snake game that allowed a snake to do a 180 degree turn), and have verified some small core C++ components at work with it as well (a queue that has certain properties around locking and liveness).

I tried other models but kept getting issues with syntax and spec logic with anything else besides Opus.

2 replies

dosnem • 7 days ago

I’ve always envisioned tla and other formal methods as specific to distributed systems and never needed to understand it. How is it used for a snake game? Also how is the TLA+ spec determined from the code? Won’t it implicitly model incorrect bugs as correct behaviour since it’s an existing state in the system? Also when using TLA from the start, can it be applied to implementations? Or is it only for catching bugs during design? Therefore I’m assuming implementations still need to match the design exactly or else you would still get subtle bugs? Sorry for all the questions I’ve never actually learned formal methods but have always been interested.

3 replies

lopatin • 7 days ago

Here's how it caught my Snake bug: My snake representation is a vector of key points (head, turns, tail). A snake in a straight line, of length 3, facing right can look like this: [(0,0), (2,0)]. When a Snake moves (a single function called "step_forward"), the Snake representation is compressed by my code: If the last 2 points are the same, remove the last one. So if this snake changes direction to "left", then the new snake representation would be [(1, 1), (1, 1)] and compressed to [(1, 1)] before existing out of step_forward.

Here's how the bug was caught: It should be impossible for the Snake representation to be < 2 points. So I told Opus to model the behavior of my snake, and also to write a TLA+ invariant that the snake length should never be under 2. TLA+ then basically simulates it and finds the exact sequence of steps "turns" that cause that invariant to not hold. In this case it was quite trivial, I never thought to prevent a Snake from making turns that are not 90 degrees.

Jtsummers • 7 days ago

It's targeted at distributed systems, but it can be used to model any system over time. I've used it for distributed systems, but also for embedded systems with a peculiar piece of hardware that (seemed, and we found was) to be misbehaving. I modeled the hardware and its spec in TLA+, then made changes to the behavior description to see if it broke any expected invariants (it did, in precisely the way we saw with the real hardware). The TLA+ model also helped me develop better reproducible test cases for that hardware compared to what we were doing before.

skydhash • 7 days ago

I'm not an expert but my current understanding is that code execution is always a state transition to the next state. So what you do is fully specify each state and the relation between them. How the transition actually happens is your code and it's not that important. What's important is that the relations does not conflict to each other. It's a supercharged type system.

> Also how is TLA+ spec determined from the code?

You start from the initial state, which is always known (or is fixed). Then you model the invariants for each lines.

> Won’t it implicitly model incorrect bugs as correct behaviour since it’s an existing state in the system

Invariants will conflicts with each other in this case.

> Also when using TLA from the start, can it be applied to implementations?

Yes, by fully following the specs and handling possible incorrect states that may happens in practice. If your initial state in the TLA+ specs says that it only includes natural numbers between 1 and 5, you add assertions in your implementation (or throw exceptions) that check that as the Int type in many type systems is not a full guarantee for that constraint. Even more work when using a dynamic language.

k__ • 7 days ago

It seems like the new DeepSeek performs at a similar level as Opus 4. At least to preliminary Aider benchmarks.

skydhash • 7 days ago

> Proving correctness is crucial in large systems.

It could be good in smaller, but critical and widely used utilities like SSH and terminals.

1 reply

oblio • 7 days ago

Yeah, basically all the coreutils plus all the common extras (rsync, ssh, etc) could use stuff like this.

1 reply

rthnbgrredf • 7 days ago

It should be feasible to rewrite the coreitils like ls, cd and cp in Lean 4 together with Cursor within days. Rsync and ssh are more complex though.

1 reply

oblio • 7 days ago

Your first claim is actually a very solid test for AI. We should start seeing a lot more AI powered OSS projects or at least contributions if AI truly is as good as they say. Heck, OSS should accelerate exponentially since contributions should become very easy.

belter • 7 days ago

>> Proving correctness is crucial in large systems.

You can't do that...

The model checker says the specification satisfies the properties you wrote within the finite state space you explored...

1 reply

amw-zero • 7 days ago

You can write proofs in TLA+ and many other formalisms. You don’t need to ever use a model checker. The proofs hold for an infinite number of infinite-length executions. We are definitely not limited to finite behaviors.