The author appears to have a serious misconception about Lean, which is surprising since he seems to be quite knowledgeable in the area.
Specifically, the author seems to be under the impression that Lean retains proof objects and the final proof to be checked is one massive proof object, with all definitions unfolded: "these massive terms are unnecessary, but are kept anyway" (TFA). This couldn't be further from the truth. Lean implements exactly the same optimization as the author cherishes in LCF; metaphorically, that "The steps of a proof would be performed but not recorded, like a mathematics lecturer using a small blackboard who rubs out earlier parts of proofs to make space for later ones" (quoted by blog post linked from TFA). Once a `theorem` (as opposed to a `def`) is written in Lean4, then the proof object is no longer used. This is not merely an optimization but a critical part of the language: theorems are opaque. If the proof term is not discarded (and I'm not sure it isn't), then this is only for the sake of user observability in the interactive mode; the kernel does not and cannot care what the proof object was.
A proof object in dependent type theory is just the term that inhabits a type. So are you saying the Lean implementation can construct proofs without constructing such a term?
No, I'm saying it is checked and then discarded. (Or at least, discarded by the kernel. Presumably it ends up somewhere in the frontend's tactic cache.) That matches perfectly the metaphor, "rubs out earlier parts of proofs to make space for later ones".
The shared misconception seems to be in believing that because _conceptually_ the theory implemented by Lean builds up a massive proof term, that _operationally_ the Lean kernel must also be doing that. This does not follow. (Even the concept is not quite right since Lean4 is not perfectly referentially transparent in the presence of quotients.)
Yeah. I guess the abstract type approach saves some memory, but it's a constant factor thing, not an asymptotic improvement. The comment that Lean wastes "tens of megabytes" seems telling: it seems like something that would be a critical advantage in the 1980s and 1990s, when Paulson first fought these battles, but maybe less important today...
To be fair, lean wastes and leaks memory like a sieve, but this is almost all in the frontend. It has nothing to do with the kernel or the theorem proving approach chosen.
It is more a conceptual thing. In LCF, proofs and terms are different things, and that is how it should be in my opinion. This Curry-Howard confusion is unnecessary, but many people don't realise that, they think it is the only way to do math on a computer. You can still store proofs in LCF if you want, and use them; just as in Lean, you might be able to optimise them away.
You have done no more to show an actual distinction in the approach than TFA and its linked blog post... It sounds like a naming thing to me. On one side we name the term/program as a term and see it as something checked by the kernel, and on the other you name the term/program as a program and see it as something executed by the runtime. What's the difference?
There is indeed no difference if your dependent-typing approach is using reflection (where the checked term is actually a program that's logically proven to result in a a correct proof when executed - such as, commonly, by running a decision procedure) but that's not a common approach.
The difference is that a term is not (necessarily) a program. Also checking is not executing. Its like saying riding a horse is the same as eating a fish. Really just a naming thing, what's the difference?
You're drawing an equivalence between the wrong pair of things. I'm not saying that term=program; I'm saying that the type checker, qua `term -> context -> decision`, bound to a particular term, is a program `context -> decision`, and the other approach is also a program `context -> decision`. I guess it's defunctionalization, not "nothing", but a next-door neighbor of "nothing".
Deceitful omission from TFA: the warning lights which the firefighter admits seeing override any tower authorization. Omitting this leaves a reader with several incorrect impressions:
INCORRECT: The truck was cleared to cross the runway.
CORRECT: The truck was forbidden to cross the runway due to the active conflict lights.
INCORRECT: Airport safety systems failed.
CORRECT: The only safety system relevant to the crash is the one which would actually alert the truck not to enter the runway, not the one that would alert the controller after the incursion happened (after all, the controller realized immediately what had happened). That system was, by the firefighters' own admission, working perfectly.
INCORRECT: The fault lies mostly with the controller.
CORRECT: While the controller made a bad error which contributed significantly to the accident, nevertheless, the proximal cause of the accident was the truck driver's failure to obey the mandatory direction given by the fully functioning runway safety system.
Zig is just doing vtable-based effect programming. This is the way to go for far more than async, but it also needs aggressive compiler optimization to avoid actual runtime dispatch.
Well... effect programming using vtables. I think this is an emerging paradigm, but it is very early yet so it's difficult to define precisely.
My primary inspiration for the concept is theorem proving languages like Lean in which typeclasses ("interfaces" in the OOP terminology) are implemented using structures passed down as arguments ("vtables" in the OOP terminology) separately from any receivers (values of the type implementing the interface, which doesn't actually need to exist for Lean). Typeclasses (and interfaces) are an effect, albeit a simple and limited one. Lean can't express effects in their generality due to totality requirements, but the same mechanism would work perfectly well for effects too. As for the "vtable" aspect: the primary distinction in implementing typeclasses using exposed vtable passing is that the language does not in any way limit the programmer to zero or one implementations of a typeclass per receiver type(s) (cf. orphan rules in Rust, cf. compiling effect systems to witness-passing, etc.).
You have to throw the context away at that point. I've experienced the same thing and I found that even when I apparently talk Claude into the better version it will silently include as many aspects of the quick fix as it thinks it can get away with.
I've been using pi.dev since December. The only significant change to the harness in that time which affects my usage is the availability of parallel tool calls. Yet Claude models have become unusable in the past month for many of the reasons observed here. Conclusion: it's not the harness.
I tend to agree about the legacy workarounds being actively harmful though. I tried out Zed agent for a while and I was SHOCKED at how bad its edit tool is compared to the search-and-replace tool in pi. I didn't find a single frontier model capable of using it reliably. By forking, it completely decouples models' thinking from their edits and then erases the evidence from their context. Agents ended up believing that a less capable subagent was making editing mistakes.
Meaning Msft Principal is below L5? I got the same feedback from one of my friends who works at Google. She said quality of former MSFT engineers now working at Google was noticeably lower.
I mean imputed prestige within the organization. Being an L5 is nothing; it's the promote-or-fire cutoff at Google AFAIK. But being a Principal is slightly more than nothing; it's two levels above the promote-or-fire cutoff.
I mean, _now_, sure, I'd assume Microsoft Principals should be hired around L4 at Google. But that's just due to a temporary inbalance in the decline of legacy organizations. Give it a few years and it will even back out and msft 64 will be in the middle of L5 range like levels.fyi claims.
L5 hasn't been the promote or fire cutoff at Google for perhaps a decade. L4 is the new L5, mostly because Google would have to pay L5s more, and it has been terrified of personnel costs for years.
But even so, an L5 at Google is basically a nobody as far as prestige or convincing other people to adopt your plan goes. Even L6 is basically just an expert across several mostly local teams. L7 is where the prestige gets going.