The staccato framing is right. It's also describing the default posture, not the settled one.
That loop (open a session, ask a question, redirect, switch) happens before the spec discipline is tight enough to trust the result without watching. When you've actually delegated well, you're not context-switching constantly between sessions. You hand over a detailed spec, wait for the agent to finish, and evaluate. The watching gets compressed to the handoff and the return, not the whole run.
Most engineers haven't gotten there, because getting there is hard. The spec has to be good enough that the agent can hit an ambiguity without needing to ask, fill a gap without needing your correction, and still produce something worth using. That's a different kind of writing than what most engineers were doing before. It takes the kind of attention that's hard to find when you're also redirecting three other sessions.
The flow state question is the more interesting one. It doesn't disappear. It moves. The continuous-arc attention used to live at the implementation layer. Now it's available at the spec-writing layer. Most people haven't redesigned their work to find it there. The new deep-work mode runs from a high-level goal through a narrow, implementation-ready spec through a validation pass. That can be a continuous arc if you've structured the work to allow it. It just doesn't look like the old one, and the muscle memory for it doesn't exist yet.
Whether that's actually flow or just a different kind of concentration is a real question. The engineers that I know (myself included), who've found something like the old rhythm have moved most of their time/effort upstream in the process. The ones still in the redirect loop are usually starting with stories with the same information they always had in them, ran through an LLM to make them appear more tightly defined.
All of these systems that try to solve "the memory problem" seem to fail to justify inserting either a layer of complexity with multiple moving pieces, or an outright blackbox. What is it that makes these systems worth the cost? What is it that they do that provide significantly more value than a structured directory of markdown files, a tuned grep search, and the model you are already using to synthesize the results? If you want to kick it up a notch, abstract the mechanism into a sub-agent to avoid context pollution. I have yet to find a memory system that clearly articulates how it is worth the overhead compared to the simple solution described.
Actually, Karpathy solutions it with RAG system and LLM Wiki but for a consumer app it will be a huge cost incentive. Every time you grep or fullSearch Into the DB or vectors you pay for bandwidth, as a bootstrapper i cannot affort this even with BaaS where they actually bills upfront for traffic. I can understand your point but i a model need to fully read every .md to make a point you'll bloat the context window. Well i'm not a ML research and i'm learning as well, but i don't think it's ideal for a consumer app this way. The fair point is i want to have something like LLM Wiki on my app, maybe if i make some $.
It should never read every full file. It should be gripping to find candidates to read, and then read chunks of the file from the hits to see if they are genuinely relevant to whatever you are trying to gather context for. If it reads a chunk of the file surrounding wherever you got the grep hit, and it appears to be relevant, then it can pull in a larger portion or the entire file, if appropriate.
I agree with that but this way it's still bloat, if you are coding with ai you are aware that everytime, a model read 100lines and dont find what it needs to modify you bloat the context. I use copilot this days (until june lol) and there a context window measurement and everytime the model read a file to make a change i assure you the window move from for example 8% to 12% (on gpt 400k tokens) its like 16k tokens for reads for something like 10 lines changes so i know about chunking but this is how it works everytime. You can check how claude code introduced us tools steps deletion to unbload the context window aswell months ago. Thank you for the advices Patrick :)
This at a high level is the right approach. Taking this concept to the next level is in the details. Tiered planning, enforcing context boundaries, and feedback loops are a few key components. Applying the same engineering disciplines to your system/harness that you applied before AI was a thing is the fundamental unlock. The hard part is applying these disciplines in a simple and effective way, while avoiding unnecessary complexity.
If it becomes a tangled web of interconnected components that is hard to follow and anticipate how changes in one part will propagate throughout it, you end up in the same mess every experienced engineer has seen in their career in "that app". The legacy one that sits at the core of the system, riddled with bad neighborhoods that people are afraid to venture into. I suspect that people will end up making the same basic mistakes that were made in "that app". Lots of features and functionality will be added due to unbridled enthusiasm, and will yield a lot of value quickly. At some point, it will become so complex, that simple changes will start causing all sorts of unanticipated downstream effects, and untangling them will create new unintended effects. The classic unwinding of mess that spirals that eventually results in people coming to the conclusion that the system just needs to be rewritten. Time will tell.
I too worry about the aspects that using AI is replacing in my thought process. I've built a sophisticated enough system to where agents can go out and determine the changes that need to be made for entire features and pretty much nail it out of the box. Everything is laid out in high detail during the planning phase. The implementation phase of actually writing the code is almost always unremarkable.
I have found myself going out and actually reading code less and less over the past year. I would be lying if I said that there are not fairly regular moments where I question the comfort level I have obtained with the system that I have built. I've seen it work with such a high accuracy and success rate so many times that my instinct at this point is to not question it. I keep waiting for this to really bite me in the ass somehow, but it just keeps not happening. Sure, there have been minor issues that have slipped through the cracks that caused me to backtrack, but that is nothing new. The difference is that with the previous way, I had painstakingly written that code and had a much more personal relationship with it. The code was the problem. Now whenever that does happen, I'm going back to the system and figuring out why it didn't get the answer right on its own, or why it didn't surface the whole thing in the plan to me prior to implementation.
I've seen it happen multiple times. Engineering degrees are no different than a vast majority of degrees in that if you are good at the read and regurgitate cycle, you can make it through. Not only can you make it through, but you can do it with a very respectable GPA. They come out with a large dictionary of keywords in their arsenal, but no idea how to put them into practice. Some are able to put it into practice and tie it all together. As they see practical examples of those keywords in the real world, it starts falling like dominoes, and at an accelerating rate. For some, it never goes much beyond keywords. The dominoes fall, but it is slow, and they stop falling for extended periods of time for them. Not many mature engineering organizations can tolerate that sort of progression rate. They usually don't last very long at any one place, until they find a company where they can blend into the background due to a combination of company culture, and low complexity systems being worked on.
That loop (open a session, ask a question, redirect, switch) happens before the spec discipline is tight enough to trust the result without watching. When you've actually delegated well, you're not context-switching constantly between sessions. You hand over a detailed spec, wait for the agent to finish, and evaluate. The watching gets compressed to the handoff and the return, not the whole run.
Most engineers haven't gotten there, because getting there is hard. The spec has to be good enough that the agent can hit an ambiguity without needing to ask, fill a gap without needing your correction, and still produce something worth using. That's a different kind of writing than what most engineers were doing before. It takes the kind of attention that's hard to find when you're also redirecting three other sessions.
The flow state question is the more interesting one. It doesn't disappear. It moves. The continuous-arc attention used to live at the implementation layer. Now it's available at the spec-writing layer. Most people haven't redesigned their work to find it there. The new deep-work mode runs from a high-level goal through a narrow, implementation-ready spec through a validation pass. That can be a continuous arc if you've structured the work to allow it. It just doesn't look like the old one, and the muscle memory for it doesn't exist yet.
Whether that's actually flow or just a different kind of concentration is a real question. The engineers that I know (myself included), who've found something like the old rhythm have moved most of their time/effort upstream in the process. The ones still in the redirect loop are usually starting with stories with the same information they always had in them, ran through an LLM to make them appear more tightly defined.
reply