More

braebo · 2026-05-02T21:38:56 1777757936

No type system is as strong as TypeScript — certainly not Kotlin.

vips7L · 2026-05-03T02:57:33 1777777053

Give Scala a try :)

braebo · 2026-04-28T11:00:30 1777374030

Packing people into tiny spaces like sardines should be illegal.

braebo · 2026-04-26T20:41:53 1777236113

You can easily persist agent memories in a markdown file though.

collinmcnulty · 2026-04-26T21:00:55 1777237255

And the memento guy had tattoos of key information. That didn’t make it so he didn’t have memory loss.

WhatIsDukkha · 2026-04-26T21:25:20 1777238720

Pretty good metaphor.

Limited space to work with, highly context dependent and likely to get confused as you cover more surface area.

troupo · 2026-04-26T21:01:34 1777237294

Yup, and the agent will happily ignore any and all markdown files, and will say "oops, it was in the memory, will not do it again", and will do it again.

Humans actually learn. And if they don't, they are fired.

strongly-typed · 2026-04-27T03:05:20 1777259120

To me it sounds like a tooling problem. OP seems to be trying to use probabilistic text systems as if they enforce rules, but rule enforcement should really live outside the model. My sense is that there was a failure to verify the agent's intent.

The tooling that invokes the model should really define some kind of guardrails. I feel like there's an analogy to be had here with the difference between an untyped program and a typed program. The typed program has external guardrails that get checked by an external system (the compiler's type checker).

troupo · 2026-04-27T07:13:21 1777274001

What tooling? It's a probabilistic text generator that runs in a black box on the provider's server. What tooling will have which guardrails to make sure that these scattered markdown files are properly injected and used in the text generation?

strongly-typed · 2026-04-27T14:55:30 1777301730

That's the million dollar question. Maybe have systems of agents that all validate each other's work? Maybe something needs to be done at the harness level? I don't suppose that we could realistically expect 100% accuracy, but if we take 100% to be the upper limit, we could build systems that get us closer to that ideal.

troupo · 2026-04-27T15:51:26 1777305086

This is faith in magic. "There's some magic way to make probabilistic text generator running in the cloud to never miss local files"

strongly-typed · 2026-04-27T21:15:28 1777324528

No no, that’s not what I’m saying. The fact that the data is stored in files is incidental. It could be in a database, in a knowledge graph, derived from so other data Regardless of where it is, something should know to include it in the context, but only when it’s relevant.

So for instance you could start by trying to classify the prompt in some way. If you use an LLM for this, you might need to get it to return a machine parsable data format. Then your harness can pattern match on the classification and use it to enrich the prompt with additional context. The challenge would be in determining how exactly you want to go about this, balancing tradeoffs such as accuracy, cost, time, etc..

For the classification step you might begin with something like "Determine whether the following prompt is a QUESTION or a STATEMENT. Respond using only one of the two words. Prompt: $PROMPT"

You could have multiple back-and-forths like this and at each round you gain more information about the prompt, and you can use that information to determine further classifications and/or context to include.

troupo · 2026-04-28T05:39:41 1777354781

> Regardless of where it is, something should know to include it in the context,

Magic. You're talking about magic. You keep re-iterating the same faith that "There's some magic way to make probabilistic text generator running in the cloud to never miss local files", where "files" is "files, knowledge graphs, databases etc.".

It doesn't matter how data is stored. You can't know when to include something relevant in the context because the whole thing including context is running in the cloud. You are not in the driver's seat. Literally anything you include locally in the prompt can and will be ignored.

strongly-typed · 2026-05-01T14:09:48 1777644588

I’m not following. If I run an agent on ollama locally, it’s not in the cloud. I don’t see what cloud has anything to do with the argument.

As to your other point about anything you include in the prompt can and will be ignored. Yes, I agree. You could draw an analogy to how a teacher assigns an in-class reading assignment and follows it up with a reading comprehension quiz. If your mind wanders during the reading you may come to find that you will fail the quiz because “anything you include in the prompt can and will be ignored”. Therefore, the quiz result serves the purpose of an evaluation.

whstl · 2026-04-26T20:57:08 1777237028

Which it will start ignoring after two or three messages in the session.

Quarrelsome · 2026-04-26T21:00:09 1777237209

and you'll blow the context over time and send to the LLM sanitorium. It doesn't fit like the human brain can.

If a junior fucks production that will have extroadinary weight because it appreciates the severity, the social shame and they will have nightmares about it. If you write some negative prompt to "not destroy production" then you also need to define some sort of non-existing watertight memory weighting system and specify it in great detail. Otherwise the LLM will treat that command only as important as the last negative prompt you typed in or ignore it when it conflicts with a more recent command.

Kim_Bruning · 2026-04-27T01:31:48 1777253508

> and you'll blow the context over time and send to the LLM sanitorium. It doesn't fit like the human brain can.

The LLM did have this capability at training time, but weights are frozen at inference time. This is a big weakness in current transformer architectures.

estimator7292 · 2026-04-26T21:01:59 1777237319

That's not learning.

braebo · 2026-04-18T21:48:13 1776548893

Which open model has the same performance as Opus 4.7?

3dfd · 2026-04-18T23:08:34 1776553714

They dont have to be parity today.

If the frontier models reach a point of barely any noticeable improvements the trade off changes.

You do not need a perfect substitute if you are getting it for free...

People will factor in future expectations about the development of open source vs frontier models. Why do you think OAI and anthropic are pushing hard on marketing? its for this reason. They want to get contractual commitments that firms have to honour whilst open source closes the gap.

retsibsi · 2026-04-19T12:31:07 1776601867

The person they were responding to said "Open models have the same performance on coding tasks now." AFAIK this is bullshit, but I'd love to be corrected if I'm wrong.

braebo · 2026-04-13T17:01:49 1776099709

Claude Code Desktop is as close as I can see them getting as it seems the big bet is that the IDE is on its way out as models improve.

braebo · 2026-04-13T16:26:52 1776097612

That’s the lsp not runtime. Bun runs Typescript very fast. It’s a fantastic language and ecosystem.

skydhash · 2026-04-13T16:47:45 1776098865

I’ve just checked FFI in bun and it’s marked as experimental. There are great libraries in C/C++ world and FFI is kinda table stakes to use them.

rvz · 2026-04-13T17:01:06 1776099666

No where did I say "runtime".

Even with Bun it's because of Zig, not TypeScript and that only proves my point even more.

r_lee · 2026-04-13T21:42:19 1776116539

you're right. we should just not use any interpreted/script languages because they're not as fast as compiled ones.

why does a CLI tool that just wraps APIs need this native performance?

braebo · 2026-04-12T18:51:00 1776019860

Svelte for eliminating countless categories of complexity introduced by React.

braebo · 2026-04-10T19:45:42 1775850342

We could use llms to scan source code and list all of the behavior not listed in the extensions page, like adware and geolocation tracking for example. Then another LLM locally to disable it and warn you with a message explaining the situation.

braebo · 2026-04-09T23:49:26 1775778566

Same but it’s certainly too slow.

braebo · 2026-04-09T13:19:46 1775740786

Avoiding this with Opus has been trivial in my experience.

darccio · 2026-04-10T10:47:25 1775818045

Even with Sonnet or "lower" models like Kimi is trivial. The only thing I still find with AI-generated code is some degree of overengineering.