More

civvv · 2026-04-26T18:19:56 1777227596

This is likely true. I think model quality has stagnated and that its likely a non-trivial task to find a new improvement vector. Scaling the width of the model (which has been the driving force behind the speed of improvement thus far) seems to have reached its limit.

It will be interesting to see the implications of this. Tooling can only do so much in the long term.

mxwsn · 2026-04-26T18:26:35 1777227995

How do you know that width scaling has been the driving force of improvement?

civvv · 2026-04-27T10:50:58 1777287058

I am no insider and have never even tried to build an LLM, so I can only guess. But the general sentiment seems to be that this is the case. If you are interested, I would recommend you read the MIT paper "Superposition Yields Robust Neural Scaling" [0]. It confirms an interesting trend: models represent more features/concepts than they have clean independent dimensions, so features overlap. Increasing model dimension reduces this geometric interference, which lowers loss in a predictable way, but with diminishing returns.

This has, in my opinion, likely been the primary vector in getting better models thus far, but MIT mathematically proves that it yields diminishing returns for each new dimension added. It will get more and more expensive and the cost-return will or probably already has made it infeasible.

Ilya appear to support sentiment this as well. [1]

[0] - https://openreview.net/forum?id=knPz7gtjPW [1] - https://www.businessinsider.com/openai-cofounder-ilya-sutske...

waterTanuki · 2026-04-26T23:19:10 1777245550

I mean, it's not exactly a PhD level question. One can infer from the extreme demand of GPUs and DRAM + new data center construction that all the providers are banking on width.

svnt · 2026-04-27T03:54:15 1777262055

No? That could just be fomo, actual adoption, or a number of other things.

civvv · 2026-04-21T22:03:56 1776809036

Do you understand how LLM's work and that they are always behind in their knowledge? Unless Claude does a network call to check its own website, it will give you outdated information. Its a prediction model, its not magic.

civvv · 2026-04-21T21:53:13 1776808393

If you scroll down, you can clearly see that the Pro plan has an "x" on Claude Code now.

civvv · 2026-04-21T21:50:07 1776808207

Does this mean that for enterprises using per-seat pricing, only the $100 premium seat gets access to claude code?

vict7 · 2026-04-21T22:18:25 1776809905

Team plan shows “Claude code” in a main bullet point still. Which would indicate it is part of the team plan regardless if it has premium seats or not.

But it seems this is all in a state of flux.

And there’s the lovely asterisk at the bottom:

> Prices and plans are subject to change at Anthropic's discretion.

conception · 2026-04-22T01:20:12 1776820812

Enterprise doesn’t have premium. Just api usage.

Business accounts are like max 6x accounts.

civvv · 2026-04-19T22:58:46 1776639526

You’re generalizing too much here. One of the biggest problems with LLM’s today is in-fact that they are not at the level being advertised. This is not solely a case of regulation standing in the way of a «revolution».

civvv · 2026-04-18T09:24:37 1776504277

Literally a skill issue?

delbronski · 2026-04-18T14:51:16 1776523876

Yes. A skill I have not mastered in 20 years. And I’ve yet to meet a person who has. If you are out here writing perfectly looking code 100% of the time that everyone else including you can perfectly understand a year later, then hats off to you.

But in my long career even the smartest most experienced software engineers I’ve met m write their share of crazy abstractions from hell.

civvv · 2026-03-25T13:24:38 1774445078

This one was pretty fun. Had zero expectations, but left pleasantly surprised.

https://opper.ai/ai-roundtable/questions/94e19d86-cc0

civvv · 2026-03-25T11:01:29 1774436489

Fun little toy, tried to ask it some post-modern philosophy questions and they all mostly agreed with the statements of the philosopher, until the debate where Opus 4.6 managed to change their opinion to a resounding "maybe", pretty much every single time. It seems like the "better" frontier models often take a more grounded stance from the beginning, and even manage to influence the other models.

Here is an example: https://opper.ai/ai-roundtable/questions/79e6cdd4-515

Another fun debate: https://opper.ai/ai-roundtable/questions/81ee56e9-60f

felix089 · 2026-03-25T11:16:52 1774437412

Yea Opus 4.6 is the one that changes opinions the most from what I've seen. Also the maybes or the are you 100% certain framings trigger most models to default to maybe / no. https://opper.ai/ai-roundtable/questions/can-you-be-100-cert... - Or as Shane puts it, Nobody's saying he IS a lizard. They're saying the universe doesn't hand out 100% certificates.

civvv · 2026-03-25T08:14:16 1774426456

That entirely depends on what you are buying. If you’re in need of a lawyer to keep you out of the bottom bunk, I’d happily spend a lot more for a little better.

civvv · 2026-03-17T19:36:22 1773776182

Go touch some grass, please