Hi Mike! Long time - super nice to see your name in my HN feed. I’ll fight you o...

mike_hearn · 2026-04-26T13:06:58 1777208818

Hello, long time no see :)

I know the argument about how each individual model is profitable, as made by Anthropic, but I have a hard time believing it. This isn't congruous with what we actually see: losses seem to be driven by systematic underpricing.

An obvious example of this is Sora, recently killed because it generated $2.1M in lifetime revenue and reportedly cost anywhere between $1M-$15M per day in compute alone - that's excluding training costs. It's quite hard to find any examples in history of a product being subsidized to that kind of level. So when people say things like "inference is profitable" it's clear that they're handwaving away some details, because that was one case where inference was not only unprofitable but comically so. Maybe they mean it's profitable for specific workloads or for specific companies, but it's more likely that the argument is too general to encompass such details.

OK but what about pure text inference?

We know that workload is hopelessly unprofitable too. Anthropic just told us a few days ago: "When we launched Max a year ago it didn't include Claude Code, Cowork didn't exist, and agents that run for hours weren't a thing. Max was designed for heavy chat usage, that's it".

Claude Code already existed at that time, they just didn't anticipate people using it with Max, apparently? Odd decision. But "heavy chat usage" apparently costs at least $200/month, and that's assuming the Max plan wasn't already a loss leader at launch.

If we go back a year ago, we can find people making the same claim that inference is profitable. But now Anthropic are openly saying they mispriced a plan that costs hundreds of dollars a month because for coding workflows, it's too cheap. We knew this already, people had been pointing out the huge API/subscription price discrepancy for a long time, but it always led to these same debates about profitable inference.

So what kind of workload are people talking about when they say "inference" is profitable? It's not consumer video. It's not the subscription plans. Do they mean pure LLM API serving? If so and API tokens are profitable by some metric, so what? It counts for nothing in a bankruptcy court - spending all your profit from one SKU on subsidizing another isn't a justification for voiding your debts.

But there's another issue with this narrative that individual models are profitable. If true it'd mean the entire set of losses made by these labs in any given year are driven entirely by the cost of training the next model. In turn that implies that model training costs are scaling so fast that it's enough to not only completely wipe out an otherwise great business model but then go beyond that and drive it deeply into the red. And that moreover this problem has got massively worse with time. Training costs probably have gone up a lot, but have they really scaled superlinearly? The last I heard RLVR now consumes the same compute budget as the actual pretraining, but that would only 2x costs whereas to make this "each model is super profitable" claim work training would have to be far more than 2x more expensive than before. And if true, how comes the frontier models appear to be only a small way ahead of models trained by heavily compute-starved Chinese labs operating on a fraction of the budget? Where is all that money going???

vessenes · 2026-04-26T21:39:52 1777239592

Time for a beer obviously. :)

I think sora backs my point actually - it didn’t pay and they killed it. As openAI aims at becoming the Facebook of AI (working model that has a permanent free tier) they are trying stuff out, and ultimately cutting product where it won’t pay. The numbers are big, but so are the customers and budgets - 15 mm/day in costs is like less than $0.02/day against 900m weekly active users. So, I. See a search for a product and differentiation but not economically weird behavior there.

Anthropic has been 10x ing revenue yearly for a while now; the recent outages and cutbacks are a sign they bet that growth would slow too early — there’s more paying demand than they can fulfill right now and they are scrambling to buy data centers at premium rates.

I haven’t done the math recently but I bet they’ve been spending right in the neighborhood of what you estimate: 2-3x last model revenues per leap. If you think you’ll get 10x usage, and make $6 on your $3 that feels good.

As to open models from china - a few things: it’s politically advantageous to state you don’t have top tier nvidia, so it’s at least incentivized to understate compute. Also you’ll note these open weights models bench well but never quite seem to keep up — and are in a time lag. the popular explanation is data exfiltration from frontier labs— along with limited budgets inspiring efficiency oriented innovation.

Anyway - I don’t see signs either company is cash strapped. I see signs they are racing to take market share on a product that is clearly like 1/6 to 1/8 the price to serve.