Hi Mike! Long time - super nice to see your name in my HN feed.
I’ll fight you on profit. The major labs are super profitable. If you replace “profitable today” with “cashflow positive today” then I think you’re correct. They are clearly not cashflow positive today. However, they are absolutely profitable, and when people confuse those I think it can be dangerous.
Consider a series of companies, let’s call these companies “Claude 1, Inc”, “Claude 2, Inc”, “Claude 3, Inc”, “Claude 4, Inc”.
In each company let’s keep track of the following:
* The pro-rata hardware and energy costs the company used during training. So, for instance, if a cluster is going to “last” 5 years, and we used it for 2, and the cluster cost $1 billion to build and provision and pay for 5 years of energy usage, we would charge $200mm.
* The R&D expenses like salary and so on
* The inference costs of every use of that company’s model.
* The revenue acquired in exchange for use of that model.
I propose first that I haven’t hidden any costs or double counted any revenue or anything - this is a full, fair assessment of the costs and likewise the revenue earned. I propose second that if you go to the end of the company’s final period then “profitability” in this case equals “cashflow”, so we can talk about either without talking past each-other. Third, I propose - if you add up all the costs and expenses of Claude 1 - 4, Inc, you’d have the full P&L of Anthropic, up to any training done on Claude 5.
I will now repeat a statement made publicly and repeatedly by Dario (and Sam in a slightly more cagey way): every single one of those “companies” (fully loaded models) has turned a profit so far. Put another way, it has, repeatedly, been a very good financial decision to train a model, and then sell inference of that model.
Why are the frontier companies spending cash? Simple - as each new model comes out, it’s quickly apparent that the new model will pay, and so increased training costs are incurred before that model has ended its useful life. Due to scaling activity, each new run costs some multiple of the prior run. Combining the overlap and the scale up, these companies are cashflow negative. But they aren’t doing it in some weird race to spend a dollar to make $0.50. They’re spending a dollar to make like $6 a year for a year or two.
If you see this, most of the ‘bubble’ (and implied massive crash) forecasts don’t seem to have any basis in reality from my perspective.
Frontier lab models are fucking great earners: 60%+ inference margins. (Public statements by said CEOs. Lateral proof: similar sized open models available for inference at 1/8 to 1/10 price on openrouter. Ergo - closed model margins are high). These earnings are real dollars, hard cash. Maybe the datacenters are in a bubble? After all, there’s a lot of debt getting laid on to do datacenter buildouts.
Datacenter companies and hyperscalars are making money providing hosting to these frontier labs. Coreweave (former ETH miner!) and others are posting 70% profit margins against debt costs under 8%. These profits are again in hard dollars from the labs. So, maybe the hardware providers are in a bubble?
Nvidia is making 70%+ margins, consistently beating every earnings call, is spending like $6bn a quarter on R&D against $40+bn in share buybacks (made in cash). They are moving super fast, and they could still literally be spending another 7x their current R&D spend before going cashflow negative. So, maybe the foundries are in a bubble?
TSMC is showing 66% margins (record high), and cutting Apple’s allocation to a point where there are research warnings about it. Maybe the EUV lithography companies are in a bubble?
ASML is the most generous company in the world, and is showing 34% operating margin this year while providing the only machines that can make the chips that TSMC and others are selling.
This is all very real. To my eyes the possible negative financial outcomes that seem plausible are:
1 - scaling laws stop working (and/or models get ‘good enough’) and all of a sudden the new hotness we just spend our entire last 5 years revenue on isn’t any better.
2 - There’s some major exogenous shift in demand for tokens and datacenter utilization drops radically, leading to credit defaults.
The main things that would have to be true would be that these things would have to be industry wide before they were a problem, and they’d have to end up with demand at less than 1/6 or so of current forecasts before they caused some kind of cascading financial problem: until then we’d see coreweave breaking even, reworking its debt covenants, spending less on power (unused), spending less on power (over capacity = lower prices on power being used), etc. etc.
This is SUPER long already, but to close, I think it’s reasonable and interesting to talk about those scenarios - how likely is it that scaling stops working or that people are okay with what we’ve got (that is, token value stops increasing in a compute-unitized environment)? How likely is it that people stop buying tokens at all even if their utility is stable or growing?
Agreed we’re in a temporary transitional phase right now, but I think it’s to a radically new business model and economic order more than it is a prelude to a giant debt leveraged crash, Wile E. Coyote style.
I know the argument about how each individual model is profitable, as made by Anthropic, but I have a hard time believing it. This isn't congruous with what we actually see: losses seem to be driven by systematic underpricing.
An obvious example of this is Sora, recently killed because it generated $2.1M in lifetime revenue and reportedly cost anywhere between $1M-$15M per day in compute alone - that's excluding training costs. It's quite hard to find any examples in history of a product being subsidized to that kind of level. So when people say things like "inference is profitable" it's clear that they're handwaving away some details, because that was one case where inference was not only unprofitable but comically so. Maybe they mean it's profitable for specific workloads or for specific companies, but it's more likely that the argument is too general to encompass such details.
OK but what about pure text inference?
We know that workload is hopelessly unprofitable too. Anthropic just told us a few days ago: "When we launched Max a year ago it didn't include Claude Code, Cowork didn't exist, and agents that run for hours weren't a thing. Max was designed for heavy chat usage, that's it".
Claude Code already existed at that time, they just didn't anticipate people using it with Max, apparently? Odd decision. But "heavy chat usage" apparently costs at least $200/month, and that's assuming the Max plan wasn't already a loss leader at launch.
If we go back a year ago, we can find people making the same claim that inference is profitable. But now Anthropic are openly saying they mispriced a plan that costs hundreds of dollars a month because for coding workflows, it's too cheap. We knew this already, people had been pointing out the huge API/subscription price discrepancy for a long time, but it always led to these same debates about profitable inference.
So what kind of workload are people talking about when they say "inference" is profitable? It's not consumer video. It's not the subscription plans. Do they mean pure LLM API serving? If so and API tokens are profitable by some metric, so what? It counts for nothing in a bankruptcy court - spending all your profit from one SKU on subsidizing another isn't a justification for voiding your debts.
But there's another issue with this narrative that individual models are profitable. If true it'd mean the entire set of losses made by these labs in any given year are driven entirely by the cost of training the next model. In turn that implies that model training costs are scaling so fast that it's enough to not only completely wipe out an otherwise great business model but then go beyond that and drive it deeply into the red. And that moreover this problem has got massively worse with time. Training costs probably have gone up a lot, but have they really scaled superlinearly? The last I heard RLVR now consumes the same compute budget as the actual pretraining, but that would only 2x costs whereas to make this "each model is super profitable" claim work training would have to be far more than 2x more expensive than before. And if true, how comes the frontier models appear to be only a small way ahead of models trained by heavily compute-starved Chinese labs operating on a fraction of the budget? Where is all that money going???
I think sora backs my point actually - it didn’t pay and they killed it. As openAI aims at becoming the Facebook of AI (working model that has a permanent free tier) they are trying stuff out, and ultimately cutting product where it won’t pay. The numbers are big, but so are the customers and budgets - 15 mm/day in costs is like less than $0.02/day against 900m weekly active users. So, I. See a search for a product and differentiation but not economically weird behavior there.
Anthropic has been 10x ing revenue yearly for a while now; the recent outages and cutbacks are a sign they bet that growth would slow too early — there’s more paying demand than they can fulfill right now and they are scrambling to buy data centers at premium rates.
I haven’t done the math recently but I bet they’ve been spending right in the neighborhood of what you estimate: 2-3x last model revenues per leap. If you think you’ll get 10x usage, and make $6 on your $3 that feels good.
As to open models from china - a few things: it’s politically advantageous to state you don’t have top tier nvidia, so it’s at least incentivized to understate compute. Also you’ll note these open weights models bench well but never quite seem to keep up — and are in a time lag. the popular explanation is data exfiltration from frontier labs— along with limited budgets inspiring efficiency oriented innovation.
Anyway - I don’t see signs either company is cash strapped. I see signs they are racing to take market share on a product that is clearly like 1/6 to 1/8 the price to serve.
I’ll fight you on profit. The major labs are super profitable. If you replace “profitable today” with “cashflow positive today” then I think you’re correct. They are clearly not cashflow positive today. However, they are absolutely profitable, and when people confuse those I think it can be dangerous.
Consider a series of companies, let’s call these companies “Claude 1, Inc”, “Claude 2, Inc”, “Claude 3, Inc”, “Claude 4, Inc”.
In each company let’s keep track of the following:
* The pro-rata hardware and energy costs the company used during training. So, for instance, if a cluster is going to “last” 5 years, and we used it for 2, and the cluster cost $1 billion to build and provision and pay for 5 years of energy usage, we would charge $200mm.
* The R&D expenses like salary and so on
* The inference costs of every use of that company’s model.
* The revenue acquired in exchange for use of that model.
I propose first that I haven’t hidden any costs or double counted any revenue or anything - this is a full, fair assessment of the costs and likewise the revenue earned. I propose second that if you go to the end of the company’s final period then “profitability” in this case equals “cashflow”, so we can talk about either without talking past each-other. Third, I propose - if you add up all the costs and expenses of Claude 1 - 4, Inc, you’d have the full P&L of Anthropic, up to any training done on Claude 5.
I will now repeat a statement made publicly and repeatedly by Dario (and Sam in a slightly more cagey way): every single one of those “companies” (fully loaded models) has turned a profit so far. Put another way, it has, repeatedly, been a very good financial decision to train a model, and then sell inference of that model.
Why are the frontier companies spending cash? Simple - as each new model comes out, it’s quickly apparent that the new model will pay, and so increased training costs are incurred before that model has ended its useful life. Due to scaling activity, each new run costs some multiple of the prior run. Combining the overlap and the scale up, these companies are cashflow negative. But they aren’t doing it in some weird race to spend a dollar to make $0.50. They’re spending a dollar to make like $6 a year for a year or two.
If you see this, most of the ‘bubble’ (and implied massive crash) forecasts don’t seem to have any basis in reality from my perspective.
Frontier lab models are fucking great earners: 60%+ inference margins. (Public statements by said CEOs. Lateral proof: similar sized open models available for inference at 1/8 to 1/10 price on openrouter. Ergo - closed model margins are high). These earnings are real dollars, hard cash. Maybe the datacenters are in a bubble? After all, there’s a lot of debt getting laid on to do datacenter buildouts.
Datacenter companies and hyperscalars are making money providing hosting to these frontier labs. Coreweave (former ETH miner!) and others are posting 70% profit margins against debt costs under 8%. These profits are again in hard dollars from the labs. So, maybe the hardware providers are in a bubble?
Nvidia is making 70%+ margins, consistently beating every earnings call, is spending like $6bn a quarter on R&D against $40+bn in share buybacks (made in cash). They are moving super fast, and they could still literally be spending another 7x their current R&D spend before going cashflow negative. So, maybe the foundries are in a bubble?
TSMC is showing 66% margins (record high), and cutting Apple’s allocation to a point where there are research warnings about it. Maybe the EUV lithography companies are in a bubble?
ASML is the most generous company in the world, and is showing 34% operating margin this year while providing the only machines that can make the chips that TSMC and others are selling.
This is all very real. To my eyes the possible negative financial outcomes that seem plausible are:
1 - scaling laws stop working (and/or models get ‘good enough’) and all of a sudden the new hotness we just spend our entire last 5 years revenue on isn’t any better.
2 - There’s some major exogenous shift in demand for tokens and datacenter utilization drops radically, leading to credit defaults.
The main things that would have to be true would be that these things would have to be industry wide before they were a problem, and they’d have to end up with demand at less than 1/6 or so of current forecasts before they caused some kind of cascading financial problem: until then we’d see coreweave breaking even, reworking its debt covenants, spending less on power (unused), spending less on power (over capacity = lower prices on power being used), etc. etc.
This is SUPER long already, but to close, I think it’s reasonable and interesting to talk about those scenarios - how likely is it that scaling stops working or that people are okay with what we’ve got (that is, token value stops increasing in a compute-unitized environment)? How likely is it that people stop buying tokens at all even if their utility is stable or growing?
Agreed we’re in a temporary transitional phase right now, but I think it’s to a radically new business model and economic order more than it is a prelude to a giant debt leveraged crash, Wile E. Coyote style.