More

QuadrupleA · 2026-06-25T13:52:34 1782395554

Point not mentioned: it just doesn't work that well! It lies, it reverts to the mean on every topic, it wastes the reader's time. It's toxically positive, sycophantic. It's such a good mimic that it's insidiously hard to distinguish its bullshit or fake work from real work.

QuadrupleA · 2026-06-23T02:02:41 1782180161

Always amazed how spoiled we are with modern hardware! The 8087 was $500 in today's dollars, and delivered around 50 kFLOPS of performance (0.00005 GFLOPS).

A cheap mobile phone CPU+GPU costs the manufacturer maybe $20, and typically does 50 GFLOPS on the CPU and 500+ on the GPU. So 10 million times the performance for 1/25th the cost.

Humbling too how "worthless" all the incredible ingenuity of the 8087 circuits and die designs now is, although I'm sure many of those innovations live on in modern chips.

QuadrupleA · 2026-06-14T19:26:41 1781465201

This hype cycle is unique in that the tech writes its own hype.

overfeed · 2026-06-14T22:18:54 1781475534

The amount of pro-AI AI bots on chat forums is insane.

QuadrupleA · 2026-05-15T20:13:18 1778875998

The effort required to refute bullshit is an order of magnitude more than to create it.

QuadrupleA · 2026-05-12T01:21:43 1778548903

Because AI creates unmaintainable messes in any language, and ergonomic ones help humans clean up.

janfoeh · 2026-05-12T01:30:21 1778549421

Never mind cleaning up, you also have to understand the language just to judge and review the LLMs output. How else are you to separate good design and implementation from a bad one?

QuadrupleA · 2026-05-11T04:36:05 1778474165

Not sure how excited I feel about visiting your website and having it auto-download a 8GB model with GPT-3.5 level hallucinations, and then probably crash because I only have 6GB of VRAM. My dad won't be able to use it, or anyone else without a bleeding edge device. On a powerful enough "neural engine" device the battery will be drained quickly, while the heatsink burns a hole in my lap.

dgb23 · 2026-05-11T08:19:21 1778487561

Local could also mean self hosted.

The obvious optimization for the case presented would be to generate all the summaries on a server instead of in the client. Then the totally used compute would scale with the number of articles instead of number of users.

QuadrupleA · 2026-05-11T04:21:29 1778473289

This is just emotional rhetoric. Pretty much any app in the last 20 years has depended on a server somewhere, or a cloud provider. Like an AI provider, they can go down, they can turn off if you don't pay your bill, etc.

And local inference requires fairly beefy hardware, that is FAR from ubiquitous across today's userbases. Local models are also still far dumber than what frontier labs can serve.

Weird that this is getting such a tidal wave of upvotes.

QuadrupleA · 2026-04-24T20:39:21 1777063161

Exactly double the cost of GPT 5.4 - $5 per MTok input, $0.50 cached, $30 output.

All the AI players definitely seem to be trying to claw more money out of their users at the moment.

languid-photic · 2026-04-24T20:57:38 1777064258

It's 2x/token, but for default reasoning we've found GPT-5.5 uses fewer tokens overall, so net cheaper on median. [1]

(Note, that stops being true at higher reasoning levels, where our observed total cost goes up ~2-3x.)

[1] https://x.com/voratiq/status/2047737190323769488?s=20

guilamu · 2026-04-24T20:41:49 1777063309

https://openrouter.ai/openai/gpt-5.5-pro

30/180 usd on Openrouter. Did I miss something?

languid-photic · 2026-04-24T21:00:05 1777064405

I think that's Pro. Regular 5.5 is 2x regular 5.4.

QuadrupleA · 2026-04-18T18:25:48 1776536748

Definitely seems like AI money got tight the last month or two - that the free beer is running out and enshittification has begun.

QuadrupleA · 2026-04-18T18:23:38 1776536618

One thing I don't see often mentioned - OpenAI API's auto token caching approach results in MASSIVE cost savings on agent stuff. Anthropic's deliberate caching is a pain in comparison. Wish they'd just keep the KV cache hot for 60 seconds or so, so we don't have to pay the input costs over and over again, for every growing conversation turn.