Hacker Newsnew | past | comments | ask | show | jobs | submit | QuadrupleA's commentslogin

Point not mentioned: it just doesn't work that well! It lies, it reverts to the mean on every topic, it wastes the reader's time. It's toxically positive, sycophantic. It's such a good mimic that it's insidiously hard to distinguish its bullshit or fake work from real work.

Always amazed how spoiled we are with modern hardware! The 8087 was $500 in today's dollars, and delivered around 50 kFLOPS of performance (0.00005 GFLOPS).

A cheap mobile phone CPU+GPU costs the manufacturer maybe $20, and typically does 50 GFLOPS on the CPU and 500+ on the GPU. So 10 million times the performance for 1/25th the cost.

Humbling too how "worthless" all the incredible ingenuity of the 8087 circuits and die designs now is, although I'm sure many of those innovations live on in modern chips.


This hype cycle is unique in that the tech writes its own hype.

The amount of pro-AI AI bots on chat forums is insane.

The effort required to refute bullshit is an order of magnitude more than to create it.


Because AI creates unmaintainable messes in any language, and ergonomic ones help humans clean up.


Never mind cleaning up, you also have to understand the language just to judge and review the LLMs output. How else are you to separate good design and implementation from a bad one?


Not sure how excited I feel about visiting your website and having it auto-download a 8GB model with GPT-3.5 level hallucinations, and then probably crash because I only have 6GB of VRAM. My dad won't be able to use it, or anyone else without a bleeding edge device. On a powerful enough "neural engine" device the battery will be drained quickly, while the heatsink burns a hole in my lap.


Local could also mean self hosted.

The obvious optimization for the case presented would be to generate all the summaries on a server instead of in the client. Then the totally used compute would scale with the number of articles instead of number of users.


This is just emotional rhetoric. Pretty much any app in the last 20 years has depended on a server somewhere, or a cloud provider. Like an AI provider, they can go down, they can turn off if you don't pay your bill, etc.

And local inference requires fairly beefy hardware, that is FAR from ubiquitous across today's userbases. Local models are also still far dumber than what frontier labs can serve.

Weird that this is getting such a tidal wave of upvotes.


Exactly double the cost of GPT 5.4 - $5 per MTok input, $0.50 cached, $30 output.

All the AI players definitely seem to be trying to claw more money out of their users at the moment.


It's 2x/token, but for default reasoning we've found GPT-5.5 uses fewer tokens overall, so net cheaper on median. [1]

(Note, that stops being true at higher reasoning levels, where our observed total cost goes up ~2-3x.)

[1] https://x.com/voratiq/status/2047737190323769488?s=20


https://openrouter.ai/openai/gpt-5.5-pro

30/180 usd on Openrouter. Did I miss something?


I think that's Pro. Regular 5.5 is 2x regular 5.4.


Definitely seems like AI money got tight the last month or two - that the free beer is running out and enshittification has begun.


One thing I don't see often mentioned - OpenAI API's auto token caching approach results in MASSIVE cost savings on agent stuff. Anthropic's deliberate caching is a pain in comparison. Wish they'd just keep the KV cache hot for 60 seconds or so, so we don't have to pay the input costs over and over again, for every growing conversation turn.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: