Hacker Newsnew | past | comments | ask | show | jobs | submit | mrandish's commentslogin

> the models are improving at a rapid rate. If asked ~3 years ago where the state of the models are today, it would sound like sci-fi

Absolutely true, many things will continue to improve in significant ways. However, if we look at the modern history of rapid disruptions driven by technology (a side interest of mine), persistent patterns emerge. Similar to avalanches or flash floods, such periods of very rapid disruption are often triggered by one or more significant breakthroughs in certain technologies. Early rates of change tend to be fast and furious but eventually begin to taper as recently unlocked low-hanging fruit is harvested and those racing through newly found terrain encounter all-new significant barriers and points of friction. Early in such periods, extrapolating the recent extraordinary rates of change forward has poor predictive power. Sudden extreme bursts tend to regress back toward the long-term trend line.

Arguably, the current disruption in LLMs can be traced to post ~2010 research slowly building to the 2017 transformer paper and the adjacent work it quickly inspired. So today is, arguably, mid or late-ish in the LLM rapid burst phase. The rate of fundamental, broad-based breakthroughs lifting all LLM applications has clearly slowed with many of the most impactful recent discoveries being in scaling, optimization, tuning and productization toward specific domains. That doesn't mean there can't be another transformer breakthrough tomorrow but, historically, black swans rarely travel in flocks.


> The rate of fundamental, broad-based breakthroughs lifting all LLM applications has clearly slowed with many of the most impactful recent discoveries being in scaling, optimization, tuning and productization toward specific domains.

To me it definitely feels like it's still accelerating, with the most impactful recent discovery being RL training reasoning models (late '24, early '25).

There's an interesting article called "sigmoids won't save you" https://www.astralcodexten.com/p/the-sigmoids-wont-save-you which argues that (unless you have privileged information) you should always assume a process will continue about as long as it’s continued already. (Lindy's Law)

With that in mind the current disruption should last another 10-15 years (assuming it started in '10 or '17.)


This is of course true in general. But the question is not "how with this evolve" but how will we deal with the rapid changes in the industry? I suspect a long term k-shape salary curve, even worse than today, with the lower 80-90pctile salaries bottoming out such that many have to exit the industry to make ends meet. You can laugh and blame them for not saving as much as they should, but that's still a fairly horrifying prospect for most of us.

I think a _lot_ about stock trading a profession vs algorithmic trading. It was brutal - suicides, many pivoting out to doing car dealership-style work. Probably a 1/10 or 1/20 survivor rate every couple years, with almost all of it a very painful five year period.


I would ask for references for the suicide claims, so others can assess the impact themselves. That's a very serious claim to provide without any proof, especially to a group of people who very well be going through the same thing. I am not saying it did not happen, only it's the right thing to do.

And it was the dumbest and least valuable stock traders that exited the industry. The industry is alive and well today.

Phew, for a second there I felt bad for them!

That really depends on how you define alive and well. There are still stocks and there are still traders, but the market valuations are obscene and it sure appears that there must be collusion or corruption driving the industry to jam massive IPOs into every index and 401k they can find as fast as possible to fasciliste and exit.

Progress happens in a series of S-curves. While your observation is correct that advances occur initially rapidly then taper off, the next step tends to arrive sooner than the previous, and with greater magnitude [1]. Tim Urban's article from 2015 has a great explanation of this phenomenon [2].

[1]: https://ourworldindata.org/technology-long-run

[2]: https://waitbutwhy.com/2015/01/artificial-intelligence-revol...


> The rate of fundamental, broad-based breakthroughs lifting all LLM applications has clearly slowed with many of the most impactful recent discoveries being in scaling, optimization, tuning and productization toward specific domains.

What this means is that the disruption across industries not even truly begun, because it's not the generic chatbot models that are going to kill labor, it's all the domain-specific applications that leverage those models to perform work that was performed by humans


Of the posts I've seen by senior devs who assess recent events and end up roughly here:

> I'm still employed and I see myself employed for a foreseeable future. But I don't know what to think about the long-term ... Maybe I should consider transforming my woodworking hobby into a profession.

This one is notable for having all the clues pointing to why that's not the end-state this is headed toward, and yet... still not quite see it.

> I have no domain expertise that another Sr. engineer steering an LLM cannot match.

It's clear he's developed a significant competence in "steering an LLM" but the depth and value of that aren't apparent yet. After ~70 years, software development is now in the early stages of its first tectonic disruption. In the moment, these kinds of tech disruptions mostly appear to be displacing jobs but, historically, we understand the displacement is one part of a larger shift that's vertically compressing roles, functions and labor value. One steam shovel doesn't just displace dozens of pick-axe swinging diggers, it changes the roles, functions and competencies required across the entire supervision and management stack of "make tunnel through mountain" from the crew bosses and site managers to the tunnel engineers and business owners.

The author seems to be successfully navigating this shift but is still mid-disruption, so he and his management aren't yet able to see all the new competencies required or appreciate their value because it's all so new and still evolving. The rapid shock of agentic coding LLMs is especially disorienting because it's the first dramatic disruption in the field.

> review the code and steer the robot.

Historically, it's not surprising those few words are bearing so much weight and unappreciated value. Steam power was a similar shock to every field which relied on earth-moving and shaping. The big machines were quickly deployed, but it took quite a while for all the disruptions to both new and existing roles, functions and necessary competencies to be understood and appropriately valued. I imagine some top pick-axe swingers who'd graduated to being crew bosses and site foremen ended up driving or directing early steam shovels. In the first months they probably had little appreciation for the tremendous amounts of tacit new knowledge and practical expertise they accrued while keeping the steel beasts working. They were too busy being both amazed at the sheer power and frustrated by the constant scalding burns, tip-overs, blown boilers, landslides (too much weight, too little support) and cave-ins (dug too much tunnel, too fast with too little scaffolding), etc.

A big difference in the analogy is the first 100,000 steam shovels weren't sold at ~1/10th their actual cost and simultaneously delivered to job sites worldwide in six months. Software engineering is also unlike earth-moving and tunnel digging, in that the full costs and consequences aren't as visible or immediate as cave-ins and avalanches. The prices of 'steel beasts' are already going vertical with no end in sight and, over the next 18 months, I suspect "management" is about to gain a more viscerally accurate appreciation of the catastrophic costs of digging 'too much tunnel, too fast' absent the close supervision of highly skilled experts in directing all that newfound power constructively and not destructively. Between the skyrocketing full cost of operation and the consequences of poorly managed, non-expert execution - we'll start to see the broad outlines of the new equilibrium take shape.

In the steam era it over a decade for the ecosystem to understand how to even draw a new org chart accurately, label the boxes and appropriately value proven competency where it mattered. The faster the disruption, the longer it can take for all the pieces to rebalance and stabilize around a new equilibrium. Today, the author doesn't know all that he already knows and doesn't yet have the visibility to see how the new domain competencies he's rapidly accruing are creating a different kind of role that could be even higher value.


This is no doubt a useful production tool when one needs to create the visible artifacts of VHS video for some motivated story reason but as someone who worked in the tech side of broadcast video production starting near the end of the analog era and through the transition to digital, I'm not generally a fan of doing so outside those specific contexts.

The reason is that full bandwidth 6 Mhz analog composite or component video could look wonderful. If you ever have the chance to see a 2-inch quad VTR playing a master tape on a broadcast quality monitor pleased do. I suspect you'll be shocked at how good it looks, even to modern eyes. Yes, the absolute resolution is lower, but the magic of those analog broadcast standards was how gracefully they fit so much image into 6 Mhz of bandwidth. Conversely, VHS tape recording was the absolute worst, most compromised form of that. At the time, it was the best that could be done at consumer prices. But no one ever thought it was remotely good quality in any sense other than perhaps "better than nothing", and even that was hardly unanimous.

There's something about full bandwidth broadcast quality analog composite video that can be genuinely aesthetically pleasing, even compared to digital HDTV. Sadly, very few consumers ever got to see it in its pure, unadulterated form. Even live broadcasts, after being sent up a transmitter tower and down an aerial antenna, were a decimated form of the original signal at the head end (although leagues better than VHS). Yes, modern digital IS better in almost all ways, but in a few ways there was, and still is, something uniquely 'good' about that analog head-end video signal. I won't say 'better' because that's an aesthetic and stylistic judgement but definitely 'good'. Whereas, there's literally nothing good about VHS. At no point ever did a 1980s video creator look on their equipment shelf, see a VHS camcorder next to... literally any other camera or recording system, and say "I'll take the VHS today because it's the better tool for this job."

There's one context where I'm a huge proponent of recreating our analog past and that's when viewing 1980s and early 90s computer or game console graphics created to be displayed on 15khz analog composite video displays. That's when analog CRT emulation via GPU pixel shaders should always be used. The square razor sharp, hard-edged pixels of such content as seen on modern digital flat screens is an inaccurate distortion of the past because no one in that past, like the people involved in the creation or consumption of that media, ever saw square pixels like that. The only displays we had then were CRTs and images made for 15 Khz analog CRTs look not only different but much better on the displays they were designed on and for (or a good simulation of those displays).


Was anyone else fished in by the title and disappointed? After some broad introductory discussion of RSI, the article was almost about LLM coding. While there are some metrics for unattended agentic coding, it doesn't discuss "When AI builds itself" (beyond 'not now') or any progress specifically toward actual recursive self-improvement. I'm very interested in any empirical evidence of meaningful progress in RSI, so... this felt deceptively titled.

To me, unattended agentic coding is not RSI, in the same way a self-reloading "Unattended 3D printer" is not at all a "3D printer that recursively prints complete 3D printers in which each generation is significantly faster and more advanced than the last." The "unattended" part is obviously necessary but hardly sufficient. The article tacitly assumes LLM progress to be something like 1: Unattended agentic coding, 2: AGI, 3: RSI. I suspect that third step should be labeled "not to scale."

I'm increasingly convinced that actual Full Foom RSI (FF-RSI) is on a radically different scale than the first two. Just leaving it unaddressed is like assuming: Step 1: Manned space station, Step 2: Manned Mars base, Step 3: Manned Alpha Centauri base, are "just logical next steps." FF-RSI requires sustaining superlinear, recursively amplifying cognitive returns along a specific directed path - and we currently have no empirical evidence that such returns can exist for artificial OR biological intelligences. Large collectives of the smartest humans alive (Bell Labs, IAS, etc) haven't just failed to get anywhere close to reliably sustaining that, we can't even reliably predict non-recursive, single occurrences or even imagine any way all 8B humans could fully mobilize to predictably achieve non-recursive, single occurrences.

The only prior we have for open‑ended intelligence improvement is biological evolution which shows extremely slow and unreliable sublinear returns at best. And even if unbounded, recursive self‑improvement is physically possible, it may be practically unachievable due to asymptotic economic, resource and other barriers in the same way approaching light speed requires exponentially more energy. I think it's plausible, and maybe probable, that AIs achieve true super-human intelligence in a decade and yet still won't achieve FF-RSI for centuries, if ever. To me, absent compelling evidence to the contrary, that's the reasonable Null Hypothesis. Even if you feel that's too pessimistic, it seems reasonable to expect any serious discussion of "Progress Toward RSI" to first discuss why it might even be plausible that 1: Miles, 2: AU (Astronomical Units), and 3: Light Years belong on the same scale, instead of just assuming it like the meme's empty "Step 3. .... " before moving on to "Step 4. Profit!" (or "IPO!" but very, very responsibly).


> "A caveat: Lines of code is an imperfect measure"

I'm pleased they at least included this. However, they address the caveat by 'rounding down' the estimated multiple of the gain. I'm not sure that is the correct adjustment, especially once we understand the range isn't limited to positive numbers.

There's strong evidence the range of code productivity denominated in "lines of code" should include negative numbers, especially in the highest-quality sphere. Perhaps the earliest and most legendary example: https://www.folklore.org/Negative_2000_Lines_Of_Code.html


Exactly this. Just this week an engineer who seems to purely vibe everything submitted a +700ish LoC fix for what seemed like a pretty simple issue. Moreover it was a perf issue, which in my experience is not usually best fixed by adding more stuff.

Today, I merged my fix, net -381 LoC.

I'm using them too of course, they read and type and hunt for bugs and test faster than I can. But I'm using them as my tool, not being a tool using them.


> But I'm using them as my tool, not being a tool using them.

Keep believing that


Do you find it impossible to use LLMs productively without giving over your brain wholesale to them?

AFAIK, the only correlation with LoC that's got solid evidence is this: the number of bugs correlates with LoC.

Yep, this is exactly what I thought of too... If you believe negative lines of code is the goal, then they've gotten 8x _worse_!

Lmao I bloody love that.

Fully reproducing the motion clarity of a CRT on a modern display also requires 480hz.

(https://blurbusters.com/crt-simulation-in-a-gpu-shader-looks...)


> I think CRTs would have been capable of HDR

Very likely. CRT technology from phosphors to screen masks to deflection yokes were highly-evolved but there was still a lot of headroom for more performance and new innovation. Some CRT tubes where capable of driving much higher brightness than their controllers ever allowed.

It's unfortunate that CRT manufacturing wound down entirely after ~2010. While the size, weight and huge glass volumes where impractical for mass-market consumer media devices, CRTs also had unique capabilities modern display tech still can't match. With all the current interest in retro CRTs, I actually looked into what it would take to do small runs of ultra high-end HD CRTs for collectors, almost on an artisanal boutique basis. Unfortunately, it looks like the upstream manufacturing chain of many component elements also collapsed because there were no other applications for them. So, the start-up costs to make the first one would be pretty huge.


I would buy a brand new CRT from a boutique manufacturer, even for a pretty absurd price. Whether enough people would is a tough question to answer though.

Check out the rest of Copetti's site. He's got similar posts on almost all the other consoles. So much gold.

Yep, I was totally nerd-sniped by the image. I've never seen an engineer draw a whiteboard diagram anywhere near that detailed and tidy. No acronyms, consistent title case, descenders on a baseline - everything about it is wrong. It's so counter to reality, I seriously wondered if it was a joke.

The Nano Banana team should be pissed Google PR is distributing such a terrible photo. The poses are stilted, expressions frozen, even the eye-lines are off. Why couldn't they just use a Google Pixel phone to snap a photo of real Google engineers in a real Google office and upload it to Google Photos? Not Google enough?


I have seen such detailed and tidy whiteboard diagrams, but the catch is that they never occur in active discussion. It doesn't make room for scribbling, and stopping a discussion for 5-10 minutes to draw slowly and nicely doesn't make sense...

True, no one can understand my whiteboard drawings the next day, not even I.

Perfect operational security!

I think the logic follows: If we are already staging a scene just for PR purposes as was usually done then why not generate it using AI?

I actually didn't full-stop on it because I thought it was AI. For the first few seconds I thought it was a staged photo. I was nerd-sniped because it was staged so badly.

This is why we need smart glasses recording everything you see 24/7 to gather relevant real training data.

While he reads books in his job, what he's actually paid for is quickly synthesizing what he's read into actionable judgements assessing whether (and in what ways) those books have potential to be adapted into commercial film scripts. His assessments are ~10 to ~20 pages, and while being free-form to some extent, still follow fairly evolved standards for format, structure, criteria and terminology.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: