Hacker Newsnew | past | comments | ask | show | jobs | submit | drivebyhooting's commentslogin

I’m interested in learning more about your theory that these models can be trained more cheaply. Is anyone doing it from scratch, rather than adversarial distillation?

It is a lot cheaper to train a 27b model such as qwen3.6 which you can even vibe code or agentic code with than it is to train a 1t+ parameter model. It runs on a single commodity GPU for goodness sake

It's not a theory. These smaller models that are coming out are huge advances for the field.

I can't comment on companies training practices. That would be proprietary stuff I guess. I think the claims that the advances being made are due to distillation alone are completely unfair. The advances alone are not just data.


It almost doesn’t matter if it’s trained using adversarial distillation - if it’s nearly as good, and one-hundredth the cost, the choice is obvious.

Several times the speed of sound? That is meaningless when there is no media for the sound waves. I think a better unit might be furlongs per fortnight.

From TFA:

> 2.43 kilometers a second, or 1.51 miles a second, or 5,400 miles an hour, or 8,700 kilometers an hour.

> There is, of course, no air and no sound on the Moon, so a "Mach number" doesn't really make sense. But if there were air, the speed would be about Mach 7, seven times the speed of sound.


"If there were air". Air at which temperature though? Th sound of speed, and hence what Mach numbers mean, depends on the temperature of the air. The temperature air would have at the moon's surface? By day or by night? Or the air at Earth's surface? Or at some other altitude?

How many Machs is the earth moving?

What about giraffe lengths per second?

"several times the speed of sound" is obviously just meant to mean really fast to earthlings in relation to their speed of sound.

It’s not a constant on earth either, should have used km/h instead for a relate able number

Well, there is a speed of sound on the moon. Sound does travel through the regolith. If you were standing on the moon you would indeed "hear" this impact as the sound moved up through your feet. It would sound/feel like standing beside a subwoofer.

I wouldn't call that sound but you can call whatever you want whatever you want, I suppose

Transverse pressure waves?

I still call it sound when I hear things under water. Gas isn't special.

When someone knocks on your door, you still say you heard the sound, even though the pressure wave was transmitted through the solid door material (before then being transmised to the gas in the room). Likewise, we still file a noise complaint when the neighbor is throwing a raging party, even though we are feeling the bass as much as we are hearing it.


And many old dogs, who may be totally deaf, can still hear/feel someone knocking on the front door. You don't need ears to detect vibrations.

Like putting your head on a railway track to hear a train? Sound is sound.

To me, sound is a particular element of human perception. Not all pressure waves are going to be perceived as sound.

Didn’t Anthropic vibe code all of those integrations? If AI coding is as useful and successful as it is touted, then those integration should be no moat at all.

But when will RF engineering pay 500k (common mid level SWE)?

SWE are paid that because the industry makes so much money off advertising, and it marks the market for everything else.

It's more business model than skillset, because RF engineering is, in many ways, so much more technically challenging.

People who care about pay should mostly be thinking about how their potential employers make money. Do they have fat variable margins? Is there volume? Do I have the opportunity to impact those margins in some way? If you do, there's a good chance you can make good money, regardless of the actual technical challenge at hand.

For a lot of RF engineering, the answers are generally no, at least enough such that the general market isn't getting set at a high clearing rate.


Hardware engineers can get paid that, although it’s rarer. That said, there’s also a much broader base of hardware engineers than just the Bay Area… so cost of living is a lot lower, therefore salaries don’t need to be as sky high to compensate.

the difference is that RF engineers still have a salary 5 years later so that should probably be averaged in


4 out of the 5 FAANGs hire RF engineers so… now?

Imagine vibe coding your core consumer application and associated backend…

Oh wait, I don’t have to imagine. That’s what Anthropic does. A nice preview for what is in store for those who chose to turn off their brains and turn on their AI agents.


I wish WhatsApp would get nationalized. I absolutely hate having to use it.

Given how inefficient Meta et al are, why do the pay so much more than the nimbler smaller companies? (Rhetorical question, I already know the answer: monopoly and regulatory capture)

Of course those engineers would rather have more meaningful work if it came with similar compensation and work life balance.


Hard to motivate people to work on things that destroy society. Money helps.

Want to see how motivated Meta employees are? Watch how fast their offices clear out at 5pm on the dot.


What do you think is an appropriate time for most employees to end their workday?

I am a terrible person to ask. My employers get their money's worth from me: I genuinely like my work and regularly work more than 8hrs a day. I also work in a field with others who, with some exception, do the same, so its strange for me to see "normal people" clock out on the dot.

Have you considered that people can both like their work, and like other things at the same time?

Meta offices are pretty full at 5pm lmao. In fact they are still decently full at 7pm after dinner at 6. Baffles me why people just make up random crap in areas they clearly know less than nothing about.

Because you have to pay people more to do boring or evil work vs meaningful or exciting work

In my experience the pay difference was never that close that meaning and ethics played a role in the decision.

Cool exciting and meaningful science job: 200k

Big Tech surveillance capitalism job: 800k (at the low end)

The calculus has only been about affording housing and providing for the family.


800k at the low end? Big tech pays well, but that sort of comp is reserved for very senior folks.

Where do I get this cool exciting and meaningful science job paying $200k?

This is my experience too. I actually briefly took the cool exciting climate change related science job and then realized that I couldn’t actually support my family’s lifestyle on $160k so I left and went back to surveillance capitalism. I do feel guilt about that decision, but I like to imagine I’ll be able to go back to working on interesting and ethical things after my kids are out of the house.

Seems the pay is very different and thus is absolutely playing a role in the decision?

I’m curious how does it track with class sizes and tracks (remedial, regular, honors, “gifted”).

The level is so low in my local elementary school that the single track math class is still doing addition within 20 for first grade.

The funny thing is that the state standard actually measure 2 and 3 digit addition for K and 1st graders, and proficiency at that level would be p75 for a 1st grader. So why is the actual class teaching level at p<50?


The standards in New York for first grade are adding and subtracting within 20. Which state are you thinking of with higher standards?

https://www.nysed.gov/sites/default/files/programs/curriculu...


It’s in California.

My child took the “CASTL” test as part of IEP assessment. According to the result percentiles, a 6 year old who can do 2 digit vertical addition is at the 70th percentile of 6 year olds.

How come in kindergarten the curriculum is counting to 100 and addition within 10, while the assessment test shows double digit addition is well within the norm?

BTW my child didn’t learn any arithmetic from school.

EDIT: In case I wasn’t clear: there is a huge discrepancy between the standardized tests and the actual curriculum “standards”.


> tracks (remedial, regular, honors, “gifted”)

The decrease is consistent across performance levels which should be a pretty decent proxy for tracks.


Oh yes, thanks for reminding me. I’m going to cash out the 401(k).

You’ll pay massive penalties on that, another option is options (heh) but I’m not finance-literate enough to know how to pull it off.

Only penalties if you withdraw from 401k. Most 401k plans have some kind of moneymarket, bond fund, or similar

You can just reallocate away from an index fund.

I’ve made my peace with the “massive penalties”. I benefited from employer match in the past. I want the money now, not when I retire.

You gotta do what you think is best, but I hope for future you's sake you decide to not pull the money out. Or if you do you have other retirement plans.

I'm trying to help my parents now their at retirement age and am seeing first hand what not planning for your future looks like. They hit retirement with nothing but a small social security check every month. Not even enough to cover rent in most places.

I don't know how much you have in your 401k, but it will be worth literally hundreds of thousands more if you pull it out when you retire. You aren't just paying the penalties now, you're paying for potentially decades of compounding.


Retirement plan is rappelling accident before dotage.

Well can't argue with that lol

But if by some tragedy you don't die young, your older self is gonna be pissed at younger you for costing him hundreds of thousands of dollars.


You could just buy deep out of money SP500 puts expiring in 1+ year. That way you would be "insured" against the bubble popping.

The thing is, every dollar you spend on insurance is a dollar (and its interest) you lose. Furthermore, we don't know when it will pop. 1 year? 5 years?

The more reasonable solution is probably gradually reduce exposure to US markets by selling SP500 shares and turning to Europe and emerging markets ETFs. No need to cash out 401k.


You should backtest this strategy over the last 20 years before you make serious decisions off of the vibe from internet comments

20 years is not enough.

If you just look at the past 20 years, the US has had exceptional returns compared to the rest of the world.

The thing is, historically, high PE ratios like what we're seeing in the US do not correlate with short term returns that are as high. Expected future returns decrease as the PE ratios go up in a pretty linear fashion.

https://am.jpmorgan.com/us/en/asset-management/institutional...


Why 20 years? Just because we know, post hoc, the usa outperformed other places in the last 20 years, in no way means the next 20 years will be the same.

If you want a different point to backtest from, try Japan in the 80s and early 90s


What's the point of backtesting? Does backtesting say anything about the future?

The point of backtesting is to allow you to do what you want to do with a veneer of being data driven.

Repetition rather than novelty is good for learning.

Sure, and she gets that, but at some point she completely memorizes the stories. She also asks if we can get new books at the store, but they don't make 'em that fast.

Isn’t that also a valuable life lesson that some topics/resources are scarce and at some point you need to do something else?

Sure, and she already got that lesson when there literally weren't more. Then she got another lesson: we can just make our own. In fact that may be one of the most important lessons to learn: you have agency, and you can use the tools you have as accelerants to better yourself, further increasing your agency.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: