Hacker Newsnew | past | comments | ask | show | jobs | submit | ActorNightly's commentslogin

I mean, Google already has Mu Zero, which Im willing to bet has evolved quite a bit in private because if anything is going to get us closer to actual AI its that.

Realistically, one can build a AI capable of reasoning (i.e recurrent loops with branches) using very basic models that fit on a 3090, with multi agent configuration along the lines https://github.com/gastownhall/gastown. Nobody has done it yet because we don't know what the number of agents is required and what the prompts for those look like.

The fundamental philosophical problem is if that configuration is possible to arrive at using training, or do ai agents have to go through equivalent "evolution epocs" to be able to do all that in a simulated environment. Because in the case of those prompts and models, they have to be information agnostic.


Because in order to exploit this, you have to have direct access to the computer. Either through malicious usb device, or by exploiting some supply chain or a known piece of software that will be willingly or automatically installed, and furthermore you need to be able to essentially run arbitrary terminal commands, which is a huge breach of isolation in that software.

If an attacker manages to do all that, its already bad news for you. Escalation to root with this is the least of your worries at that point.

Like someone else below posted, https://xkcd.com/1200/

People need to understand what the vulnerability actually is before freaking out about it.


You are assuming that LPE only applies to the user that holds all the sensitive stuff. But it also applies to users created specifically for isolation. Without LPE they would not have access to anything important even if they were compromised.

So a threat actor buys access to a managed kubernetes service, or other linux-based shared hosting platform, and now they have access to the computer.

Hell, GitHub Actions would do.


Is there any service that relies on Linux user separation or containers to separate different user accounts? I’m pretty sure you’re not supposed to do that and the proper way is to run different instances in virtual machines.

Basically every shared webhost that uses cPanel works like this. The security mechanism they use is called CageFS (https://cloudlinux.com/getting-started-with-cloudlinux-os/41...), which makes it so users can't see other users, but it's not like a VM or something.

Right, you're not supposed to do that...

Qwen is still better that Gemma though. Also you can tune it more for different tasks, which means that you can prioritize thinking and accuracy versus inference speed.

Qwen is better at some things (code, in particular), but Gemma has better prose and better vision. At least, it feels that way to me.

gemma is also just way faster. i dont wanna wait 10min to get a 5-10% better answer (and sometimes, actually worse answer).

best is to use your own model router atm, depending on the task


I'm pretty sure Qwen is faster? The MoE version of Qwen is 3B active, while Gemma 4 is 4B active. Similarly, the dense Qwen is 27B while Gemma is 31B. All else being equal (though I know all else isn't equal), Qwen should be faster in both cases. I haven't actually measured with any precision, but on my AMD hardware (Strix Halo or dual Radeon Pro V620) they seem quite similar in both cases...both MoE models are fast enough for interactive use, both dense models are notably smarter but much slower, long time to first response and single-digit tokens per second once it starts talking.

qwen-3.6 is really interesting. The dense 27B model is pretty slow for me whereas the sparse 31B is blazingly fast but it also needs to be since it's so chatty. It produces pages and pages of stream of consciousness stuff. 27B does this to a lesser extent but slow enough that I can actually read it whereas 31B just blasts by.

I haven't yet compared either to Gemma 4. I tried that out the day after it came out with the patched llama.cpp that added support for it but I couldn't make tool calling work and so it was kind of useless. I should try again to see if things have changed but judging by what people say, qwen-3.6 seems stronger for coding anyway.


I had the same experience with 31B. Runs well on 4090 too!

I'm using both incessantly and having a great time.

Qwen without thinking is just as fast. I have 4 parameter settings based on recommendation. If you want a good coding problem, the thinking coding mode works well, but takes a while to arrive at an answer. If you want faster turn around time, instruction mode works without thinking.

Genuine question: how do you tune it?

I thought "fine-tuning" meant training it on additional data to add additional facts / knowledge? I might be mistaking your use of the word "tune", though :)


Parameter settings are here. https://huggingface.co/Qwen/Qwen3.6-35B-A3B

Most clients that support ollama support passing extra body options where you can set those.


You can fine-tune relatively easily in Unsloth Studio.

It’s a heck of a lot faster too.

Yes I would just go with qwen.

I found that Gemma 4:26b makes way more mistakes compared to Qwen and Gemma 3. Gemma3 27b QAT was my goto for some time as this was quite fast. Qwen is still king for a balance of accuracy and inference speed.

Gemma:31b was more accurate but speed was horrendous.


You don't need hdmi out, just ability to do screenshots, which easy to script.

Arguably though, browser automation gets you 95% of the way there for most things.


Many systems won't allow the end user to install any software (e.g. work issued laptops), but you can plug in HDMI and USB.

I had a Casio that was multi color, because I thought it was cooler. Display was nice, functionality sucked.

I had a Casio as well because, IIRC, it was the only thing the shop had. Eventually I had to also get a TI because it allowed using imaginary numbers in a matrix operation. Not that that was used in more than one course after all. But I grew to like it and even had an emulator for a long time on my first smart phone.

But yeah, Casio was definitely more friendly and polished in UI, but dumber. You could only use "wizard" type things and pseudo gui clickies while the ti was crude and text-heavy but let you enter just about anything anywhere and seemed more symbol and language oriented. Which one was nicer in use? I guess it would depend on how much of that language you could memorize. Or browse a cheat sheet for.


>Around the same time, Andrej Karpathy (OpenAI cofounder, former Tesla AI lead) told the No Priors podcast he was in a “state of psychosis” over AI agents. He said he hadn’t written a line of code since December. He described tasks that used to take a weekend now finishing in 30 minutes with zero human intervention. Karpathy is a literal genius and one of the most technically accomplished people in the industry. He built a WhatsApp bot called “Dobby the House Elf” to control his home systems (though that naming leans more towards genius than psychosis).

Ah yes, the same guy that said implementing lidar with cameras is hard (like Kalman filters aren't a thing). Same guy who spoke positively about Musks engineering talents AFTER he went crazy. That genius...

Basically, I feel like if you are suffering from psychosis, your talent is measured by how much stuff you have memorized, and how much of it you can type on keyboard in a given timeframe. And now that LLMs are doing it for you, you feel worthless.

I remember when I first started learning python, having been in Java/C++ land. It felt like a hack. You could just pip install stuff, import it, dynamically hack things around if you needed to, and make stuff work in much shorter time. I wrote tools that let me write other tools quicker. For example, back before you could ask LLMs to write code, you basically had to google stuff and search for examples. So one of the first things I wrote was essentially web page to api converter. Now I had a tool that programmatically let me pull content from web, which included things like code samples.

I then wrote a tool to search documentation and github, and pull things that were styled as code, using my previous tool, and put them into opensearch, so when i had a question about something, I could search a function in opensearch and see examples.

E.t.c and so on.

Agents these days have replaced a lot of the manual work. But complex tasks, with decision making, repeat loops, and unknown unknowns is still something that agents cant reliably do. Anyone can put together a UI with agents very quickly. But then, if you leave a lot of stuff to the agents and not specify how you want the code written, you are going to get bounded into code that is going to quickly degrade performance, introduce edge case bugs, and so on. Sure, you can have llms fix all that, but to do that automatically is something nobody has done yet.

The real skill in the future is going to be writing agentic programs to work on features for you instead of working on features. You invest time up front to do this, and spend minimal time maintaining. Much in the same way that you invested time into writing OOP code with clean separation in packages and classes, build systems with verification, all so that anyone can come in and write code and have a safe way of testing and committing changes.


The problem is that there is no incentive, because if you take the average set of behavior of a human, nowhere in that set is the willingness to go against the grain to to what is morally right versus what is currently "socially acceptable".

For example, taking a stand against Tesla, when you go buy one right now, you really don't feel any sort of general animosity from people, even though its morally not the right thing to do.


Can we stop pretending that it was Covid, and not the felon pedophile and his cronies in charge of the country? You can see on the plot that the shit started in 2016.


Sure, because the world was just great before 2016. The orange idiot is just the culmination of decades of decline, not a random blip in American history.


I mean, in a roundabout way you are right in your second sentence, it just wasn't decades of decline, on the opposite, it was a decade of positive growth. The world was pretty good prior to 2016. By all accounts, economy was doing good, tech was happening, cool things were being done.

Most of the actual important issues were solved or on the way of being solved, so people slowly started to make the trivial problems seem way grander than they are. Hedonistic adaptation is part of human nature, and the cycle has been seen in history many times in many civilizations.

Meanwhile, ironically, in societies where there is significant hardship every day, whether its going out and farming or having to work harder for your meal at home, dealing with adverse weather, and other things, you tend to see way more inclusion and coherence between humans, because they really never get a chance to get accustomed to a good life.


I agree but like most tragedies, it wasn't the event, it was the reaction. Trump did very little in his 1st term (especially in comparison to now), yet extremist/politically addicted people lost their minds constantly. It was their radicalization and increased extremism that caused most of the harm. And as most of their real life social circle pulled back from their extremism they got deeper into their social media bubble. And they still haven't come back and I don't expect them to for some time.


Trump might've been "subdued" in his 1st term, but social media was already at its breaking point even before he sat in the White House the first time. Remember the cesspool that was /r/TheDonald for example, the 4chan psyop factory, the pepe the frog memes, Steve Bannon, etc.

Trump is a product of the idiocy of the American electorate. He's also a product of the forces that have worked for many, many years to have a guy like him run the country. Trump is what you eventually get after the Reagans, the Nixons, the George Wallaces have sown the seeds.


What plot? All the plots in the article either (1) show the change for the worse happening in 2020 or later or (2) are explicitly comparing "before 2020" with "after 2020".

(I do agree that Mr Trump is a shockingly bad president in oh so many ways. But the malaise being described here doesn't seem to have started in 2016. Not every bad thing is his fault.)


Trump is a terrible president and person. Unfortunately, Trump derangement syndrome is also a real thing. We are a country full of fools of one persuasion or another.


[flagged]


Is this comment supposed to make me think Trump derangement syndrome is a fake diagnosis? Because all I've gathered is that you seem utterly broken by the presence of this man.


[flagged]


Personal attacks are against the site guidelines. If that's what you are going to talk to people, don't do it here.


Are you an admin? If so, just ban me bro, spare me the pity party.

People like you are part of the problem. Look at my post history, I actually have plenty of comments in technical matter. Meanwhile the person who Im replying to is all politics shitposting. So get off your high horse my dude.


> If they pull it off, and it's looking like they will,

I really wonder about this psychological effect where non technical people champion people like Musk so hard without any basis for doing so. Is is some sort of wanting to belong to some ideology that makes you just make shit up in your head about how Starship is a success, despite many indicators of it clearly being a stupid idea born from Musks ketomine episodes?

For the record, Starships engines are the equivalent of taking a Toyota Corolla and making it run on nitrous continously on the verge of self destructing. You may be able to do technology demonstrations here and there, but making it work reliably for actual missions is much much harder.


"Starships engines", also known as Raptors, are among the most extensively tested rocket engines in the world.

More than 600 of them (across all variants) have been produced and tested on McGregor test stands and in flight, with relatively few explosions, and for some of that test stand explosions, we do not know if those were deliberate overload tests.

Raptor is about the least problematic of all the crucial Starship components. They had a lot more problems with ullage and the gorilla in the room is the heat shield. It must be very reliable and at the same time quick to check and fix. The time to fix the heat shield will be the critical component of the total turnaround time.

Personally, I don't believe in 1 hour turnaround. 1-2 days just might be plausible.


Is that why you see several engines not functioning during flight tests?

The issue is that the controllers have to maintain a crazy unstable balance of the mixture to keep these things running due to the dual preburn cycle and cryogenic storage, that any unforeseen circumstance can lead to failure.

In the days of Falcon, where Space X was attracting people willing to disrupt the industry, I would have been in the "its possible" camp. Nowdays, with everything going on, I would place a large chunk of money on the fact that they will never get it to be reliable enough. They haven't even gotten it to orbit yet, despite the massive experience they have.


The rockets were designed to be tolerant of a couple of the engines not functioning.

This is actually good engineering, as perfect reliability is very expensive.


>This is actually good engineering,

Lol.

First its "engines are reliable", then its "well actually they are not, but the starship can function with a few out"

The Elon simps never cease to amaze me.

An engine should not go out by itself. Some catastrophic event can take out an engine, and then if you design your rocket to fly with a few out, thats fine. But if the engines just are so unstable that they fizzle out, thats a huge risk because that means any non planned event can cause more engines to fizzle out, leading to loss of vehicle.


By far the most common reason for engines to go out is when they don't have enough fuel, and that is mostly caused by faults in ullage, which is what I mentioned in my GP comment to be a significant problem. Once you spent most of your fuel, relighting the engines is not easy.

Ullage is a plumbing matter, though. External to the engine itself and its intrinsic (un)reliability.

On the ascent burn, where fuel is plentiful, Raptor flameouts have become way less frequent over time. The # of engines failing on the Super Heavy during the ascent burn is 0/33, 1/33 and 0/33 during the last three IFTs. Not bad for a test vehicle.


The full flow cycle of the raptors means they do 2x preburn for oxidizer and fuel to spin the turbines to generate the pressure required for the pressure ratio. If a small part of this fails for whatever reason, or is out of sync, either engine doesn't start or goes kaboom.

Ascent isn't the problem its the relighting of them on the way down.


I agree, relighting is tricky, not just because of the limitations that you describe, but because there is a lot less cumulative experience with relighting of engines than simply lighting them at the launch. We (as the entire humanity) have seen tens of thousands of rocket motors lit on ascent, orders of magnitude than those relit.

Raptors don't seem to be prone to kabooms during flight, probably because the command unit switches them off in case of any suspicion.


The Saturn V's had a big problem with "pogoing" which they never solved. The Soviet moon rocket apparently was abandoned because of unstable coupling among the grid of engines that powered it.

SpaceX must have a handle on these problems, I haven't heard anything about it.


> An engine should not go out by itself.

Airliners have twin engines, and are designed to be able to fly one one. Why? Because engines fail now and then. Every flight critical part on an airliner has a redundant workaround, sometimes two.

Redundancy is a cost-effective way to successfully deal with unreliable parts.

When I go dirt bike riding in the wilderness, I only ride with a buddy. I wear protective gear. I carry a phone, water, and a minimal survival kit. I also text a photo of the trail map to a friend before hitting the trail.

Small things, but potential life savers.


Yeah and if you look at all the jet engines produced, the amount of failures are extremely tiny compared to total flight hours.

Not the same for raptor engines.


The ratio was different in the late 50s when jet engines were new. Experience counts.

Bringing together the money and people to make this stuff happen is the basis. That’s the most impressive part. Debatably the only truly impressive part.

There’s no ideology. You can watch a really big rocket take off every month or two and watch a smaller rocket take off every couple days. I’m sure there are better designs out there… on drawing boards.


Its not a video game where you put enough resources into "science" and stuff just works.

There are fundamentals at play that Musk certainly doesn't understand, and its ridiculous to think that he would be smart enough to account for them.


> There are fundamentals at play that Musk certainly doesn't understand

Examples?


Musk thinks that if you crash things enough until it works first time, the problem is solved. Which is fine for something like Falcon. As loads get bigger and heavier, and you start running into margins of performance (for example, raptor engines are required to generate the thrust to lift it). And then, just cause it works the first time, that doesn't mean there is enough margin on the system to not fail due to an external unforseen event for which the narrow margins can't account for.


Musk's technology ventures have been incredibly successful. I wouldn't bet against him. In fact, I've bet on him.


Pre 2018 Musk before he destroyed his mind with ketamine, sure. Apart from a few very autistic events, I would have also bet on him.

But now, you wanna tell me the person that thought the Cybertruck was a good product is going to solve a very complex problem of reusable heavy launch vehicle? Much less go to Mars? Ok.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: