2xRTX5080 would be awesome. You'd only be able to run a q6, which it's already pretty good, but moreover you'd be able to use P2P and use Blackwell full speed, which I can't.
I don't know, I've been using Mythos this week quite sceptically and I found it to be incredibly dumb. For instance gave it a dialogue between 3 people and it was constantly mixing up who said what to whom, which looked like early Gemini behaviour. But latest Opus does that too. It would also make nonsensical inference about given papers and only correct itself when pointed out what it said wrong. If that is what US government fears... maybe the fear is that someone follows the dumb things the model suggests.
it feels like it's mostly just tuned to up it's level of capability on long horizon tasks - stop context rot and keep persisting at all costs until a goal is done.
The base intelligence does not feel much greater to me.
This is a ridiculous thing to test on it. Other models are trained on that kind of thing, use those instead.
Fable was designed for _really_ hard software engineering problems. Possibly large, but especially hard. For those tasks, you feel the difference immediately.
No it wasn't, Fable is a general purpose model for use in regular chat, analysis, as well as coding.
And yes, the parent poster is accurate, Fable is just as prone to moronic mistakes as Opus was. Stop being so AI-pilled.
Codex is still a better model, and yes, for the hardest engineering problems. I use Claude for UI/GUIs and Codex for all my backend, because I have 20 years of experience of actual hard engineering, and I can see that Codex writes, cleaner code, and is far more steerable.
Bad engineers think Claude is better because it writes more lines of code and is more "proactive", but lines of code doesn't make a better system.
> Fable is a general purpose model for use in regular chat, analysis, as well as coding
This is a forum filled with experts. Putting marketing aside, in a forum like this, it is most useful to assess models according to the toughest problems in the domain they were specifically refined on. For DeepSeek, that's math. For Claude, that's programming. Gemini and ChatGPT are generalist. Yes, you can use every model for anything you like. But Fable is a bit special, it's very expensive, and very clearly designed for particular types of tasks.
> Fable is just as prone to moronic mistakes as Opus was.
"Just as" is up for debate, but yes, all models are capable of moronic mistakes. That's not helpful information though.
> Codex is still a better model
You're comparing agentic workflows, which relies on a lot more than just the underlying model. It sounds like you're using it like a precision instrument, which is great! It's very different compared to my use cases though, and the ones that Fable seems to excel at. I'm using it for scientific computing, and you really, really want it to one shot a solution. It's either the right algorithm for the task, or the wrong one. So for the hardest problems, it needs to successfully implement a solution in effectively one shot. I use Codex too, but it's often too careless for the delicate tasks. If it gets it wrong, it is really hard to steer it back. You have to start from scratch.
> Bad engineers think Claude is better because it writes more lines of code and is more "proactive".
Think you missed the mark on this one. Not really an engineer, have as much experience as you do in my job. A solution to my problems comprises few lines of code. Fable actually gets it right, first time, every time (so far), but this is with a very long prompt and a bunch of attachments. No other model has done this for me. Not shilling for Anthropic, just impressed. This isn't particularly subjective for me; it is quantitatively measurable.
Don't assume everyone using AI is going to have the same experience you have, or the same types of use cases. And please don't assume that because others have different experiences that it makes them "bad".
Also, Claude has always been mediocre at creative tasks. For your line of work, I would have already recommended Codex hands down.
I tested it on that too. A problem I usually give a model to test is to optimise already well optimised function that performs certain calculations. I give it reference to CPU instruction set, how instructions can be paired to take advantage of superscalar execution pipeline etc. In that test also it fell on its face by producing code that was demonstrably slower and with extra bug.
Interesting, thanks for sharing. That is something I would have expected it to do well on, unless it tripped the internal rerouting. My experience on computational geometry problems has been universally positive (virtually flawless), and falling back to Opus has been a huge and frustrating step back. Opus has been frequently making errors and regressions, Fable never made a single one.
Kimi works great in their CLI, but their CLI has a number of workarounds for quirks of their models, including detecting when the model gets into a loop, and reverting to a checkpoint but letting the model compose a "message" to its past self (search their CLI for "BackToTheFuture"...) It doesn't work so well in a harness that doesn't take those quirks into account.
Composer is really good, but just like any Chinese model it needs a good plan. It's cheap and fast, in 1 month of pro I used the equivalent of 500$ in API credit for it.
I had this phone when it was released. I really loved it. But one thing I remember the most was using it as fidgeting toy. Just opening and closing it. So satisfying.
Security by ineffective obscurity is worthless but it’s clearly a continuum and not a buzzword that wins the conversation.
For example, if I had a 128bit port number that I randomly rotated my service on, you’d be hard pressed to find my service unless I told you the port - obscurity still but clearly closer to a password. So ipv4 and 16 bit numbers are not because it’s a relatively small space vs the resources needed to map it out quickly (ie equivalent to a weak password and also not suitable for public facing services that need that connection). And obviously relying on this kind of stuff exclusively isn’t wise but it is valuable as an additional barrier an attacker has to overcome and raises the cost of the attack.
I’ll put the anarchist cookbook out there [1] as an example, a book even the original author changed his mind on. Without easy recipes, doing all the things in that book requires you to work to gain that knowledge and that process of working it shapes you into someone who understands and appreciates the consequences of that knowledge and that it’s wise to be careful who you share it with. As is there’s reasonable links between the book and all kinds of mass violence that was more easily perpetrated. Would those people still have been violent? Possibly? Would there have been as much damage? Possibly less.
So if China attacks Taiwan and NATO intervenes, how Canada will ensure BYD will not remotely brick the charging infrastructure or will not make cars suddenly speed up and crash into oncoming traffic?
NATO cannot be called in to defend Taiwan. NATO article 6 makes this perfectly clear:
"For the purpose of Article 5, an armed attack on one or more of the Parties is deemed to include an armed attack: on the territory of any of the Parties in Europe or North America... [or] on the islands under the jurisdiction of any of the Parties in the North Atlantic area north of the Tropic of Cancer..."
The US may invoke ANZUS or treaties with Japan and SK.
If China attacks the US directly, such as attacks on US soil, that might change, but it is highly unlikely that NATO would ever get directly involved.
Same question, but for Tesla and the proposed US invasion of Canada.
(one of those things which the POTUS says that we're all told shouldn't be taken as serious or real, as if that wasn't a massive disqualification for him)
We already have US threatening to invade for absolutely no reason while the American people stand by and keeps arguing about the scandal of the day, I think this argument has sailed for Canadians. The US is now a very unreliable business partner, nothing else.
We can get fucked on both sides but will do business with those that don't want to destroy our economy.
It withholds it from good actors (they cannot use it to harden their code against bad actors) and assumes bad actors don't have access to such tools anyway.
reply