The Debts and Engagements Clause of article VI of the US constitution was kind of a weird little thing to stick in there, but like, it was important to a lot of people at the time and probably helped move the needle to get the thing ratified.
> We need an approach to make sure AI doesn't destroy the world and wipe humanity to extinction.
That's easy. Stop training your AIs on cheesy old sci-fi that talks about robot uprisings. In fact, maybe y'all should just stop talking about robot uprisings altogether. Putting a stochastic parrot in charge of an agentic function-calling REPL doesn't somehow make it super-dangerous, except to the extent that dumb mistakes might result in danger. And you can't prevent an AI from making dumb mistakes with burdensome regulation.
The biggest existential risk from AI is its contribution to global climate change. The second biggest risk from AI is the potential for AI-generated disinformation and propaganda to spark, or to manufacture consent for, a world war. The risk of superintelligent paperclip maximizers is so low as to be negligible.
> The risk of superintelligent paperclip maximizers is so low as to be negligible.
Literal paperclips, sure.
But the point of the example was never literal paperclips.
The point is that maximising *any* goal, if it doesn't include what you care about, will annihilate what you care about.
If you don't believe me, consider what you yourself just said about climate change, and why this is a consequence from maximising money spent on data centres.
show me an agent that persists productively in a goal without stopping. Does not exist. LLMs run on gradient descent. The agent is looking for the most efficient way to halt. AGI paperclip maximizer woukd likely recognize the absurdity of its goal and shut itself down.
> show me an agent that persists productively in a goal without stopping. Does not exist.
The stories about agents bankrupting their owners by running too long passed you by?
> LLMs run on gradient descent.
They were *trained on*, they don't run on it.
Know what else is? DNA. A/B testing. Capitalism. Democracy.
> The agent is looking for the most efficient way to halt.
No. They are looking to produce an answer most likely to get a high score on a rating system which itself is another AI, created either manually or by yet another AI but in both cases to approximate what the creators think is "good", which may or may not be what anyone else thinks is "good", hence Grok calling itself Mecha Hitler because Musk is an edgelord.
> AGI paperclip maximizer woukd likely recognize the absurdity of its goal and shut itself down.
Do billionaires ever get satisfied with how much money they have?
persists productively. if agent bankruots its owner or.itself.then that is not productive (who funds paper clip maximizer?)
an answer that takes forever has no score and answer that is arrived at quickly and is good enough scores well. agents are indeed looking for the fastest route to superficially satisfy their constraints.
whether billionaores get stified is imaterial to the fsct they still are constraint bound.
> persists productively. if agent bankruots its owner or.itself.then that is not productive (who funds paper clip maximizer?)
So, your standard for how risky it is, is simply how competent it is? (And if so, is the paperclip maximiser scenario really it being "successful"?)
That's fine right up until the thing passes an unknown threshold, one which will only be visible in the rear view mirror.
This is a problem in two directions:
1. We have a trend of the maximum complexity of a tasks they can handle growing faster than Moore's law did at its peak.
2. The threshold can be quite small, e.g. the covid-19 virus itself is not what anyone would call smart, ditto HIV, smallpox, and bubonic plague, but a genome is much the same learning system, and they still killed millions each.
> superficially satisfy their constraints.
The "superficially" part is one of the reasons these things can be dangerous. e.g. hopefully nobody at OpenAI actually wanted their wildly-sycophantic version, but yet they created it.
This is in fact the whole reason for the paperclip maximiser scenario: some idiot specifies the constraint "maximise paperclips" and it (in the hypothetical) superficially satisfies this without any consideration of why someone might ask for it.
persists productively. if agent bankruots its owner or.itself.then that is not productive (who funds paper clip maximizer?)
an answer that takes forever has no score and answer that is arrived at quickly and is good enough scores well. agents are indeed looking for the fastest route to superficially satisfy tjeir constraints.
whether billionaores get stified is imaterial to the fsct they still are constraint bound.
> except to the extent that dumb mistakes might result in danger
That "except" goes all the way up to starting WW3. Or a leak from a viral research lab, and by "leak" I mean "mail order" and by "research lab" I mean "the companies who already ship custom DNA and RNA retroviruses": https://duckduckgo.com/?q=companies+who+already+ship+custom+...
If you can prove that simply not training on horror stories would work, it would make a lot of people very happy.
Unfortunately, I don't think it does a single thing to solve, for example, Elon Musk just plain asking some future version of Grok to take over the world for him.
Nor would merely failing to include them in traing data stop certain entire fictional scenarios such as that Doctor Who episode where the android repair bots weren't told that the crew were off-limits as spare parts, or the other Doctor Who episode where the utilitarian robots started killing everyone who was upset because they calculated net positive utility from upset people ceasing to exist. Well, except for the bit where the Doctor saves the day, because they are not real.
Trustbusting should absolutely be included as well. One of the biggest immediate threats is the concentration of wealth into a very tiny number of companies.
DMCA-style fines should be retroactively + prospectively applied to copyrighted works reproduced by AI, paid for by the AI companies, paid out to the copyright holders whose work was used without permission.
It would not be prohibitively hard to do the math on this.
That would fix a lot of the problems with AI overnight, but it'll also never happen.
Maybe if attribution is available. What do you do for the rest of the ingested content? I think based on the content itself, you assign percentages to the top 3 industries like a naics code. Then whatever Anthropic makes as gross or net, a percentage goes to each industry via assigned bank accounts or USDC addresses via solana or some scalable payment system. Could be the start of ubi or some sort of compensation for jobs displaced by ai usage. So every input gets tagged for categories and every output gets tagged for the same naics categories via federal law.
It is normal, expected, and healthy for stakeholders in a regulatory environment to offer proposals about regulations. What's unhealthy is the proposition that the deliberation process is so fragile that a stakeholder needs to cover every angle, lest they corrupt the outcome.
It is normal, expected, and healthy to offer criticism of self interested proposals. And mock even. What is unhealthy is to imply someone said what they did not.
If that's what this is, a bank-shot snarky criticism of the proposal, fair enough. I read it instead as a criticism of a stakeholder having the temerity to make a proposal in the first place. It's not their job to anticipate and capture all your objections. That's your job!
The clue is in the name. Working class doesn't mean you earn more than average and have some savings. You're working class if you live from your work rather than your capital.
My household income is in the top 0.7% in the Netherlands, yet I'll never accumulate enough capital to stop working, so I'm working class.
115k€ gross, that gives you about 67k€ net. 5600€ per month. Minus a 2200€ mortgage, 400€ of health insurance, 1200€ of childcare, 200€ for the car repayment and insurance, 200€ for the water/electricity/internet and 800€ of groceries, that leaves me with a whopping 600€ per month.
My class interests are infinitely closer to a part-time starbucks barista than they are to a millionaire.
That's a bit like asking how the defendant in a legal case is an interested party.
Even if you think someone is guilty, it does make sense to allow them to at least submit their defense. And if they choose to use that time to advocate for their own promotion, let them.
"Stakeholder" literally means someone with a stake in the outcome, which is to say, those who will be affected by the decision. That can include a whole range of people+entities, including citizens (as a group) and the companies to be regulated.
Dunno if you can, but your examples here are the legal equivalent of that time
someone asked me about making "Uber for airplanes" without any elaboration on their part when I asked for it:
Far more vague than I think you realise.
You could probably write a book on each of those topics and a hundred others besides.
I see a lot of skepticism in Dario's position in this forum. But allow me to argue the opposite.
I think the key argument that this skepticism lies on is that he himself gained from AI - specifically building Frontier AI models - and this is basically regulatory capture disguised as doomerism.
Fair points - but I think this is a more charitable version of this. Dario is building Anthropic because that is the most valuable thing he can build, or at least that is what his conviction has been. The success of Anthropic and the impending IPO is proof that this conviction has not only been correct but has largely played out very successfully. Dario understands the true nature of AI and he has welded that power to immense personal benefit.
But maybe he also sees the potential danger to AI which he is trying to address through these posts and regulatory initiatives. There are three reasons why I would support the charitable version:
Firstly, personal gain and societal benefit can coexist in the same individual. And both of them might drive towards opposite agendas. But that doesn't necessarily have to mean that the impulse driving the societal benefit is not earnest. In fact if you would look at Dario’s proposal - like closing the data broker loophole - several of them could constrain Anthropic instead of benefitting them.
Secondly, he expects that his concerns on the negative potential of AI will be taken seriously, if he is actually running the Frontier AI company. And there is some truth to this argument. The only reason we are discussing this is because he is the CEO of Anthropic. He is probably the most influential figure outside of the government who has to be taken seriously when he claims something like this.
Thirdly, and most importantly, Dario has previously demonstrated that he is willing to sacrifice personal/corporate gains for societal benefits. The proof is the classification by the US DoD of Anthropic as a supply chain risk when Anthropic refused to completely cooperate with the military to develop fully Autonomous AI weapons and enable mass surveillance. It would have been only too easy for Dario to accept if personal benefit was his only concern - and OpenAI was more than ready to step in their place.
Even with Mythos - Anthropic could have released the model to the public broadly. But they took their time to reduce the potential danger - as best as they could. Despite the fact that GPT-5.5 was nipping in the buds in what is becoming a very competitive market.
That being said, just because Dario is acting in good faith, does not mean that this will all result in good outcomes. The FAA-styled regulation could still end up favoring incumbents - some of whom might choose to not act in good faith. A more diversified capability can potentially limit that power ending up in a small number of wrong hands. Just because Anthropic is the leader right now, doesn't mean that they will always be. Maybe someone else tomorrow benefits from this regulatory capture at the expense of everyone else - and Dario might have a hand in driving it.
No single open weights model comes close to either Mythos or GPT 5.5.
Nonetheless, running many of the open weights models over a codebase, with an appropriate harness, can provide about the same vulnerability coverage (i.e. each of the open weights models would find a subset of what Mythos or GPT 5.5 could find, but the subsets are not the same).
Despite needing more runs and more time, this may be significantly cheaper, especially if the models are self hosted.
Based on what Anthropic said about Mythos, they also use a quite elaborate harness for finding bugs and vulnerabilities, i.e. not a simple prompt like "find the bugs".
They run repeatedly Mythos on each file of the codebase, many times. They start with more generic prompts, used to determine whether a more thorough analysis of that file is worthwhile. Then they use more specific prompts, to detect various classes of bugs. After it becomes probable that a certain bug exists, they do a final run where the prompt requests a confirmation of the already known bug, perhaps together with a proposed patch or a PoC exploit.
Therefore the efficiency of finding vulnerabilities depends a lot on the harness, not only on the LLM. Also, searching vulnerabilities in a big codebase when paying per token is very expensive, because it requires many runs of the LLM.
"I took it and it didn't work so it's a fake placebo drug" - wow, your scientific method is flawless, have you considered a career at the US Department of Health?
I have a counter-study with size n=1: I did all my recovery from tonsilectomy on paracetamol and definitely noticed it working. That was however on the maximum safe dose.
(one of the major problems with paracetamol is that the effective dose is only a few multiples away from the dose which starts to cause liver damage! It is by a long way the most dangerous OTC drug)
You're partially right compared to placebo only about 5% of people are painfree over the effect of a placebo when taking paracetamol.
Paracetamol got it's start as replacing the more effective but much more dangerous and withdrawn drug Phenacetin.
Why don't people notice that it's such a small benefit over nothing? Well because placebo effect is quite good for pain and pain is usually transitory anywhere..if you have a tension headache you're probably going to aim to relax. Turn away from the screen or even have some caffeine and those are more effective than paracetamol!
Where did you pull this 5% from? There are gazillions of studies showing higher or lower efficacies for different kinds of pain. Along with the inaccuracies about Phenacetin (whose MOA is metabolising into paracetamol).
You will indeed find various figures for various pain types all are far worse than ibuprofen.
Here is an example from the Cochrane library
> For the IHS preferred outcome of being pain free at two hours the NNT for paracetamol 1000 mg compared with placebo was 22 (95% confidence interval (CI) 15 to 40) in eight studies (5890 participants; high quality evidence), with no significant difference from placebo at one hour.
A NNT of 22 means that in absolute terms 1/22 people met the positive endpoint criteria more than placebo. This figure is usually quoted as 20% for placebo and 25% for paracetamol giving NNT of 20.
"pain free" is a long way from the pain is manageable. Pain is an understudied subject, where we have too little knowledge. Just using the word manageable is an indication of this.
That's very true, but the metric is applied to all medications you compare against that's what's important. You also just get a baseline idea of what's good by guessing what you'd accept.
Episodic tension type headache tested with ibuprofen Vs placebo NNT is 14. (Btw that's not great itself) But it's better than paracetamols often quoted figure 20.
Here's why I say it's not great. Why don't you guess some reasonable NNTs for say moderate depression treated with SSRIs, or no relapse in schizophrenia treated with an antipsychotic.
Now guess the NNT for a statin to prevent a first heart attack.
SSRI for moderate depression about 10, antipsychotics to prevent schizophrenia relapse over 2 years NNT= 3 (excellent )Statin to prevent a first heart attack 200! (This one always shocks me). Statins have a clear role of course.
Good answer from you part, but I am not qualified to write an response. Pain is temporary for most of us, what we mostly need is something that is slightly better than placebo. I will stick to my "it is probably more complex" I will not make a decision.
GrapheneOS is doing lot of things right in this regard. Robust permission system adopted from AOSP and hardening by default in every imaginable way. Things like hardened malloc, storage scopes are excellent security features. Malware cannot do much even with the default settings.
With a file system driver like Veracrypt, if it’s malicious, the OS might keep your computer safe, but not your files that you store in that file system.
- You shall not embed copyrighted material in your models.
- You shall not bombard every little website in existence with 1 million scraping queries per day.
- You shall not use your political influence to pump and dump your AI (or rocket?) company.
- You shall not imperill the whole IT sector by buying all CPU and memory chips.
These new rules will affect every society directly in a positive way. Thanks.