That's why I said: > Replace "suicide" with whatever the "AI Safety" obsession w...

estearum · 2026-05-03T16:25:25 1777825525

Which would be more than 0% concerning if I've ever heard (even once) an example of this happening with a query that shouldn't actually trigger something like that, or is so close to such a query, that the false positive is understandable and of incredibly niche value anyway.

OP gave an example of reverse engineering, something that to the LLM looks identical to just hacking. I am totally fine if the incredibly tiny little fraction of people who want to reverse engineer their own systems can't use LLMs to do it, and in exchange top LLMs aren't helpful for the hordes of actual malicious actors who would love a superintelligence to aid their crimes.

No-brainer tradeoff, just like 100% of examples I've ever heard.