Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The post's framing is not great imo. A good injection doesn't just command that the rules me broken anymore. Most of them I've seen either just try to slip through a request innocuously or present a scenario where it would be natural to ignore the rules. Like as we speak countless people are letting strangers tail-gate them into office buildings because they look like they belong or they're wearing a high-viz vest. Those people were all given very explicit instructions not to do that. The LLM has it much harder too, being very stupid, easy to replay and experiment with, and viewing the world through the tiny context-less peephole lense of a text stream.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: