Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The prompt injection thing is especially nasty for agents because they process untrusted input (web pages, emails, documents) and can take real actions. With a chatbot, prompt injection makes it say something dumb. With an agent that acts as you, a malicious payload hidden in an email could make it forward your contacts, reply on your behalf, whatever. You can't fix this in the model alone — you need an enforcement layer outside the model that limits what it can actually do regardless of what it thinks it should do. I'd bet Apple is working on exactly this and it's why they're taking their time.
 help



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: