AI Sec Watch: A Security Intelligence Platform for AI Systems

Luu, T.J.

Designing AI agents to resist prompt injection

infonewsLLM-Specific

securitysafety

Source: OpenAI BlogMarch 11, 2026

Summary

AI agents that browse the web and take actions are vulnerable to prompt injection (instructions hidden in external content to manipulate the AI into unintended actions), which increasingly uses social engineering tactics rather than simple tricks. Rather than trying to perfectly detect malicious inputs (which is as hard as detecting lies), the most effective defense is to design AI systems with built-in limitations on what agents can do, similar to how human customer service agents are restricted to limit damage if they're manipulated.