What happened after 2,000 people tried to hack my AI assistant
Summary
A researcher ran a public challenge where 2,000 people attempted to hack an AI assistant by sending emails containing prompt injection attacks (tricks to make an AI ignore its safety rules and reveal secrets). After 6,000 total attempts, nobody successfully leaked the system's secrets, suggesting that modern AI models are becoming more resistant to these attacks through better training.
Classification
Affected Vendors
Related Issues
Original source: https://simonwillison.net/2026/Jun/26/hack-my-ai-assistant/#atom-everything
First tracked: June 26, 2026 at 08:00 PM
Classified by LLM (prompt v3) · confidence: 85%