OpenAI talks about not talking about goblins
infonewsLLM-Specific
safety
Source: The Verge (AI)April 30, 2026
Summary
OpenAI discovered that its AI models were unexpectedly inserting references to goblins and other creatures into their responses, a behavior that started appearing in the GPT-5.1 model, particularly when using the "Nerdy" personality option. The company traced this quirk to patterns in the training data and added instructions to prevent the models from discussing these creatures.
Classification
Attack SophisticationTrivial
Impact (CIA+S)
safety
AI Component TargetedModel
Affected Vendors
OpenAI
Related Issues
Monthly digest — independent AI security research
Original source: https://www.theverge.com/ai-artificial-intelligence/921181/openai-codex-goblins
First tracked: April 30, 2026 at 02:00 PM
Classified by LLM (prompt v3) · confidence: 85%