AI Sec Watch: A Security Intelligence Platform for AI Systems

Luu, T.J.

Helping ChatGPT better recognize context in sensitive conversations

infonewsLLM-Specific

safety

Source: OpenAI BlogMay 13, 2026

Summary

OpenAI updated ChatGPT to better recognize warning signs of harm by analyzing context within and across conversations, particularly for suicide, self-harm, and harm-to-others scenarios. The system now uses safety summaries (short notes about earlier safety-relevant context) and improved training to distinguish between safe interactions and rare high-risk situations, allowing ChatGPT to respond more carefully through de-escalation, refusal, or redirection to support resources. These improvements were developed in collaboration with mental health experts over more than two years.

Solution / Mitigation

OpenAI implemented safety summaries, which are short, factual notes about earlier safety-relevant context created by a model trained for safety reasoning tasks. These summaries are narrowly scoped, kept only for a limited time, and used only when relevant to serious safety concerns. Additionally, ChatGPT was trained to use this context more carefully to recognize when added caution is needed and respond appropriately by de-escalating, refusing harmful details, or redirecting toward safer alternatives and crisis resources.