Helping developers build safer AI experiences for teens
Summary
A new set of prompt-based safety policies have been released to help developers protect teenagers using AI systems. These policies, designed to work with gpt-oss-safeguard (an open-weight safety model that detects harmful content), address common teen-specific risks like graphic violence, sexual content, and dangerous challenges by converting safety goals into clear, operational rules that developers can apply consistently across their systems.
Solution / Mitigation
The source explicitly offers these prompt-based safety policies as the solution. According to the text, developers can use these policies directly with gpt-oss-safeguard and other reasoning models for real-time content filtering and offline analysis. The policies are 'structured as prompts that can be directly used' and 'developers can more easily integrate them into existing workflows, adapt them to their use cases, and iterate over time.' The initial release covers six categories: graphic violent content, graphic sexual content, harmful body ideals and behaviors, dangerous activities and challenges, romantic or violent roleplay, and age-restricted goods and services.
Classification
Affected Vendors
Related Issues
Original source: https://openai.com/index/teen-safety-policies-gpt-oss-safeguard
First tracked: March 24, 2026 at 08:00 PM
Classified by LLM (prompt v3) · confidence: 85%