AI Sec Watch: A Security Intelligence Platform for AI Systems

Luu, T.J.

Helping developers build safer AI experiences for teens

infonewsLLM-Specific

safetypolicy

Source: OpenAI BlogMarch 24, 2026

Summary

A new set of prompt-based safety policies have been released to help developers protect teenagers using AI systems. These policies, designed to work with gpt-oss-safeguard (an open-weight safety model that detects harmful content), address common teen-specific risks like graphic violence, sexual content, and dangerous challenges by converting safety goals into clear, operational rules that developers can apply consistently across their systems.

Solution / Mitigation

The source explicitly offers these prompt-based safety policies as the solution. According to the text, developers can use these policies directly with gpt-oss-safeguard and other reasoning models for real-time content filtering and offline analysis. The policies are 'structured as prompts that can be directly used' and 'developers can more easily integrate them into existing workflows, adapt them to their use cases, and iterate over time.' The initial release covers six categories: graphic violent content, graphic sexual content, harmful body ideals and behaviors, dangerous activities and challenges, romantic or violent roleplay, and age-restricted goods and services.