Trigger Without Trace: Toward Stealthy Backdoor Attack on Text-to-Image Diffusion Models
Summary
Researchers have developed a new backdoor attack method called Trigger without Trace (TwT) that can secretly compromise text-to-image diffusion models (AI systems that generate images from text descriptions) while avoiding detection. The method works by using syntactic structures (grammar patterns) as hidden triggers and employing a mathematical technique called Kernel Maximum Mean Discrepancy (KMMD, a way to match statistical distributions) to make malicious samples look identical to legitimate ones, achieving a 97.5% success rate while bypassing three existing defense detection systems.
Classification
Affected Vendors
Related Issues
Original source: http://ieeexplore.ieee.org/document/11527385
First tracked: June 4, 2026 at 08:03 PM
Classified by LLM (prompt v3) · confidence: 92%