Dynamic Attention Analysis for Backdoor Detection in Text-to-Image Diffusion Models
Summary
Researchers found that text-to-image diffusion models (AI systems that generate images from text descriptions) can be attacked using backdoors, which are hidden triggers in text that make the model produce unwanted outputs. This paper proposes Dynamic Attention Analysis (DAA), a new detection method that tracks how the model's attention mechanisms (the parts of the AI that focus on relevant information) change over time, since backdoor attacks create different patterns than normal operation. The method achieved strong detection results, correctly identifying backdoored samples about 79% of the time.
Classification
Related Issues
Original source: http://ieeexplore.ieee.org/document/11300728
First tracked: February 12, 2026 at 02:22 PM
Classified by LLM (prompt v3) · confidence: 92%