AI Sec Watch: A Security Intelligence Platform for AI Systems

Luu, T.J.

Complementary Text-Guided Attention for Zero-Shot Adversarial Robustness

inforesearchPeer-Reviewed

researchsafety

Source: IEEE Xplore (Security & AI Journals)March 2, 2026

Summary

CLIP and similar vision-language models (AI systems trained on paired images and text to understand both) are vulnerable to adversarial examples (carefully crafted image modifications designed to fool AI systems). Researchers proposed two methods, TGA-ZSR and Comp-TGA, that use text-guided attention (the model's focus on image regions based on text descriptions) to make these models more robust, achieving 9.58% and 11.95% improvements in accuracy when tested on adversarial examples.