Are Large Vision-Language Models Robust to Adversarial Visual Transformations?
Summary
Large vision-language models (LVLMs, which are AIs that understand both images and text) can be attacked using simple visual transformations, such as rotations or color changes, that fool them into giving wrong answers. Researchers found that combining multiple harmful transformations can make these attacks more effective, and they can be optimized using gradient approximation (a mathematical technique to find the best attack parameters). This research highlights a previously overlooked safety risk in how well LVLMs resist these kinds of adversarial attacks (attempts to trick AI systems).
Classification
Related Issues
Original source: http://ieeexplore.ieee.org/document/11421907
First tracked: March 16, 2026 at 04:14 PM
Classified by LLM (prompt v3) · confidence: 92%