NAP-Tuning: Neural Augmented Prompt Tuning for Adversarially Robust Vision-Language Models
Summary
Vision-Language Models (VLMs, AI systems that understand both images and text together) like CLIP are powerful but vulnerable to adversarial attacks (malicious inputs designed to fool AI systems, especially in images). This research presents NAP-Tuning, a method that uses learnable text prompts and lightweight neural modules called TokenRefiners to clean up distorted features inside the model's layers, making these systems more resistant to such attacks while keeping normal performance intact.
Classification
Affected Vendors
Related Issues
CVE-2022-29200: TensorFlow is an open source platform for machine learning. Prior to versions 2.9.0, 2.8.1, 2.7.2, and 2.6.4, the implem
CVE-2025-33254: NVIDIA Triton Inference Server contains a vulnerability where an attacker may cause internal state corruption. A success
Original source: http://ieeexplore.ieee.org/document/11368741
First tracked: May 7, 2026 at 08:03 PM
Classified by LLM (prompt v3) · confidence: 85%