Component-Specific Prompt Tuning for Deepfake Detection
Summary
Deepfake technology can create fake facial images that are hard to distinguish from real ones, posing risks to privacy and security. This paper proposes a new detection method using Visual Language Models (VLMs, AI systems that understand both images and text) combined with component-specific prompt tuning (customizing input instructions to focus on specific facial parts like eyes and nose). The approach transforms deepfake detection into a Visual Question Answering task and uses a Q-Former module (a feature extraction component guided by instructions) to help the model identify forgery traces in local facial features, achieving better accuracy than existing methods.
Classification
Original source: http://ieeexplore.ieee.org/document/11456731
First tracked: April 2, 2026 at 08:03 PM
Classified by LLM (prompt v3) · confidence: 85%