AI Sec Watch: A Security Intelligence Platform for AI Systems

Luu, T.J.

CVE-2025-46560: vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and p

mediumvulnerabilityLLM-Specific

security

Source: NVD/CVE DatabaseApril 30, 2025CVE-2025-46560

Summary

vLLM (a system for running large language models efficiently) versions 0.8.0 through 0.8.4 have a critical performance bug in how it processes multimodal input (text, images, audio). The bug uses an inefficient algorithm (quadratic time complexity, meaning it slows down exponentially as input size grows) when replacing placeholder tokens (special markers like <|audio_|> that get expanded into repeated tokens), which allows attackers to crash or freeze the system by sending specially crafted malicious inputs.