CVE-2025-46560: vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and p
Summary
vLLM (a system for running large language models efficiently) versions 0.8.0 through 0.8.4 have a critical performance bug in how it processes multimodal input (text, images, audio). The bug uses an inefficient algorithm (quadratic time complexity, meaning it slows down exponentially as input size grows) when replacing placeholder tokens (special markers like <|audio_|> that get expanded into repeated tokens), which allows attackers to crash or freeze the system by sending specially crafted malicious inputs.
Solution / Mitigation
This issue has been patched in version 0.8.5.
Vulnerability Details
6.5(medium)
EPSS: 0.6%
Classification
Taxonomy References
Affected Vendors
Related Issues
CVE-2022-29200: TensorFlow is an open source platform for machine learning. Prior to versions 2.9.0, 2.8.1, 2.7.2, and 2.6.4, the implem
CVE-2021-29541: TensorFlow is an end-to-end open source platform for machine learning. An attacker can trigger a dereference of a null p
Original source: https://nvd.nist.gov/vuln/detail/CVE-2025-46560
First tracked: February 15, 2026 at 08:44 PM
Classified by LLM (prompt v3) · confidence: 95%