CVE-2025-32444: vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.6.5 and p
Summary
vLLM (a system for running AI models efficiently) versions 0.6.5 through 0.8.4 have a critical vulnerability when using mooncake integration. Attackers can execute arbitrary code remotely because the system uses pickle (an unsafe method for converting data into a format that can be transmitted) over unencrypted ZeroMQ sockets (communication channels) that listen to all network connections, making them easily accessible from the internet.
Solution / Mitigation
Update to vLLM version 0.8.5 or later, which has patched this vulnerability.
Vulnerability Details
10(critical)
EPSS: 2.5%
Classification
Affected Vendors
Related Issues
Original source: https://nvd.nist.gov/vuln/detail/CVE-2025-32444
First tracked: February 15, 2026 at 08:44 PM
Classified by LLM (prompt v3) · confidence: 95%