Security vulnerabilities, privacy incidents, safety concerns, and policy updates affecting LLMs and AI agents.
vLLM (a system for running and serving large language models) versions 0.8.0 through 0.9.0 have a vulnerability where the /v1/chat/completions API endpoint doesn't properly check user input in the 'pattern' and 'type' fields when the tools feature is used, allowing a single malformed request to crash the inference worker (the part that actually runs the model) until someone restarts it.
Fix: Update to version 0.9.0 or later, which fixes the issue.
NVD/CVE DatabaseCVE-2025-48943 is a Denial of Service vulnerability (a type of attack that crashes a system) in vLLM versions 0.8.0 through 0.8.x that causes the server to crash when given an invalid regex (a pattern used to match text). This happens specifically when using the structured output feature, which lets the AI format responses in a specific way.
vLLM (an inference and serving engine for large language models) versions 0.8.0 through 0.8.x have a vulnerability where sending an invalid JSON schema as a parameter to the /v1/completions API endpoint causes the server to crash. This happens because the application doesn't properly handle (catch) exceptions that occur when processing malformed input.
vLLM, a software system that runs and serves large language models, has a vulnerability in how it parses tool commands that can be exploited to crash or slow down the service. The problem comes from using an overly complex pattern-matching rule (regular expression with nested quantifiers, optional groups, and inner repetitions) that can cause the system to get stuck processing certain inputs, leading to severe performance problems.
Gradio is an open-source Python package for building machine learning demos and web applications. Before version 5.31.0, a vulnerability in its flagging feature let unauthenticated attackers copy any readable file from the server's filesystem, which could cause DoS (denial of service, where a system becomes unavailable) by copying massive files to fill up disk space, though attackers couldn't actually read the copied files.
CVE-2025-48491 is a vulnerability in Project AI, a platform for creating AI agents, where a hardcoded API key (a secret credential stored directly in the code rather than kept separate) was exposed in versions before the pre-beta release. This means attackers could potentially find and misuse this key to access the system without proper authorization.
vLLM (a system for running large language models) versions 0.7.0 through 0.8.x have a bug in how they create hash values (fingerprints) for images. The hashing method only looks at the raw pixel data and ignores important image properties like width and height, so two different-sized images with the same pixels would create identical hash values. This can cause the system to incorrectly reuse cached results or expose data it shouldn't.
vLLM, an inference and serving engine for large language models, had a vulnerability in versions before 0.9.0 where timing differences in the PageAttention mechanism (a feature that speeds up processing by reusing matching text chunks) were large enough that attackers could detect and exploit them. This type of attack is called a timing side-channel attack, where an attacker learns information by measuring how long operations take.
A vulnerability (CVE-2025-5320) was found in Gradio, a web framework for building AI demos, affecting versions up to 5.29.1. An attacker could manipulate the localhost_aliases parameter in the CORS Handler (the component that controls which websites can access the application) to gain elevated privileges, though executing this attack is difficult and requires remote access.
CVE-2025-5277 is a command injection vulnerability (a flaw where an attacker can trick a program into running unwanted commands) in aws-mcp-server, an MCP server (a software tool that helps AI systems interact with AWS cloud services). An attacker can craft a malicious prompt that, when accessed by an MCP client (a program that connects to the server), executes arbitrary commands on the host system, with a critical severity rating of 9.4.
vLLM versions 0.6.5 through 0.8.4 have a vulnerability when using `PyNcclPipe` (a tool for peer-to-peer communication between multiple computers running the AI model) with the V0 engine. The issue is that a network communication interface called `TCPStore` was listening on all network connections instead of just the private network specified by the `--kv-ip` parameter, potentially exposing the system to unauthorized access.
Langroid, a Python framework for building AI applications, has a vulnerability in versions before 0.53.15 where the `LanceDocChatAgent` component uses pandas eval() (a function that executes Python code stored in strings) in an unsafe way, allowing attackers to run malicious commands on the host system. The vulnerability exists in the `compute_from_docs()` function, which processes user queries without proper protection.
Langroid, a Python framework for building LLM-powered applications, had a code injection vulnerability (CWE-94, a flaw where untrusted input can be executed as code) in its `TableChatAgent` component before version 0.53.15 because it used `pandas eval()` without proper safeguards. This could allow attackers to run arbitrary code if the application accepted untrusted user input.
ChatGPT through March 30, 2025, renders SVG documents (scalable vector graphics, a type of image format) directly in web browsers instead of displaying them as plain text, which allows attackers to inject HTML (the code that structures web pages) and potentially trick users through phishing attacks.
A vulnerability in the `preprocess_string()` function of the huggingface/transformers library (version v4.48.3) allows a ReDoS attack (regular expression denial of service, where a poorly written pattern causes the computer to do exponential amounts of work). An attacker can send specially crafted input with many newline characters that makes the function use excessive CPU, potentially crashing the application.
CVE-2025-1975 is a vulnerability in Ollama server version 0.5.11 that allows an attacker to crash the server through a Denial of Service attack by sending specially crafted requests to the /api/pull endpoint (the function that downloads AI models). The vulnerability stems from improper validation of array index access (CWE-129, which means the program doesn't properly check if it's trying to access memory locations that don't exist), which happens when a malicious user customizes manifest content and spoofs a service.
CVE-2025-4701 is a vulnerability in VITA-MLLM Freeze-Omni (versions up to 20250421) where improper input validation in the torch.load function of models/utils.py allows deserialization (converting data back into executable code) of untrusted data through a manipulated file path argument. This vulnerability has a CVSS score (a 0-10 rating of how severe a vulnerability is) of 4.8 (medium severity) and can be exploited locally by users with basic privileges.
CVE-2025-0649 is a bug in Google's TensorFlow Serving (a tool that runs machine learning models as a service) versions up to 2.18.0 where incorrect handling of JSON input can cause unbounded recursion (a program calling itself repeatedly without stopping), leading to server crashes. This vulnerability has a CVSS score (a 0-10 rating of how severe a vulnerability is) of 8.9, indicating high severity. The issue relates to out-of-bounds writes (writing data to unintended memory locations) and stack-based buffer overflow (overflowing a memory region meant for temporary data).
CVE-2025-30165 is a vulnerability in vLLM (a system for running large language models) that affects multi-node deployments using the V0 engine. The vulnerability exists because vLLM deserializes (converts from storage format back into usable data) incoming network messages using pickle, an unsafe method that allows attackers to execute arbitrary code on secondary hosts. This could let an attacker compromise an entire vLLM deployment if they control the primary host or use network-level attacks like ARP cache poisoning (redirecting network traffic to a malicious server).
CVE-2025-25014 is a prototype pollution vulnerability (a type of bug where an attacker modifies the basic template that objects are built from) in Kibana that allows attackers to execute arbitrary code (run commands they shouldn't be able to run) by sending specially crafted HTTP requests (malicious web requests) to machine learning and reporting endpoints. The vulnerability affects multiple versions of Kibana and was identified by Elastic.
Fix: Upgrade to version 0.9.0, which fixes the issue. A patch is available at https://github.com/vllm-project/vllm/commit/08bf7840780980c7568c573c70a6a8db94fd45ff.
NVD/CVE DatabaseFix: Update to vLLM version 0.9.0 or later, which fixes the issue.
NVD/CVE DatabaseFix: Update to version 0.9.0 or later, which contains a patch for the issue.
NVD/CVE DatabaseFix: Update to Gradio version 5.31.0 or later, where this issue has been patched.
NVD/CVE DatabaseFix: This issue has been patched in version 0.9.0.
NVD/CVE DatabaseFix: Update vLLM to version 0.9.0 or later. The issue has been patched in version 0.9.0.
NVD/CVE DatabaseFix: Update to vLLM version 0.8.5 or later. According to the source: "As of version 0.8.5, vLLM limits the `TCPStore` socket to the private interface as configured."
NVD/CVE DatabaseFix: Upgrade to Langroid version 0.53.15 or later. The fix involves input sanitization (cleaning and filtering user input) to the affected function by default to block common attack vectors, along with added warnings in the project documentation about the risky behavior.
NVD/CVE DatabaseFix: Upgrade to Langroid version 0.53.15 or later. According to the source, 'Langroid 0.53.15 sanitizes input to `TableChatAgent` by default to tackle the most common attack vectors, and added several warnings about the risky behavior in the project documentation.'
NVD/CVE DatabaseFix: A patch is available at https://github.com/tensorflow/serving/commit/6cb013167d13f2ed3930aabb86dbc2c8c53f5adf (identified by Google Inc. as the official patch for this vulnerability).
NVD/CVE DatabaseFix: The maintainers recommend that users ensure their environment is on a secure network. Additionally, the V0 engine has been off by default since v0.8.0, and the V1 engine is not affected by this issue.
NVD/CVE DatabaseFix: A security update is available from Elastic for Kibana versions 8.17.6, 8.18.1, or 9.0.1, as referenced in the Elastic vendor advisory at https://discuss.elastic.co/t/kibana-8-17-6-8-18-1-or-9-0-1-security-update-esa-2025-07/377868.
NVD/CVE Database