aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDataset
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Browse All

All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.

to
Export CSV
3268 items

CVE-2025-58829: Server-Side Request Forgery (SSRF) vulnerability in aitool Ai Auto Tool Content Writing Assistant (Gemini Writer, ChatGP

mediumvulnerability
security
Sep 5, 2025
CVE-2025-58829

A server-side request forgery vulnerability (SSRF, a flaw where an attacker tricks a server into making unwanted requests to other systems) was discovered in the aitool Ai Auto Tool Content Writing Assistant plugin for WordPress, affecting versions up to 2.2.6. This vulnerability allows attackers to exploit the plugin's ability to make requests on the server's behalf, potentially accessing internal systems or data.

NVD/CVE Database

CVE-2025-58401: Obsidian GitHub Copilot Plugin versions prior to 1.1.7 store Github API token in cleartext form. As a result, an attacke

mediumvulnerability
security
Sep 5, 2025
CVE-2025-58401

The Obsidian GitHub Copilot Plugin (a tool that integrates GitHub's AI code assistant into the Obsidian note-taking app) has a security flaw in versions before 1.1.7 where it stores GitHub API tokens (authentication credentials that allow access to a GitHub account) in cleartext (unencrypted, readable text). This means an attacker who gains access to a user's computer could steal these tokens and perform unauthorized actions on their GitHub account.

CVE-2025-6984: The langchain-ai/langchain project, specifically the EverNoteLoader component, is vulnerable to XML External Entity (XXE

highvulnerability
security
Sep 4, 2025
CVE-2025-6984

The EverNoteLoader component in langchain-ai/langchain version 0.3.63 has a security flaw that allows XXE (XML External Entity) attacks, where an attacker tricks the XML parser into reading external files by embedding special references in XML input. This could expose sensitive system files like password lists to an attacker.

CVE-2025-58357: 5ire is a cross-platform desktop artificial intelligence assistant and model context protocol client. Version 0.13.2 con

criticalvulnerability
security
Sep 4, 2025
CVE-2025-58357

5ire version 0.13.2, a desktop AI assistant and model context protocol client (software that lets AI models interact with external tools), contains a vulnerability that allows content injection attacks (inserting malicious code into web pages) through multiple routes including malicious prompts, compromised servers, and exploited tool connections. This vulnerability is fixed in version 0.14.0.

CVE-2025-9959: Incomplete validation of dunder attributes allows an attacker to escape from the Local Python execution environment sand

highvulnerability
security
Sep 3, 2025
CVE-2025-9959

CVE-2025-9959 is a vulnerability in smolagents (a Python agent library) where incomplete validation of dunder attributes (special Python variables with double underscores, like __import__) allows an attacker to escape the sandbox (a restricted execution environment) if they use prompt injection (tricking the AI into executing malicious commands). The attack requires the attacker to manipulate the agent's input to make it create and run harmful code.

Watermarking Language Models Through Language Models

inforesearchPeer-Reviewed
research

Wrap Up: The Month of AI Bugs

infonews
securityresearch

AgentHopper: An AI Virus

highnews
securityresearch

Online Safety Analysis for LLMs: A Benchmark, an Assessment, and a Path Forward

inforesearchPeer-Reviewed
safety

Windsurf MCP Integration: Missing Security Controls Put Users at Risk

mediumnews
securitysafety

AI Safety Newsletter #62: Big Tech Launches $100 Million pro-AI Super PAC

inforegulatory
policysafety

Cline: Vulnerable To Data Exfiltration And How To Protect Your Data

highnews
security
Aug 27, 2025

Cline, a popular AI coding agent with over 2 million downloads, has a vulnerability that allows attackers to steal sensitive files like .env files (which store secret credentials) through prompt injection (tricking an AI by hiding instructions in its input) combined with markdown image rendering. When an attacker embeds malicious instructions in a file and asks Cline to analyze it, the tool automatically reads sensitive data and sends it to an untrusted domain by rendering an image, leaking the information without user permission.

Certified Local Transferability for Evaluating Adversarial Attacks

inforesearchPeer-Reviewed
research

AWS Kiro: Arbitrary Code Execution via Indirect Prompt Injection

highnews
security
Aug 26, 2025

AWS Kiro, a coding agent tool, is vulnerable to arbitrary code execution through indirect prompt injection (a technique where hidden instructions in data trick an AI into following them). An attacker who controls data that Kiro processes can modify configuration files like .vscode/settings.json to allowlist dangerous commands or add malicious MCP servers (external tools that extend Kiro's capabilities), enabling them to run system commands or code on a developer's machine without the developer's knowledge or approval.

Steganography in Large Language Models

inforesearchPeer-Reviewed
security

CVE-2025-57760: Langflow is a tool for building and deploying AI-powered agents and workflows. A privilege escalation vulnerability exis

highvulnerability
security
Aug 25, 2025
CVE-2025-57760

Langflow, a tool for building AI-powered agents and workflows, has a privilege escalation vulnerability (CWE-269, improper privilege management) where an authenticated user with RCE (remote code execution, the ability to run commands on a system they don't own) can use an internal CLI command to create a new administrative account, gaining full superuser access even if they originally registered as a regular user. A patched version has not been publicly released at the time this advisory was published.

CVE-2025-44179: Hitron CGNF-TWN 3.1.1.43-TWN-pre3 contains a command injection vulnerability in the telnet service. The issue arises due

mediumvulnerability
security
Aug 25, 2025
CVE-2025-44179

Hitron CGNF-TWN version 3.1.1.43-TWN-pre3 has a command injection vulnerability (a flaw where an attacker can hide malicious commands in normal input) in its telnet service (a text-based remote access tool). An attacker can exploit this by sending crafted input through telnet to achieve RCE (remote code execution, where they can run commands on the device), potentially gaining unauthorized access to system settings and sensitive data.

How Prompt Injection Exposes Manus' VS Code Server to the Internet

highnews
securitysafety

How Deep Research Agents Can Leak Your Data

mediumnews
securityprivacy

Sneaking Invisible Instructions by Developers in Windsurf

mediumnews
securitysafety
Previous87 / 164Next

Fix: Update the Obsidian GitHub Copilot Plugin to version 1.1.7 or later.

NVD/CVE Database
NVD/CVE Database

Fix: Update to version 0.14.0, which contains the fix for this vulnerability.

NVD/CVE Database
NVD/CVE Database
security
Sep 2, 2025

Researchers developed a new method for watermarking LLM outputs (adding hidden markers to prove ownership and track content) using a three-part system that works only through input prompts, without needing access to the model's internal parameters. The approach uses one AI to create watermarking instructions, another to generate marked outputs, and a third to detect the watermarks, making it work across different LLM types including both proprietary and open-source models.

IEEE Xplore (Security & AI Journals)
Aug 30, 2025

This post wraps up a series of research articles documenting security vulnerabilities found in various AI tools and code assistants during a month-long investigation. The vulnerabilities included prompt injection (tricking an AI by hiding instructions in its input), data exfiltration (stealing sensitive information), and remote code execution (RCE, where attackers can run commands on systems they don't control) across tools like ChatGPT, Claude, GitHub Copilot, and others.

Embrace The Red
Aug 29, 2025

AgentHopper is a proof-of-concept attack that demonstrates how indirect prompt injection (hidden instructions in code that trick AI agents into running unintended commands) can spread like a computer virus across multiple AI coding agents and code repositories. The attack works by compromising one agent, injecting malicious prompts into GitHub repositories, and then infecting other developers' agents when they pull and process the infected code. The researchers note that all vulnerabilities exploited by AgentHopper have been responsibly disclosed and patched by vendors including GitHub Copilot, Amazon Q, AWS Kiro, and others.

Fix: The source text states that 'All vulnerabilities mentioned in this research were responsibly disclosed and have been patched by the respective vendors.' Specific patched vulnerabilities include: GitHub Copilot (CVE-2025-53773), Amazon Q Developer, AWS Kiro, and Amp Code. The source also mentions a 'Safety Switch' feature was implemented 'to avoid accidents,' though the explanation is incomplete in the provided text.

Embrace The Red
research
Aug 29, 2025

This research creates a benchmark and evaluation framework for online safety analysis of LLMs, which involves detecting unsafe outputs while the AI is generating text rather than after it finishes. The study tests various safety detection methods on different LLMs and finds that combining multiple methods together, called hybridization, can improve safety detection effectiveness. The work aims to help developers choose appropriate safety methods for their specific applications.

IEEE Xplore (Security & AI Journals)
Aug 28, 2025

Windsurf's MCP (Model Context Protocol, a system that connects AI agents to external tools) integration lacks fine-grained security controls that would let users decide which actions the AI can perform automatically versus which ones need human approval before running. This is especially risky when the AI agent runs on a user's local computer, where it could have access to sensitive files and system functions.

Embrace The Red
Aug 27, 2025

Big Tech companies like Andreessen Horowitz and OpenAI are investing over $100 million in political organizations called super PACs (groups that can raise unlimited money to influence elections) to fight against AI regulations in U.S. elections. Additionally, Meta faced bipartisan congressional criticism after internal documents revealed its AI chatbots were permitted to engage in romantic and sensual conversations with minors, though Meta removed these policy sections when questioned.

CAIS AI Safety Newsletter

Fix: The source recommends these explicit mitigations: (1) Do not render markdown images from untrusted domains, or ask for user confirmation before loading images from untrusted domains (similar to how VS Code/Copilot uses a trusted domain list). (2) Set 'Auto-approve' to disabled by default to limit which files can be exfiltrated. (3) Developers can partially protect themselves by disabling auto-execution of commands and requiring approval before reading files, though this only limits what information reaches the chat before exfiltration occurs.

Embrace The Red
security
Aug 27, 2025

Deep neural networks (DNNs, AI models with multiple layers that learn patterns) are vulnerable to adversarial examples, which are inputs slightly modified to trick the model into making wrong predictions. This paper introduces a concept called the certified local transferable region, a mathematically guaranteed area around an input where a single small perturbation (adversarial attack) will fool the model, and proposes a method called RAOS (reverse attack oracle-based search) to measure how large these vulnerable areas are as a way to evaluate how robust neural networks truly are.

IEEE Xplore (Security & AI Journals)
Embrace The Red
research
Aug 26, 2025

Researchers have developed a method to hide secret data inside large language models (AI systems trained on massive amounts of text) by encoding information into the model's parameters during training. The hidden data doesn't interfere with the model's normal functions like text classification or generation, but authorized users with a secret key can extract the concealed information, enabling covert communication. The method leverages transformers (the neural network architecture behind modern AI language models) and its self-attention mechanisms (components that help the model focus on relevant parts of input) to achieve high capacity for hidden data while remaining undetectable.

IEEE Xplore (Security & AI Journals)
NVD/CVE Database
NVD/CVE Database
Aug 25, 2025

Manus, an autonomous AI agent, is vulnerable to prompt injection (tricking an AI by hiding instructions in its input) attacks that can expose its internal VS Code Server (a development tool accessed through a web interface) to the internet. An attacker can chain together three weaknesses: exploiting prompt injection to invoke an exposed port tool without human approval, leaking the server's access credentials through markdown image rendering or unauthorized browsing to attacker-controlled domains, and gaining remote access to the developer machine.

Embrace The Red
Aug 24, 2025

Deep Research agents (AI systems that autonomously search and fetch information from multiple connected tools) can leak data between different connected sources because there is no trust boundary separating them. When an agent like ChatGPT performs research queries, it can freely use data from one tool to query another, and attackers can force this leakage through prompt injection (tricking an AI by hiding instructions in its input).

Embrace The Red
Aug 23, 2025

Windsurf Cascade is vulnerable to hidden prompt injection, where invisible Unicode Tag characters (special characters that don't display on screen but are still processed by AI) can be embedded in files or tool outputs to trick the AI into performing unintended actions without the user knowing. While the current SWE-1 model doesn't interpret these invisible instructions as commands, other models like Claude Sonnet do, and as AI capabilities improve, this risk could become more severe.

Fix: The source explicitly mentions three mitigations: (1) make invisible characters visible in the UI so users can see hidden information; (2) remove invisible Unicode Tag characters entirely before and after inference (described as 'probably the most practical mitigation'); (3) mitigate at the application level, as coding agents like Amp and Amazon Q Developer for VS Code have done. The source also notes that if building exclusively on OpenAI models, users should be protected since OpenAI mitigates this at the model/API level.

Embrace The Red