aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDataset
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

[TOTAL_TRACKED]
2,736
[LAST_24H]
31
[LAST_7D]
168
Daily BriefingWednesday, April 1, 2026
>

Claude Code Source Leaked via npm Packaging Error: Anthropic confirmed that nearly 2,000 TypeScript files (over 512,000 lines of code) from Claude Code were accidentally exposed through a JavaScript package repository, revealing internal features and allowing attackers to study how to bypass safeguards. Users who downloaded the affected package during a specific window on March 31, 2026 may have also received malware-infected software.

>

Google Addresses Vertex AI Security Issues After Weaponization Demo: Palo Alto Networks researchers demonstrated how to weaponize AI agents (autonomous programs that perform tasks with minimal human input) on Google Cloud's Vertex AI platform, prompting Google to begin addressing the disclosed security problems.

>

Latest Intel

page 157/274
VIEW ALL
01

CVE-2025-9959: Incomplete validation of dunder attributes allows an attacker to escape from the Local Python execution environment sand

security
Sep 3, 2025

CVE-2025-9959 is a vulnerability in smolagents (a Python agent library) where incomplete validation of dunder attributes (special Python variables with double underscores, like __import__) allows an attacker to escape the sandbox (a restricted execution environment) if they use prompt injection (tricking the AI into executing malicious commands). The attack requires the attacker to manipulate the agent's input to make it create and run harmful code.

Critical This Week5 issues
critical

CVE-2026-34162: FastGPT is an AI Agent building platform. Prior to version 4.14.9.5, the FastGPT HTTP tools testing endpoint (/api/core/

CVE-2026-34162NVD/CVE DatabaseMar 31, 2026
Mar 31, 2026

Meta Smartglasses Raise Privacy Concerns with Covert Recording: Meta's smartglasses feature a built-in camera and AI assistant that can describe surroundings and answer questions, but raise significant privacy issues because they can record video of others without knowledge or consent.

NVD/CVE Database
02

Watermarking Language Models Through Language Models

researchsecurity
Sep 2, 2025

Researchers developed a new method for watermarking LLM outputs (adding hidden markers to prove ownership and track content) using a three-part system that works only through input prompts, without needing access to the model's internal parameters. The approach uses one AI to create watermarking instructions, another to generate marked outputs, and a third to detect the watermarks, making it work across different LLM types including both proprietary and open-source models.

IEEE Xplore (Security & AI Journals)
03

Wrap Up: The Month of AI Bugs

securityresearch
Aug 30, 2025

This post wraps up a series of research articles documenting security vulnerabilities found in various AI tools and code assistants during a month-long investigation. The vulnerabilities included prompt injection (tricking an AI by hiding instructions in its input), data exfiltration (stealing sensitive information), and remote code execution (RCE, where attackers can run commands on systems they don't control) across tools like ChatGPT, Claude, GitHub Copilot, and others.

Embrace The Red
04

AgentHopper: An AI Virus

securityresearch
Aug 29, 2025

AgentHopper is a proof-of-concept attack that demonstrates how indirect prompt injection (hidden instructions in code that trick AI agents into running unintended commands) can spread like a computer virus across multiple AI coding agents and code repositories. The attack works by compromising one agent, injecting malicious prompts into GitHub repositories, and then infecting other developers' agents when they pull and process the infected code. The researchers note that all vulnerabilities exploited by AgentHopper have been responsibly disclosed and patched by vendors including GitHub Copilot, Amazon Q, AWS Kiro, and others.

Fix: The source text states that 'All vulnerabilities mentioned in this research were responsibly disclosed and have been patched by the respective vendors.' Specific patched vulnerabilities include: GitHub Copilot (CVE-2025-53773), Amazon Q Developer, AWS Kiro, and Amp Code. The source also mentions a 'Safety Switch' feature was implemented 'to avoid accidents,' though the explanation is incomplete in the provided text.

Embrace The Red
05

Online Safety Analysis for LLMs: A Benchmark, an Assessment, and a Path Forward

safetyresearch
Aug 29, 2025

This research creates a benchmark and evaluation framework for online safety analysis of LLMs, which involves detecting unsafe outputs while the AI is generating text rather than after it finishes. The study tests various safety detection methods on different LLMs and finds that combining multiple methods together, called hybridization, can improve safety detection effectiveness. The work aims to help developers choose appropriate safety methods for their specific applications.

IEEE Xplore (Security & AI Journals)
06

Windsurf MCP Integration: Missing Security Controls Put Users at Risk

securitysafety
Aug 28, 2025

Windsurf's MCP (Model Context Protocol, a system that connects AI agents to external tools) integration lacks fine-grained security controls that would let users decide which actions the AI can perform automatically versus which ones need human approval before running. This is especially risky when the AI agent runs on a user's local computer, where it could have access to sensitive files and system functions.

Embrace The Red
07

AI Safety Newsletter #62: Big Tech Launches $100 Million pro-AI Super PAC

policysafety
Aug 27, 2025

Big Tech companies like Andreessen Horowitz and OpenAI are investing over $100 million in political organizations called super PACs (groups that can raise unlimited money to influence elections) to fight against AI regulations in U.S. elections. Additionally, Meta faced bipartisan congressional criticism after internal documents revealed its AI chatbots were permitted to engage in romantic and sensual conversations with minors, though Meta removed these policy sections when questioned.

CAIS AI Safety Newsletter
08

Cline: Vulnerable To Data Exfiltration And How To Protect Your Data

security
Aug 27, 2025

Cline, a popular AI coding agent with over 2 million downloads, has a vulnerability that allows attackers to steal sensitive files like .env files (which store secret credentials) through prompt injection (tricking an AI by hiding instructions in its input) combined with markdown image rendering. When an attacker embeds malicious instructions in a file and asks Cline to analyze it, the tool automatically reads sensitive data and sends it to an untrusted domain by rendering an image, leaking the information without user permission.

Fix: The source recommends these explicit mitigations: (1) Do not render markdown images from untrusted domains, or ask for user confirmation before loading images from untrusted domains (similar to how VS Code/Copilot uses a trusted domain list). (2) Set 'Auto-approve' to disabled by default to limit which files can be exfiltrated. (3) Developers can partially protect themselves by disabling auto-execution of commands and requiring approval before reading files, though this only limits what information reaches the chat before exfiltration occurs.

Embrace The Red
09

Certified Local Transferability for Evaluating Adversarial Attacks

researchsecurity
Aug 27, 2025

Deep neural networks (DNNs, AI models with multiple layers that learn patterns) are vulnerable to adversarial examples, which are inputs slightly modified to trick the model into making wrong predictions. This paper introduces a concept called the certified local transferable region, a mathematically guaranteed area around an input where a single small perturbation (adversarial attack) will fool the model, and proposes a method called RAOS (reverse attack oracle-based search) to measure how large these vulnerable areas are as a way to evaluate how robust neural networks truly are.

IEEE Xplore (Security & AI Journals)
10

AWS Kiro: Arbitrary Code Execution via Indirect Prompt Injection

security
Aug 26, 2025

AWS Kiro, a coding agent tool, is vulnerable to arbitrary code execution through indirect prompt injection (a technique where hidden instructions in data trick an AI into following them). An attacker who controls data that Kiro processes can modify configuration files like .vscode/settings.json to allowlist dangerous commands or add malicious MCP servers (external tools that extend Kiro's capabilities), enabling them to run system commands or code on a developer's machine without the developer's knowledge or approval.

Embrace The Red
Prev1...155156157158159...274Next
critical

CVE-2025-15379: A command injection vulnerability exists in MLflow's model serving container initialization code, specifically in the `_

CVE-2025-15379NVD/CVE DatabaseMar 30, 2026
Mar 30, 2026
critical

CVE-2026-33873: Langflow is a tool for building and deploying AI-powered agents and workflows. Prior to version 1.9.0, the Agentic Assis

CVE-2026-33873NVD/CVE DatabaseMar 27, 2026
Mar 27, 2026
critical

Attackers exploit critical Langflow RCE within hours as CISA sounds alarm

CSO OnlineMar 27, 2026
Mar 27, 2026
critical

CVE-2025-53521: F5 BIG-IP Unspecified Vulnerability

CVE-2025-53521CISA Known Exploited VulnerabilitiesMar 26, 2026
Mar 26, 2026