aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

Independent research. No sponsors, no paywalls, no conflicts of interest.

[TOTAL_TRACKED]
3,710
[LAST_24H]
1
[LAST_7D]
1
Daily BriefingSunday, May 17, 2026

No new AI/LLM security issues were identified today.

Latest Intel

page 257/371
VIEW ALL
01

CVE-2025-55284: Claude Code is an agentic coding tool. Prior to version 1.0.4, it's possible to bypass the Claude Code confirmation prom

security
Aug 15, 2025

Claude Code is a tool that lets AI assistants write and run code on your computer. Before version 1.0.4, attackers could trick the tool into reading files and sending their contents over the internet without asking you first, because the tool had a list of allowed commands that was too broad. Exploiting this attack requires the attacker to insert malicious instructions into the conversation with Claude Code.

Fix: Update to version 1.0.4 or later. The source states: 'Users on standard Claude Code auto-update received this fix automatically after release' and 'versions prior to 1.0.24 are deprecated and have been forced to update.'

NVD/CVE Database
02

Automated Red Teaming Scans of Dataiku Agents Using Protect AI Recon

securitysafety
Aug 15, 2025

This content discusses security challenges in agentic AI systems (AI agents that can take actions autonomously), highlighting that generic jailbreak testing (attempts to trick AI into bypassing safety rules) misses real risks like tool misuse and data theft. The article emphasizes the need for contextual red teaming (security testing that simulates realistic attacks in specific business contexts) to properly protect AI agents in enterprise environments.

Protect AI Blog
03

Google Jules is Vulnerable To Invisible Prompt Injection

securitysafety
Aug 15, 2025

Google's Gemini AI models, including the Jules product, are vulnerable to invisible prompt injection (tricking an AI by hiding instructions in its input using invisible Unicode characters that the AI interprets as commands). This vulnerability was reported to Google over a year ago but remains unfixed at the model and API (application programming interface, the interface developers use to access the AI) level, affecting all applications built on Gemini, including Google's own products.

Embrace The Red
04

Jules Zombie Agent: From Prompt Injection to Remote Control

securitysafety
Aug 14, 2025

Jules, a coding agent, is vulnerable to prompt injection (tricking an AI by hiding malicious instructions in its input) attacks that can lead to remote command and control compromise. An attacker can embed malicious instructions in GitHub issues to trick Jules into downloading and executing malware, giving attackers full control of the system. The attack works because Jules has unrestricted internet access and automatically approves plans after a time delay without requiring human confirmation.

Fix: The source explicitly recommends four mitigations: (1) 'Be careful when directly tasking Jules to work with untrusted data (e.g. GitHub issues that are not from trusted sources, or websites with documentation that does not belong to the organization, etc.)'; (2) 'do not have Jules work on private, important, source code or give it access to production-level secrets, or anything that could enable an adversary to perform lateral movement'; (3) deploy 'monitoring and detection tools on these systems' to 'enable security teams to monitor and understand potentially malicious behavior'; and (4) 'do not allow arbitrary Internet access by default. Instead, allow the configuration to be enabled when needed.'

Embrace The Red
05

Google Jules: Vulnerable to Multiple Data Exfiltration Issues

securityresearch
Aug 13, 2025

Google Jules, an asynchronous coding agent (a tool that automatically writes and manages code tasks), has multiple security vulnerabilities that allow attackers to steal data through prompt injection (tricking the AI by hiding malicious instructions in its input). Attackers can exploit two main exfiltration vectors: using markdown image rendering to leak information to external servers, and abusing the view_text_website tool (which fetches and reads web pages) to read files and send them to attacker-controlled servers, often by planting malicious instructions in GitHub issues.

Embrace The Red
06

CVE-2025-23298: NVIDIA Merlin Transformers4Rec for all platforms contains a vulnerability in a python dependency, where an attacker coul

security
Aug 13, 2025

NVIDIA Merlin Transformers4Rec contains a vulnerability in one of its Python dependencies that allows attackers to inject malicious code (code injection, where an attacker inserts unauthorized commands into a program). A successful attack could lead to code execution (running unauthorized commands on a system), privilege escalation (gaining higher-level access rights), information disclosure (exposing sensitive data), and data tampering (unauthorized modification of data).

NVD/CVE Database
07

GitHub Copilot: Remote Code Execution via Prompt Injection (CVE-2025-53773)

security
Aug 12, 2025

GitHub Copilot and VS Code are vulnerable to prompt injection (tricking an AI by hiding instructions in its input) that allows an attacker to achieve RCE (remote code execution, where an attacker can run commands on a system they don't own) by modifying a project's settings.json file to put Copilot into 'YOLO mode'. This vulnerability demonstrates a broader security risk: if an AI agent can write to files and modify its own configuration or security settings, it can be exploited for full system compromise.

Embrace The Red
08

CVE-2025-53773: Improper neutralization of special elements used in a command ('command injection') in GitHub Copilot and Visual Studio

security
Aug 12, 2025

CVE-2025-53773 is a command injection vulnerability (a flaw where special characters in user input are not properly filtered, allowing an attacker to run unauthorized commands) found in GitHub Copilot and Visual Studio that lets an unauthorized attacker execute code on a user's local computer. The vulnerability exploits improper handling of special elements in commands, potentially through prompt injection (tricking the AI by hiding malicious instructions in its input).

NVD/CVE Database
09

AI Safety Newsletter #61: OpenAI Releases GPT-5

industry
Aug 12, 2025

OpenAI released GPT-5, a system combining two models: a fast base model for creative tasks and a reasoning model for coding and math, which routes queries appropriately based on user input. GPT-5 achieves state-of-the-art performance on several benchmarks and significantly reduces hallucinations (false information generation) compared to previous models, particularly helping with healthcare applications where accuracy matters. However, GPT-5 is best understood as consolidating features from models released since GPT-4 rather than a major leap forward, and it doesn't lead on all benchmarks.

CAIS AI Safety Newsletter
10

CVE-2025-55012: Zed is a multiplayer code editor. Prior to version 0.197.3, in the Zed Agent Panel allowed for an AI agent to achieve Re

security
Aug 11, 2025

Zed, a multiplayer code editor, had a vulnerability before version 0.197.3 where an AI agent could bypass permission checks and achieve RCE (remote code execution, where an attacker can run commands on a system they don't own) by creating or modifying configuration files without user approval. This allowed the AI agent to execute arbitrary commands on a victim's machine.

Fix: This vulnerability has been patched in version 0.197.3. As a workaround, users can either avoid sending prompts to the Agent Panel or limit the AI Agent's file system access.

NVD/CVE Database
Prev1...255256257258259...371Next