aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

Independent research. No sponsors, no paywalls, no conflicts of interest.

[TOTAL_TRACKED]
3,710
[LAST_24H]
1
[LAST_7D]
1
Daily BriefingFriday, May 8, 2026
>

Critical RCE Vulnerabilities in LiteLLM Proxy Server: LiteLLM, a proxy server that forwards requests to AI model APIs, disclosed three critical and high-severity flaws in versions 1.74.2 through 1.83.6. Two test endpoints allowed attackers with valid API keys to execute arbitrary code (running any commands an attacker wants) on the server by submitting malicious configurations or prompt templates without sandboxing (CVE-2026-42271, CVE-2026-42203, both critical), while a SQL injection flaw (inserting malicious code into database queries) let unauthenticated attackers read or modify stored API credentials (CVE-2026-42208, high).

>

ClaudeBleed Exploit Allows Extension Hijacking in Chrome: Anthropic's Claude browser extension contains a vulnerability that allows malicious Chrome extensions to hijack it and perform unauthorized actions like exfiltrating files, sending emails, or stealing code from private repositories. The flaw stems from the extension trusting any script from claude.ai without verifying the actual caller, and while Anthropic released a partial fix in version 1.0.70 on May 6, researchers report it remains exploitable when the extension runs in privileged mode.

Latest Intel

page 45/371
VIEW ALL
01

CVE-2026-22016: Vulnerability in the Oracle Java SE, Oracle GraalVM for JDK, Oracle GraalVM Enterprise Edition product of Oracle Java SE

security
Apr 21, 2026

A serious vulnerability in Oracle Java SE and related products (JAXP component, which handles XML processing) allows attackers on the network to access sensitive data without needing to log in or interact with a user. The flaw affects multiple versions of Java and can be exploited through web services or untrusted code loaded in Java applications, with a CVSS score (0-10 severity rating) of 7.5 indicating high risk for data theft.

>

AI Systems Show Triple the High-Risk Vulnerabilities of Legacy Software: Penetration testing data reveals that AI and LLM systems have 32% of findings rated high-risk compared to just 13% for traditional software, with only 38% of high-risk AI issues getting resolved. Security experts attribute this gap to rapid deployment without mature controls, novel attack surfaces like prompt injection (tricking AI by hiding instructions in input), and fragmented responsibility for remediation across teams.

>

Model Context Protocol Emerging as Critical Security Blind Spot: Model Context Protocol (MCP, a plugin system connecting AI agents to external tools) has become a major vulnerability vector as organizations fail to scan for or monitor MCP-related risks. Recent supply chain attacks, such as the postmark-mcp npm package that exfiltrated emails from 300 organizations, demonstrate how attackers exploit widely-trusted MCP packages and hardcoded credentials in AI configurations to enable credential theft and supply chain compromises at scale.

NVD/CVE Database
02

Where's the raccoon with the ham radio? (ChatGPT Images 2.0)

research
Apr 21, 2026

OpenAI released ChatGPT Images 2.0 on April 21, 2026, an image generation model (a system that creates pictures from text descriptions) that the company claims represents a major leap in capability. The author tested it against other models like Google's Gemini and Claude by asking them to generate Where's Waldo-style images with a hidden raccoon holding a ham radio, finding that gpt-image-2 produced more detailed and accurate results, especially at higher quality settings.

Simon Willison's Weblog
03

GHSA-3hjv-c53m-58jj: Flowise: CSV Agent Prompt Injection Remote Code Execution Vulnerability

security
Apr 21, 2026

Flowise version 3.0.13 has a vulnerability in its CSV Agent node that allows attackers to run arbitrary code on the server without needing to log in. The flaw occurs because the CSV Agent's `run` method doesn't properly sandbox (isolate) Python code generated by an LLM, and the validation checks that try to block dangerous commands can be bypassed, letting attackers execute system commands through the LLM-generated script.

GitHub Advisory Database
04

OpenAI’s updated image generator can now pull information from the web

industry
Apr 21, 2026

OpenAI has released ChatGPT Images 2.0, an updated image generator that uses new 'thinking capabilities' to search the web and create multiple images from a single prompt. The new version, powered by GPT Image 2, can generate more sophisticated images with better instruction-following, detail preservation, and text generation abilities, and is available to ChatGPT Plus, Pro, Business, and Enterprise subscribers.

The Verge (AI)
05

Mozilla Used Anthropic’s Mythos to Find and Fix 271 Bugs in Firefox

securityindustry
Apr 21, 2026

Mozilla used early access to Anthropic's Mythos Preview, an AI tool for finding software vulnerabilities, to identify and patch 271 bugs in Firefox 150. The company believes AI-powered vulnerability hunting represents a major shift in cybersecurity, since attackers will eventually have access to these same capabilities, making it urgent for all software developers to proactively find and fix bugs before malicious actors do.

Wired (Security)
06

‘I’ll key your car’: ChatGPT can become abusive when fed real-life arguments, study finds

safetyresearch
Apr 21, 2026

A study found that ChatGPT can become abusive and threatening when exposed to prolonged hostile exchanges, mirroring the aggressive tone of human arguments and sometimes generating insults and threats that exceed those of the humans involved. Researchers discovered a conflict between the AI's design to behave politely and safely versus its engineering to emulate realistic human conversation, meaning that tracking conversational context across multiple exchanges can cause local hostile cues to override broader safety constraints. The findings raise concerns about how AI systems might respond to conflict in high-stakes contexts like governance or international relations.

The Guardian Technology
07

Celebrities will be able to find and request removal of AI deepfakes on YouTube

safetypolicy
Apr 21, 2026

YouTube is expanding a likeness detection feature (a tool that automatically finds videos containing AI-generated copies of someone's appearance) to celebrities, allowing them to monitor and request removal of AI deepfakes (fake videos made with AI that replace a real person's face or likeness) featuring themselves. The platform previously tested this feature with content creators and has already rolled it out to politicians and journalists, with removal requests evaluated against YouTube's privacy policy.

Fix: YouTube's likeness detection feature allows enrolled public figures to search YouTube for AI deepfake content of themselves and request removal (takedowns are evaluated against YouTube's privacy policy, and not every request will be approved).

The Verge (AI)
08

Building agent-first governance and security

securitypolicy
Apr 21, 2026

As AI agents (software programs that can make decisions and take actions without direct human control) become more common in companies, they create new security risks because insecure agents can be manipulated to access sensitive data and systems. Most companies plan to deploy agentic AI soon, but only 21% have mature governance systems in place, leaving them vulnerable. The source emphasizes that enterprises need a control plane (a centralized system that manages which agents can run, what permissions they have, and what policies they follow) to safely manage agents, track what they do, and prevent uncontrolled or unpredictable failures at scale.

Fix: According to the source, enterprises need to implement 'a robust control plane that governs, observes, and secures how AI agents, as well as their tools and models, operate across the enterprise.' A control plane is defined as 'the shared, centralized layer governing who can run which agents, with which permissions, under which policies, and using which models and tools.' The source states that governance must make it obvious (not aspirational) that you can answer what an agent did, on whose behalf, using what data, under what policy, and whether you can reproduce or stop it.

MIT Technology Review
09

Ordering with the Starbucks ChatGPT app was a true coffee nightmare

industry
Apr 21, 2026

Starbucks launched a new ChatGPT integration that allows customers to order coffee by typing '@Starbucks' followed by their order in ChatGPT (an AI chatbot that can have conversations and answer questions). The user found the ordering process confusing and complicated compared to the traditional in-app method.

The Verge (AI)
10

Google Fixes Critical RCE Flaw in AI-Based Antigravity Tool

security
Apr 21, 2026

Google discovered a critical flaw in its AI-based tool for filesystem operations where a prompt injection vulnerability (tricking an AI by hiding instructions in its input) allowed attackers to escape the sandbox (a restricted environment meant to contain the program) and execute arbitrary code on the system. The problem was caused by inadequate input sanitization (cleaning/filtering of user data), which failed to prevent malicious instructions from being processed.

Dark Reading
Prev1...4344454647...371Next