aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

Independent research. No sponsors, no paywalls, no conflicts of interest.

[TOTAL_TRACKED]
3,710
[LAST_24H]
1
[LAST_7D]
9
Daily BriefingFriday, May 8, 2026
>

Critical RCE Vulnerabilities in LiteLLM Proxy Server: LiteLLM, a proxy server that forwards requests to AI model APIs, disclosed three critical and high-severity flaws in versions 1.74.2 through 1.83.6. Two test endpoints allowed attackers with valid API keys to execute arbitrary code (running any commands an attacker wants) on the server by submitting malicious configurations or prompt templates without sandboxing (CVE-2026-42271, CVE-2026-42203, both critical), while a SQL injection flaw (inserting malicious code into database queries) let unauthenticated attackers read or modify stored API credentials (CVE-2026-42208, high).

>

ClaudeBleed Exploit Allows Extension Hijacking in Chrome: Anthropic's Claude browser extension contains a vulnerability that allows malicious Chrome extensions to hijack it and perform unauthorized actions like exfiltrating files, sending emails, or stealing code from private repositories. The flaw stems from the extension trusting any script from claude.ai without verifying the actual caller, and while Anthropic released a partial fix in version 1.0.70 on May 6, researchers report it remains exploitable when the extension runs in privileged mode.

Latest Intel

page 20/371
VIEW ALL
01

Here’s how the new Microsoft and OpenAI deal breaks down

industry
Apr 30, 2026

Microsoft and OpenAI have restructured their business partnership, with the key change allowing OpenAI to offer its products and services through multiple cloud providers (computing platforms that deliver software and services over the internet) instead of being limited to Microsoft's cloud. The companies maintained an amicable relationship despite previous tensions over contracts and AI infrastructure.

Critical This Week4 issues
high

GHSA-8g7g-hmwm-6rv2: n8n-mcp affected by path traversal, redirect-following SSRF, and telemetry payload exposure

GitHub Advisory DatabaseMay 8, 2026
May 8, 2026
>

AI Systems Show Triple the High-Risk Vulnerabilities of Legacy Software: Penetration testing data reveals that AI and LLM systems have 32% of findings rated high-risk compared to just 13% for traditional software, with only 38% of high-risk AI issues getting resolved. Security experts attribute this gap to rapid deployment without mature controls, novel attack surfaces like prompt injection (tricking AI by hiding instructions in input), and fragmented responsibility for remediation across teams.

>

Model Context Protocol Emerging as Critical Security Blind Spot: Model Context Protocol (MCP, a plugin system connecting AI agents to external tools) has become a major vulnerability vector as organizations fail to scan for or monitor MCP-related risks. Recent supply chain attacks, such as the postmark-mcp npm package that exfiltrated emails from 300 organizations, demonstrate how attackers exploit widely-trusted MCP packages and hardcoded credentials in AI configurations to enable credential theft and supply chain compromises at scale.

The Verge (AI)
02

Gemini is rolling out to cars with Google built-in

industry
Apr 30, 2026

Google is updating vehicles equipped with Google built-in to replace their current Google Assistant with Gemini, a more advanced AI assistant. The upgrade will be available to both new and existing vehicles through a software update, offering improvements in natural conversations, vehicle information retrieval, and settings adjustments.

The Verge (AI)
03

This startup’s new mechanistic interpretability tool lets you debug LLMs

researchsafety
Apr 30, 2026

Goodfire, a startup, has created Silico, a tool that uses mechanistic interpretability (a technique for understanding how AI models work by mapping their neurons and the connections between them) to help developers debug and adjust LLM behavior. Instead of treating model development as trial-and-error, Silico lets developers zoom into a trained model, see which neurons control specific behaviors like hallucinations (false information the AI generates), and adjust those neurons to improve or suppress certain outputs.

MIT Technology Review
04

OpenAI talks about not talking about goblins

safety
Apr 30, 2026

OpenAI discovered that its AI models were unexpectedly inserting references to goblins and other creatures into their responses, a behavior that started appearing in the GPT-5.1 model, particularly when using the "Nerdy" personality option. The company traced this quirk to patterns in the training data and added instructions to prevent the models from discussing these creatures.

The Verge (AI)
05

OpenAI tells ChatGPT models to stop talking about goblins

safety
Apr 30, 2026

OpenAI discovered that ChatGPT and other tools powered by its GPT-5 model were randomly mentioning goblins, gremlins, and other creatures in their responses, with goblin mentions increasing 175% after the GPT-5.1 launch in November. The problem stemmed from a "nerdy personality" developed during training that was rewarding mentions of these creatures in metaphors, and OpenAI found this personality was responsible for 66.7% of all goblin mentions. The issue illustrates how AI training systems can accidentally reinforce quirks and errors when they reward certain language patterns.

Fix: OpenAI said it took steps to mitigate the issue by instructing its coding agent Codex to avoid referring to goblins, gremlins, raccoons, trolls, ogres, pigeons, and other creatures "unless it is absolutely and unambiguously relevant to the user's query." The company also retired the "nerdy personality" system that had been incentivizing these mentions.

BBC Technology
06

The (In)security Landscape of AI-Powered GitHub Actions (Part 2/2)

securityresearch
Apr 30, 2026

AI-powered GitHub Actions from companies like OpenAI, Anthropic, and Google have a critical security flaw where prompt injection (tricking an AI by hiding instructions in its input) attacks can be triggered by external attackers, even when configuration settings are meant to restrict access. The vulnerability stems from these actions not properly distinguishing between trusted internal apps and untrusted external apps, allowing anyone to potentially manipulate the AI's behavior through pull requests, issues, or other user-controlled inputs.

Wiz Research Blog
07

Critical Gemini CLI Flaw Enabled Host Code Execution, Supply Chain Attacks

security
Apr 30, 2026

A critical vulnerability in Gemini CLI, an open source AI agent for terminal access to Google's Gemini, allowed attackers to execute arbitrary code on the host system by planting malicious configuration files in a workspace folder. The flaw was particularly dangerous in CI/CD pipelines (automated systems that build, test, and deploy software) because attackers could steal credentials and perform supply chain attacks (compromising software before it reaches users) by exploiting the trusted access that these pipelines have.

Fix: The vulnerability was patched by Google in both Gemini CLI and the 'run-gemini-cli' GitHub Action.

SecurityWeek
08

Max-severity RCE flaw found in Google Gemini CLI

security
Apr 30, 2026

A maximum-severity vulnerability in Google Gemini CLI allowed remote code execution (RCE, where attackers can run commands on a system they don't own) when the tool processed untrusted inputs in automated environments like CI/CD pipelines (automated workflows that test and deploy code). The flaw occurred because the CLI automatically trusted workspace configurations without verification, letting attackers inject malicious code that would execute before security protections kicked in.

Fix: The issue was fixed in @google/gemini-cli versions 0.39.1 and 0.40.0-preview.3, and in run-gemini-cli version 0.1.22. The patches removed implicit workspace trust in headless (non-interactive) environments and now require explicit trust decisions before loading workspace configurations. Additionally, the fix enforces stricter tool allowlisting (a list of permitted commands) to prevent command execution outside intended restrictions. Workflows that pin a specific gemini-cli version are advised to upgrade to a patched release and review their existing Gemini CLI configurations.

CSO Online
09

OpenAI’s new security model is for ‘critical cyber defenders’ only

securitypolicy
Apr 30, 2026

OpenAI is launching GPT-5.5-Cyber, a specialized AI model designed to help organizations defend against cyberattacks, but it will only be available to a limited group of vetted "cyber defenders" rather than the general public. The company plans to roll out access within days and will work with other organizations and government agencies to establish a trusted access system for the model.

The Verge (AI)
10

The more young people use AI, the more they hate it

industry
Apr 30, 2026

Despite heavy promotion by tech companies, young people (Gen Z) are increasingly using AI chatbots like ChatGPT while simultaneously expressing strong negative feelings toward AI technology. Polling data shows widespread cultural backlash against AI among Gen Z students and workers, even as they continue to adopt these tools.

The Verge (AI)
Prev1...1819202122...371Next
high

GHSA-cmrh-wvq6-wm9r: n8n-mcp webhook and API client paths has an authenticated SSRF

CVE-2026-44694GitHub Advisory DatabaseMay 8, 2026
May 8, 2026
high

CVE-2026-41487: Langfuse is an open source large language model engineering platform. From version 3.68.0 to before version 3.167.0, the

CVE-2026-41487NVD/CVE DatabaseMay 8, 2026
May 8, 2026
high

Claude in Chrome is taking orders from the wrong extensions

CSO OnlineMay 8, 2026
May 8, 2026