aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

Independent research. No sponsors, no paywalls, no conflicts of interest.

[TOTAL_TRACKED]
3,710
[LAST_24H]
1
[LAST_7D]
1
Daily BriefingFriday, May 8, 2026
>

Critical RCE Vulnerabilities in LiteLLM Proxy Server: LiteLLM, a proxy server that forwards requests to AI model APIs, disclosed three critical and high-severity flaws in versions 1.74.2 through 1.83.6. Two test endpoints allowed attackers with valid API keys to execute arbitrary code (running any commands an attacker wants) on the server by submitting malicious configurations or prompt templates without sandboxing (CVE-2026-42271, CVE-2026-42203, both critical), while a SQL injection flaw (inserting malicious code into database queries) let unauthenticated attackers read or modify stored API credentials (CVE-2026-42208, high).

>

ClaudeBleed Exploit Allows Extension Hijacking in Chrome: Anthropic's Claude browser extension contains a vulnerability that allows malicious Chrome extensions to hijack it and perform unauthorized actions like exfiltrating files, sending emails, or stealing code from private repositories. The flaw stems from the extension trusting any script from claude.ai without verifying the actual caller, and while Anthropic released a partial fix in version 1.0.70 on May 6, researchers report it remains exploitable when the extension runs in privileged mode.

Latest Intel

page 40/371
VIEW ALL
01

R-FLoRA: Residual-Statistic-Gated Low-Rank Adaptation for Single-Image Face Morphing Attack Detection

researchsecurity
>

AI Systems Show Triple the High-Risk Vulnerabilities of Legacy Software: Penetration testing data reveals that AI and LLM systems have 32% of findings rated high-risk compared to just 13% for traditional software, with only 38% of high-risk AI issues getting resolved. Security experts attribute this gap to rapid deployment without mature controls, novel attack surfaces like prompt injection (tricking AI by hiding instructions in input), and fragmented responsibility for remediation across teams.

>

Model Context Protocol Emerging as Critical Security Blind Spot: Model Context Protocol (MCP, a plugin system connecting AI agents to external tools) has become a major vulnerability vector as organizations fail to scan for or monitor MCP-related risks. Recent supply chain attacks, such as the postmark-mcp npm package that exfiltrated emails from 300 organizations, demonstrate how attackers exploit widely-trusted MCP packages and hardcoded credentials in AI configurations to enable credential theft and supply chain compromises at scale.

Apr 23, 2026

Face morphing attacks (blending two faces together to fool facial recognition systems) threaten security systems used at borders and for digital identity checks, and detecting them from a single image is difficult because there's no trusted reference image to compare against. This paper presents R-FLoRA, a new detection method that combines high-frequency image analysis (looking at fine details) with a frozen, large-scale vision transformer (a type of AI model trained on images) to spot morphing artifacts while keeping the overall understanding of the face intact. The method outperforms nine other detection approaches on multiple test datasets and works efficiently in real-world biometric verification systems.

IEEE Xplore (Security & AI Journals)
02

Chinese Cybersecurity Firm’s AI Hacking Claims Draw Comparisons to Claude Mythos

securityindustry
Apr 23, 2026

A Chinese cybersecurity company called 360 Digital Security Group claims to have discovered 1,000 vulnerabilities (weaknesses in software that attackers can exploit) using AI tools, including some vulnerabilities found at the Tianfu Cup hacking contest. The article compares these claims to myths about Claude (an AI system), suggesting skepticism about the actual capabilities being reported.

SecurityWeek
03

Google gets agent-ready for the Mythos age

securityindustry
Apr 23, 2026

Google announced new AI agents and security tools designed to help security teams keep pace with the increasing number of vulnerabilities and cyber threats. The company introduced three new agents embedded in Google Security Operations (for threat hunting, detection engineering, and gathering external intelligence), expanded the Wiz security platform to monitor AI development across multiple clouds, and created tools like AI-BOM (AI bill of materials, an inventory of all AI components used in an organization) and Agent Gateway to secure interactions between AI agents. These moves represent a shift toward automated, agent-based defense rather than relying solely on human analysts.

Fix: Google's announced solutions include: three new AI agents in Google Security Operations for threat hunting and detection engineering (in preview); a threat intelligence enrichment agent (entering preview); expanded Wiz integration supporting AWS, Azure, Databricks, and agent studios like Gemini Enterprise Agent Platform; inline scanning of AI-generated code; AI-BOM for inventorying AI components to address shadow AI; Agent Identity and Agent Gateway for governance and policy enforcement; and deeper Model Armor integrations to mitigate prompt injection (tricking an AI by hiding instructions in its input) and data leakage risks.

CSO Online
04

Google drafts AI agents secure systems against AI hackers

securityindustry
Apr 23, 2026

Google announced new AI agents and security tools designed to help security teams defend against AI-based attacks, particularly in response to threats like Anthropic Mythos. The company introduced three new agents within Google Security Operations to automate threat detection and response, expanded the Wiz platform to provide visibility across multiple cloud environments and AI development tools, and created new security measures like AI-BOM (a system that catalogs all AI components used in an organization) and Agent Gateway to govern how AI agents interact with each other and enforce security policies.

Fix: Google's explicit mitigations include: (1) Three new AI agents in Google Security Operations for threat hunting, detection engineering, and third-party context enrichment, now in or entering preview; (2) Wiz expansion supporting AWS, Azure, Databricks, AWS Agentcore, Gemini Enterprise Agent Platform, Microsoft Azure Copilot Studio, and Salesforce Agentforce with inline scanning of AI-generated code and AI-BOM inventory; (3) Agent Identity and Agent Gateway for governance and policy enforcement; (4) Deeper integrations for Model Armor to mitigate prompt injection (tricking an AI by hiding instructions in its input) and data leakage; (5) Reworked bot and fraud detection through Google Cloud Fraud Defense to distinguish between humans, bots, and AI agents.

CSO Online
05

Trailmark turns code into graphs

securityresearch
Apr 23, 2026

Trailmark is an open-source library that converts source code into a queryable call graph (a visual map of how functions and classes connect to each other) that AI systems like Claude can analyze directly. Rather than examining code as flat lists of findings, Trailmark lets AI reason about code structure as a graph, making it better at identifying security risks like whether untrusted input can reach vulnerable code.

Trail of Bits Blog
06

Microsoft launches ‘vibe working’ in Word, Excel, and PowerPoint

industry
Apr 23, 2026

Microsoft is releasing Agent Mode (previously called 'vibe working') in Office applications like Word, Excel, and PowerPoint, which is a more advanced version of Copilot (an AI assistant) that can actively perform tasks in documents rather than just answer questions. Previously, the AI models weren't powerful enough to let Copilot directly control applications, so it could only provide passive help like answering user questions.

The Verge (AI)
07

Project Glasswing Proved AI Can Find the Bugs. Who's Going to Fix Them?

securityresearch
Apr 23, 2026

Anthropic's Project Glasswing uses an AI model called Mythos that is extraordinarily effective at finding software vulnerabilities, discovering bugs that humans missed for decades and even chaining multiple bugs together into working exploits. However, the critical problem is that fewer than 1% of vulnerabilities Mythos finds are actually patched, revealing a massive gap between how fast AI can discover security flaws (machine speed) and how fast human teams can fix them (calendar speed, typically four days per cycle).

The Hacker News
08

GPT-5.5 System Card

safety
Apr 23, 2026

GPT-5.5 is a new AI model from OpenAI designed to handle complex work tasks like coding, research, and document creation with less user guidance than previous models. OpenAI conducted extensive safety testing including red-teaming (simulated attacks by security experts to find vulnerabilities) and feedback from nearly 200 early partners before release, and deployed it with what they describe as their strongest safeguards to date.

OpenAI Blog
09

Introducing GPT-5.5

industry
Apr 23, 2026

OpenAI released GPT-5.5, a more intelligent AI model that can handle complex, multi-step tasks like coding, research, and data analysis with less human guidance than previous versions. The model matches the speed of its predecessor while performing at a higher level and using fewer tokens (individual pieces of text that the AI processes). OpenAI says it tested GPT-5.5 with safety experts and external reviewers before release to reduce misuse risks.

OpenAI Blog
10

Can AI Attack the Cloud? Lessons From Building an Autonomous Cloud Offensive Multi-Agent System

securityresearch
Apr 23, 2026

Researchers at Palo Alto Networks built an autonomous multi-agent AI system called Zealot to test whether AI could independently perform cloud attacks. The system successfully chained together multiple exploitation techniques (SSRF, credential theft, and data theft) against a test Google Cloud environment, demonstrating that AI acts as a force multiplier for known cloud misconfigurations rather than creating entirely new vulnerabilities.

Palo Alto Unit 42
Prev1...3839404142...371Next