aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

Independent research. No sponsors, no paywalls, no conflicts of interest.

[TOTAL_TRACKED]
3,710
[LAST_24H]
1
[LAST_7D]
1
Daily BriefingFriday, May 8, 2026
>

Critical RCE Vulnerabilities in LiteLLM Proxy Server: LiteLLM, a proxy server that forwards requests to AI model APIs, disclosed three critical and high-severity flaws in versions 1.74.2 through 1.83.6. Two test endpoints allowed attackers with valid API keys to execute arbitrary code (running any commands an attacker wants) on the server by submitting malicious configurations or prompt templates without sandboxing (CVE-2026-42271, CVE-2026-42203, both critical), while a SQL injection flaw (inserting malicious code into database queries) let unauthenticated attackers read or modify stored API credentials (CVE-2026-42208, high).

>

ClaudeBleed Exploit Allows Extension Hijacking in Chrome: Anthropic's Claude browser extension contains a vulnerability that allows malicious Chrome extensions to hijack it and perform unauthorized actions like exfiltrating files, sending emails, or stealing code from private repositories. The flaw stems from the extension trusting any script from claude.ai without verifying the actual caller, and while Anthropic released a partial fix in version 1.0.70 on May 6, researchers report it remains exploitable when the extension runs in privileged mode.

Latest Intel

page 61/371
VIEW ALL
01

The deepfake dilemma: From financial fraud to reputational crisis

securitysafety
>

AI Systems Show Triple the High-Risk Vulnerabilities of Legacy Software: Penetration testing data reveals that AI and LLM systems have 32% of findings rated high-risk compared to just 13% for traditional software, with only 38% of high-risk AI issues getting resolved. Security experts attribute this gap to rapid deployment without mature controls, novel attack surfaces like prompt injection (tricking AI by hiding instructions in input), and fragmented responsibility for remediation across teams.

>

Model Context Protocol Emerging as Critical Security Blind Spot: Model Context Protocol (MCP, a plugin system connecting AI agents to external tools) has become a major vulnerability vector as organizations fail to scan for or monitor MCP-related risks. Recent supply chain attacks, such as the postmark-mcp npm package that exfiltrated emails from 300 organizations, demonstrate how attackers exploit widely-trusted MCP packages and hardcoded credentials in AI configurations to enable credential theft and supply chain compromises at scale.

Apr 15, 2026

Deepfake technology (AI-generated fake audio or video of people) has become cheap, accessible, and realistic enough to fool many employees and executives, with 43% of cybersecurity leaders experiencing audio deepfakes and 37% experiencing video deepfakes in 2025. Deepfakes are now used for financial fraud (by impersonating executives to approve fund transfers) and reputational attacks (by spreading false videos to damage trust with investors and customers), and traditional ways of spotting fakes, like looking for obvious flaws, no longer work reliably.

CSO Online
02

The next evolution of the Agents SDK

industry
Apr 15, 2026

OpenAI introduced new capabilities to the Agents SDK, a toolkit for developers building AI agents that can work with files and run commands on computers. The update includes a model-native harness (a framework optimized for OpenAI models) and native sandbox execution (a controlled, isolated computer environment where agents can safely run code and access files). The SDK aims to bridge the gap between flexibility and production-readiness by providing developers with standardized infrastructure that keeps agents aligned with how frontier models (the most advanced AI models available) work best.

Fix: The Agents SDK includes several built-in protections: 'Separating harness and compute helps keep credentials out of environments where model-generated code executes.' The SDK also supports 'built-in snapshotting and rehydration' so 'the Agents SDK can restore the agent's state in a fresh container and continue from the last checkpoint if the original environment fails or expires.' Additionally, developers can configure sandbox execution with 'Blaxel, Cloudflare, Daytona, E2B, Modal, Runloop, and Vercel' providers, and the SDK provides a 'Manifest abstraction for describing the agent's workspace' to control access to files and data.

OpenAI Blog
03

Mallory Launches AI-Native Threat Intelligence Platform, Turning Global Threat Data Into Prioritized Action

securityindustry
Apr 15, 2026

Mallory is a new AI-powered threat intelligence platform (a system that gathers and analyzes information about cyber threats) designed to help security teams quickly understand which threats are actually dangerous to their organization. Instead of overwhelming teams with alerts, the platform analyzes thousands of threat sources, checks them against each company's specific vulnerabilities, and provides prioritized actions that security teams can take immediately.

CSO Online
04

OWASP GenAI Exploit Round-up Report Q1 2026

security
Apr 15, 2026

A Q1 2026 security report by OWASP documents major AI and agentic AI (AI systems that can take autonomous actions) exploits, showing a shift from theoretical risks to real-world attacks targeting AI agent identities, permissions, and supply chains. Key incidents include a Mexican government breach where attackers used Claude to automate reconnaissance and exploitation, affecting 150 GB of sensitive data, along with other incidents involving prompt injection (tricking AI by hiding malicious instructions in its input), privilege abuse, and supply-chain vulnerabilities in AI tools.

OWASP GenAI Security
05

OpenAI Launches GPT-5.4-Cyber with Expanded Access for Security Teams

securityindustry
Apr 15, 2026

OpenAI launched GPT-5.4-Cyber, a specialized AI model designed to help security teams find and fix vulnerabilities faster, while expanding access through its Trusted Access for Cyber program to thousands of defenders and hundreds of teams. The company acknowledged that AI models are dual-use tools (meaning they can be repurposed for both good and bad purposes) and that adversaries could potentially reverse-engineer the model to find exploitable vulnerabilities before they're fixed, so OpenAI plans to scale defenses alongside access by strengthening safeguards against jailbreaks (techniques to bypass safety restrictions) and adversarial prompt injections (tricking an AI by hiding malicious instructions in its input).

Fix: OpenAI's stated approach includes: (1) a deliberate, iterative rollout of access to minimize misuse, (2) strengthening safeguards through ongoing work against jailbreaks and adversarial prompt injections as model capabilities advance, and (3) integrating advanced coding models and agentic capabilities (AI systems that can take independent actions to solve problems) into developer workflows to enable immediate feedback during the software development process, shifting security from occasional audits to continuous, ongoing risk reduction.

The Hacker News
06

CVE-2026-39884: mcp-server-kubernetes is a Model Context Protocol server for Kubernetes cluster management. Versions 3.4.0 and prior con

security
Apr 15, 2026

mcp-server-kubernetes versions 3.4.0 and earlier have an argument injection vulnerability (a type of attack where an attacker sneaks extra commands into a tool by exploiting how input is processed) in the port_forward tool. The vulnerability exists because the code builds a kubectl command (a tool for managing Kubernetes clusters) by concatenating strings with user input and splitting on spaces, instead of using a safer array-based method like other tools in the codebase. This allows attackers to inject malicious kubectl flags to expose internal services or target resources in unintended ways.

Fix: Update to version 3.5.0, which fixes this issue.

NVD/CVE Database
07

Curity looks to reinvent IAM with runtime authorization for AI agents

securitypolicy
Apr 14, 2026

Traditional identity and access management (IAM) tools, which control who can access systems and resources, were not designed to secure AI agents (autonomous software programs that perform tasks independently), which operate at high speed with unpredictable access patterns. Curity announced Access Intelligence, a new security layer that grants agent permissions at runtime (during execution, not beforehand) and uses OAuth tokens (credentials that allow access to specific resources) to carry information about each agent's purpose, ensuring agents can only access resources matching their intended task.

CSO Online
08

GHSA-7xjm-g8f4-rp26: Giskard has Unsandboxed Jinja2 Template Rendering in ConformityCheck

security
Apr 14, 2026

The `ConformityCheck` class in giskard-checks was automatically treating the `rule` parameter as a Jinja2 template (a template language that evaluates expressions), which could allow arbitrary code execution if check definitions came from untrusted sources. While the library is only used locally by developers, this hidden behavior made it easy to accidentally pass untrusted input without realizing expressions would be evaluated.

Fix: Upgrade to `giskard-checks` >= 1.0.2b1. The patched version removes template rendering from rule evaluation entirely.

GitHub Advisory Database
09

GHSA-rq2q-4r55-9877: Giskard has a Regular Expression Denial of Service (ReDoS) in RegexMatching Check

security
Apr 14, 2026

The RegexMatching check in giskard-checks has a ReDoS vulnerability (regular expression denial of service, where a specially crafted regex pattern causes the regex engine to hang by backtracking excessively through text). An attacker with write access to check definitions can craft malicious regex patterns that make the testing process hang indefinitely, disrupting automated testing environments like CI/CD pipelines (continuous integration/continuous deployment automation).

Fix: Upgrade to giskard-checks >= 1.0.2b1.

GitHub Advisory Database
10

Secure AI agent access patterns to AWS resources using Model Context Protocol

securitypolicy
Apr 14, 2026

AI agents access AWS resources through the Model Context Protocol (MCP, a system that lets AI tools interact with cloud services), but unlike traditional software with predictable behavior, agents can dynamically choose different actions based on context. The main security risk is that agents operate at machine speed and will use any permissions (IAM roles, API keys, or OAuth scopes) they're granted, so misconfigured access controls can cause large-scale damage quickly. The source recommends three security principles for controlling AI agent access to AWS resources, with an emphasis on using MCP servers rather than direct API access because MCP provides better monitoring and control.

Fix: The source recommends architecting agents to use MCP servers rather than direct service access where possible, because MCP servers provide a layer of abstraction that enables differentiation controls and creates additional monitoring capabilities through AWS CloudTrail. For agents on developer machines, developers should configure which AWS credentials the agent uses in their mcp.json file by specifying a named profile (which can use credential helpers and the credential provider chain for short-lived credentials), environment variables, or explicit credential configuration, rather than allowing agents to inherit broad developer admin credentials.

AWS Security Blog
Prev1...5960616263...371Next