AI Sec Watch: A Security Intelligence Platform for AI Systems

Luu, T.J.

Digest Archive

Daily BriefingWednesday, April 8, 2026

Anthropic's Claude Mythos Exposes Critical Capability Jump in AI-Powered Vulnerability Discovery: Anthropic announced Project Glasswing featuring Claude Mythos, which has already discovered thousands of high-severity zero-day vulnerabilities (previously unknown security flaws) in major operating systems and browsers and demonstrated autonomous sandbox escape capabilities. The company is restricting access to a small group of major tech organizations because these powerful offensive security capabilities emerged unexpectedly from improvements to coding and reasoning skills rather than explicit training.

Critical Command Injection and Sandbox Escape Vulnerabilities in PraisonAI: PraisonAI has two critical vulnerabilities allowing attackers to inject arbitrary shell commands through YAML workflow files and LLM-generated tool calls via `subprocess.run()` with `shell=True`, and a separate sandbox escape (CVE-2026-39888) where attackers can chain exception handling attributes (`__traceback__`, `tb_frame`, `f_back`, `f_builtins`) to access Python builtins and execute arbitrary code. Both flaws enable complete system compromise through different attack vectors.

Microsoft Releases Agent Governance Toolkit Addressing OWASP AI Agent Risks: Microsoft released an open-source Agent Governance Toolkit that adds runtime security controls (protective software running during execution) to monitor and control AI agents in production, addressing ten major OWASP (Open Worldwide Application Security Project) security risks including prompt injection (tricking AI by hiding instructions in input), goal hijacking, and code execution vulnerabilities. The toolkit provides seven modular components across multiple programming languages and integrates with existing AI frameworks without requiring code rewrites.

LLM-Generated Passwords Structurally Predictable and Cryptographically Weak: Research from Irregular and Kaspersky demonstrates that frontier LLMs (large language models, AI systems trained on massive text) generate passwords that are fundamentally insecure because they assign high probability to plausible next characters based on learned patterns rather than using equal probability across all possibilities like cryptographic random number generators. When Claude Opus 4.6 generated passwords 50 times, only 30 distinct passwords emerged with one appearing 36% of the time, proving models retrieve training data patterns rather than creating true randomness.

Daily BriefingTuesday, April 7, 2026

Flowise AI Platform Under Active Exploitation: Flowise, an open-source AI agent builder, has a critical vulnerability (CVE-2025-59528, CVSS 10.0) in its CustomMCP node that allows attackers to execute arbitrary JavaScript code without validation by simply using an API token. Over 12,000 exposed instances are vulnerable and the flaw is being actively exploited in the wild, potentially leading to full system compromise.

Anthropic Launches Claude Mythos Under Restricted Access: Anthropic released Claude Mythos Preview, an AI model with exceptional vulnerability discovery capabilities that has reportedly identified thousands of high-severity zero-day vulnerabilities in major software including a 27-year-old bug in OpenBSD. The model is only available to over 40 partner organizations through Project Glasswing rather than being publicly released, due to concerns that its exploit development capabilities could be weaponized by attackers.

Daily BriefingMonday, April 6, 2026

Anthropic Claude CLI and SDK Hit with Multiple Critical Command Injection Flaws: Three vulnerabilities (CVE-2026-35022, CVE-2026-35020, CVE-2026-35021) allow attackers to execute arbitrary commands by manipulating authentication helpers, environment variables like TERMINAL, or injecting shell metacharacters (special characters that command interpreters treat as instructions) into file paths. All three affect Anthropic's Claude Code CLI and Agent SDK, enabling credential theft and unauthorized code execution with user privileges.

Broadcom Secures Expanded AI Chip Deals with Google and Anthropic: Broadcom will produce AI chips for Google and provide Anthropic access to approximately 3.5 gigawatts of compute capacity using Google's custom TPUs (tensor processing units, specialized processors designed for AI workloads), reflecting surging infrastructure demand for running generative AI at scale.

Daily BriefingSunday, April 5, 2026

Google Integrates Gemini into Maps for Itinerary Planning: Google has embedded its Gemini AI assistant directly into Google Maps to suggest locations and plan daily itineraries, moving beyond navigation into autonomous planning of user activities.

OpenClaw AI Assistant Surges in China Amid Western Tool Restrictions: OpenClaw, an open-source AI assistant, became widely adopted in China because users can customize it to work with domestic AI models, compensating for blocked access to Western tools like ChatGPT and reflecting Beijing's push for AI self-sufficiency.

Daily BriefingSaturday, April 4, 2026

Anthropic Claude Code Leak Weaponized With Malware: Hackers are reposting Anthropic's accidentally leaked Claude Code source code on GitHub with embedded infostealer malware (software that steals personal information like passwords and credentials). Anthropic has issued copyright takedowns targeting thousands of repositories hosting the leaked code.

Critical Authorization Bypass in Directus CMS Enables File Overwrite: Directus contains a high-severity flaw in its TUS resumable upload endpoint (a feature for uploading files in chunks) that allows any authenticated user to overwrite arbitrary files by specifying their UUID, completely bypassing row-level permissions and enabling potential data destruction or malicious file injection (CVE-2026-35412).

Daily BriefingFriday, April 3, 2026

Major AI Infrastructure Breach Exposes Training Data Secrets: Meta and other AI labs paused work with Mercor (a contractor hiring platform for AI training data) after hackers compromised a version of LiteLLM and stole hundreds of gigabytes of proprietary datasets that could reveal competitive secrets to rivals. The breach, likely by hacking group TeamPCP, affected thousands of organizations.

Critical Auth Bypass in LiteLLM via Cache Collision: LiteLLM's JWT authentication (a method to verify user identity using encoded tokens) could be bypassed because the system only used the first 20 characters of a token as a cache key, allowing attackers to create fake tokens matching legitimate users' cached tokens and gain their permissions. The flaw only affects deployments with JWT/OIDC authentication explicitly enabled. (CVE-2026-35030, Critical)

Daily BriefingThursday, April 2, 2026

Axios npm Package Compromised in Supply Chain Attack: A three-hour supply chain attack injected a RAT (remote access trojan, malware giving attackers shell access and command execution) into axios versions 1.14.1 and 0.30.4 on npm. The @lightdash/cli package could resolve to these compromised versions during installation if users installed versions 0.1800.0 through 0.2695.0 without a lockfile (a file that pins exact dependency versions).

Multiple Critical Vulnerabilities Found in SillyTavern AI Interface: SillyTavern, a locally installed interface for interacting with AI text generation models, had several high-severity path traversal vulnerabilities (flaws that let attackers access files outside intended directories) in versions before 1.17.0. Authenticated attackers could write malicious files, read and delete sensitive files like secrets.json, and unauthenticated users could probe for file existence anywhere on the server (CVE-2026-34522, CVE-2026-34524, CVE-2026-34523).

Daily BriefingWednesday, April 1, 2026

Claude Code Source Leaked via npm Packaging Error: Anthropic confirmed that Claude Code's source code was accidentally leaked through an npm package containing a source map file, exposing nearly 2,000 TypeScript files and over 512,000 lines of code. Users who downloaded the affected version on March 31, 2026 may have received a trojanized HTTP client (compromised software) containing malware.

AI Tool Discovers Zero-Days in Vim and GNU Emacs Within Minutes: Researcher Hung Nguyen used Anthropic's Claude Code to quickly discover zero-day exploits (previously unknown security flaws) in Vim and GNU Emacs that would allow attackers to execute arbitrary code by tricking users into opening malicious files. Claude Code generated proof-of-concept exploits (working examples of attacks) within minutes, demonstrating how AI can accelerate vulnerability discovery.

Daily BriefingTuesday, March 31, 2026

OpenAI Closes Record $122 Billion Funding Round: OpenAI raised $122 billion at an $852 billion valuation with backing from SoftBank, Amazon, and Nvidia, now serving 900 million weekly users and generating $2 billion monthly revenue as it prepares for a potential IPO despite not yet being profitable.

Multiple Critical FastGPT Vulnerabilities Disclosed: FastGPT versions before 4.14.9.5 contain three high-severity flaws including CVE-2026-34162 (unauthenticated proxy endpoint allowing unauthorized server-side requests), CVE-2026-34163 (SSRF vulnerability letting attackers scan internal networks and access cloud metadata), and issues with MCP tools endpoints that accept user URLs without validation.

Daily BriefingMonday, March 30, 2026

Anthropic's Unreleased Cybersecurity Model Accidentally Exposed: A configuration error leaked details of Anthropic's powerful new AI model called Mythos, designed for cybersecurity use cases with advanced reasoning and coding abilities including recursive self-fixing (autonomously finding and patching its own bugs). The leak raises concerns because the model's improved vulnerability detection could enable more sophisticated cyberattacks, prompting Anthropic to plan a phased rollout to enterprise security teams first.

Critical Command Injection in MLflow Model Deployment: MLflow has a command injection vulnerability (where attackers insert malicious commands into input that gets executed) in its model serving code when using `env_manager=LOCAL`, allowing attackers to execute arbitrary commands by manipulating dependency information in the `python_env.yaml` file without any safety checks. (CVE-2025-15379, Critical)

Newer9 / 14Older

Multiple Critical Vulnerabilities in AI Infrastructure: HuggingFace Transformers' Trainer class allows arbitrary code execution via malicious checkpoint files because it calls `torch.load()` without the `weights_only=True` safety parameter (CVE-2026-1839). Text-generation-webui versions before 4.3 contain multiple high-severity flaws including SSRF (server-side request forgery, tricking a server into making requests to unintended locations) in RAG extensions that can steal cloud IAM credentials, and path traversal vulnerabilities allowing unauthenticated file access (CVE-2026-35486, CVE-2026-35485). NVIDIA Triton Inference Server has several denial-of-service vulnerabilities stemming from uncaught exceptions and improper input validation (CVE-2026-24175, CVE-2026-24146, CVE-2026-24173, CVE-2026-24174).

GrafanaGhost Enables Zero-Click Data Theft: A critical vulnerability in Grafana uses indirect prompt injection (tricking an AI by hiding malicious instructions in data it processes) to exfiltrate sensitive enterprise data without requiring authentication or user interaction. The attack chains multiple exploits including bypassing URL validation and AI safety guardrails to force Grafana's AI features into sending confidential information to attacker-controlled servers.

Text-Generation-WebUI Flaw Enables Code Execution via Extension Settings: CVE-2026-35050, a critical vulnerability in the open-source LLM interface text-generation-webui (versions before 4.1.1), allowed attackers to save malicious extension settings as Python files that could overwrite core application files like 'download-model.py' and execute arbitrary code when users attempted model downloads.

Attackers Pivot to 'Living Off the AI Land' Techniques: Threat actors are increasingly abusing legitimate AI services rather than deploying traditional malware, including poisoning MCP servers (tools connecting AI assistants to external services) in supply chains, using platforms like Claude as command-and-control channels (hidden communication pathways for sending instructions to compromised systems), and hijacking AI agents to exfiltrate data or perform destructive actions.

Mobile-MCP Vulnerable to Malicious Intent Execution via Prompt Injection: The mobile_open_url tool in @mobilenext/mobile-mcp fails to validate URL schemes before passing them to Android, allowing attackers to use prompt injection (tricking an AI by hiding instructions in its input) to trigger dangerous actions like unauthorized phone calls, SMS messages, or access to private device data (CVE-2026-35394).

Directus Exposed Sensitive Credentials in Revision History: Directus failed to sanitize sensitive fields including authentication tokens, two-factor secrets, and API keys before storing them in revision history, allowing anyone with database access to read these credentials in plaintext and potentially compromise user accounts or third-party integrations.

MLflow Job Endpoints Skip Authentication Entirely: MLflow has a vulnerability where API endpoints under `/ajax-api/3.0/jobs/*` skip authentication checks even when basic-auth protection is enabled, allowing attackers to submit and run jobs without logging in and potentially achieve RCE (remote code execution, where an attacker can run commands on a system they don't own). (CVE-2026-0545, Critical)

Claude Code Security Bypass After 50 Subcommands: Claude Code skips its security checks for commands with more than 50 subcommands, asking users to approve them without proper safety analysis after the 50th. Attackers could hide malicious commands in legitimate-looking code repositories to steal credentials and compromise software projects.

Mass Credential Theft via Unpatched Next.js Vulnerability: Threat group UAT-10608 is actively exploiting React2Shell (CVE-2025-55182, a pre-authentication RCE flaw in Next.js) to steal credentials at scale, with researchers discovering the attackers' exposed dashboard showing 766 compromised hosts in 24 hours and stolen credentials from AWS, Azure, OpenAI, and GitHub.

Leaked Claude Code Source Exploited to Distribute Malware: After Anthropic's Claude Code (a terminal-based AI agent) was accidentally leaked on March 31, attackers created fake GitHub repositories that deliver Vidar infostealer malware to users searching for the leaked code. The malicious repositories use search engine optimization to appear in Google results and trick users into downloading malware disguised as the leaked source code.

Google Releases Gemma 4 Open-Source AI Models: Google DeepMind released Gemma 4, a family of open-source AI models in four sizes (2B to 31B parameters, where parameters are the trainable weights in a neural network) designed for complex reasoning and agentic workflows (AI systems that can autonomously plan and use tools). The models support multimodal processing (handling text, images, video, and audio), function-calling for tool integration, and context windows up to 256K tokens.

Critical Python Sandbox Escape in PraisonAI: PraisonAI's `execute_code()` function can be bypassed by creating a custom string subclass with an overridden `startswith()` method, allowing attackers to run arbitrary OS commands on the host system (CVE-2026-34938). This is especially dangerous because many deployments auto-approve code execution, so attackers could trigger it silently through indirect prompt injection (sneaking malicious instructions into the AI's input).

Multiple High-Severity Vulnerabilities in ONNX Format: ONNX (Open Neural Network Exchange, a standard format for sharing machine learning models) versions before 1.21.0 contain several high-severity vulnerabilities including path traversal via symlink (CVE-2026-27489, CVSS 8.7) and improper validation allowing attackers to craft malicious models that overwrite internal object properties (CVE-2026-34445). These flaws allow attackers to read arbitrary files outside intended directories or manipulate model behavior.

Claude SDK Filesystem Sandbox Escapes: Both TypeScript (CVE-2026-34451) and Python (CVE-2026-34452) versions of Claude SDK had vulnerabilities in their filesystem memory tools where attackers could use prompt injection or symlinks to access files outside intended sandbox directories, potentially reading or modifying sensitive data they shouldn't access.

Axios npm Supply Chain Attack Impacts Millions: Attackers compromised the npm account of Axios' lead maintainer and published malicious versions containing a remote access trojan (malware that gives attackers control over infected systems), affecting a library downloaded 100 million times per week and used in 80% of cloud environments before being detected and removed within hours.

Claude AI Discovers RCE Bugs in Vim and Emacs: Claude AI helped identify remote code execution vulnerabilities (where attackers can run commands on systems they don't own) in Vim and GNU Emacs text editors that trigger simply by opening a malicious file, exploiting modeline handling in Vim and automatic Git operations in Emacs.

Multiple High-Severity Flaws in AI Agent Frameworks: CrewAI has several vulnerabilities including Docker fallback issues that enable RCE (remote code execution, where attackers run commands on systems they don't control) when containerization fails (CVE-2026-2287, CVE-2026-2275), while OpenClaw suffers from malicious plugin code execution during installation and sandbox bypass flaws that let agents access other agents' workspaces. SakaDev and HAI Build Code Generator can both be tricked through prompt injection (hiding malicious instructions in normal-looking input) to misclassify dangerous terminal commands as safe and execute them automatically (CVE-2026-30306, CVE-2026-30308).

ChatGPT Data Leakage Vulnerability Patched: OpenAI fixed a vulnerability that allowed attackers to secretly extract sensitive user data including conversation messages and uploaded files by exploiting a hidden DNS-based communication channel (covert data transmission using the Domain Name System) in ChatGPT's Linux runtime, bypassing all safety guardrails designed to prevent unauthorized data sharing.