aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Browse All

All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.

to
Export CSV
6399 items

Anthropic was the Pentagon's choice for AI. Now it's banned and experts are worried

infonews
policyindustry
Mar 9, 2026

The U.S. Defense Department banned Anthropic's AI models after a review by Pentagon technology leadership, designating the company a supply chain risk (a classification historically reserved for foreign adversaries) and requiring defense contractors to certify they don't use its technology. The decision surprised many officials who considered Anthropic's models superior and had deployed them in classified military networks, and defense experts worry it sets a troubling precedent while removing a trusted AI vendor that military personnel relied on.

CNBC Technology

GHSA-v359-jj2v-j536: vLLM has SSRF Protection Bypass

mediumvulnerability
security
Mar 9, 2026
CVE-2026-25960

vLLM has a bypass in its SSRF (server-side request forgery, where an attacker tricks a server into making requests to unintended targets) protection because the validation layer and the HTTP client parse URLs differently. The validation uses urllib3, which treats backslashes as literal characters, but the actual requests use aiohttp with yarl, which interprets backslashes as part of the userinfo section. An attacker can craft a URL like `https://httpbin.org\@evil.com/` that passes validation for httpbin.org but actually connects to evil.com.

Anthropic sues US government for calling it a risk

infonews
policy
Mar 9, 2026

Anthropic, an AI company, sued the US government after being labeled a 'supply chain risk' (a designation meaning a company's tools are considered unsafe for government use) in retaliation for refusing to remove safety restrictions on military use of its AI tools like Claude. The company argues the government's actions violate its free speech rights and are unlawful, claiming it had been negotiating compromises with the Defense Department before the administration publicly criticized the company and directed all agencies to stop using its tools.

Anthropic launches code review tool to check flood of AI-generated code

infonews
industry
Mar 9, 2026

Anthropic launched Code Review, an AI tool that automatically checks pull requests (code change submissions for review) to catch bugs and security issues before they enter the codebase. The tool integrates with GitHub, uses multiple AI agents working in parallel to analyze code from different angles, and provides step-by-step explanations of potential problems with color-coded severity levels to help developers prioritize fixes.

OpenAI to buy cybersecurity startup Promptfoo to better safeguard AI agents

infonews
industrysecurity

OpenAI acquires Promptfoo to secure its AI agents

infonews
securityindustry

Anthropic is suing the Department of Defense

infonews
policysafety

AI firm Anthropic sues US defense department over blacklisting

infonews
policy
Mar 9, 2026

Anthropic, an AI company, is suing the US Department of Defense after being labeled a 'supply chain risk' (a designation meaning the government considers the company a potential threat to national security in government contracts). The lawsuit claims this blacklisting is unlawful and violates free speech rights, stemming from a dispute over Anthropic's safety measures designed to prevent the military from using its AI models for mass surveillance or fully autonomous weapons.

Anthropic sues Trump administration over Pentagon blacklist

inforegulatory
policy
Mar 9, 2026

Anthropic, an AI company, sued the Trump administration after being blacklisted and designated a supply chain risk (a classification usually reserved for foreign threats), which prevents the Pentagon and its contractors from using the company's AI models. The lawsuit claims the blacklist is unlawful and is causing irreparable harm by canceling government contracts and jeopardizing hundreds of millions of dollars in business. The conflict arose from disagreement over how Anthropic's AI should be used, with the Department of Defense wanting unrestricted access while Anthropic wanted safeguards against fully autonomous weapons and domestic mass surveillance.

Anthropic sues Defense Department over supply chain risk designation

inforegulatory
policy
Mar 9, 2026

Anthropic, a company that makes Claude (an AI assistant), is suing the Department of Defense after the agency labeled it a "supply chain risk," which prevents other companies and government agencies from using Anthropic's AI models. The conflict started because Anthropic refused to give the Pentagon unrestricted access to its technology, citing concerns about mass surveillance of Americans and fully autonomous weapons that make targeting decisions without human input. Anthropic argues the DOD's actions violate free speech protections in the Constitution.

X says you can block Grok from editing your photos

infonews
safety
Mar 9, 2026

X has added a toggle in its iOS app that claims to block Grok (an AI chatbot) from editing your photos, but the feature has a major limitation. According to the fine print, it only prevents users from tagging @Grok in replies to your images on X, rather than actually stopping Grok from editing your photos.

The Download: murky AI surveillance laws, and the White House cracks down on defiant labs

infonews
policysecurity

Evaluation of Phishing Attacks Targeting Local Systems Using an Attribute-Based Dataset and Machine Learning Methods

inforesearchPeer-Reviewed
research

Practical Differential Fault Attacks on the GPRS Standard Ciphers

inforesearchPeer-Reviewed
security

Defending PoW Blockchains Against Game-Theoretic DoS Attacks: A Rational Strategy Analysis

inforesearchPeer-Reviewed
security

SeVoAuth: Secure Voiceprint Authentication With Hash-Based Feature Transformation

inforesearchPeer-Reviewed
security

Cracks in Collaboration: Threat Models and Attacks on Multi-LLM Collaborative Systems

inforesearchPeer-Reviewed
security

AGFPS: An Automated Gradient-Free Framework for Prompt Stealing

inforesearchPeer-Reviewed
security

Robustness Over Time: Understanding Adversarial Examples’ Effectiveness on Longitudinal Versions of Large Language Models

inforesearchPeer-Reviewed
security

Your Non-Transferable Learning is Fragile: Practical Breach of Protected Models

inforesearchPeer-Reviewed
security
Previous175 / 320Next
GitHub Advisory Database
BBC Technology

Fix: Anthropic's Code Review tool is the solution presented in the source. It integrates with GitHub and automatically analyzes pull requests, leaving comments on code explaining potential issues and suggested fixes. Engineering leads can enable it to run by default for all team members. The tool focuses on logical errors (not style issues), uses color-coded severity labels (red for highest severity, yellow for potential problems, purple for issues tied to preexisting code), and provides a light security analysis. Additional customized checks can be configured based on internal best practices, with deeper security analysis available through Claude Code Security.

TechCrunch
Mar 9, 2026

OpenAI is acquiring Promptfoo, a cybersecurity startup that provides tools to test and secure AI systems, particularly as AI agents (autonomous programs that can take actions) become more connected to real data and systems. Promptfoo's security tools will be integrated into OpenAI's Frontier platform, and OpenAI will continue supporting Promptfoo's open-source project that helps developers test different AI prompts and compare large language models (AI systems trained on massive amounts of text data).

CNBC Technology
Mar 9, 2026

OpenAI acquired Promptfoo, an AI security startup, to integrate its technology into OpenAI's enterprise platform for protecting AI agents from attacks. Promptfoo develops tools that help companies test security vulnerabilities in LLMs (large language models, the AI systems behind chatbots), addressing growing concerns that autonomous AI agents could be exploited to steal data or manipulate systems.

Fix: According to the source, Promptfoo's technology will be integrated into OpenAI Frontier to perform automated red-teaming (simulated attacks to find weaknesses), evaluate AI workflows for security concerns, and monitor activities for risks and compliance needs. OpenAI also stated it expects to continue building out Promptfoo's open source offering.

TechCrunch (Security)
Mar 9, 2026

Anthropic, a major AI company, is suing the US Department of Defense after being labeled a supply-chain risk (a company whose products or services might pose security threats if compromised). The lawsuit claims the Trump administration retaliated against Anthropic for refusing to remove safety restrictions on its AI systems, particularly regarding mass surveillance and fully autonomous weapons (systems that make lethal decisions without human involvement).

The Verge (AI)
The Guardian Technology
CNBC Technology
TechCrunch
The Verge (AI)
Mar 9, 2026

Current US laws have not kept pace with AI capabilities, creating legal ambiguity around whether the government can conduct mass surveillance on Americans using AI systems. A dispute between the Department of Defense and AI company Anthropic has exposed this gap, with the White House responding by issuing new guidelines requiring AI companies to allow 'any lawful' use of their models, though questions about what is actually lawful remain unanswered.

MIT Technology Review
security
Mar 9, 2026

Phishing attacks are a form of social engineering (tricking people into revealing secrets by pretending to be trustworthy) that trick users into visiting fake websites that look like real ones to steal sensitive information. Researchers created a new dataset with 31 attributes (measurable characteristics) derived from URLs and similarity features, then tested multiple machine learning algorithms (computer programs that learn patterns from data) on it to detect these attacks. The Logistic Regression method achieved 96.40% accuracy at detecting phishing, showing that this approach works well for protecting local systems in real-world situations.

IEEE Xplore (Security & AI Journals)
Mar 9, 2026

Researchers demonstrated a practical differential fault attack (an exploit that deliberately introduces errors into a system to extract secrets) against GEA-1 and GEA-2, the stream ciphers (algorithms that encrypt data bit-by-bit) used to protect GPRS (General Packet Radio Service, a mobile data standard) communications between phones and base stations. By identifying the exact location where faults occur in the cipher, attackers can recover the 64-bit secret keys in about 16 minutes on a standard laptop. Many current phones still support these outdated ciphers, making them vulnerable.

IEEE Xplore (Security & AI Journals)
Mar 9, 2026

Game-theoretic DoS attacks (GDoS, attacks that exploit miners' financial incentives) can damage proof-of-work blockchains (like Bitcoin, which uses computational puzzles to secure transactions) even when attackers control less than 20% of the network's computing power. Rather than changing the blockchain protocol itself, researchers propose a cooperative defense where miners temporarily move their computing resources to larger mining pools during attacks to maintain their earnings and discourage attackers.

Fix: The source proposes a 'cooperative hash-power hopping mechanism in which miners temporarily reallocate hash power to larger pools when under attack to preserve expected payoffs and suppress attacker incentives.' Simulations show this strategy 'reduces attacker revenue gains by more than 20% and prevents throughput degradation across the entire attack range.' However, this is a theoretical proposal presented in a research paper, not an implemented or deployed mitigation in existing systems.

IEEE Xplore (Security & AI Journals)
research
Mar 9, 2026

SeVoAuth is a cloud-based voiceprint authentication system (a security method that recognizes users by their unique voice characteristics) designed to protect user privacy while defending against replay attacks (replaying a recorded voice), spoofing (faking a voice), and adversarial attacks (manipulating input to fool the system). The system stores a synthesized version of a user's voice in the cloud and uses hash functions (mathematical functions that transform data into fixed-size codes) to continuously change the verification targets during each login, making it difficult for attackers to reuse old voice recordings or tricks.

IEEE Xplore (Security & AI Journals)
research
Mar 9, 2026

Multi-LLM collaborative systems (setups where multiple AI models work together on complex tasks) can be attacked through three new methods: Decision Poisoning Attack (injecting false instructions to manipulate system output), Indirect Echoleak Attack (extracting private information through model interactions), and Information Collision Attack (exploiting communication between models). While these collaborative systems offer flexibility and better reasoning, their internal communication channels create security and privacy vulnerabilities that attackers can exploit.

IEEE Xplore (Security & AI Journals)
research
Mar 9, 2026

AGFPS is a new attack method that steals system prompts (the hidden instructions that control how an LLM behaves) from deployed AI applications by using evolutionary optimization (a technique that mimics natural selection to find solutions) instead of gradient-based methods. The researchers demonstrated that their approach successfully extracted prompts 95.2% of the time and worked better than previous methods, highlighting serious security weaknesses in how LLMs are currently deployed.

IEEE Xplore (Security & AI Journals)
research
Mar 9, 2026

Researchers studied how well different versions of major LLMs (like GPT, Llama, and Qwen) resist adversarial attacks, which are inputs designed to trick AI systems into making mistakes, ignoring safety guidelines, or producing false information. They found that newer versions of these models don't always become more resistant to these attacks, and that simply making models larger doesn't guarantee better security.

IEEE Xplore (Security & AI Journals)
research
Mar 9, 2026

Researchers developed a new attack called Distribution Drift Learner (DDL) that can break through non-transferable learning (NTL, a method that prevents AI models from being adapted to new tasks to protect their intellectual property) by only observing the model's input and output responses. The attack works by manipulating how data is distributed across domains and reconstructing training samples, successfully increasing accuracy on protected models from 10% to 81%, exposing serious weaknesses in current model protection strategies.

IEEE Xplore (Security & AI Journals)