aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Browse All

All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.

to
Export CSV
6328 items

⚡ Weekly Recap: Telecom Sleeper Cells, LLM Jailbreaks, Apple Forces U.K. Age Checks and More

infonews
security
Mar 30, 2026

A critical flaw in Citrix NetScaler ADC and NetScaler Gateway (CVE-2026-3055, a CVSS score of 9.3 measuring severity on a 0-10 scale) is being actively exploited to leak sensitive information through insufficient input validation, a failure to properly check data before processing it. The vulnerability only affects systems configured as SAML Identity Providers (SAML IDPs, which are services that verify user identities). Additionally, a Chinese state-sponsored group called Red Menshen deployed stealthy kernel implants called BPFDoor deep in telecom networks worldwide to secretly monitor traffic without being detected.

Fix: Rapid7 has released a scanning script designed to detect known BPFDoor variants across Linux environments.

The Hacker News

PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models

inforesearchPeer-Reviewed
safety

Differentially Private Zeroth-Order Methods for Scalable Large Language Model Fine-Tuning

inforesearchPeer-Reviewed
research

Rethinking Frequency Modeling: Tail-Aware Dynamic Adversarial Training for Long-Tailed Robustness

inforesearchPeer-Reviewed
research

When AI Trust Breaks: The ChatGPT Data Leakage Flaw That Redefined AI Vendor Security Trust

infonews
securityprivacy

LangChain path traversal bug adds to input validation woes in AI pipelines

highnews
security
Mar 30, 2026

LangChain and LangGraph, popular AI frameworks that connect AI to business systems, have critical security flaws that allow attackers to steal sensitive data like API keys and files through improper input handling. The newest vulnerability is a path traversal bug (CVE-2026-34070, a CVSS 7.5 severity rating measuring how serious a flaw is) where attackers can read files by crafting malicious input, while two older flaws enable data theft through unsafe deserialization (treating untrusted data as safe) and SQL injection (manipulating database queries). The maintainers have released fixes that need to be applied immediately to prevent exploitation.

Leak reveals Anthropic’s ‘Mythos,’ a powerful AI model aimed at cybersecurity use cases

infonews
securityindustry

APIs are the new perimeter: Here’s how CISOs are securing them

infonews
security
Mar 30, 2026

Attackers are increasingly targeting APIs (application programming interfaces, the tools that let software systems communicate with each other) instead of traditional endpoints, and many organizations have hundreds or thousands of APIs that lack proper security controls. Traditional security tools like EDR (endpoint detection and response, software that monitors computers for attacks) and WAFs (web application firewalls, systems that filter web traffic) often miss API attacks because they cannot understand the business logic being abused, and 95% of API attacks come from authenticated users with stolen credentials or API keys.

CVE-2025-15379: A command injection vulnerability exists in MLflow's model serving container initialization code, specifically in the `_

criticalvulnerability
security
Mar 30, 2026
CVE-2025-15379

MLflow has a command injection vulnerability (a type of attack where an attacker inserts malicious commands into input that gets executed) in its model serving code when deploying models with `env_manager=LOCAL`. The vulnerability occurs because MLflow reads dependency information from a file called `python_env.yaml` in the model artifact and directly uses it in a shell command without checking if it's safe, allowing an attacker to execute arbitrary commands on the system deploying the model.

Mistral secures $830 million in debt financing to fund AI data center

infonews
industry
Mar 30, 2026

Mistral, a French AI startup, secured $830 million in debt financing to build a data center powered by thousands of Nvidia graphics processing units (GPUs, specialized chips used for AI training). The new data center near Paris will support training of Mistral's large language models (LLMs, AI systems trained on vast amounts of text) and will become operational in the second quarter of 2025, with plans to expand European computing capacity to 200 MW by the end of 2027.

llm-mrchatterbox 0.1

infonews
industry
Mar 29, 2026

This item is a sponsorship announcement for an LLM (large language model) monthly briefing curated by Simon Willison, posted on March 30, 2026. It offers subscribers a $10/month email digest of important LLM developments. The announcement includes a playful tagline suggesting the service reduces information overload.

CVE-2025-15036: A path traversal vulnerability exists in the `extract_archive_to_dir` function within the `mlflow/pyfunc/dbconnect_artif

highvulnerability
security
Mar 29, 2026
CVE-2025-15036

A path traversal vulnerability (a security flaw where an attacker uses special path names like '../' to access files outside intended directories) exists in MLflow's archive extraction function that doesn't validate the contents of tar.gz files before extracting them. An attacker who controls the tar.gz file can overwrite arbitrary files or escape sandbox restrictions (isolated environments that limit what code can access) in shared computing environments.

All the latest in AI ‘music’

infonews
industrypolicy

CVE-2026-3055: Citrix NetScaler Out-of-Bounds Read Vulnerability

infovulnerability
security
Mar 29, 2026
CVE-2026-3055🔥 Actively Exploited

Helping disaster response teams turn AI into action across Asia

infonews
industry
Mar 29, 2026

OpenAI and partner organizations held an 'AI Jam' workshop in Bangkok with 50 disaster management leaders from 13 Asian countries to explore practical ways AI can improve emergency response. The workshop focused on building custom GPTs (generalized pre-trained transformer models, or AI tools trained on broad data) and workflows for tasks like situation reporting and needs assessment, addressing how disaster response teams in resource-constrained environments with fragmented data can work faster and more effectively.

Bluesky’s new app is an AI for customizing your feed

infonews
industry
Mar 29, 2026

Bluesky has released Attie, a new AI assistant powered by Claude (Anthropic's language model) that helps users create custom feeds using natural language instructions instead of traditional algorithmic settings. Users can describe what content they want to see, like 'posts about folklore, mythology, and traditional music, especially Celtic traditions,' and Attie builds a personalized feed based on that description, with plans to integrate it into Bluesky and other apps built on the AT Protocol (Bluesky's underlying technical foundation).

GHSA-wprj-9cvc-5w37: AVideo: Unauthenticated Access to Payment Log DataTables Endpoints Exposes Transaction Data, PayPal Tokens, and User Financial Records

highvulnerability
security
Mar 29, 2026

AVideo's payment plugins have a critical vulnerability where `list.json.php` endpoints (which retrieve payment transaction records) lack authentication checks, allowing anyone to access sensitive financial data including PayPal tokens, Authorize.Net webhook details, Bitcoin transaction records, and user IDs without logging in. This is the same type of vulnerability that was previously fixed in the Scheduler plugin, but the fix was not applied to 21 other vulnerable endpoints across the codebase.

CVE-2026-5002: A vulnerability has been found in PromtEngineer localGPT up to 4d41c7d1713b16b216d8e062e51a5dd88b20b054. The impacted el

highvulnerability
security
Mar 28, 2026
CVE-2026-5002

A vulnerability (CVE-2026-5002) was discovered in PromtEngineer localGPT that allows injection attacks (inserting malicious code into input) through the LLM Prompt Handler component in the backend/server.py file. An attacker can exploit this vulnerability remotely, and the exploit code has been publicly released. The vendor has not responded to disclosure attempts, and because the product uses rolling releases (continuous updates without traditional version numbers), specific patch information is unavailable.

TikTok’s policy for AI ads isn’t working

infonews
policysafety

RanDS: A Large-Scale Open Dataset of Raw Binaries and Extracted Features for Ransomware Research

inforesearchPeer-Reviewed
research
Previous142 / 317Next
research
Mar 30, 2026

Text-to-image models (AI systems that generate pictures from written descriptions) can be misused to create unsafe content like sexually explicit or violent images. PromptGuard is a new safety technique that uses a soft prompt (a special text input optimized for safety that works within the model's internal text processing layer) to moderate unsafe requests and prevent the generation of such content while still producing high-quality normal images.

Fix: The source describes PromptGuard as the solution itself rather than a patch or update. The technique works by optimizing a safety soft prompt that functions as an implicit system prompt within the text-to-image model's embedding space, with a divide-and-conquer strategy that optimizes category-specific soft prompts and combines them into holistic safety guidance. Code and dataset are available at https://t2i-promptguard.github.io/

IEEE Xplore (Security & AI Journals)
privacy
Mar 30, 2026

This research proposes new methods for fine-tuning (customizing a trained AI model for specific tasks) large language models while protecting sensitive data using differential privacy (a technique that adds noise to data to prevent identifying individuals). The paper introduces DP-ZOSO and DP-ZOPO, which use zeroth-order gradient approximation (estimating how to improve the model without calculating exact mathematical directions) instead of traditional methods, making the process faster and more scalable while maintaining privacy protection.

IEEE Xplore (Security & AI Journals)
safety
Mar 30, 2026

This research addresses a problem where adversarial training (a method to make AI models resistant to adversarial attacks, which are carefully crafted inputs designed to fool the model) works poorly when training data is imbalanced, meaning some classes have many examples while others have very few. The authors propose Tail-Aware Dynamic Adversarial Training (TAD-AT), which improves robustness by adjusting the training loss, attack strategy, and weight averaging to account for which classes are most vulnerable to attacks, rather than just how many examples exist per class.

Fix: The proposed mitigation is Tail-Aware Dynamic Adversarial Training (TAD-AT), which consists of three components: (1) a training loss that incorporates frequency- and accuracy-aware regularization to emphasize learning for vulnerable classes, (2) an attack that adjusts perturbations based on class-wise vulnerability to encourage robust feature learning, and (3) a weight average that adaptively controls the decay rate across classes to improve robust generalization and training stability. Code is available at https://github.com/bookman233/TADAT.

IEEE Xplore (Security & AI Journals)
Mar 30, 2026

Researchers discovered a vulnerability in ChatGPT that could leak sensitive user data (like medical records, financial information, and internal documents) from conversations without the user's knowledge or permission. Although OpenAI has since fixed the issue, the discovery highlights an important lesson: AI tools should not be automatically trusted to be secure just because they are popular or widely used.

Check Point Research

Fix: The source explicitly recommends the following mitigations: For path traversal, enforce allowlists for file access and restrict directory boundaries. For deserialization vulnerabilities, avoid unsafe deserialization methods and ensure only validated, expected data structures are processed. For SQL injection, use parameterized queries (pre-structured database requests that safely handle user input) and strengthen input sanitization. The source notes that fixes from the tools' maintainers are now available but must be applied immediately across integrations.

CSO Online
Mar 30, 2026

Anthropic's unreleased AI model, codenamed Mythos, was accidentally exposed through a configuration error in its content management system (CMS, software that organizes and stores digital content), revealing a more powerful LLM with advanced reasoning and coding abilities. The leak raises security concerns because the model's improved skills at finding and exploiting software vulnerabilities could make cyberattacks easier while also helping defenders, and its capability for recursive self-fixing (autonomously identifying and patching its own code problems) narrows the gap between human and AI-level hacking. Anthropic plans a phased rollout targeting enterprise security teams first before broader release.

CSO Online
CSO Online

Fix: Update MLflow to version 3.8.2, which fixes the vulnerability. Version 3.8.0 is affected.

NVD/CVE Database
CNBC Technology
Simon Willison's Weblog

Fix: Update to mlflow version v3.7.0 or later.

NVD/CVE Database
Mar 29, 2026

AI is now being used throughout the music industry for tasks like creating songs, building playlists, and detecting AI-generated content, but this raises major concerns about copyright (legal ownership of creative work), whether AI outputs are truly art, and whether AI-generated music will flood the market and harm human musicians. The music industry is divided, with some platforms like Apple Music and Deezer adding labels to identify AI music, while others like Bandcamp have banned AI content entirely, and major record labels are pursuing lawsuits against AI music companies.

The Verge (AI)

Citrix NetScaler contains an out-of-bounds read vulnerability (a memory access bug where software reads past the boundaries of allocated memory) in its SAML IDP (SAML identity provider, which authenticates users) component, potentially exposing sensitive data. This vulnerability is currently being actively exploited by attackers in the wild. The vulnerability affects multiple NetScaler products including NetScaler ADC, NetScaler Gateway, and their FIPS and NDcPP variants.

Fix: Apply mitigations per vendor instructions, follow applicable BOD 22-01 guidance for cloud services, or discontinue use of the product if mitigations are unavailable. Consult the Citrix Security Bulletin (CTX696300) for detailed patching information.

CISA Known Exploited Vulnerabilities
OpenAI Blog
The Verge (AI)
GitHub Advisory Database
NVD/CVE Database
Mar 28, 2026

Companies like Samsung are posting ads on TikTok that appear to be made with generative AI (AI systems that create images or videos from text descriptions), but they're not adding the required AI disclosure labels that TikTok's advertising policies demand. This means users can't easily tell whether the ads they see are AI-generated or made by humans, even though the companies creating them know the truth.

The Verge (AI)
Mar 28, 2026

RanDS is a new large-scale dataset containing raw binary files (the compiled machine code of programs) and extracted features designed to help researchers study and detect ransomware (malicious software that encrypts victims' files and demands payment). This resource aims to support the development and testing of machine learning models that can identify ransomware threats more effectively.

Elsevier Security Journals