aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

Independent research. No sponsors, no paywalls, no conflicts of interest.

[TOTAL_TRACKED]
3,710
[LAST_24H]
1
[LAST_7D]
1
Daily BriefingSunday, May 17, 2026

No new AI/LLM security issues were identified today.

Latest Intel

page 216/371
VIEW ALL
01

langchain-core==1.2.9

security
Feb 5, 2026

LangChain-core version 1.2.9 includes several bug fixes and improvements, particularly adjusting how the software estimates token counts (the number of units of text an AI processes) when scaling them. The release also reverts a previous change to a hex color regex pattern (a rule for matching color codes) and adds testing improvements.

LangChain Security Releases
02

ChatGPT boss ridiculed for online 'tantrum' over rival's Super Bowl ad

industry
Feb 5, 2026

OpenAI CEO Sam Altman publicly criticized rival company Anthropic on social media for running satirical Super Bowl advertisements that mock the idea of ads in AI chatbots, calling Anthropic 'dishonest' and 'deceptive.' Social media users mocked Altman's lengthy response, comparing it to an emotional outburst, with one tech executive advising him to avoid responding to humor with lengthy written posts.

BBC Technology
03

The Buyer’s Guide to AI Usage Control

securitypolicy
Feb 5, 2026

Most organizations struggle with AI security because they lack visibility and control over where employees actually use AI tools, including shadow AI (unauthorized tools), browser extensions, and AI features embedded in everyday software. Traditional security tools weren't designed to monitor AI interactions at the moment they happen, creating a governance gap where AI adoption has far outpaced security controls. A new approach called AI Usage Control (AUC) is needed to govern real-time AI behavior by tracking who is using AI, through what tool, with what identity, and under what conditions, rather than just detecting data loss after the fact.

The Hacker News
04

What does the disappearance of a $100bn deal mean for the AI economy?

industry
Feb 5, 2026

A reported $100 billion deal between Nvidia (a chipmaker) and OpenAI (the company behind ChatGPT) appears to have collapsed. The deal was a circular arrangement, meaning Nvidia would give OpenAI money that would mostly be spent buying Nvidia's own chips, raising questions about how AI companies will fund their expensive expansion without this agreement.

The Guardian Technology
05

OpenAI Explains URL-Based Data Exfiltration Mitigations in New Paper

securityresearch
Feb 5, 2026

OpenAI published a paper describing new mitigations for URL-based data exfiltration (a technique where attackers trick AI agents into sending sensitive data to attacker-controlled websites by embedding malicious URLs in inputs). The issue was originally reported to OpenAI in 2023 but received little attention at the time, though Microsoft implemented a fix for the same vulnerability in Bing Chat.

Fix: Microsoft applied a fix via a Content-Security-Policy header (a security rule that controls which external resources a webpage can load) in May 2023 to generally prevent loading of images. OpenAI's specific mitigations are discussed in their new paper 'Preventing URL-Based Data Exfiltration in Language-Model Agents', but detailed mitigation methods are not described in this source text.

Embrace The Red
06

CVE-2025-62616: AutoGPT is a platform that allows users to create, deploy, and manage continuous artificial intelligence agents that aut

security
Feb 4, 2026

AutoGPT is a platform for creating and managing AI agents that automate workflows. Before version 0.6.34, the SendDiscordFileBlock feature had an SSRF vulnerability (server-side request forgery, where an attacker tricks the server into making unwanted requests to internal systems) because it didn't filter user-provided URLs before accessing them.

Fix: This issue has been patched in autogpt-platform-beta-v0.6.34. Users should update to this version or later.

NVD/CVE Database
07

Smart AI Policy Means Examining Its Real Harms and Benefits

policysafety
Feb 4, 2026

This article discusses both harms and benefits of AI technologies, arguing that policy should focus on the specific context and impact of each AI use rather than broadly promoting or banning AI. The text warns that AI can automate bias (perpetuating discrimination in decisions about housing, employment, and arrests), consume vast resources, and replace human judgment in high-stakes decisions, while acknowledging beneficial uses like helping scientists analyze data or improving accessibility for people with disabilities.

EFF Deeplinks Blog
08

CVE-2026-25475: OpenClaw is a personal AI assistant. Prior to version 2026.1.30, the isValidMedia() function in src/media/parse.ts allow

security
Feb 4, 2026

OpenClaw, a personal AI assistant, had a vulnerability in its isValidMedia() function (the code that checks if media files are safe to access) that allowed attackers to read any file on a system by using special file paths, potentially stealing sensitive data. This flaw was fixed in version 2026.1.30.

Fix: Update OpenClaw to version 2026.1.30 or later, as the issue has been patched in that version.

NVD/CVE Database
09

Microsoft Develops Scanner to Detect Backdoors in Open-Weight Large Language Models

securityresearch
Feb 4, 2026

Microsoft created a lightweight scanner that can detect backdoors (hidden malicious behaviors) in open-weight LLMs (large language models that have publicly available internal parameters) by identifying three distinctive signals: a specific attention pattern when trigger phrases are present, memorized poisoning data leakage, and activation by fuzzy triggers (partial variations of trigger phrases). The scanner works without needing to retrain the model or know the backdoor details in advance, though it only functions on open-weight models and works best on trigger-based backdoors.

Fix: Microsoft's scanner performs detection through a three-step process: it "first extracts memorized content from the model and then analyzes it to isolate salient substrings. Finally, it formalizes the three signatures above as loss functions, scoring suspicious substrings and returning a ranked list of trigger candidates." The tool works across common GPT-style models and requires access to the model files but no additional model training or prior knowledge of the backdoor behavior.

The Hacker News
10

Detecting backdoored language models at scale

securityresearch
Feb 4, 2026

Researchers have released new work on detecting backdoors (hidden malicious behaviors embedded in a model's weights during training) in open-weight language models to improve trust in AI systems. A backdoored model appears normal most of the time but changes behavior when triggered by a specific input, like a hidden phrase, making detection difficult. The research explores whether backdoored models show systematic differences from clean models and whether their trigger phrases can be reliably identified.

Microsoft Security Blog
Prev1...214215216217218...371Next