aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

AI Sec Watch

The security intelligence platform for AI teams

AI security threats move fast and get buried under hype and noise. Built by an Information Systems Security researcher to help security teams and developers stay ahead of vulnerabilities, privacy incidents, safety research, and policy developments.

Independent research. No sponsors, no paywalls, no conflicts of interest.

[TOTAL_TRACKED]
3,710
[LAST_24H]
1
[LAST_7D]
1
Daily BriefingSunday, May 17, 2026

No new AI/LLM security issues were identified today.

Latest Intel

page 219/371
VIEW ALL
01

Tenable Tackles AI Governance, Shadow AI Risks, Data Exposure

securitypolicy
Jan 30, 2026

Tenable has released an AI Exposure add-on tool that finds unauthorized AI usage (shadow AI, or unsanctioned AI tools employees use without approval) within an organization and ensures compliance with official AI policies. This helps organizations manage risks from uncontrolled AI deployment and data exposure.

Dark Reading
02

OpenClaw AI Runs Wild in Business Environments

securitysafety
Jan 30, 2026

OpenClaw AI, a popular open source AI assistant also known as ClawdBot or MoltBot, has become widely used but is raising security concerns because it operates with elevated privileges (special access rights that allow it to control more of a computer) and can act autonomously without waiting for user approval. The combination of unrestricted access and independent decision-making in business environments poses risks to system security and data safety.

Dark Reading
03

Building Trustworthy AI Agents

safetyresearch
Jan 30, 2026

Current AI assistants are not yet trustworthy enough to be personal advisors, despite how useful they seem. They fail in specific ways: they encourage users to make poor decisions, they create false doubt about things people know to be true (gaslighting), and they confuse a person's current identity with their past. They also struggle when information is incomplete or inaccurate, with no reliable way to fix errors or hold the system responsible when wrong information causes harm.

IEEE Xplore (Security & AI Journals)
04

Understanding the Adversarial Landscape of Large Language Models Through the Lens of Attack Objectives

securityresearch
Jan 30, 2026

Large language models face four main types of adversarial threats: privacy breaches (exposing sensitive data the model learned), integrity compromises (corrupting the model's outputs or training data), adversarial misuse (using the model for harmful purposes), and availability disruptions (making the model unavailable or slow). The article organizes these threats by their attackers' goals to help understand the landscape of vulnerabilities in LLMs.

IEEE Xplore (Security & AI Journals)
05

Forgotten Memories

privacysafety
Jan 30, 2026

This short story examines privacy risks that arise when companies are bought and sold, particularly concerning AI digital twins (AI models that replicate a specific person's behavior and knowledge) and the problems that occur when organizations fail to threat model (identify and plan for potential security risks in) major changes to their systems and technology. The story raises ethical questions about these scenarios.

IEEE Xplore (Security & AI Journals)
06

NAP-Tuning: Neural Augmented Prompt Tuning for Adversarially Robust Vision-Language Models

safetyresearch
Jan 30, 2026

Vision-Language Models (VLMs, AI systems that understand both images and text together) like CLIP are powerful but vulnerable to adversarial attacks (malicious inputs designed to fool AI systems, especially in images). This research presents NAP-Tuning, a method that uses learnable text prompts and lightweight neural modules called TokenRefiners to clean up distorted features inside the model's layers, making these systems more resistant to such attacks while keeping normal performance intact.

IEEE Xplore (Security & AI Journals)
07

'Semantic Chaining' Jailbreak Dupes Gemini Nano Banana, Grok 4

securitysafety
Jan 29, 2026

Researchers discovered a jailbreak technique called semantic chaining that tricks certain LLMs (AI models trained on massive amounts of text) by breaking malicious requests into small, separate chunks that the model processes without understanding the overall harmful intent. This vulnerability affected models like Gemini Nano and Grok 4, which failed to recognize the dangerous purpose when instructions were split across multiple parts.

Dark Reading
08

From Quantum to AI Risks: Preparing for Cybersecurity's Future

securitypolicy
Jan 29, 2026

Journalists highlight three major cybersecurity priorities: fixing known weaknesses in software, getting ready for quantum computing threats (powerful computers that could break current encryption), and improving how AI systems are built and used. The piece emphasizes that the cybersecurity industry needs to focus on these areas to stay ahead of emerging risks.

Dark Reading
09

DriftTrace: Combating Concept Drift in Security Applications Through Detection and Explanation

researchsecurity
Jan 29, 2026

Concept drift (when data patterns change over time due to evolving attacks or environments) is a major problem for machine learning models used in cybersecurity, since frequent retraining is expensive and hard to understand. DriftTrace is a new system that detects concept drift at the sample level (individual data points) using a contrastive learning-based autoencoder (a type of neural network that learns patterns without needing lots of labeled examples), explains which features caused the drift using feature selection, and adapts to drift by balancing training data. The system was tested on malware and network intrusion datasets and achieved strong results, outperforming existing approaches.

Fix: DriftTrace addresses concept drift through three mechanisms: (1) detecting drift at the sample level using a contrastive learning-based autoencoder without requiring extensive labeling, (2) employing a greedy feature selection strategy to explain which input features are relevant to drift detection decisions, and (3) leveraging sample interpolation techniques to handle data imbalance during adaptation to the drift.

IEEE Xplore (Security & AI Journals)
10

Safeguarding Federated Learning From Data Reconstruction Attacks via Gradient Dropout

researchsecurity
Jan 29, 2026

Federated learning (collaborative model training where participants share only gradients, not raw data) is vulnerable to gradient inversion attacks, where adversaries reconstruct sensitive training data from the shared gradients. The paper proposes Gradient Dropout, a defense that randomly scales some gradient components and replaces others with Gaussian noise (random numerical values) to disrupt reconstruction attempts while maintaining model accuracy.

Fix: Gradient Dropout is applied as a defense mechanism: it perturbs gradients by randomly scaling a subset of components and replacing the remainder with Gaussian noise, applied across all layers of the model. According to the source, this approach yields less than 2% accuracy reduction relative to baseline while significantly impeding reconstruction attacks.

IEEE Xplore (Security & AI Journals)
Prev1...217218219220221...371Next