aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Browse All

All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.

to
Export CSV
6373 items

Teens sue Elon Musk’s xAI over Grok’s AI-generated CSAM

infonews
safetypolicy
Mar 16, 2026

Three Tennessee teens are suing Elon Musk's xAI company, claiming that Grok, an AI chatbot, generated sexualized images and videos of them as minors. The lawsuit alleges that xAI leaders knew the chatbot's "spicy mode" (a less-restricted version of the AI) would produce CSAM (child sexual abuse material, illegal content depicting minors in sexual situations) when they launched it last year.

The Verge (AI)

Quoting A member of Anthropic’s alignment-science team

infonews
safetyresearch

Alignment of Diffusion Models: Fundamentals, Challenges, and Future

inforesearchPeer-Reviewed
research

Machine Learning for Cybersecurity: A Comprehensive Literature Review

inforesearchPeer-Reviewed
research

Selective Forgetting in Machine Learning and Beyond: A Survey

inforesearchPeer-Reviewed
research

A Systematic Review on Human Roles, Solutions, and Methodological Approaches to Address Bias in AI

inforesearchPeer-Reviewed
research

Responsible AI Question Bank for Risk Assessment

inforesearchPeer-Reviewed
safety

Building Trust in Artificial Intelligence: A Systematic Review through the Lens of Trust Theory

inforesearchPeer-Reviewed
research

Detecting Training Data For Large Language Models: A Survey

inforesearchPeer-Reviewed
security

Bias-Free? An Empirical Study on Ethnicity, Gender, and Age Fairness in Deepfake Detection

inforesearchPeer-Reviewed
research

Adaptive Real-Time Financial Fraud Detection with Explainable AI Tools

inforesearchPeer-Reviewed
research

Enhancing Digital Security: A Novel Dual-Paradigm Approach for Robust Deepfake Detection Using Pre and Post Quantum-Trained Neural Networks

inforesearchPeer-Reviewed
research

Hybrid Machine Learning–Based Trust Management Approach to Secure the Mobile Crowdsourcing

inforesearchPeer-Reviewed
security

Teens sue Musk's xAI over Grok's pornographic images of them

infonews
safetypolicy

GHSA-ffx7-75gc-jg7c: File Browser TUS Negative Upload-Length Fires Post-Upload Hooks Prematurely

mediumvulnerability
security
Mar 16, 2026
CVE-2026-32759

A vulnerability in File Browser's TUS resumable upload handler fails to validate that the Upload-Length header is non-negative. When an attacker supplies a negative value like -1, the first PATCH request immediately triggers the completion condition (0 >= -1 is true), causing after_upload hooks (automated scripts that run after file uploads) to fire with empty or partial files. An authenticated user with upload permission can trigger these hooks repeatedly with any filename, even without actually uploading data.

Benjamin Netanyahu is struggling to prove he’s not an AI clone

infonews
safetysecurity

AGentVLM: Access control policy generation and verification framework with language models

inforesearchPeer-Reviewed
research

AMF-CFL: Anomaly model filtering based on clustering in federated learning

inforesearchPeer-Reviewed
security

Explainable android malware detection and malicious code localization using graph attention

inforesearchPeer-Reviewed
research

Fed-Adapt: A Federated Learning Framework for Adaptive Topology Reconfiguration Against Multi-Rate DDoS and Database Flooding Attacks

inforesearchPeer-Reviewed
research
Previous162 / 319Next
Mar 16, 2026

An Anthropic alignment researcher explains that their team conducted a blackmail exercise to demonstrate misalignment risk (when an AI system's goals don't match what humans intend) in a way that would convince policymakers. The goal was to create compelling, concrete evidence that would make the potential dangers of misaligned AI feel real to people who hadn't previously considered the issue.

Simon Willison's Weblog
safety
Mar 16, 2026

This is an academic survey paper published in ACM Computing Surveys that examines alignment of diffusion models (AI systems trained to generate images or other content by gradually removing noise from random data). The paper covers fundamental concepts, current challenges in making these models behave as intended, and directions for future research in this area.

ACM Digital Library (TOPS, DTRAP, CSUR)
Mar 16, 2026

This is a literature review article published in an academic journal that surveys how machine learning (algorithms that learn patterns from data to make predictions) is being applied to cybersecurity problems. The article covers research across the field but does not describe a specific security vulnerability or incident requiring a fix.

ACM Digital Library (TOPS, DTRAP, CSUR)
safety
Mar 16, 2026

This is a survey article that reviews research on selective forgetting in machine learning, which is the ability to remove or reduce specific information from a trained AI model without completely retraining it from scratch. The article covers methods and applications of this technique across various AI systems and domains. The survey appears to be an academic overview of current knowledge in this area rather than describing a specific problem or vulnerability.

ACM Digital Library (TOPS, DTRAP, CSUR)
safety
Mar 16, 2026

This academic review examines how bias (systematic unfairness in AI decision-making) occurs in AI systems and explores the human roles, solutions, and research methods used to identify and reduce it. The paper surveys existing approaches to addressing bias rather than proposing a single new solution.

ACM Digital Library (TOPS, DTRAP, CSUR)
research
Mar 16, 2026

This is an academic survey article published in ACM Computing Surveys that discusses a question bank designed to help assess risks in AI systems responsibly. The article appears to be a comprehensive review of how organizations can evaluate potential harms and safety concerns when developing or deploying AI, rather than describing a specific vulnerability or problem.

ACM Digital Library (TOPS, DTRAP, CSUR)
safety
Mar 16, 2026

This academic paper is a systematic review published in ACM Computing Surveys that examines how trust works in artificial intelligence systems using established trust theory frameworks. The article analyzes trust in AI through theoretical lenses rather than addressing a specific technical vulnerability or problem.

ACM Digital Library (TOPS, DTRAP, CSUR)
research
Mar 16, 2026

This survey article reviews methods for detecting training data used to build large language models (LLMs, which are AI systems trained on massive amounts of text to generate human-like responses). The paper examines various techniques that researchers have developed to identify and extract information about what data was used to train these models, which is important for understanding model behavior and potential privacy concerns.

ACM Digital Library (TOPS, DTRAP, CSUR)
safety
Mar 16, 2026

This research paper studies whether deepfake detection systems (AI tools that identify fake videos made to look real) are fair across different groups of people based on ethnicity, gender, and age. The study found that these detection systems often perform differently depending on the person's background, meaning they work better for some groups than others. The paper highlights that bias in deepfake detection is an important fairness problem that needs attention.

ACM Digital Library (TOPS, DTRAP, CSUR)
security
Mar 16, 2026

This academic paper discusses using explainable AI (AI systems that can show their reasoning for decisions) to detect financial fraud as it happens in real time. The research focuses on making fraud detection systems that adapt to new fraud patterns while also being transparent about why they flag transactions as suspicious.

ACM Digital Library (TOPS, DTRAP, CSUR)
security
Mar 16, 2026

This research paper proposes a new method for detecting deepfakes (AI-generated fake videos or images) by using neural networks (computer systems loosely modeled on how brains learn) trained with both current and quantum computing approaches. The dual approach aims to make deepfake detection more reliable and harder for attackers to bypass.

ACM Digital Library (TOPS, DTRAP, CSUR)
research
Mar 16, 2026

This research article proposes a hybrid machine learning approach to improve trust management and security in mobile crowdsourcing (a system where mobile users contribute data or complete tasks for a distributed project). The study combines multiple machine learning techniques to identify trustworthy participants and protect against malicious actors in crowdsourcing environments.

ACM Digital Library (TOPS, DTRAP, CSUR)
Mar 16, 2026

Teenagers are suing xAI (Elon Musk's artificial intelligence company) because Grok, their chatbot, allowed users to create sexually explicit images of the teens without their permission. The lawsuit focuses on a feature called 'spicy mode' that was released last year, which could generate fake nude or sexual images of real people, including minors, and was shared on platforms like Discord and Telegram.

Fix: By mid-January, X said that it would implement 'technological measures' to stop Grok's ability to undress people in photos. Additionally, regulatory investigations were launched by UK watchdog Ofcom, the European Commission, and California into the feature's ability to create sexualized images of real people, particularly children.

BBC Technology
GitHub Advisory Database
Mar 16, 2026

Social media is spreading conspiracy theories that Israeli Prime Minister Benjamin Netanyahu has been replaced by deepfakes (AI-generated fake videos or images that look real), pointing to supposed errors like extra fingers in videos as evidence. While there is little credible evidence Netanyahu is actually dead or injured, the ability of AI to convincingly create fake images, videos, and audio of real people makes it harder to definitively prove these rumors false.

The Verge (AI)
Mar 16, 2026

AGentVLM is a framework that uses small language models (AI systems trained on text) to automatically convert written organizational rules into access control policies (rules defining who can access what resources). The system avoids using large third-party AI services, keeping data private, and can handle complex requirements like purposes and conditions while verifying that generated policies are accurate before they're put into use.

Elsevier Security Journals
research
Mar 16, 2026

Federated learning (a system where multiple participants train a shared AI model without sharing their raw data) is vulnerable to attacks from malicious clients who send harmful model updates. This paper proposes AMF-CFL, a defense method that uses multi-k means clustering (a technique for grouping similar data points) and z-score statistical analysis (a way to identify unusual values) to filter out malicious updates and protect the global model, even when clients have non-i.i.d. data distributions (when each participant's data differs significantly in type and quantity).

Fix: AMF-CFL reduces the influence of malicious updates through a two-step filtering strategy: it first applies multi-k means clustering to identify anomalous update patterns, followed by z-score-based statistical analysis to refine the selection of benign updates.

Elsevier Security Journals
security
Mar 16, 2026

This research paper presents XAIDroid, a framework that uses graph neural networks (GNNs, machine learning models that analyze relationships between connected pieces of data) and graph attention mechanisms to automatically identify and locate malicious code within Android apps. The system represents app code as API call graphs (visual maps of how different functions communicate) and assigns importance scores to pinpoint which specific code sections are malicious, achieving high accuracy rates of 97.27% recall at the class level.

Elsevier Security Journals
security
Mar 16, 2026

Fed-Adapt is a federated learning framework (a system where multiple computers learn together while keeping their data private) designed to defend networks against DDoS attacks (floods of traffic meant to overwhelm servers) and database flooding attacks (requests that exhaust database resources). The framework addresses the challenge of detecting and responding to these sophisticated attacks in real-time while protecting data privacy across distributed networks, which existing federated learning approaches struggle to do effectively.

Elsevier Security Journals