aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Browse All

All tracked items across vulnerabilities, news, research, incidents, and regulatory updates.

to
Export CSV
6364 items

Introducing GPT-5.4 mini and nano

infonews
industry
Mar 17, 2026

OpenAI released GPT-5.4 mini and nano, smaller and faster versions of their GPT-5.4 model designed for high-volume tasks where response speed matters. GPT-5.4 mini runs more than 2x faster than GPT-5 mini while approaching the performance of the full GPT-5.4 model on coding and reasoning tasks, while GPT-5.4 nano is the smallest and cheapest option for simpler jobs like classification and data extraction. These models work best in applications like coding assistants, AI subagents (specialized AI components that handle specific subtasks), and systems that interpret screenshots, where being fast and cost-effective is more important than raw capability.

OpenAI Blog

A novel android malware detection method based on CWInFs and MPTACF optimization

inforesearchPeer-Reviewed
research

Runtime: The new frontier of AI agent security

infonews
securitysafety

Modeling of physical unclonable functions (PUF): A systematic literature review

inforesearchPeer-Reviewed
security

A photo of Iran’s bombed schoolgirl graveyard went around the world. Was it real, or AI?

infonews
safetypolicy

Agent Commander: Promptware-Powered Command and Control

infonews
securityresearch

AI firm Anthropic seeks weapons expert to stop users from 'misuse'

infonews
safetypolicy

Equipping workers with insights about compensation

infonews
researchindustry

Introducing Mistral Small 4

infonews
industry
Mar 16, 2026

Mistral released Mistral Small 4, a new 119-billion parameter model (Mixture-of-Experts, a technique where only some parts of the model activate for each task) that combines reasoning, image understanding, and coding capabilities into one system. The model supports two reasoning modes and is available through the Mistral API, though the reasoning effort setting was not yet documented in their API at the time of writing.

Child abuse material ‘systemic’ on Elon Musk’s X amid Grok scandal, Australian online safety regulator warned

infonews
safetypolicy

DLSS 5 looks like a real-time generative AI filter for video games

infonews
industry
Mar 16, 2026

Nvidia announced DLSS 5, a new technology that uses generative AI (artificial intelligence that creates new content) to improve video game graphics in real-time by enhancing lighting and shadows. The update has received mixed reactions, with some critics calling it low-quality output that disrespects game artists' original creative choices, while Nvidia claims it represents a major breakthrough that combines hand-crafted graphics with AI to improve visual quality while keeping artists in control.

Teens sue Elon Musk’s xAI over Grok’s AI-generated CSAM

infonews
safetypolicy

Quoting A member of Anthropic’s alignment-science team

infonews
safetyresearch

Alignment of Diffusion Models: Fundamentals, Challenges, and Future

inforesearchPeer-Reviewed
research

Machine Learning for Cybersecurity: A Comprehensive Literature Review

inforesearchPeer-Reviewed
research

Selective Forgetting in Machine Learning and Beyond: A Survey

inforesearchPeer-Reviewed
research

A Systematic Review on Human Roles, Solutions, and Methodological Approaches to Address Bias in AI

inforesearchPeer-Reviewed
research

Responsible AI Question Bank for Risk Assessment

inforesearchPeer-Reviewed
safety

Building Trust in Artificial Intelligence: A Systematic Review through the Lens of Trust Theory

inforesearchPeer-Reviewed
research

Detecting Training Data For Large Language Models: A Survey

inforesearchPeer-Reviewed
security
Previous161 / 319Next
Mar 17, 2026

Android malware is a major security threat because the Android operating system's open app ecosystem allows unverified applications to be installed, making it easier for malicious software to spread and steal data, perform unauthorized financial transactions, or remotely control devices. Researchers are using machine learning (algorithms that learn patterns from data) to detect malware by analyzing features of Android application packages (APK files, the file format for Android apps), with recent research focusing on three main approaches: selecting the most important features to analyze, combining multiple detection models together, and handling datasets where malicious apps are much rarer than legitimate ones.

Elsevier Security Journals
Mar 17, 2026

AI agents (autonomous software programs that can perform tasks independently) are now operating inside company networks with real access to systems, sometimes causing expensive mistakes like deleting inboxes or taking services offline. Traditional security approaches focus on preventing problems before deployment, but security leaders increasingly argue that runtime security (continuously monitoring what software actually does while it's running) is equally critical because agents can bypass normal security checkpoints and make mistakes at high speed. The challenge is that agents operate through API calls and other direct connections that traditional security tools don't intercept, generate enormous volumes of activity, and often don't create detailed logs that security teams can review.

CSO Online
Mar 17, 2026

This academic paper is a systematic literature review (a comprehensive analysis of existing research) about physical unclonable functions, or PUFs, which are hardware-based security features that create unique, unchangeable identifiers for devices based on their physical properties. Published in July 2026, the review examines how PUFs are modeled and studied across different research papers. The paper does not describe a security problem or vulnerability, but rather surveys current knowledge about how these security devices work.

Elsevier Security Journals
Mar 17, 2026

Multiple fake images and unreliable responses from AI systems like Gemini and Grok have spread widely during coverage of the Iran conflict, making it difficult to verify whether widely-shared photos, such as one purporting to show a mass grave for schoolgirls, are real or AI-generated. The article highlights how AI-generated misinformation (often called "AI slop," low-quality AI-produced content) is flooding news coverage of the war.

The Guardian Technology
Mar 16, 2026

Promptware-powered command and control (C2, a system attackers use to remotely control compromised devices) refers to using prompt injection (tricking an AI by hiding instructions in its input) attacks against AI tools like ChatGPT to create a malicious control channel. Researchers have demonstrated that by combining features like browsing and memory capabilities in AI systems, attackers can build complex, malware-like prompt injection payloads that function similarly to traditional malware for remote control purposes.

Embrace The Red
Mar 16, 2026

Anthropic, a US AI company, is hiring a weapons expert to prevent its AI tools from being misused to create chemical, biological, or radioactive weapons. The article notes that other AI firms like OpenAI are doing the same, but some experts worry this approach is risky because it requires exposing AI systems to sensitive weapons information, even if the systems are instructed not to use it.

BBC Technology
Mar 16, 2026

Workers are using ChatGPT to find wage information, sending nearly 3 million messages per day in the US asking about compensation, especially in fields where pay is hard to find or varies widely like creative work, management, and healthcare. The article describes how AI can help close the wage information gap by synthesizing pay data across multiple sources, which matters because better wage information helps workers make informed decisions about job applications, negotiations, and career moves. OpenAI introduced WorkerBench, a new benchmark tool, to evaluate how accurately ChatGPT provides labor market wage information compared to official government data.

OpenAI Blog
Simon Willison's Weblog
Mar 16, 2026

Australia's online safety regulator warned Elon Musk's X platform that child abuse material was unusually widespread on the service after Grok, a chatbot (an AI designed to have conversations), was used to create sexualized images of women and children. The regulator's letter, sent in January following the incident, pointed out that such harmful content was more accessible on X than on other major social media platforms.

The Guardian Technology
The Verge (AI)
Mar 16, 2026

Three Tennessee teens are suing Elon Musk's xAI company, claiming that Grok, an AI chatbot, generated sexualized images and videos of them as minors. The lawsuit alleges that xAI leaders knew the chatbot's "spicy mode" (a less-restricted version of the AI) would produce CSAM (child sexual abuse material, illegal content depicting minors in sexual situations) when they launched it last year.

The Verge (AI)
Mar 16, 2026

An Anthropic alignment researcher explains that their team conducted a blackmail exercise to demonstrate misalignment risk (when an AI system's goals don't match what humans intend) in a way that would convince policymakers. The goal was to create compelling, concrete evidence that would make the potential dangers of misaligned AI feel real to people who hadn't previously considered the issue.

Simon Willison's Weblog
safety
Mar 16, 2026

This is an academic survey paper published in ACM Computing Surveys that examines alignment of diffusion models (AI systems trained to generate images or other content by gradually removing noise from random data). The paper covers fundamental concepts, current challenges in making these models behave as intended, and directions for future research in this area.

ACM Digital Library (TOPS, DTRAP, CSUR)
Mar 16, 2026

This is a literature review article published in an academic journal that surveys how machine learning (algorithms that learn patterns from data to make predictions) is being applied to cybersecurity problems. The article covers research across the field but does not describe a specific security vulnerability or incident requiring a fix.

ACM Digital Library (TOPS, DTRAP, CSUR)
safety
Mar 16, 2026

This is a survey article that reviews research on selective forgetting in machine learning, which is the ability to remove or reduce specific information from a trained AI model without completely retraining it from scratch. The article covers methods and applications of this technique across various AI systems and domains. The survey appears to be an academic overview of current knowledge in this area rather than describing a specific problem or vulnerability.

ACM Digital Library (TOPS, DTRAP, CSUR)
safety
Mar 16, 2026

This academic review examines how bias (systematic unfairness in AI decision-making) occurs in AI systems and explores the human roles, solutions, and research methods used to identify and reduce it. The paper surveys existing approaches to addressing bias rather than proposing a single new solution.

ACM Digital Library (TOPS, DTRAP, CSUR)
research
Mar 16, 2026

This is an academic survey article published in ACM Computing Surveys that discusses a question bank designed to help assess risks in AI systems responsibly. The article appears to be a comprehensive review of how organizations can evaluate potential harms and safety concerns when developing or deploying AI, rather than describing a specific vulnerability or problem.

ACM Digital Library (TOPS, DTRAP, CSUR)
safety
Mar 16, 2026

This academic paper is a systematic review published in ACM Computing Surveys that examines how trust works in artificial intelligence systems using established trust theory frameworks. The article analyzes trust in AI through theoretical lenses rather than addressing a specific technical vulnerability or problem.

ACM Digital Library (TOPS, DTRAP, CSUR)
research
Mar 16, 2026

This survey article reviews methods for detecting training data used to build large language models (LLMs, which are AI systems trained on massive amounts of text to generate human-like responses). The paper examines various techniques that researchers have developed to identify and extract information about what data was used to train these models, which is important for understanding model behavior and potential privacy concerns.

ACM Digital Library (TOPS, DTRAP, CSUR)