aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDataset
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Research

Academic papers, new techniques, benchmarks, and theoretical findings in AI/LLM security.

to
Export CSV
267 items

OWASP GenAI Exploit Round-up Report Q1 2026

highresearchIndustry
security
Apr 15, 2026

A Q1 2026 security report by OWASP documents major AI and agentic AI (AI systems that can take autonomous actions) exploits, showing a shift from theoretical risks to real-world attacks targeting AI agent identities, permissions, and supply chains. Key incidents include a Mexican government breach where attackers used Claude to automate reconnaissance and exploitation, affecting 150 GB of sensitive data, along with other incidents involving prompt injection (tricking AI by hiding malicious instructions in its input), privilege abuse, and supply-chain vulnerabilities in AI tools.

OWASP GenAI Security

Generalizability of Large Language Model-Based Agents: A Comprehensive Survey

inforesearchPeer-Reviewed
research

Cybersecurity in the quantum era: Assessing the impact of quantum computing on infrastructure

inforesearchPeer-Reviewed
security

An Encoding-Based Detection Approach for Stealthy FDI Attacks via Dimensional Transformation of Measurement Data

inforesearchPeer-Reviewed
security

Towards efficient malicious-secure multi-party private set union: Harnessing trusted execution environments

inforesearchPeer-Reviewed
security

A Formal Lens on Android Permissions System: Modeling, Verification, and Exploitation Using LLMs and Model Checking

inforesearchPeer-Reviewed
security

Exploring Visual Explanations for Defending Federated Learning against Poisoning Attacks: Enhancing LayerCAM with Autoencoders

inforesearchPeer-Reviewed
security

Enhancing website fingerprinting through combined data augmentation strategies

inforesearchPeer-Reviewed
security

ReSLC: Defending backdoor attacks on intelligent vulnerability detection via redundant semantic LLM compression

inforesearchPeer-Reviewed
security

Deep learning-based sequential detection of attacks on low-Latency network services

inforesearchPeer-Reviewed
research

XFaceMark: Explainable deep fake watermarking using YOLO, and random MRFO

inforesearchPeer-Reviewed
research

SBOMs into Agentic AIBOMs: Schema Extensions, Agentic Orchestration and Reproducibility Evaluation

inforesearchPeer-Reviewed
research

Adaptive Density Clustering for Data-Driven Password Mangling Rule Generation

inforesearchPeer-Reviewed
research

A Survey on Recent Advances in Conversational Data Generation

inforesearchPeer-Reviewed
research

AISM: Adversarial image steganography model for defending unauthorized recognition

inforesearchPeer-Reviewed
security

Erratum: Adversarial Machine Learning in IoT Security: A Comprehensive Survey

inforesearchPeer-Reviewed
research

Prompting Frameworks for Large Language Models: A Survey

inforesearchPeer-Reviewed
research

v5.5.0

inforesearchIndustry
security

RanDS: A Large-Scale Open Dataset of Raw Binaries and Extracted Features for Ransomware Research

inforesearchPeer-Reviewed
research

One Trigger, Multiple Victims: Clean-Label Neighborhood Backdoor Attacks on Graph Neural Networks

inforesearchPeer-Reviewed
security
1 / 14Next
Apr 14, 2026

This academic survey examines how well large language model-based agents (AI systems that use LLMs to make decisions and take actions) can generalize, meaning how effectively they perform on new tasks or situations they weren't specifically trained for. The paper reviews research across different domains to understand what factors help or limit an agent's ability to adapt and work reliably in unfamiliar contexts.

ACM Digital Library (TOPS, DTRAP, CSUR)
Apr 14, 2026

Quantum computing poses a major threat to current security systems because it can break traditional encryption methods that protect critical infrastructure and cloud services. This paper examines how quantum computing affects different layers of infrastructure (from applications to networks) and proposes moving toward quantum-resistant cryptography (encryption methods designed to withstand quantum computer attacks) as a protective strategy. The authors advocate for collaboration across sectors to develop and implement these new security approaches before quantum threats become critical.

Elsevier Security Journals
Apr 13, 2026

This research paper proposes a method to detect FDI attacks (false data injection, where attackers insert fake sensor readings into control systems) by using encoding techniques to transform measurement data into a different mathematical space. The approach aims to catch stealthy FDI attacks that are designed to evade traditional detection methods by disguising themselves as normal system behavior.

Elsevier Security Journals
Apr 11, 2026

This research paper, published in June 2026, explores how to make multi-party private set union (a process where multiple parties combine datasets while keeping their individual data secret) more efficient and secure against malicious attacks. The authors propose using trusted execution environments (TEEs, hardware that protects code and data even from the computer's owner) to achieve this goal. The paper aims to balance computational efficiency with strong security guarantees when multiple parties need to collaborate while protecting sensitive information.

Elsevier Security Journals
Apr 10, 2026

Researchers used LLMs (large language models, AI systems trained on vast text data) and model checking (a technique to verify if software behaves correctly by examining all possible states) to study Android's permission system, which controls what apps can access on your phone. The study involved modeling how this system works, checking if it's secure, and finding ways to exploit it using AI techniques.

ACM Digital Library (TOPS, DTRAP, CSUR)
research
Apr 10, 2026

This research paper examines how visual explanation techniques can help protect federated learning (a machine learning approach where multiple computers train a model together without sharing raw data) from poisoning attacks (attempts to corrupt the training data or model). The authors propose an enhanced version of LayerCAM (a method that visualizes which parts of an input an AI focuses on), combined with autoencoders (neural networks that compress and reconstruct data), to detect and defend against such attacks.

ACM Digital Library (TOPS, DTRAP, CSUR)
Apr 8, 2026

Researchers developed new data augmentation strategies (techniques for artificially expanding training datasets) to improve website fingerprinting, which is a method to identify which websites users visit by analyzing their network traffic patterns. The study, published in August 2026, demonstrates how combining multiple augmentation approaches can make these fingerprinting techniques more effective.

Elsevier Security Journals
research
Apr 8, 2026

This research paper describes a method called ReSLC that protects AI systems used to find software bugs from backdoor attacks, where attackers secretly embed malicious instructions into the AI's training process. The approach uses redundant semantic LLM compression (a technique that removes unnecessary information from large language models while keeping their core abilities) to make these hidden attacks harder to carry out. The work was published in July 2026 in the Journal of Information Security and Applications.

Elsevier Security Journals
security
Apr 8, 2026

This research paper presents a hybrid deep learning method using autoencoders (neural networks that learn to compress and reconstruct data) and transformers (AI models that process sequences of information) to detect a new type of attack called unresponsive ECN attacks on low-latency network services (systems designed to minimize delay in data transmission). The proposed method achieves over 90% accuracy in detecting these attacks while keeping false alarms below 0.01%, outperforming existing detection approaches by more than 10%.

Elsevier Security Journals
security
Apr 7, 2026

This paper presents XFaceMark, a method that uses YOLO (an object detection system that identifies items in images) and random MRFO (a nature-inspired optimization algorithm) to add watermarks to deepfakes (AI-generated fake videos or images) in a way that can be explained and understood. The approach aims to make deepfakes traceable while allowing researchers to understand how the watermarking process works.

Elsevier Security Journals
Apr 7, 2026

This academic paper discusses extending SBOMs (software bill of materials, which are detailed lists of all components and dependencies in software) to create AIBOMs that can describe agentic AI systems (AI systems that can take independent actions and make decisions). The paper proposes schema extensions, methods for coordinating multiple AI agents, and ways to evaluate whether AI systems produce consistent and reproducible results.

ACM Digital Library (TOPS, DTRAP, CSUR)
security
Apr 6, 2026

This research paper describes a method for automatically generating password mangling rules (transformations that modify passwords systematically) using adaptive density clustering (a technique that groups similar data points together based on how densely packed they are). The approach aims to improve password security by learning patterns from real password data to create more effective rules for testing password strength.

Elsevier Security Journals
Apr 4, 2026

This is a survey paper published in an academic journal that reviews recent progress in conversational data generation, which refers to techniques for creating dialogue datasets (collections of conversations) used to train and improve AI systems. The paper appears to be a comprehensive overview of advances in this field as of July 2026, but no specific technical findings, vulnerabilities, or security issues are described in the provided content.

ACM Digital Library (TOPS, DTRAP, CSUR)
research
Apr 3, 2026

Researchers have developed AISM (adversarial image steganography model, a technique that hides data inside images while making them resistant to AI recognition), a method for protecting images from being recognized by unauthorized AI systems. The approach uses adversarial techniques (methods that deliberately trick AI models by adding subtle, invisible changes to data) combined with steganography (the practice of hiding information within other data) to prevent unwanted AI analysis while keeping the images visually normal to humans. This work addresses privacy concerns where people want to prevent their images from being processed by AI systems without permission.

Elsevier Security Journals
Apr 2, 2026

This is an erratum (correction notice) for an academic survey paper about adversarial machine learning in IoT security (the practice of deliberately fooling AI systems used to protect internet-connected devices). The notice appears in ACM Computing Surveys journal, Volume 58, Issue 10, published in July 2026.

ACM Digital Library (TOPS, DTRAP, CSUR)
Apr 1, 2026

This is an academic survey paper that reviews different prompting frameworks, which are structured approaches to asking large language models (AI systems trained on huge amounts of text) questions or giving them instructions to complete tasks. The paper, published in a major computer science journal, catalogues and analyzes various methods researchers have developed to improve how effectively people interact with and get useful results from LLMs.

ACM Digital Library (TOPS, DTRAP, CSUR)
research
Mar 30, 2026

Version 5.5.0 adds new security techniques documenting threats to AI systems, including AI agent tool poisoning (when attackers corrupt tools that AI agents use), supply chain attacks, and cost harvesting (depleting computing resources through expensive queries). It also updates existing techniques and mitigations related to code signing and monitoring AI agent behavior.

MITRE ATLAS Releases
Mar 28, 2026

RanDS is a new large-scale dataset containing raw binary files (the compiled machine code of programs) and extracted features designed to help researchers study and detect ransomware (malicious software that encrypts victims' files and demands payment). This resource aims to support the development and testing of machine learning models that can identify ransomware threats more effectively.

Elsevier Security Journals
research
Mar 27, 2026

Researchers discovered a new backdoor attack (a security flaw where hidden malicious code is planted in training data) on Graph Neural Networks, or GNNs (AI models designed to understand interconnected data). The attack uses a single trigger node (a specially crafted fake data point) attached to a target node to trick the GNN into making wrong predictions not just on that node, but also on its immediate neighbors, while remaining stealthy and achieving over 95% success rates even against existing defenses.

IEEE Xplore (Security & AI Journals)