aisecwatch.com
DashboardVulnerabilitiesNewsResearchArchiveStatsDatasetFor devs
Subscribe
aisecwatch.com

Real-time AI security monitoring. Tracking AI-related vulnerabilities, safety and security incidents, privacy risks, research developments, and policy changes.

Navigation

VulnerabilitiesNewsResearchDigest ArchiveNewsletter ArchiveSubscribeData SourcesStatisticsDatasetAPIIntegrationsWidgetRSS Feed

Maintained by

Truong (Jack) Luu

Information Systems Researcher

Research

Academic papers, new techniques, benchmarks, and theoretical findings in AI/LLM security.

to
Export CSV
690 items

Agentic AI in Healthcare: Opportunities, Challenges, and Future Directions

inforesearchPeer-Reviewed
research
Jun 25, 2026

This academic survey article examines agentic AI in healthcare, which refers to AI systems that can independently plan and execute tasks to accomplish goals. The article discusses both the potential benefits of using such AI systems in medical settings and the technical, ethical, and practical obstacles that need to be addressed. The survey provides an overview of current research directions for developing safer and more effective autonomous AI agents in healthcare applications.

ACM Digital Library (TOPS, DTRAP, CSUR)

Understanding Hallucinations in Large Visual and Language Models

inforesearchPeer-Reviewed
research

Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey

inforesearchPeer-Reviewed
security

Metrics for Privacy-Preserving Generative Models: A Comprehensive Survey

inforesearchPeer-Reviewed
research

SALT: Semantic-guided adaptive latent space truncation sampling watermarking for diffusion models

inforesearchPeer-Reviewed
security

Using AI to help physicians diagnose rare genetic diseases affecting children

inforesearchPeer-Reviewed
research

A Hybrid Intrusion Detection Model for Cloud Security: Feature Selection, Classification, and Authentication Using TFSEA Framework

inforesearchPeer-Reviewed
security

Rethinking ransomware defense in the age of generative AI

inforesearchPeer-Reviewed
security

A Survey of Neural Network Robustness Assessment in Image Recognition

inforesearchPeer-Reviewed
research

Uncovering robot joint-level controller actions from encrypted network traffic: Empirical attacks and information-theoretic bounds

inforesearchPeer-Reviewed
security

Matching Comes First: Efficient Certificateless Lattice-Based Bilateral Access Control With On-Demand Matching

inforesearchPeer-Reviewed
security

Chain Reaction: A Triple-Chain Architecture for Sensitive Thumbnail-Preserving Image Encryption

inforesearchPeer-Reviewed
security

Enhanced privacy-preserving neural networks with fully homomorphic encryption: Optimized search and training

inforesearchPeer-Reviewed
security

A lightweight pairing-free certificateless signcryption scheme with designated-verifier privacy for IoT in the standard model

inforesearchPeer-Reviewed
security

Watermarking for Model Ownership Verification:Invisible at Deployment, Activated by Updates

inforesearchPeer-Reviewed
security

With Power comes Responsibility: Attack Synthesis for Industrial Control Systems using Large Language Models

inforesearchPeer-Reviewed
security

Understanding security risks in update mechanisms of computing systems

inforesearchPeer-Reviewed
security

Hiding the trees in the forest: Building network covert channels with hash-based covert carrier filtering

inforesearchPeer-Reviewed
security

FIT-Print: Toward False-Claim-Resistant Model Ownership Verification via Targeted Fingerprint

inforesearchPeer-Reviewed
security

TAPGuard: A Semantic-Aware Graph Framework for TAP Rule Cascading Threat Detection

inforesearchPeer-Reviewed
research
1 / 35Next
safety
Jun 25, 2026

This academic survey examines hallucinations in large visual and language models, which are instances where AI systems generate false or nonsensical information that appears plausible. The paper, published in ACM Computing Surveys in October 2026, provides a comprehensive overview spanning 36 pages of research on this problem affecting both language models (AI systems trained on text) and multimodal models (AI systems that process both images and text).

ACM Digital Library (TOPS, DTRAP, CSUR)
research
Jun 24, 2026

This academic survey examines harmful fine-tuning attacks (methods where attackers modify an AI model's training process to make it behave dangerously) and the defenses designed to stop them. The paper reviews different types of attacks, how they work, and various protection strategies researchers have developed to keep large language models safe from this threat.

ACM Digital Library (TOPS, DTRAP, CSUR)
privacy
Jun 24, 2026

This academic survey paper examines metrics, or measurement methods, used to evaluate privacy-preserving generative models (AI systems that create new data while protecting personal information). The paper provides a comprehensive overview of different ways researchers measure how well these models protect privacy while still functioning effectively.

ACM Digital Library (TOPS, DTRAP, CSUR)
research
Jun 19, 2026

SALT is a watermarking technique for diffusion models (AI systems that generate images by gradually removing noise from random data) that uses semantic guidance and adaptive latent space truncation to embed hidden ownership marks. The method aims to protect diffusion models from unauthorized use while maintaining the quality of generated images. This research addresses the need for better ownership verification and copyright protection in generative AI systems.

Elsevier Security Journals
Jun 18, 2026

Researchers used OpenAI o3 Deep Research, an AI reasoning model, to re-analyze 376 previously unsolved rare genetic disease cases by connecting clinical data, genetic variants, and scientific literature into evidence-based explanations for human experts to review. After specialist evaluation and clinical confirmation, the AI-assisted workflow helped establish new diagnoses in 18 cases (4.8% additional diagnostic yield), with the model generating hypotheses rather than making medical decisions itself. This demonstrates how periodic AI-assisted reanalysis could help scale the process of solving rare disease cases as medical knowledge evolves.

OpenAI Blog
research
Jun 15, 2026

This research paper presents a new security framework called TFSEA that combines feature selection (choosing which data points matter most), classification (sorting data into categories), and authentication (verifying user identity) to detect unauthorized access attempts in cloud computing environments. The paper proposes using this hybrid approach to improve how well systems can identify and prevent intrusions in cloud infrastructure.

Elsevier Security Journals
research
Jun 14, 2026

This article examines how ransomware (malicious software that locks files and demands payment to unlock them) defense strategies need to change as generative AI (AI systems that create new content like text or code) becomes more common. The piece suggests that traditional security approaches may be less effective in an environment where AI is widely used.

Elsevier Security Journals
safety
Jun 14, 2026

This academic survey paper reviews methods for testing how well neural networks (AI systems trained to recognize patterns in data) perform when faced with unexpected or manipulated images. The paper examines various approaches researchers use to assess whether image recognition systems remain accurate and reliable under challenging conditions.

ACM Digital Library (TOPS, DTRAP, CSUR)
Jun 12, 2026

Researchers discovered that they can figure out what actions industrial robots are performing just by analyzing encrypted network traffic (data traveling across networks in scrambled form) without being able to read the actual messages. The study shows both practical attacks that successfully identified robot movements and theoretical limits on how much information can be extracted from this type of traffic. This reveals a security gap where encryption alone may not fully protect sensitive robot operations from being monitored.

Elsevier Security Journals
Jun 12, 2026

This paper presents a new cryptographic method called certificateless lattice-based matchmaking encryption (CLLME) designed to secure data sharing on cloud platforms while meeting regulations like GDPR. CLLME provides post-quantum security (protection against future quantum computers), allows both senders and receivers to control who can access data, and includes a filtering mechanism to avoid decrypting irrelevant encrypted files. The researchers proved the method is mathematically secure and showed it works efficiently in real-world scenarios.

IEEE Xplore (Security & AI Journals)
Jun 12, 2026

Thumbnail-preserving encryption (TPE, a method that keeps some visual information visible in encrypted images to balance usability and privacy) has a security weakness: existing approaches encrypt pixels, blocks, or channels separately, creating vulnerabilities. Researchers propose a new 'triple-chain architecture' that links encryption at three levels (pixels, blocks, and channels) so that any small change to an image causes completely different encryption results, making the system more secure while still maintaining TPE benefits.

IEEE Xplore (Security & AI Journals)
research
Jun 12, 2026

This research paper describes methods for making neural networks (AI models that learn patterns from data) more private by using fully homomorphic encryption (a type of encryption that lets computers perform calculations on encrypted data without decrypting it first). The work focuses on optimizing how these privacy-protecting neural networks search through and train on data while keeping information secure.

Elsevier Security Journals
Jun 12, 2026

This research paper proposes a new cryptographic method for securing communication in IoT (Internet of Things) devices that is lightweight and preserves privacy. The scheme uses certificateless signcryption (a technique that combines digital signatures for authentication with encryption for confidentiality, without requiring traditional certificates) and designated-verifier privacy (meaning only a chosen recipient can verify that a message is authentic), designed to work efficiently on resource-constrained IoT devices.

Elsevier Security Journals
research
Jun 11, 2026

This research paper describes a watermarking technique that allows AI model creators to prove they own their models without revealing the watermark during normal use. The watermark remains hidden when the model is deployed but becomes detectable when the model is updated, helping prevent unauthorized copying or theft of AI models.

ACM Digital Library (TOPS, DTRAP, CSUR)
research
Jun 11, 2026

Researchers demonstrated that large language models (AI systems trained on vast text data) can be used to generate attack strategies against industrial control systems (the computers that manage power plants, factories, and critical infrastructure). The study shows a concerning security risk where these powerful AI tools could be misused to help attackers plan harmful activities against systems that society depends on.

ACM Digital Library (TOPS, DTRAP, CSUR)
Jun 11, 2026

This academic publication examines security vulnerabilities in the mechanisms that deliver software updates to computers and systems. The article, published in June 2026, analyzes how attackers might exploit the update process itself to compromise systems, rather than targeting the software after it's already installed.

Elsevier Security Journals
Jun 11, 2026

Researchers describe a method for creating hidden communication channels within networks by using hash-based filtering to disguise data inside normal-looking network traffic. This technique, called a covert channel (a hidden path for sending information that shouldn't be detectable), could allow attackers to secretly send data through systems without being noticed by security tools.

Elsevier Security Journals
research
Jun 10, 2026

Existing model fingerprinting techniques (methods that create unique digital signatures to prove ownership of AI models) are vulnerable to false claim attacks, where attackers can fraudulently claim they own models they didn't create. This paper introduces FIT-Print, a targeted fingerprinting approach that uses optimization to create verifiable signatures resistant to these false claims, offering two specific methods (bit-wise FIT-ModelDiff and list-wise FIT-LIME) that achieved 100% success in preventing false ownership claims while maintaining accurate ownership verification.

Fix: The paper proposes FIT-Print, a targeted fingerprinting paradigm that 'actively counters false claim attacks' by leveraging 'optimization to transform the fingerprint into a verifiable, targeted signature.' Two specific black-box fingerprinting methods are introduced: 'bit-wise FIT-ModelDiff' which 'utilizes output distances' and 'list-wise FIT-LIME' which utilizes 'feature attributions as robust model signatures.' The framework demonstrated '100% defense success rate' against false claim attacks and '100% ownership verification rate.'

IEEE Xplore (Security & AI Journals)
safety
Jun 10, 2026

This research proposes TAPGuard, a framework for detecting cascading threats in Trigger-Action Programming (TAP, a system where one event automatically triggers another action, commonly used in smart home devices). The framework uses large language models (AI systems trained on text) to understand the semantic meaning (the actual intent and meaning, not just the structure) of automation rules and identifies two types of threats: explicit ones from direct device interactions and implicit ones from rules sharing environmental variables that shouldn't interact. TAPGuard performs better than existing methods at catching these dangerous rule combinations.

IEEE Xplore (Security & AI Journals)