Academic papers, new techniques, benchmarks, and theoretical findings in AI/LLM security.
Cyber Threat Attribution (CTA) is the process of identifying who carried out a cyberattack by analyzing evidence from the attack. This paper introduces ThreatMAMBA, an AI framework that improves CTA by building knowledge graphs from threat intelligence data (IOCs, or indicators of compromise that identify malicious activity; TTPs, or tactics and techniques used by attackers; and temporal relationships) and using machine learning to identify attackers even in the early stages of ongoing attacks. The system showed significant improvements in accuracy at different stages of attack development, suggesting it can provide reliable attribution information quickly during real incidents.
This paper addresses privacy and security concerns in collaborative data analysis by proposing a new method for computing Jaccard Coefficient (a mathematical measure comparing similarity between two sets). The proposed protocol protects sensitive information like intersection and union cardinalities (counts of shared and combined elements) while maintaining high accuracy and computational efficiency, and can be enhanced further using cloud-assisted encryption to improve performance by 25.5% to 30.4%.
This is a research survey published in ACM Computing Surveys that examines the limitations and problems of large language models (LLMs, which are AI systems trained on massive amounts of text data to generate human-like responses). The survey takes a data-driven approach to understand how LLM research has evolved as scientists discover and study these systems' weaknesses and constraints.
This is a systematic literature review, a type of research paper that surveys and analyzes existing studies on differential privacy (a mathematical technique that adds carefully measured noise to data to protect individual privacy) in machine learning. The review examines how researchers are applying differential privacy to train AI models while keeping personal information safe from being extracted or misused.
This academic survey paper categorizes and describes different privacy concerns and system designs in collaborative deep learning (machine learning where multiple parties train models together while keeping their data private). The paper creates a taxonomy, which is a systematic classification scheme, to help organize the various approaches and challenges in this field.
Researchers have developed BioGuard, a defense method that protects biometric classifiers (AI systems that identify people using fingerprints, faces, or iris scans) against model extraction attacks (where attackers try to steal or copy the AI model by repeatedly querying it). The method works without needing malicious sample data to train it, making it practical for real-world deployment.
This academic paper presents T3AT, a new cryptographic system for creating anonymous tokens (digital proof of eligibility that doesn't reveal who you are) that can be issued and verified by multiple parties working together, rather than requiring a single trusted authority. The system uses advanced mathematical techniques including threshold signatures (where multiple parties must cooperate to authorize something) and verifiable computation methods to ensure tokens cannot be transferred between users and cannot be forged, while maintaining privacy without needing trusted hardware or centralized control.
This research addresses a problem where AI models trained to identify radio transmitters (specific emitter identification, or SEI) fail when tested on different hardware receivers due to shortcut learning (when models rely on irrelevant patterns instead of genuine features). The authors propose MTL-SEI, a framework that uses adversarial training (a technique where two competing AI systems help each other improve) and multiple related learning tasks to teach models to ignore receiver-specific artifacts and focus on true transmitter fingerprints, achieving 88.50% accuracy on test data.
Malware often encrypts its network traffic (data sent over the internet) to hide its activities, making it hard to detect using traditional methods. Most existing detection systems need complete traffic data to work well, but this research presents DawnGuard, a new system that can identify encrypted malware traffic very early in an attack, when only a small amount of data is available, by using temporal graph learning (analyzing how multiple network connections relate to each other over time) and a Vision Transformer (a type of deep learning model that captures patterns across data). The system achieved 95.11% accuracy using just the first 20% of traffic data.
ESCM is a toolkit that uses homomorphic encryption (a technique that lets computers process encrypted data without decrypting it first) to let cloud servers perform calculations on data from multiple users who each have their own encryption key. The toolkit addresses security risks by using a distributed two trapdoor cryptosystem with threshold decryption (a system where multiple servers must cooperate to decrypt data, so no single server can access the information alone), which protects against server collusion and outages.
This research paper examines macro-level collaborative leakage, which occurs when individually harmless data pieces reveal sensitive information when combined together. The authors conducted mathematical analyses to understand why this happens and found that the problem stems from how risk data (data that don't directly expose private information) correlate with sensitive information. While Gaussian distribution (a common bell-curve statistical pattern) can help prevent this type of leakage, the paper concludes that this protection is limited and more comprehensive security mechanisms are needed.
This research proposes HeteroFed, a framework for federated learning (a distributed machine learning approach where multiple devices train a shared model without sending raw data to a central server) that addresses privacy and performance challenges in edge intelligence scenarios. The framework uses four main techniques: personalized model construction for different devices, dynamic gradient clipping (limiting how much model parameters can change), adaptive noise addition for privacy protection, and improved model aggregation to maintain accuracy despite privacy protections.
Fix: The source proposes HeteroFed as a solution framework containing four specific mechanisms: (1) heterogeneous model construction to enable personalized model training for different smart devices, (2) dynamic gradient clipping to dynamically adjust the magnitude of gradients on models uploaded by devices, (3) adaptive noise addition to customize differential privacy (mathematical techniques that add noise to protect individual data) protection based on device model convergence status, and (4) deviation-aware model aggregation for accurate model aggregation to mitigate noise perturbation effects.
IEEE Xplore (Security & AI Journals)Researchers discovered that two widely-used encryption schemes for secure database searches (m-ORE and om-ORE, which allow multiple parties to query encrypted data without revealing the queries or data) can be attacked by a malicious client and server working together to insert fake records into the database. The team developed a new scheme called MORES that fixes this vulnerability while also making searches about one-third faster and more efficient than the older schemes.
Fix: The source proposes MORES, described as 'the first multi-client ORE scheme that preserves range-query functionality while provably resisting arbitrarily malicious participants.' The text indicates MORES can serve as 'an immediate drop-in replacement for encrypted-database systems that demand both efficiency and robustness in adversarial environments,' but does not provide implementation details, version numbers, or step-by-step deployment instructions.
IEEE Xplore (Security & AI Journals)```json { "summary": "This paper introduces AuthRF, a security system that protects RF sensing models (AI systems that interpret radio frequency signals from WiFi or radar) by using user-specific digital "passports" embedded in the signal processing pipeline. Valid passports allow the model to work correctly, while invalid or fake ones distort the signal and degrade performance, preventing unauthorized use. The approach is designed to be proactive and work during runtime, addressing limitation
Researchers developed DiffMI, a new attack that can recover people's facial identities from face recognition systems by reversing the embeddings (compressed numerical representations of faces). Unlike previous attacks, DiffMI doesn't require expensive training on specific targets and can work against unseen faces and new recognition models, achieving success rates between 84-93% against systems designed to resist such attacks.
This research proposes a new method for private set operations (PSO, techniques that let organizations securely compare or combine datasets without revealing private information) that reduces the computational burden on client devices. The approach uses secret sharing (splitting data into pieces so no single party can see the whole picture) to allow servers to do most of the work while clients can stay offline, making it practical for large-scale collaborative research across institutions like hospitals.
Voice biometric systems (technology that identifies people by their voice) are vulnerable to replay attacks (where an attacker plays back a recorded voice to fool the system), but there hasn't been enough realistic training data to build good defenses. This research created RIRplay, a simulated database that realistically mimics how replay attacks actually happen across different acoustic environments, which improved detection performance significantly when tested on real-world voice spoofing challenges.
Researchers developed AdvFor, a black-box attack method (a way to trick an AI system without seeing its internal workings) that can fool image forgery localization models, which are AI systems trained to detect where images have been fake-edited or manipulated. The attack uses reinforcement learning (a technique where an AI learns by trial and error to maximize rewards) to craft minimal changes to images that make forgery detection fail, using only 7 queries per image, and the researchers tested it on multiple real-world models to show it works effectively.
Adversarial examples (inputs crafted to fool AI systems) are a serious security risk for deep neural networks (AI systems with many layers), especially in physical-world attacks like fooling object detection in surveillance cameras. This research proposes Adversarial Spectrum Defense (ASD), a defense method that uses spectral decomposition (breaking down data into different frequency components) via Discrete Wavelet Transform (a mathematical technique to analyze patterns at multiple scales) to detect and defend against patch-based and texture-based adversarial attacks, and shows it achieves better protection when combined with Adversarial Training (training the AI on attack examples to make it more robust).
Fix: The source proposes Adversarial Spectrum Defense (ASD), which 'leverages spectral decomposition via Discrete Wavelet Transform (DWT) to analyze adversarial patterns across multiple frequency scales' and 'by integrating this spectral analysis with the off-the-shelf Adversarial Training (AT) model, ASD provides a comprehensive defense strategy against both patch-based and texture-based adversarial attacks.' The paper reports that 'ASD+AT achieved state-of-the-art (SOTA) performance against various attacks, outperforming the APs of previous defense methods by 21.73%'.
IEEE Xplore (Security & AI Journals)FinBot is an interactive training platform (CTF, or capture-the-flag exercise) created by OWASP to help developers and security professionals learn about risks in agentic AI systems (AI agents that can plan, act, and make decisions autonomously). It simulates a financial services application where users can practice identifying and defending against attacks like prompt injection (tricking an AI by hiding instructions in its input), tool misuse, data theft, and privilege escalation across multiple connected AI agents.