Academic papers, new techniques, benchmarks, and theoretical findings in AI/LLM security.
Federated learning (a method where multiple computers train an AI model together without sharing their raw data) is vulnerable to poisoning attacks, where malicious participants sabotage the shared model. This paper proposes SpecShield, a defense that proactively tests each participant's model using carefully crafted perturbations (small, intentional changes) and analyzes their responses using frequency-domain analysis (a mathematical technique that examines patterns at different scales) to distinguish malicious clients from honest ones.
Fix: The paper proposes SpecShield, which works by: (1) using the Fast Gradient Sign Method on the server side to actively probe client models through calibrated adversarial perturbations, (2) analyzing the resulting responses in the frequency domain using Discrete Wavelet Transform to uncover distinctive patterns between benign and malicious clients, and (3) deriving theoretical upper bounds on perturbation magnitudes to guarantee detection accuracy while preserving benign client performance.
IEEE Xplore (Security & AI Journals)Large Vision Language Models (VLMs, which are AI systems that process both images and text) are vulnerable to jailbreak attacks (attempts to trick the AI into ignoring its safety guidelines). VLM-Guard is a detection framework that identifies and monitors a small set of neurons (individual computational units, about 0.2% of the total) that are linked to unsafe behavior, allowing it to catch jailbreak attempts without requiring model fine-tuning (adjusting the AI's internal parameters through additional training). The approach is lightweight and effective at detecting attacks while maintaining normal performance on safe inputs.
Fix: VLM-Guard detects jailbreak attacks by identifying critical neurons linked to unsafe behaviors through differential analysis of activation values. The framework monitors a compact set of just a few hundred neurons (less than 0.2% of total neurons) that are strongly correlated with harmful semantics. It operates as a training-free detector, meaning no parameter updates or model fine-tuning is required, making it suitable for practical deployment in safeguarding VLMs.
IEEE Xplore (Security & AI Journals)This research paper, published in September 2026, addresses how to find the shortest path between two points on encrypted graphs (networks where connections and data are hidden using cryptography) while keeping the query private. The work focuses on path-constrained queries, meaning the shortest route must follow specific rules or limitations, all without revealing the actual graph structure or what users are searching for.
CTISum is a new benchmark dataset designed to help train and test AI systems that automatically summarize cyber threat intelligence (CTI, which is information about security attacks and threats). The dataset provides examples of threat reports and their summaries, helping researchers develop better AI tools for quickly understanding large amounts of security information. This work addresses the challenge of processing the massive volume of threat data that security teams need to analyze.
Cloud computing systems face side-channel attacks (SCAs, where attackers exploit information leaked through physical resources like CPU cache to compromise VMs on shared servers) when multiple VMs run on the same physical machine. This paper proposes a framework that analyzes user behavior patterns to intelligently place VMs across servers and then continuously moves them around to prevent attackers from keeping their malicious VM next to a target VM long enough to launch an attack, achieving up to 25% reduction in attack risk while improving resource efficiency.
Knowledge graph embedding (KGE, a technique that converts graph structures into numerical representations for AI tasks) systems can be attacked by deleting or adding false information, which corrupts their learned representations. This paper introduces PathAttack, a new attack method that uses reasoning paths (multi-hop connections between entities) to identify which pieces of data to attack without needing to know targets in advance, showing 5% improvement in attack effectiveness on test datasets.
This research proposes a framework for authenticating people based on their faces while protecting their facial privacy in systems like smart building access. The system uses homomorphic encryption (a technique that lets computers perform calculations on encrypted data without decrypting it first) and a multi-party secure authentication process, where multiple parties verify identity together rather than relying on a single trusted server, to prevent privacy breaches and single points of failure.
N/A -- The provided content is a navigation menu and feature listing from GitHub's website, not an AI/LLM security issue, vulnerability, or technical problem.
Researchers created SemBugger, a polymorphic backdoor attack (a type of hidden malicious code that can change its behavior) against semantic communication (SC, a system where AI learns shared knowledge to compress and transmit information efficiently). The attack uses variable-intensity triggers to poison training data and manipulate the system into producing different malicious outputs while appearing normal, but the researchers also developed a defense mechanism using controlled noise that can resist these attacks.
Fix: The source proposes a provable robustness defense that resists SemBugger attacks through a controlled noise mechanism, which operates by strategically adding noise to semantic communication inputs, with theoretical lower bounds on defense effectiveness provided. Experiments show this designed defense effectively neutralizes SemBugger attacks.
IEEE Xplore (Security & AI Journals)Person re-identification (ReID) systems, which match images of the same person across different camera views, are vulnerable to a new attack called DSCA (diffusion-based semantic camouflage attack). Instead of changing individual pixels, DSCA uses a generative model to subtly alter high-level features like clothing color and texture to trick the system into matching an attacker with a target identity without needing access to the victim system. The researchers demonstrated this attack succeeds over 95% of the time and evades existing defenses, revealing important security gaps that developers should address.
AmbShield is a security method that uses ambient backscatter devices (AmBDs, which are passive devices that reflect wireless signals without needing their own power source) to protect wireless networks from eavesdropping. The system works by having these devices act as both friendly jammers that create interference to disrupt eavesdroppers and as passive relays that strengthen the signal for legitimate users, all without requiring extra power or complex deployment.
This research addresses privacy and data quality challenges in federated learning (FL, a technique where multiple computers train an AI model together without sharing raw data) for skeleton-based action recognition (identifying human movements from body joint positions). The authors propose Fed-C&E, a system that uses data condensation on client devices to reduce privacy risks, then expands the condensed data on a central server using techniques like a prototype-to-sequence similarity transformation matrix pool and feature expansion with second-order statistics to recover lost information and prevent overfitting.
This research proposes K-TCDP (K-Temporal Correlated Differential Privacy), a new method for training large language models privately using LoRA (a technique that adds small trainable adapters to a model). Standard privacy-preserving training adds random noise that degrades model quality, but K-TCDP uses strategically correlated noise over time so that noise added in early steps can be partially canceled out by noise in later steps, improving model performance while maintaining privacy guarantees.
BlockAthena is a new forensic framework designed to detect long-term crimes on blockchains (distributed ledgers that record transactions) by analyzing transaction patterns over extended periods. The system identifies criminal behavior by recognizing botnet-style activity (coordinated malicious networks) and APT tactics (advanced persistent threat methods used by sophisticated attackers), segmenting transaction data into meaningful chunks based on crime phases, and using graph analysis to spot suspicious patterns while using less computer memory than previous approaches.
FinBot is an interactive training platform (CTF, or capture-the-flag competition) created by OWASP to help builders and defenders understand how agentic AI systems (AI agents that plan, act, and make decisions in complex workflows) can fail and be attacked. It simulates a financial services application where users encounter real security risks like prompt injection (tricking an AI by hiding instructions in its input), tool misuse, data theft, and privilege escalation (gaining unauthorized higher-level access), with connections to industry security frameworks like the OWASP Top 10 for Agentic Applications.
High-frequency traders on decentralized applications (DApps, which are programs built on blockchains) are vulnerable to honeypots, which are traps created by attackers that use publicly visible transaction data to trick users into executing transactions that will fail. Researchers identified 636 honeypot incidents affecting 99 smart contracts (self-executing programs on blockchains) that caused over 25 million dollars in losses, and developed methods to detect these traps and analyze why transactions fail. The study proposes mitigation strategies based on understanding the causes of transaction reversions (when a transaction fails and is undone), though detailed implementation specifics are not provided in this summary.
Fix: The source mentions that researchers 'propose potential strategies to mitigate these security risks and validate them in a simulated environment,' but does not explicitly describe what these strategies are or provide specific implementation details. N/A -- explicit mitigation strategies are not detailed in the source.
IEEE Xplore (Security & AI Journals)This research studies how to securely communicate using drones (autonomous aerial vehicles, or AAVs) that transmit data through free space using light beams (FSO, or free-space optical communication) rather than radio waves. The researchers analyze physical layer security (PLS, which means protecting data at the lowest level of how information is transmitted) when multiple users share the connection and an eavesdropper tries to intercept the signal, accounting for real-world challenges like atmospheric turbulence and pointing errors. They propose optimizing certain communication parameters to reduce the chance that secret information leaks to unauthorized listeners while maintaining good data transmission speeds.
Vision-language models (AI systems that process both images and text together) can leak private information from user-uploaded content, such as identifying people in photos or extracting sensitive text. This research examines privacy risks when users submit images and text to these models. The paper proposes privacy-preserving methods to protect user data while still allowing these AI systems to function effectively.
This survey paper examines algorithm debt in machine learning and deep learning systems, which refers to the long-term costs and problems that accumulate when developers use suboptimal algorithms or methods in AI projects. The paper defines what algorithm debt is, identifies warning signs called 'smells' that indicate its presence, and discusses future research directions. Understanding algorithm debt helps developers recognize when quick, temporary solutions in AI projects create technical problems that become harder and more expensive to fix later.
This paper describes a security problem in blockchain payment channels (like the Lightning Network, which allows faster transactions by bundling multiple payments together): malicious intermediate nodes can intercept funds by reading payment conditions sent in plaintext. The authors propose a solution using a new encryption method called CUAP-PRE (ciphertext unlinkable autonomous path proxy re-encryption, which encrypts payment instructions so intermediate nodes can't see or trace them) combined with an improved payment protocol that lets the final receiver control decryption rights in reverse order to unlock the funds.
Fix: The proposed solution is a secure off-chain payment protocol (SOCP) built on the new CUAP-PRE cryptographic primitive. According to the source, this protocol prevents malicious nodes by: (1) enabling the delegator to designate all trusted delegatees, (2) using ciphertext unlinkability to resist inference attacks and path tracing to ensure anonymity, and (3) implementing an enhanced multi-hop Hash Time-Lock Contract where the receiver at the end of the payment path can control decryption rights in a reversed multi-hop delegation manner to unlock the corresponding bitcoins on hold.
IEEE Xplore (Security & AI Journals)