Academic papers, new techniques, benchmarks, and theoretical findings in AI/LLM security.
This research proposes a method for AI systems to learn and understand the unique decision-making patterns of individual human operators in cyber defense roles, such as their risk tolerance and curiosity levels. Rather than trying to copy what operators do, the approach uses a kernel-based inverse learning framework (a mathematical technique to infer hidden traits from observed behavior) to build personalized models that can provide better guidance and support. The method was tested with 108 participants and showed it can accurately predict individual decision-making styles even with limited data, helping AI assistants adapt their support to different operators while maintaining mission safety.
This research proposes CTCV, a framework to verify that data stored on edge nodes (computers positioned between users and distant servers for faster access) hasn't been corrupted or tampered with. The framework uses blockchain (a distributed ledger technology) to let edge nodes check each other's data integrity without relying on a single trusted auditor, while preventing collusion attacks (where multiple nodes work together to hide data corruption) through careful verification methods and time limits on response times.
Researchers discovered a serious weakness in tools designed to detect third-party libraries (external code that apps use) in Android applications. They created LibPass, an attack method that generates tricked versions of apps that can fool these detection tools into missing dangerous or non-compliant libraries, with success rates reaching up to 99%. The study reveals that current detection tools are not robust enough to withstand intentional attacks, which puts users at risk since unsafe libraries could hide inside apps.
Smart grids (power distribution systems that communicate usage data electronically) currently use classical public-key cryptosystems (encryption methods based on mathematical problems that are hard to solve) to protect power consumption information, but quantum computing threatens to break these systems. This paper proposes QC-EAM, a new security model using quantum encryption and quantum Fourier transformation (a quantum algorithm for processing data) to protect smart grid communications, tested on IBM's quantum computing platform.
Advanced web bots like OpenWPM (a browser automation tool) can hide their identity and mimic human behavior, making them hard to detect and potentially enabling fraud or data theft. Researchers developed a detection system that analyzes four types of browsing behaviors (mouse movement, clicks, keystrokes, and scrolling) using machine learning classification models to identify these stealthy bots with 98.8% accuracy.
Model-based offline reinforcement learning (RL, where an AI learns to make decisions from a fixed dataset without interacting with a live environment) struggles because static data makes it hard to develop robust policies. This paper introduces MORAL, which uses adversarial data augmentation (a technique where competing AI models deliberately generate challenging training examples to improve robustness) to dynamically enrich training data and improve policy learning instead of using traditional fixed rollout methods.
Researchers have identified a new attack called user isolation poisoning (UIP) that targets decentralized federated learning (DFL, a system where multiple computers train AI models together without sending raw data to a central server). A malicious participant in DFL can use an adversarial message-passing graph neural network (a type of AI model that shares information between connected nodes) to strategically corrupt their model updates, which tricks the system into ignoring honest participants' contributions and reduces the overall accuracy of the shared model by up to 49.5%.
This research presents HIMT-NAS, an improved method for neural architecture search (NAS, the process of automatically designing neural network structures) that handles multiple tasks at once. The new approach tracks historical information about previous network designs across generations to reduce wasted search effort and adjusts how knowledge is shared between different tasks based on their similarity, addressing problems in existing multitask NAS methods.
AI-generated image forgeries created by tools like GANs (generative adversarial networks, AI models that create fake images) are hard to detect reliably, especially when facing new types of fakes or noisy images. Researchers found that forgery detectors fail because of frequency bias (a tendency to focus on certain patterns in image data while ignoring others), and they developed a frequency alignment method that can either attack these detectors or strengthen them by removing differences between real and fake images in how they look at the frequency level.
Fix: The source proposes a two-step frequency alignment method to remove the frequency discrepancy between real and fake images. According to the text, this method 'can serve as a strong black-box attack against forgery detectors in the anti-forensic context or, conversely, as a universal defense to improve detector reliability in the forensic context.' The authors developed corresponding attack and defense implementations and demonstrated their effectiveness across twelve detectors, eight forgery models, and five evaluation metrics.
IEEE Xplore (Security & AI Journals)This research addresses limitations in proxy re-encryption (a technique that converts encrypted data so one user can decrypt it and another user can read it instead) by proposing a new system called privacy-preserving proxy bilateral access control. The new system allows both the sender and receiver to set rules about what data can be shared, while protecting the message from being read by unauthorized parties and from being altered or forged during forwarding through multiple nodes.
Current password strength meters in IoT systems (internet-connected devices) incorrectly rate passwords as secure when they contain certain number patterns, causing users to create passwords that are actually weak. Researchers discovered that numbers in passwords follow predictable semantic patterns (like common sequences or meaningful digit combinations), which attackers can exploit using improved PCFG attacks (a method that guesses passwords by learning common patterns from leaked databases). The study proposes updating password strength meters to account for these digit patterns when evaluating password security.
Fix: The source proposes "a feasible scheme to improve the password strength meter for IoT systems based on the high-frequency semantic characteristics of digit segments" but does not provide specific implementation details, code, or concrete steps in the text provided.
IEEE Xplore (Security & AI Journals)This research proposes using generative AI (AI systems that can create new content) to automatically build multimedia knowledge graphs (MKGs, which are tools that organize data by showing how images, text, and other media relate to each other). The approach uses a quality index (QI, a computed score that measures how good generated images are) to evaluate synthetic images, reducing manual review work while keeping expert judgment for difficult or safety-critical decisions.
Vertical federated learning (VFL, a method where multiple parties train an AI model together by sharing features derived from their local data without sharing the raw data itself) can leak sensitive information through the shared features, making them vulnerable to attacks like reconstruction and inference (where attackers try to figure out or recreate the original data). FedFlex is a new framework that protects these shared features by combining VFL with differential privacy (DP, a technique that adds noise to data to hide individual information), first adding a fixed amount of noise and then automatically adjusting how features are shared to improve accuracy while maintaining privacy protection.
Fix: FedFlex addresses the problem through a two-step integration approach: first, it achieves generic protection by adding a task-agnostic amount of noise; subsequently, it adaptively adjusts the scale and distribution of the features to be shared in a trainable manner, thereby enhancing model accuracy under the added noise.
IEEE Xplore (Security & AI Journals)N/A -- This content is a website navigation menu and product listing for GitHub's development platform features, not a technical article about an AI/LLM issue, vulnerability, or problem.
Researchers developed a dual-locking security method for protecting trained neural networks by combining two techniques: a PIN (personal identification number)-based watermark embedded in the network's bias coefficients, and a cryptographic key that scrambles the network's internal index vectors. When locked without the correct key, the network becomes nearly non-functional (dropping accuracy below 10%), but unlocking with the right key fully restores its performance while keeping the ownership watermark hidden inside the model.
This research proposes a new method for protecting data privacy in deep learning (training AI models on sensitive data) by adding Gaussian noise (random values from a bell-curve distribution) to ResNets (a type of neural network with skip connections). The method aims to provide differential privacy (a mathematical guarantee that an individual's data cannot be easily identified from the model's results) while maintaining better accuracy and speed than existing privacy-protection techniques like DPSGD (differentially private stochastic gradient descent, a slower privacy-focused training method).
GRACE-FL is a framework for federated learning (collaborative training where multiple devices learn together while keeping their data private) that reduces energy use and communication costs on resource-limited devices like smartphones or IoT sensors. The system adjusts each device's training settings based on how much battery or power it has available, so devices with more energy can do harder computational work while weaker devices do lighter work, and a special aggregation strategy (method for combining results) weights each device's contribution fairly based on its energy capacity.
This paper presents SUNG, a framework for offline-to-online reinforcement learning (RL), which is training an AI agent first on existing data and then improving it through live interactions. The framework addresses two main problems: limited exploration due to offline data constraints and distribution shift (when the agent encounters data patterns it wasn't trained on). SUNG uses uncertainty estimation via a VAE (variational autoencoder, a type of neural network that learns data patterns) to guide both exploration (trying new actions) and exploitation (using known good actions), achieving strong performance on standard benchmarks.
Deep model fusion is a technique that combines parameters or predictions from multiple deep learning models into one unified system to improve performance by reducing individual model biases and errors. The survey categorizes four main fusion approaches: weight average (averaging model parameters), mode connectivity (connecting models through optimized paths), alignment (matching corresponding units between models), and ensemble learning (combining model outputs during inference). However, applying this technique to large-scale models like LLMs (large language models, which are AI systems trained on massive amounts of text) faces challenges including high computational cost and interference between different types of models.
Researchers developed OR-SLZNet, a deep learning model that helps drones automatically identify safe landing zones by analyzing camera images in real time. The model assigns each pixel a safety score by combining visual features like color and texture with geometric information like flatness and slope, enabling drones to make quick landing decisions in emergencies or autonomous missions.