Academic papers, new techniques, benchmarks, and theoretical findings in AI/LLM security.
This research proposes ATRIA, a system for verifying that copies of data stored across multiple edge computing servers are authentic and haven't been tampered with. ATRIA uses TEEs (trusted execution environments, which are secure hardware areas that isolate sensitive operations) to shift the work of generating verification tags from resource-limited user devices to more powerful servers, while also protecting user privacy through anonymous identities that a trusted authority can trace if needed. The system protects against attacks where servers collude or create fake data on demand, and testing shows it uses less computing power than similar existing approaches.
Wireless Sensor Networks (WSNs, collections of small wireless devices that sense and relay data) are vulnerable to node failures and malicious attacks because they operate with limited resources in open environments. This paper proposes EFTE, a framework that evaluates the trustworthiness of individual nodes by measuring their communication quality, remaining battery power, behavior consistency, and movement patterns, then uses entropy-based weighting (a mathematical approach to handle uncertainty in data) and a fuzzy inference system (a method that makes decisions from incomplete or uncertain information) to identify and isolate untrustworthy nodes while protecting data with lightweight encryption.
Contrastive learning (a machine learning technique where the AI learns to group similar items together and push different items apart) can suffer from sampling bias when similar samples belong to different classes or dissimilar samples belong to the same class, hurting classification accuracy. This paper proposes using out-of-distribution (OOD) detection, which identifies and masks unusual or misclassified samples, to create a better contrastive learning model that can work without needing a separate collection of known unusual samples. The authors generate synthetic samples at the boundary between normal and unusual data to train an improved detector that produces more reliable classifications.
This research addresses how to safely explore environments using reinforcement learning (RL, a type of AI training where a system learns by trial and error) without causing damage or violating safety rules. The paper introduces safe equilibrium exploration (SEE), a method that balances two competing goals: expanding the area where exploration is allowed (the feasible zone) and building a more accurate model of how the environment works, showing that these two objectives improve each other and can reach an optimal balance without any safety violations.
This paper presents a method for compressing visual data in multimodal 3D object detection systems (systems that use multiple types of sensors like cameras and LiDAR to identify and locate objects in 3D space) when processing happens across both edge devices (local computers) and cloud servers. The authors propose two compression approaches: T-FFC (Transmission-Friendly Feature Compression), which reduces data size by 4933 times with minimal accuracy loss, and A-FFC (Accuracy-Friendly Feature Compression), which reduces data by 733 times with almost no accuracy loss, allowing cloud and edge devices to work together more efficiently.
Researchers proposed Adaptive Token Dictionary (ATD), a new transformer architecture (a type of AI model good at learning relationships between different parts of data) designed to improve image restoration tasks like super-resolution and denoising while reducing computational demands. Unlike traditional transformers that struggle with high computational costs, ATD uses a learnable token dictionary (a set of learned patterns representing typical image structures) and a cross-attention mechanism (a way for the model to compare input data against these learned patterns) to achieve better performance with lower computational complexity.
AIRPNet is a new AI system that restores damaged images while keeping them hidden from cloud services, protecting user privacy. The system works by concealing low-quality images inside other images using a technique called steganography (hiding data within other data), then restoring the hidden image without ever exposing it during processing. This approach offers better privacy protection than existing methods while maintaining image quality.
PSDO is a privacy-preserving framework for managing energy systems where prosumers (people who both consume and produce energy) can trade power without relying on a central authority. It combines decentralized optimization (a method where multiple parties solve problems together without one central controller) with differential privacy (a mathematical technique that adds noise to data to protect individual information), allowing prosumers to manage their energy autonomously while keeping their data private. Tests on an IEEE 33-bus system showed PSDO can find optimal solutions while protecting privacy better than existing methods.
OwnerHunter is a system that uses large language models (AI trained on vast amounts of text) to identify who owns a website by analyzing webpage content across multiple languages. It improves on older methods that struggled when webpages listed many names or were written in non-English languages, using strategies like checking multiple sources on a page and verifying results to accurately determine the true owner.
Living-off-the-land (LOTL) attacks use legitimate tools already built into a system to avoid being detected by security software. The article examines how attackers could use on-device large language models (AI systems running locally on a user's computer rather than in the cloud) as part of these attacks, though it does not detail specific attack methods or provide concrete defenses.
Researchers developed AdvDiffusion, a method that creates adversarial patches (special sticker patterns) that can fool face recognition systems into misidentifying people, even in real-world physical environments. The technique uses a diffusion model (an AI that learns to remove noise from images) to generate patches that work against black-box models (AI systems the attacker cannot see inside). These adversarial patches are more effective and transferable across different face recognition systems than previous attack methods.
CLIP and similar vision-language models (AI systems trained on paired images and text to understand both) are vulnerable to adversarial examples (carefully crafted image modifications designed to fool AI systems). Researchers proposed two methods, TGA-ZSR and Comp-TGA, that use text-guided attention (the model's focus on image regions based on text descriptions) to make these models more robust, achieving 9.58% and 11.95% improvements in accuracy when tested on adversarial examples.
Split learning (SL, a technique that splits AI model training across multiple computers to reduce computational burden) faces efficiency and security problems in edge computing (distributed computing done on devices near data sources) environments, where slow computers can hinder training and malicious actors may sabotage the model. The paper proposes CoDefend, a framework that uses local epoch regulation (dynamically adjusting how many training rounds each computer performs) and time-aware detection (monitoring for suspicious behavior within specific time windows) to improve both training speed and security while protecting privacy.
This research proposes DSW, a dataset-specific watermarking method to detect when text-to-image diffusion models (AI systems that generate images from text descriptions) are illegally fine-tuned (customized for specific tasks) using protected datasets. The method embeds a hidden watermark image representing the dataset owner into training data, then extracts it from images generated by models that used that data, creating a one-to-one link between the watermark and the specific misused dataset to prove intellectual property theft.
StrokePIN is an authentication system that uses keystroke dynamics (the unique way a person types, including timing and pressure patterns) combined with other data types to verify users' PIN (personal identification number) entries on mobile devices. The system uses a few-shot learning technique called Siamese Network (a machine learning approach that learns from very few examples) to work efficiently without needing to retrain constantly, and it includes security analysis showing that keystroke dynamics can provide meaningful protection against guessing attacks.
Fix: StrokePIN dynamically updates the template library (the stored reference patterns of how each user types) to mitigate the impact of user behavior drift over time, achieving a False Acceptance Rate of 8.3% and False Rejection Rate of 0.4%.
IEEE Xplore (Security & AI Journals)This paper describes SPARTA, a protocol designed to let people create multiple separate avatars (digital representations of users in virtual spaces) in the metaverse while keeping those avatars unlinkable, meaning no one can connect different avatars to the same real person. The protocol uses mercurial signatures (a cryptographic technique that allows flexible key usage) and zero-knowledge proofs (ways to prove something is true without revealing how you know it) to enable secure authentication and prevent misuse through a reputation system based on time-based hash chains (sequences of data linked by timestamps).
Ano2Rule is a new method that makes unsupervised anomaly detection models (AI systems that find unusual patterns without being trained on examples of what's normal) more understandable to humans by converting them into simple rules. The approach breaks down how normal data is distributed into multiple parts and creates boundary rules that explain when the model flags something as anomalous (abnormal), making it easier for security experts to trust and deploy these systems in high-stakes situations like detecting network intrusions or protecting IoT devices (internet-connected devices).
Deep neural networks can be attacked through backdoors, where attackers secretly poison training data to make the model misclassify certain inputs while appearing normal otherwise. This paper proposes Cert-SSBD, a defense method that uses randomized smoothing (adding random noise to samples) with sample-specific noise levels, optimized per sample using stochastic gradient ascent, combined with a new certification approach to make models more resistant to these attacks.
Fix: The proposed Cert-SSBD method addresses the issue by employing stochastic gradient ascent to optimize the noise magnitude for each sample, applying this sample-specific noise to multiple poisoned training sets to retrain smoothed models, aggregating predictions from multiple smoothed models, and introducing a storage-update-based certification method that dynamically adjusts each sample's certification region to improve certification performance.
IEEE Xplore (Security & AI Journals)Gradient leakage attacks (methods that steal private data by analyzing the mathematical updates sent between computers in federated learning, where AI training happens across multiple devices) pose privacy risks in federated learning systems. Researchers discovered that different layers of neural networks (sections that process information at different stages) leak different amounts of private information, so they created Layer-Specific Gradient Protection (LSGP), which applies stronger privacy protection to layers that leak more sensitive data rather than protecting all layers equally.
When users send prompts to LLM services like ChatGPT, sensitive personal information (such as names, addresses, or ID numbers) can leak out, even when basic privacy protections are used. This paper presents Rap-LI, a framework that identifies which parts of a user's input contain sensitive data and applies stronger privacy protection to those specific parts, rather than treating all data equally.