Academic papers, new techniques, benchmarks, and theoretical findings in AI/LLM security.
IoT devices used in rental situations like Airbnbs need secure ways to transfer permission (access rights) from owners to renters, but current systems don't properly prevent problems like a malicious owner keeping camera access after handing it over. Forseti is a new authorization framework that uses zero-knowledge proof (a cryptographic method proving something is true without revealing the details) and a decentralized ledger (a shared, distributed record not controlled by any single party) to protect both owners' and renters' control over devices during permission transfers.
Fix: The source presents Forseti as a proposed solution framework that 'leverages zero-knowledge proof and a decentralized ledger to ensure that the rights of both hosts and tenants are not violated.' However, the source does not describe a specific implementation step, patch, update, or deployment procedure that users can apply.
IEEE Xplore (Security & AI Journals)OCEAN is a security system designed for Industrial IoT (the use of connected devices in factories and industrial settings) that aims to prevent packet loss (data getting dropped during transmission) while keeping data transmission fast and secure. It uses specialized hardware (an ASIC and FPGA, which are types of programmable computer chips) combined with a network protocol (set of rules for how data moves between devices) that verifies packets at each hop and caches (temporarily stores) them until receiving confirmation they arrived safely.
This research presents mmFace, a face authentication system that uses millimeter wave radar (mmWave, radio signals that can penetrate materials and detect fine details) instead of cameras to verify a person's identity while resisting spoofing attacks (fake faces or replayed recordings). The system works even when users wear masks because mmWave signals can pass through them, and it uses techniques like liveness detection (checking that a face is real and alive) and amplitude modulation-based methods to prevent attackers from fooling it with fake faces or recorded videos.
This paper proposes CQT-AKA, a security method for mobile devices that combines cancelable biometrics (fingerprints or facial features that can be regenerated if compromised) with quantum-resistant encryption (protection against future powerful computers) to securely exchange encryption keys between devices. The approach is more secure than traditional methods that rely on passwords or smart cards alone, and it works well on resource-limited devices because it requires less storage and computing power.
OptiVersa-ECDSA is a new cryptographic protocol that improves threshold-ECDSA (a method where multiple parties must cooperate to sign blockchain transactions securely). The protocol uses novel techniques called verifiable secret-product sharing (VSPS, a way to distribute and verify secret values) to achieve 35-65% faster performance and 99% improvement in cheater identification compared to previous approaches, making it practical for real-time blockchain use.
Recommender systems (platforms that suggest products or services to users) are vulnerable to data poisoning attacks (malicious manipulation of the data the system learns from to make it behave incorrectly). This paper presents METT, a detection method that identifies these attacks even when they are carefully hidden or small-scale, using techniques like causality inference (analyzing cause-and-effect relationships in user behavior) and a disturbance tolerance mechanism (a way to distinguish real attack patterns from false alarms).
This research paper presents a new method for coverless image steganography (CIS, a technique to hide secret information inside images without visibly altering them), designed to resist black-box attacks (attacks where an attacker can't see how the system works, only its outputs). The method uses SIFT (Scale-Invariant Feature Transform, an algorithm that identifies distinctive points in images), to create a dataset and mapping structure that hides data more securely and with greater capacity than previous CIS methods.
This research presents a self-supervised learning (SSL, a training method where an AI learns patterns from unlabeled data without human annotations) framework to help soft robots understand their own body position and movement. The key innovation is that the approach uses large amounts of unannotated data to train an initial model, then fine-tunes it with just a small set of labeled examples, requiring only about 5% of the annotated data that traditional supervised learning methods need while achieving better results.
This research addresses how to make reinforcement learning (RL, where AI systems learn to make decisions by trial and error) safer for healthcare by proposing a method called Constraint Transformer that learns safety rules from historical medical records instead of requiring real-time interaction. The system uses a causal attention mechanism (a technique that identifies which past events matter most) and a generative world model (a simulation tool) to identify unsafe treatment decisions and improve patient outcomes while reducing harmful behaviors.
Glacial lake outburst floods (GLOFs, sudden releases of water from glacial lakes that threaten communities) are dangerous, and detecting them early requires accurate identification of glacial lakes and assessment of their risk. Researchers developed AdU-Net, a framework combining a dilated U-Net (a type of neural network architecture for image analysis) with a vision transformer encoder to identify glacial lakes in satellite imagery, and then used a modified spiking neural network (SNN, a type of AI model that processes information similarly to how neurons communicate) to analyze how the risk of outbursts changes over time.
Researchers studied how humans use two types of thinking (fast intuitive processing and slower logical reasoning) when looking at images, and tested whether AI systems like multimodal large language models (MLLMs, which process both text and images together) have similar abilities. They found that while MLLMs have improved at correcting intuitive errors, they still struggle with logical processing tasks that require deeper analysis, and segmentation models (AI systems that identify objects in images) make errors similar to human intuitive mistakes rather than using logical reasoning.
Researchers developed a new method for watermarking LLM outputs (adding hidden markers to prove ownership and track content) using a three-part system that works only through input prompts, without needing access to the model's internal parameters. The approach uses one AI to create watermarking instructions, another to generate marked outputs, and a third to detect the watermarks, making it work across different LLM types including both proprietary and open-source models.
This research creates a benchmark and evaluation framework for online safety analysis of LLMs, which involves detecting unsafe outputs while the AI is generating text rather than after it finishes. The study tests various safety detection methods on different LLMs and finds that combining multiple methods together, called hybridization, can improve safety detection effectiveness. The work aims to help developers choose appropriate safety methods for their specific applications.
Deep neural networks (DNNs, AI models with multiple layers that learn patterns) are vulnerable to adversarial examples, which are inputs slightly modified to trick the model into making wrong predictions. This paper introduces a concept called the certified local transferable region, a mathematically guaranteed area around an input where a single small perturbation (adversarial attack) will fool the model, and proposes a method called RAOS (reverse attack oracle-based search) to measure how large these vulnerable areas are as a way to evaluate how robust neural networks truly are.
Researchers have developed a method to hide secret data inside large language models (AI systems trained on massive amounts of text) by encoding information into the model's parameters during training. The hidden data doesn't interfere with the model's normal functions like text classification or generation, but authorized users with a secret key can extract the concealed information, enabling covert communication. The method leverages transformers (the neural network architecture behind modern AI language models) and its self-attention mechanisms (components that help the model focus on relevant parts of input) to achieve high capacity for hidden data while remaining undetectable.
N/A -- This content is a navigation menu and feature listing from GitHub's Release 4.9.1 page, not a description of a security issue, vulnerability, or AI/LLM problem.
Organizations struggle to manage cyber supply chain risk management (C-SCRM, the practice of protecting digital products and services from threats as they move through their supply chain from creation to use). The paper identifies specific obstacles by combining research, past security incidents, and industry standards to understand what makes it hard for companies to protect hardware, firmware (low-level software that controls hardware), software, and services throughout their lifecycles.
Differential privacy (DP, a mathematical technique that adds controlled randomness to data to protect individual privacy while keeping data useful) is a widely-used method for protecting sensitive information, but putting it into practice in real-world systems has proven difficult. Researchers analyzed 21 actual deployments of differential privacy by major companies and institutions over the last ten years to understand what works and what doesn't.
This research addresses the problem of stealing attacks against healthcare APIs (application programming interfaces, which are tools that let software systems communicate with each other), where attackers try to copy or extract data from medical AI models. The authors propose a defense strategy called "adaptive teleportation" that modifies incoming queries (requests) in clever ways to fool attackers while still allowing legitimate users to get accurate results from the healthcare API.
Fix: The source proposes 'adaptive teleportation of incoming queries' as the defense mechanism. According to the text, 'The adaptive teleportation operations are generated based on the formulated bi-level optimization target and follows the evolution trajectory depicted by the Wasserstein gradient flows, which effectively push attacking queries to cross decision boundary while constraining the deviation level of benign queries.' This approach 'provides misleading information on malicious queries while preserving model utility.' The authors validated this mechanism on three healthcare prediction tasks (inhospital mortality, bleed risk, and ischemic risk prediction) and found it 'significantly more effective to suppress the performance of cloned model while maintaining comparable serving utility compared to existing defense approaches.'
IEEE Xplore (Security & AI Journals)OWASP's Agentic Security Initiative has created a taxonomy (a classification system for threats and their fixes) that is now being used in real developer tools like PENSAR, SPLX.AI Agentic Radar, and AI&ME to help teams build and test secure agentic AI systems (AI systems that can take actions autonomously). This taxonomy is also informing the development of OWASP's Top 10 for Agentic AI, a list of the most critical security risks in this area.