Academic papers, new techniques, benchmarks, and theoretical findings in AI/LLM security.
Scene Graph Generation (SGG, a method that identifies objects and their relationships in images) is limited by long-tailed bias, where the AI model performs well on common relationships but poorly on rare ones. This paper proposes a Grounded Cognition Method (GCM) that mimics human thinking by using techniques like Out Domain Knowledge Injection to broaden visual understanding, a Semantic Group Aware Synthesizer to organize relationship categories, modality erasure (removing one type of input at a time) to improve robustness, and a Shapley Enhanced Multimodal Counterfactual module to handle diverse contexts.
This paper presents mathematical approaches to solve Shape-from-Template (SfT, reconstructing a 3D object's shape from a single image using a known template) and Non-Rigid Structure-from-Motion (NRSfM, figuring out how a flexible object moves and its 3D structure from video). The researchers use Semi-Definite Programming (SDP, a mathematical optimization technique for solving certain types of problems) to find solutions that work with different types of object deformation models, requiring only point correspondences (matching points between images) rather than additional impractical assumptions.
This research addresses the problem of recognizing shapes that have been rotated at different angles in computer vision (the field of teaching computers to understand images). The authors propose a new method that focuses on analyzing the outline or contour points of shapes rather than individual pixels, and they use a special neural network module to identify geometric patterns in these contours while ignoring rotation. Their approach shows better results than previous methods, especially for complex shapes, and it works even when the contour data is slightly noisy or imperfect.
This research paper argues that the real problem with machine learning classifiers isn't that robustness (resistance to adversarial attacks, where small malicious changes trick the AI) and accuracy are fundamentally opposed, but rather that continuous functions (smooth mathematical functions without jumps or breaks) cannot achieve both properties simultaneously. The authors propose that effective robust and accurate classifiers should use discontinuous functions (functions with breaks or sudden changes) instead, and show that understanding this continuity property is crucial for building, analyzing, and testing modern machine learning models.
ATLAS Data v5.1.0 is an updated framework that documents security threats and defenses related to AI systems, now containing 16 tactics, 84 techniques, and 32 mitigations. The update adds new attack methods targeting AI, such as prompt injection (tricking an AI by hiding instructions in its input), deepfake generation, and data theft from AI services, along with new defensive measures like human oversight of AI agent actions and restricted permissions for AI tools. It also includes 42 real-world case studies showing how these attacks and defenses apply in practice.
This paper presents RINNs (reparameterizable integral neural networks), a new type of AI model designed to run efficiently on mobile devices with limited computing power. The key innovation is a reparameterization strategy that converts the complex mathematical structure used during training into a simpler feed-forward structure (a straightforward sequence of processing steps) at inference time, allowing these models to achieve high accuracy (79.1%) while running very fast (0.87 milliseconds) on mobile hardware.
N/A -- The provided content is a navigation menu and feature listing from GitHub's website, not a security issue, vulnerability report, or technical problem related to AI/LLMs.
ATLAS Data v5.0.0 introduces a new "Technique Maturity" field that categorizes AI attack techniques based on evidence level, ranging from feasible (proven in research) to realized (used in actual attacks). The release adds 11 new techniques covering AI agent attacks like context poisoning (injecting false information into an AI system's memory), credential theft from AI configurations, and prompt injection (tricking an AI by hiding malicious instructions in its input), plus updates to existing techniques and case studies.
This research presents LipVor, an algorithm that mathematically verifies whether a trained neural network (a computer model with interconnected nodes that learns patterns) follows partial monotonicity constraints, which means outputs change predictably with certain inputs. The method works by testing the network at specific points and using mathematical properties to guarantee the network behaves correctly across its entire domain, potentially allowing neural networks to be used in critical applications like credit scoring where trustworthiness and predictable behavior are required.
This article describes BMMA-GPT, a biometric authentication system that uses multiple forms of identification (like fingerprints and facial recognition) together with mathematical optimization to improve security and speed. The system uses a dual-threshold approach (two decision points to verify identity) and can be tailored to different organizational needs, achieving high accuracy while keeping verification time under 1.5 seconds.
Researchers developed TabExtractor, a tool that can steal tabular models (AI systems trained on spreadsheet-like data) without needing access to the original training data or knowing how the model was built. The attack works by creating synthetic data samples and using a special neural network architecture called a contrastive tabular transformer (CTT, a type of AI that learns by comparing similar and different examples) to reverse-engineer a clone of the victim model that performs almost as well as the original. This research shows that tabular models face serious security risks from extraction attacks.
Machine unlearning allows AI models to forget the effects of specific training samples, but verifying whether this actually happened is difficult because existing checks (like backdoor attacks or membership inference attacks, which test if a model remembers data by trying to extract or manipulate it) can be fooled by a dishonest model provider who simply retrains the model to pass the test rather than truly unlearning. This paper proposes IndirectVerify, a formal verification method that uses pairs of connected samples (trigger samples that are unlearned and reaction samples that should be affected by that unlearning) with intentional perturbations (small changes to training data) to create indirect evidence that unlearning actually occurred, making it harder to fake.
Researchers discovered a type of backdoor attack (hidden malicious instructions planted in AI systems) on multiagent reinforcement learning systems, where one adversary agent uses its actions to trigger hidden failures in other agents' decision-making policies. Unlike previous attacks that assumed unrealistic direct control over what victims observe, this attack is more practical because it works through normal agent interactions in partially observable environments (where agents cannot always see what others are doing). The researchers developed a training method to help adversary agents efficiently trigger these backdoors with minimal suspicious actions.
This research addresses privacy risks in decentralized optimization (where multiple networked computers work together to solve a problem without a central coordinator) by proposing ZS-DDAPush, an algorithm that adds mathematical noise structures to protect sensitive node information during communication. The key innovation is that ZS-DDAPush achieves privacy protection while maintaining the accuracy and efficiency of the optimization process, avoiding the typical trade-offs seen in other privacy methods like differential privacy (adding statistical noise to protect individual data) or encryption (scrambling data so only authorized parties can read it).
This research proposes a new method for deploying cyber deception (defensive tricks to confuse attackers) in networks by combining deep reinforcement learning (a type of AI that learns by trial and error) with game theory that accounts for time delays. The method uses an algorithm called proximal policy optimization (PPO, a technique for training AI to make optimal decisions) to figure out where and when to place deception resources, and tests show it outperforms existing approaches in handling complex network attacks.
This paper describes a new watermarking technique (a method to embed hidden ownership markers into AI models) that remains stable when models are fine-tuned (adjusted to perform new tasks) across different domains. The researchers propose a system that automatically adjusts synthetic training samples and watermark embedding based on the specific data, using out-of-distribution awareness (detecting when data differs significantly from expected patterns) to keep the watermark robust while maintaining the model's performance on its actual task.
This research paper proposes a new cryptographic method for secure data sharing in Internet of Vehicles (IoV, a system where vehicles communicate with each other and road infrastructure). The method uses Certificateless Signcryption (CLSC, a technique that encrypts data and verifies its authenticity without requiring traditional certificates) to allow one sender to securely share customized data with multiple specific receivers while keeping it hidden from others, even across different geographic regions. The proposed approach reduces computational complexity and includes privacy protections through pseudonym generation (creating fake identities).
Mujaz is a system that uses natural language processing (NLP, the field of AI that helps computers understand human language) to automatically clean up and summarize vulnerability descriptions found in public databases. The system was trained on a collection of carefully labeled vulnerability summaries and uses pre-trained language models (AI systems trained on large amounts of text) to create clearer, more consistent descriptions that help developers and organizations understand and patch security issues more effectively.
This paper presents DynMD, a new machine learning model that uses Graph Neural Networks (GNNs, which are AI systems that analyze connected data points and their relationships) to detect malware by analyzing streaming behavioral data (information about what a program does over time). Unlike previous approaches that miss how malware behaviors connect over time, DynMD uses an energy-based method to better understand malware patterns and can detect threats 3.81 to 5.33 times faster than existing systems.
Researchers developed BPDA, a method for finding security vulnerabilities in embedded firmware (software that runs on devices like routers and IoT devices) by tracking how user input flows through code to reach dangerous functions called sinks. The method is faster and more accurate than existing tools, discovering 163 real vulnerabilities including 34 previously unknown ones when tested on firmware from major manufacturers.