Research

This paper proposes a comprehensive framework for integrating Zero Trust Architecture (ZTA) into cloud-based endpoint security for critical infrastructure such as power plants, healthcare systems, and financial systems. The framework aims to address the gap in applying ZTA to endpoint management within cloud environments, treating every access request as new with no implicit trust, thereby enhancing compliance, enabling continuous protection, and reducing attack surfaces.

Is Reasoning Capability Enough for Safety in Long-Context Language Models?

securitysafetyresearch

This research introduces "compositional reasoning attacks" where harmful queries are decomposed into fragments scattered across long contexts (up to 64k tokens), revealing that stronger reasoning capability in LLMs does not improve safety against such attacks. Testing 14 frontier LLMs showed that safety alignment degrades as context length increases, and models with better general reasoning often assemble harmful intent but fail to refuse it.

Fix: Increasing inference-time compute reduces attack success by over 50 percentage points on GPT-oss-120b model, indicating that inference-time reasoning effort is a key mitigating factor.

CryptoGen: Secure Transformer Generation with Encrypted KV-Cache Reuse

securityprivacyresearch

CryptoGen is a system designed to enable privacy-preserving autoregressive generation on cloud-hosted Transformer models by supporting encrypted key-value (KV) cache reuse. It combines homomorphic encryption and secret sharing to achieve near-linear scaling with 4.4x-7.6x lower per-token latency compared to existing discriminative secure inference systems when adapted for generation tasks. The system is released as an open-source library.

DyMA-Fuzz: Dynamic Direct Memory Access Abstraction for Re-hosted Monolithic Firmware Fuzzing

DyMA-Fuzz is a firmware fuzzing framework designed to address Direct Memory Access (DMA) challenges in re-hosted monolithic firmware testing. It uses runtime analysis techniques to automatically infer DMA memory access patterns and inject fuzzing data into target buffers without manual configuration. When evaluated on 94 firmware samples and 8 DMA-guarded CVE benchmarks, it achieved up to 122% higher code coverage compared to state-of-the-art tools.

Empirical Evaluation of SMOTE in Android Malware Detection with Machine Learning: Challenges and Performance in CICMalDroid 2020

This research evaluates machine learning algorithms (XGBoost, Naïve Bayes, SVC, and Random Forest) for Android malware detection using the CICMalDroid2020 dataset of dynamically obtained behavior samples. The study empirically tests the SMOTE technique for addressing class imbalance, finding that in 75% of configurations SMOTE led to performance degradation or marginal improvements with an average loss of 6.14 percentage points. Tree-based algorithms like XGBoost and Random Forest consistently outperformed others, achieving weighted recall above 94%.

Large Language Lobotomy: Jailbreaking Mixture-of-Experts via Expert Silencing

securitysafetyresearch

Researchers discovered a security vulnerability in Mixture-of-Experts (MoE) Large Language Models where safety-critical behaviors like refusal are concentrated in a small set of experts. They developed Large Language Lobotomy (L³), a training-free attack that exploits expert routing dynamics by silencing safety-relevant experts, increasing jailbreak attack success from 7.3% to 70.4% (up to 86.3%) across eight state-of-the-art MoE LLMs while silencing fewer than 20% of layer-wise experts.

SoK: The Pitfalls of Deep Reinforcement Learning for Cybersecurity

researchsecurity

This paper systematizes 11 methodological pitfalls commonly found in Deep Reinforcement Learning for Cybersecurity (DRL4Sec) research across environment modeling, agent training, evaluation, and deployment stages. Analysis of 66 significant DRL4Sec papers from 2018-2025 reveals an average of over five pitfalls per paper, with the authors demonstrating the practical impact through controlled experiments in autonomous cyber defense, adversarial malware creation, and web security testing.

Fix: The paper provides actionable recommendations for each of the 11 identified pitfalls to support the development of more rigorous and deployable DRL-based security systems.

Dashed Line Defense: Plug-And-Play Defense Against Adaptive Score-Based Query Attacks

This paper addresses the vulnerability of deep learning models to score-based query attacks that craft adversarial examples using only black-box access to model outputs. The authors demonstrate that existing plug-and-play defenses can be bypassed by adaptive attacks and propose Dashed Line Defense (DLD), a post-processing method designed to withstand adaptive query strategies by introducing ambiguity in loss observations.

Fix: The paper proposes Dashed Line Defense (DLD), a plug-and-play post-processing method that introduces ambiguity in how the observed loss reflects the true adversarial strength of candidate examples, preventing attackers from reliably analyzing and adapting their queries to disrupt the adversarial example generation process. DLD is validated through experiments on ImageNet and demonstrates effectiveness even under worst-case adaptive attacks while preserving model predicted labels.

Retrieval Pivot Attacks in Hybrid RAG: Measuring and Mitigating Amplified Leakage from Vector Seeds to Graph Expansion

securityprivacyresearch

Hybrid Retrieval-Augmented Generation (RAG) pipelines that combine vector similarity search with knowledge graph expansion create a security vulnerability where vector-retrieved seed chunks can pivot through entity links into sensitive graph neighborhoods, causing cross-tenant data leakage. The research formalizes this as Retrieval Pivot Risk (RPR) and demonstrates that naturally shared entities create cross-tenant pivot paths without requiring adversarial injection, with undefended systems showing RPR up to 0.95 and consistent leakage at pivot depth 2.

Sparse Models, Sparse Safety: Unsafe Routes in Mixture-of-Experts LLMs

securitysafetyresearch

This research reveals safety vulnerabilities in Mixture-of-Experts (MoE) large language models, demonstrating that manipulating specific routers can create "unsafe routes" that convert safe outputs into harmful ones. The study introduces RoSais (Router Safety importance score) to identify critical routers and proposes F-SOUR framework, which achieves attack success rates of 0.90 and 0.98 on safety benchmarks across four MoE LLM families by exploiting routing configurations.

Stateless Yet Not Forgetful: Implicit Memory as a Hidden Channel in LLMs

This research reveals that large language models (LLMs) possess "implicit memory"—the ability to encode information in their outputs and later recover it when those outputs are reintroduced as input, creating persistent information channels across supposedly independent interactions. The authors demonstrate this through "time bombs," a new class of temporal backdoors that activate only after a sequence of interactions satisfies hidden conditions, which can be induced through prompting or fine-tuning. The work discusses broader security implications including covert communication, benchmark contamination, and targeted manipulation.

Claude Opus 4.6 Finds 500+ High-Severity Flaws Across Major Open-Source Libraries

securityresearchindustry

2/6/2026

Anthropic's Claude Opus 4.6 AI model discovered over 500 previously unknown high-severity security vulnerabilities in major open-source libraries including Ghostscript, OpenSC, and CGIF. The model found these flaws without task-specific tooling or specialized prompting by analyzing code like a human researcher, identifying issues such as missing bounds checks, buffer overflows, and complex vulnerabilities requiring conceptual understanding of algorithms.

Microsoft Develops Scanner to Detect Backdoors in Open-Weight Large Language Models