ShadowCoT: Cognitive Hijacking for Stealthy Reasoning Backdoors in LLMs
Summary
ShadowCoT is a backdoor attack (a hidden vulnerability inserted into an AI model that causes it to misbehave when triggered) that targets Chain-of-Thought reasoning, which is a technique where LLMs show their step-by-step thinking to solve complex problems. Unlike simpler attacks, ShadowCoT hijacks the model's internal reasoning process by subtly rewiring how attention flows through the model and changing intermediate representations (internal data the model creates while processing), allowing it to produce logical-sounding but harmful outputs while avoiding detection.
Classification
Related Issues
CVE-2024-37052: Deserialization of untrusted data can occur in versions of the MLflow platform running version 1.1.0 or newer, enabling
CVE-2024-27444: langchain_experimental (aka LangChain Experimental) in LangChain before 0.1.8 allows an attacker to bypass the CVE-2023-
Original source: http://ieeexplore.ieee.org/document/11495247
First tracked: May 12, 2026 at 02:01 AM
Classified by LLM (prompt v3) · confidence: 92%