AI Sec Watch: A Security Intelligence Platform for AI Systems

Luu, T.J.

Exposing the Ghost in the Transformer: Abnormal Detection for Large Language Models via Hidden State Forensics

inforesearchPeer-ReviewedLLM-Specific

securityresearch

Source: IEEE Xplore (Security & AI Journals)April 13, 2026

Summary

Large language models (LLMs, which are AI systems trained on vast amounts of text) are vulnerable to serious attacks like hallucinations (making up false information), jailbreaks (tricking the AI into ignoring its safety rules), and backdoors (hidden malicious instructions inserted during training). This research proposes a detection method using hidden state forensics (analyzing the internal numerical patterns that flow through the model's layers) to identify abnormal or malicious behavior in real-time, achieving over 95% accuracy with minimal computational cost.