Trustworthy Agentic AI Requires Deterministic Architectural Boundaries
Summary
This paper argues that current agentic AI architectures are fundamentally incompatible with high-stakes scientific workflows because autoregressive language models cannot deterministically separate commands from data through training alone. The authors contend that probabilistic alignment and guardrails are insufficient for authorization security, and that deterministic architectural enforcement is necessary to prevent the "Lethal Trifecta" of untrusted inputs, privileged data access, and external action capability from becoming an exploit-discovery problem.
Solution / Mitigation
The paper introduces the Trinity Defense Architecture, which enforces security through three mechanisms: action governance via a finite action calculus with reference-monitor enforcement, information-flow control via mandatory access labels preventing cross-scope leakage, and privilege separation isolating perception from execution.
Original source: https://arxiv.org/abs/2602.09947v1
First tracked: February 11, 2026 at 06:00 PM