StealthRL: Reinforcement Learning Paraphrase Attacks for Multi-Detector Evasion of AI-Text Detectors
securityresearch
Source: Arxiv (cs.CR + cs.AI)February 9, 2026Summary
StealthRL is a reinforcement learning framework that uses paraphrasing attacks to evade AI-text detectors while preserving semantic meaning. The system achieves near-zero detection rates (0.001 mean TPR@1%FPR) and 99.9% attack success rate against multiple detector families, with attacks successfully transferring to unseen detector types, revealing fundamental architectural vulnerabilities in current AI-text detection systems.
Original source: https://arxiv.org/abs/2602.08934v1
First tracked: February 11, 2026 at 06:00 PM