Armor: Shielding Unlearnable Examples Against Data Augmentation
Summary
Unlearnable examples are protective noises added to private data to prevent AI models from learning useful information from them, but this paper shows that data augmentation (a common technique that creates variations of training data to improve model performance) can undo this protection and restore learnability from 21.3% to 66.1% accuracy. The researchers propose Armor, a defense framework that adds protective noise while accounting for data augmentation effects, using a surrogate model (a practice model used to simulate the real training process) and smart augmentation selection to keep private data unlearnable even after augmentation is applied.
Solution / Mitigation
The paper proposes Armor, a defense framework that works by: (1) designing a non-local module-assisted surrogate model to better capture the effect of data augmentation, (2) using a surrogate augmentation selection strategy that maximizes distribution alignment between augmented and non-augmented samples to choose the optimal augmentation strategy for each class, and (3) using a dynamic step size adjustment algorithm to enhance the defensive noise generation process. The authors state that 'Armor can preserve the unlearnability of protected private data under data augmentation' and plan to open-source the code upon publication.
Classification
Related Issues
CVE-2025-45150: Insecure permissions in LangChain-ChatGLM-Webui commit ef829 allows attackers to arbitrarily view and download sensitive
CVE-2025-54868: LibreChat is a ChatGPT clone with additional features. In versions 0.0.6 through 0.7.7-rc1, an exposed testing endpoint
Original source: http://ieeexplore.ieee.org/document/11345171
First tracked: March 9, 2026 at 08:01 PM
Classified by LLM (prompt v3) · confidence: 92%