AI Sec Watch: A Security Intelligence Platform for AI Systems

Luu, T.J.

AGFPS: An Automated Gradient-Free Framework for Prompt Stealing

inforesearchPeer-ReviewedLLM-Specific

securityresearch

Source: IEEE Xplore (Security & AI Journals)March 9, 2026

Summary

AGFPS is a new attack method that steals system prompts (the hidden instructions that control how an LLM behaves) from deployed AI applications by using evolutionary optimization (a technique that mimics natural selection to find solutions) instead of gradient-based methods. The researchers demonstrated that their approach successfully extracted prompts 95.2% of the time and worked better than previous methods, highlighting serious security weaknesses in how LLMs are currently deployed.