How to make LLMs a defensive advantage without creating a new attack surface
Summary
LLMs are being used in security in three ways: as productivity tools for analysts, as embedded components in security products, and as targets for attackers to manipulate or steal. The same capabilities that help security teams (like summarizing incidents or drafting detection logic) can also enable attackers to create convincing phishing emails or extract sensitive information if the LLM is poorly integrated. To use LLMs defensively without creating new vulnerabilities, security teams should treat LLM output as untrusted, start with narrow, easy-to-verify use cases, and design systems with three layers of constraints: limited model capabilities, restricted data access, and human approval for any actions that change system state.
Solution / Mitigation
The source describes three design choices that reduce risk: (1) 'Make sources explicit: Use retrieval-augmented generation so the assistant answers from curated documents, tickets or playbooks and show the cited snippets to the analyst.' (2) 'Keep the model out of the blast radius: The model should not hold secrets. Use short-lived credentials, scoped tokens and brokered access to tools.' (3) 'Gate actions: Anything that changes a system state (blocking, quarantining, deleting, emailing) should require human approval or a separate policy engine.' The source also recommends starting with a 'narrow set of workflows where the output is advisory and easy to verify' before expanding capabilities.
Classification
Related Issues
CVE-2024-27444: langchain_experimental (aka LangChain Experimental) in LangChain before 0.1.8 allows an attacker to bypass the CVE-2023-
CVE-2025-45150: Insecure permissions in LangChain-ChatGLM-Webui commit ef829 allows attackers to arbitrarily view and download sensitive
Original source: https://www.csoonline.com/article/4137983/how-to-make-llms-a-defensive-advantage-without-creating-a-new-attack-surface.html
First tracked: February 27, 2026 at 07:00 AM
Classified by LLM (prompt v3) · confidence: 85%