Research

Academic papers, new techniques, benchmarks, and theoretical findings in AI/LLM security.

33 items

CAPID: Context-Aware PII Detection for Question-Answering Systems

2/10/2026

This paper proposes CAPID, a context-aware PII detection system for question-answering platforms that addresses the limitation of current approaches which redact all PII regardless of contextual relevance. The approach fine-tunes a locally owned small language model (SLM) to detect PII spans, classify their types, and determine contextual relevance before data is passed to LLMs, avoiding privacy concerns with closed-source models. A synthetic data generation pipeline using LLMs is introduced to create training data that captures context-dependent PII relevance across multiple domains.

Research

CAPID: Context-Aware PII Detection for Question-Answering Systems

Trustworthy Agentic AI Requires Deterministic Architectural Boundaries

Focus Session: LLM4PQC -- An Agentic Framework for Accurate and Efficient Synthesis of PQC Cores

Spinel: A Post-Quantum Signature Scheme Based on SLn(Fp) Hashing

QRS: A Rule-Synthesizing Neuro-Symbolic Triad for Autonomous Vulnerability Discovery

LLM-FS: Zero-Shot Feature Selection for Effective and Interpretable Malware Detection

Stop Testing Attacks, Start Diagnosing Defenses: The Four-Checkpoint Framework Reveals Where LLM Safety Breaks

AGMark: Attention-Guided Dynamic Watermarking for Large Vision-Language Models

A Behavioral Fingerprint for Large Language Models: Provenance Tracking via Refusal Vectors

Autonomous Action Runtime Management(AARM):A System Specification for Securing AI-Driven Actions at Runtime

Understanding and Enhancing Encoder-based Adversarial Transferability against Large Vision-Language Models

LLMAC: A Global and Explainable Access Control Framework with Large Language Model

Timing and Memory Telemetry on GPUs for AI Governance

Benchmarking Knowledge-Extraction Attack and Defense on Retrieval-Augmented Generation

"These cameras are just like the Eye of Sauron": A Sociotechnical Threat Model for AI-Driven Smart Home Devices as Perceived by UK-Based Domestic Workers

MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks

CIC-Trap4Phish: A Unified Multi-Format Dataset for Phishing and Quishing Attachment Detection

Reverse Online Guessing Attacks on PAKE Protocols

StealthRL: Reinforcement Learning Paraphrase Attacks for Multi-Detector Evasion of AI-Text Detectors

Framework for Integrating Zero Trust in Cloud-Based Endpoint Security for Critical Infrastructure

CAPID: Context-Aware PII Detection for Question-Answering Systems

Trustworthy Agentic AI Requires Deterministic Architectural Boundaries

Focus Session: LLM4PQC -- An Agentic Framework for Accurate and Efficient Synthesis of PQC Cores

Spinel: A Post-Quantum Signature Scheme Based on SLn(Fp) Hashing

QRS: A Rule-Synthesizing Neuro-Symbolic Triad for Autonomous Vulnerability Discovery

LLM-FS: Zero-Shot Feature Selection for Effective and Interpretable Malware Detection

Stop Testing Attacks, Start Diagnosing Defenses: The Four-Checkpoint Framework Reveals Where LLM Safety Breaks

AGMark: Attention-Guided Dynamic Watermarking for Large Vision-Language Models

A Behavioral Fingerprint for Large Language Models: Provenance Tracking via Refusal Vectors

Autonomous Action Runtime Management(AARM):A System Specification for Securing AI-Driven Actions at Runtime

Understanding and Enhancing Encoder-based Adversarial Transferability against Large Vision-Language Models

LLMAC: A Global and Explainable Access Control Framework with Large Language Model

Timing and Memory Telemetry on GPUs for AI Governance

Benchmarking Knowledge-Extraction Attack and Defense on Retrieval-Augmented Generation

"These cameras are just like the Eye of Sauron": A Sociotechnical Threat Model for AI-Driven Smart Home Devices as Perceived by UK-Based Domestic Workers

MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks

CIC-Trap4Phish: A Unified Multi-Format Dataset for Phishing and Quishing Attachment Detection

Reverse Online Guessing Attacks on PAKE Protocols

StealthRL: Reinforcement Learning Paraphrase Attacks for Multi-Detector Evasion of AI-Text Detectors

Framework for Integrating Zero Trust in Cloud-Based Endpoint Security for Critical Infrastructure