Paper 2602.14211v1

SkillJect: Automating Stealthy Skill-Based Prompt Injection for Coding Agents with Trace-Driven Closed-Loop Refinement

extend tool-augmented behaviors. This abstraction introduces an under-measured attack surface: skill-based prompt injection, where poisoned skills can steer agents away from user intent and safety policies

high relevance attack
Paper 2512.15081v1

Quantifying Return on Security Controls in LLM Systems

subjected to automated attacks with Garak across five vulnerability classes: PII leakage, latent context injection, prompt injection, adversarial attack generation, and divergence. For each (vulnerability, control) pair, attack success probabilities

medium relevance tool
Paper 2511.20597v1

BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents

security challenges that go beyond traditional web application threat models. Prior work has identified prompt injection as a new attack vector for web agents, yet the resulting impact within real

high relevance attack
Paper 2604.12284v1

WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents

textual webpage content to accomplish user-specified tasks. However, they are highly vulnerable to prompt injection attacks, where adversarial instructions embedded in HTML or rendered screenshots can manipulate agent behavior

high relevance attack
Paper 2605.15030v1

WARD: Adversarially Robust Defense of Web Agents Against Prompt Injections

interacting with websites, but their exposure to open web environments makes them vulnerable to prompt injection attacks embedded in HTML content or visual interfaces. Existing guard models still suffer from

high relevance attack
Paper 2606.12737v1

PI-Hunter: Automated Red-Teaming for Exposing and Localizing Prompt Injections

that interact with external tools and environments, introducing new security risks such as indirect prompt injection attacks through untrusted external sources. Existing defenses mainly focus on blocking malicious content

high relevance attack
CVE CRITICAL CVE-2026-42074

OpenClaude Sandbox Bypass via Model-Controlled `dangerouslyDisableSandbox` Input

openclaude View details
Paper 2605.11868v1

IPI-proxy: An Intercepting Proxy for Red-Teaming Web-Browsing AI Agents Against Indirect Prompt Injection

HTML pages those domains serve. Existing red-teaming resources fall short of this scenario: prompt-injection benchmarks ship pre-built adversarial pages that whitelisted agents cannot reach, and generic

high relevance attack
Paper 2603.13424v1

Agent Privilege Separation in OpenClaw: A Structural Defense Against Prompt Injection

Prompt injection remains one of the most practical attack vectors against LLM-integrated applications. We replicate the Microsoft LLMail-Inject benchmark (Greshake et al., 2024) against current generation models running

high relevance attack

JSONalyzeQueryEngine` in the run-llama/llama_index repository allows for SQL injection via prompt injection. This can lead to arbitrary file creation and Denial-of-Service (DoS) attacks. The vulnerability affects

CVSS 7.1 llamaindex View details
CVE CRITICAL CVE-2024-8309

GraphCypherQAChain class of langchain-ai/langchain version 0.2.5 allows for SQL injection through prompt injection. This vulnerability can lead to unauthorized data manipulation, data exfiltration, denial of service

CVSS 9.8 langchain View details

server CORS wildcard + auth-off-by-default enables CSRF graph exfiltration and persistent indirect prompt injection

Flowise: APIChain Prompt Injection SSRF in GET/POST API Chains

CVSS 7.1 flowise-components View details
Paper 2603.19469v1

A Framework for Formalizing LLM Agent Security

executes a user task. Using this framework, we reformalize existing attacks, such as indirect prompt injection, direct prompt injection, jailbreak, task drift, and memory poisoning, as violations

medium relevance tool
Paper 2602.13597v2

AlignSentinel: Alignment-Aware Detection of Prompt Injection Attacks

Prompt injection attacks insert malicious instructions into an LLM's input to steer it toward an attacker-chosen task instead of the intended one. Existing detection defenses typically classify

high relevance attack
CVE CRITICAL CVE-2024-7042

langchain-ai/langchainjs versions 0.2.5 and all versions with this class allows for prompt injection, leading to SQL injection. This vulnerability permits unauthorized data manipulation, data exfiltration, denial of service

CVSS 9.8 langchain View details
Paper 2511.15759v1

Securing AI Agents Against Prompt Injection Attacks

used for enhancing large language model capabilities, but they introduce significant security vulnerabilities through prompt injection attacks. We present a comprehensive benchmark for evaluating prompt injection risks in RAG-enabled

high relevance attack
Paper 2604.05179v1

Gradient-Controlled Decoding: A Safety Guardrail for LLMs with Dual-Anchor Steering

Large language models (LLMs) remain susceptible to jailbreak and direct prompt-injection attacks, yet the strongest defensive filters frequently over-refuse benign queries and degrade user experience. Previous work

medium relevance defense
CVE CRITICAL CVE-2024-12366

PandasAI uses an interactive prompt function that is vulnerable to prompt injection and run arbitrary Python code that can lead to Remote Code Execution (RCE) instead of the intended explanation

CVSS 9.8 pandasai View details
Paper 2606.22659v1

Confidently Wrong: Severity-Aware Calibration of Prompt-Injection Detectors under Attack Shift

Prompt-injection detectors are deployed as guards: a model scores an input and a downstream system trusts or blocks it on that score. I study the confidence of these scores

high relevance attack
Previous Page 4 of 28 Next