Search: prompt injection | AI Threat Alert

Severity:

449 results in 179ms

Paper 2601.13612v1

2026-01-20

PINA: Prompt Injection Attack against Navigation Agents

actions. Compared to text-based applications, their security is far more critical: a successful prompt injection attack does not just alter outputs but can directly misguide physical navigation, leading

high relevance attack

Paper 2605.26999v1

2026-05-26

Prompt Injection Detection is Regime-Dependent: A Deployment-Aware Evaluation with Interpretable Structural Signals

Prompt injection poses a critical threat to the safe deployment of large language models, yet existing detection approaches are typically evaluated under limited settings that do not reflect real-world

high relevance benchmark

Paper 2601.17383v1

2026-01-24

Physical Prompt Injection Attacks on Large Vision-Language Models

reasoning in open physical environments. While LVLMs are known to be vulnerable to prompt injection attacks, existing methods either require access to input channels or depend on knowledge of user

high relevance attack

Paper 2509.25926v1

2025-09-30

Better Privilege Separation for Agents by Restricting Data Types

systems, such as AI agents. Unfortunately, these advantages have come with a vulnerability to prompt injections, an attack where an adversary subverts the LLM's intended functionality with an injected

medium relevance attack

Paper 2605.28017v1

2026-05-27

Can It Reach the Generator? Investigating the Survival of Prompt-Injection Attacks in Realistic RAG Settings

Recent generative engine optimisation (GEO) research has shown that prompt-injection attacks can push a target product to the top of an LLM's recommendation list, with the strongest attacks

high relevance attack

Paper 2605.18133v1

2026-05-18

An Empirical Study of Privacy Leakage Chains via Prompt Injection in Black-Box Chatbot Environments

user' s task. This paper studies a privacy-leakage attack chain based on indirect prompt injection in black-box chatbot environments, where the attacker has no access to model weights

high relevance attack

Paper 2604.25562v1

2026-04-28

SnapGuard: Lightweight Prompt Injection Detection for Screenshot-Based Web Agents

effective paradigm for automating interactions with complex web environments, yet remain vulnerable to prompt injection attacks that embed malicious instructions into webpage content to induce unintended actions. This threat

high relevance attack

Paper 2602.09222v1

2026-02-09

MUZZLE: Adaptive Agentic Red-Teaming of Web Agents Against Indirect Prompt Injection Attacks

users' behalf. While these agents offer powerful capabilities, their design exposes them to indirect prompt injection attacks embedded in untrusted web content, enabling adversaries to hijack agent behavior and violate

high relevance attack

Paper 2601.17911v1

2026-01-25

Prompt Injection Evaluations: Refusal Boundary Instability and Artifact-Dependent Compliance in GPT-4-Series Models

Prompt injection evaluations typically treat refusal as a stable, binary indicator of safety. This study challenges that paradigm by modeling refusal as a local decision boundary and examining its stability

high relevance benchmark

Paper 2606.13038v1

2026-06-11

Nous: An Attempt to Extract and Inject the Cognition Behind Prediction-Market Behavior

measuring the cognitive-monoculture problem and the limits of a prompt-level remedy, motivating deeper, below-the-prompt injection (fine-tuning, activation steering). Code, frozen profiles, prompts, and model outputs

medium relevance attack

Paper 2510.16128v1

2025-10-17

Prompt injections as a tool for preserving identity in GAI image descriptions

have been described, but most require top down or external intervention. An emerging strategy, prompt injections, provides an empowering alternative: indirect users can mitigate harm against them, from within their

high relevance tool

Paper 2605.17324v1

2026-05-17

ASPI: Seeking Ambiguity Clarification Amplifies Prompt Injection Vulnerability in LLM Agents

from standard execution to a clarification-seeking state increases an agent's susceptibility to prompt injection attacks. We introduce ASPI (Ambiguous-State Prompt Injection), a benchmark of 728 task-attack

high relevance attack

Paper 2512.00966v1

2025-11-30

Mitigating Indirect Prompt Injection via Instruction-Following Intent Analysis

Indirect prompt injection attacks (IPIAs), where large language models (LLMs) follow malicious instructions hidden in input data, pose a critical threat to LLM-powered agents. In this paper, we present

high relevance attack

Paper 2602.14211v1

2026-02-15

SkillJect: Automating Stealthy Skill-Based Prompt Injection for Coding Agents with Trace-Driven Closed-Loop Refinement

extend tool-augmented behaviors. This abstraction introduces an under-measured attack surface: skill-based prompt injection, where poisoned skills can steer agents away from user intent and safety policies

high relevance attack

Paper 2512.15081v1

2025-12-17

Quantifying Return on Security Controls in LLM Systems

subjected to automated attacks with Garak across five vulnerability classes: PII leakage, latent context injection, prompt injection, adversarial attack generation, and divergence. For each (vulnerability, control) pair, attack success probabilities

medium relevance tool

Paper 2511.20597v1

2025-11-25

BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents

security challenges that go beyond traditional web application threat models. Prior work has identified prompt injection as a new attack vector for web agents, yet the resulting impact within real

high relevance attack

Paper 2604.12284v1

2026-04-14

WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents

textual webpage content to accomplish user-specified tasks. However, they are highly vulnerable to prompt injection attacks, where adversarial instructions embedded in HTML or rendered screenshots can manipulate agent behavior

high relevance attack

Paper 2605.15030v1

2026-05-14

WARD: Adversarially Robust Defense of Web Agents Against Prompt Injections

interacting with websites, but their exposure to open web environments makes them vulnerable to prompt injection attacks embedded in HTML content or visual interfaces. Existing guard models still suffer from

high relevance attack

Paper 2606.12737v1

2026-06-10

PI-Hunter: Automated Red-Teaming for Exposing and Localizing Prompt Injections

that interact with external tools and environments, introducing new security risks such as indirect prompt injection attacks through untrusted external sources. Existing defenses mainly focus on blocking malicious content

high relevance attack

Paper 2605.11868v1

2026-05-12

IPI-proxy: An Intercepting Proxy for Red-Teaming Web-Browsing AI Agents Against Indirect Prompt Injection

HTML pages those domains serve. Existing red-teaming resources fall short of this scenario: prompt-injection benchmarks ship pre-built adversarial pages that whitelisted agents cannot reach, and generic

high relevance attack

Previous Page 3 of 23 Next