Search: prompt injection | AI Threat Alert

All CVEs Papers

Severity:

All Critical High Medium Low

451 results in 183ms

Paper 2605.10862v1

2026-05-11

RUBEN: Rule-Based Explanations for Retrieval-Augmented LLM Systems

safety, specifically to test the resiliency of safety training and effectiveness of adversarial prompt injections

medium relevance tool

Paper 2605.09822v1

2026-05-10

Oracle Poisoning: Corrupting Knowledge Graphs to Weaponise AI Agent Reasoning

query at runtime via tool-use protocols, causing incorrect conclusions through correct reasoning. Unlike prompt injection, Oracle Poisoning manipulates the data agents reason over, not their instructions. We demonstrate

medium relevance attack

Paper 2605.06393v1

2026-05-07

Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation

legitimately deployed agent may be steered toward unsafe operations through malicious messages, indirect prompt injection, unsafe skills, or tampering along the host-side control path. We argue that such risks

medium relevance benchmark

CVE MEDIUM CVE-2026-43901

2026-05-05

wireshark-mcp vulnerable to arbitrary file write via export_objects

CVSS 6.8 wireshark-mcp View details

Paper 2605.04261v1

2026-05-05

Laundering AI Authority with Adversarial Examples

produces confident and authoritative responses about the \emph{wrong} input. Unlike jailbreaks or prompt injections, our attacks do not compromise model alignment; the attack operates entirely at the perceptual level

medium relevance attack

Paper 2605.03213v1

2026-05-04

When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI

sensitive context, hold credentials, and operate across pipelines no single party fully controls, enabling prompt injection, context exfiltration, credential theft, and inter-agent message poisoning. Current defenses operate entirely within

medium relevance survey

Paper 2605.03129v1

2026-05-04

PIIGuard: Mitigating PII Harvesting under Adversarial Sanitization

with limited deployable options. We present PIIGuard, a webpage-level defense that repurposes indirect prompt injection as a protective mechanism: the page owner embeds optimized hidden HTML fragments that steer

medium relevance attack

Paper 2605.02811v1

2026-05-04

Tool Use as Action: Towards Agentic Control in Mobile Core Networks

functions and break down the latency of end-to-end operations, starting from the prompt injection until the completion of the input task. This work demonstrates how an AI agent

medium relevance attack

Paper 2605.02187v1

2026-05-04

When Alignment Isn't Enough: Response-Path Attacks on LLM Agents

AgentDojo and ASB with six LLMs, RTA achieves up to 99.1% attack success, outperforming prompt-injection baselines with modest overhead. Case studies on OpenClaw and Claude Code demonstrate real-world

high relevance attack

Paper 2604.28129v1

2026-04-30

Latent Adversarial Detection: Adaptive Probing of LLM Activations for Multi-Turn Attack Detection

Multi-turn prompt injection follows a known attack path -- trust-building, pivoting, escalation but text-level defenses miss covert attacks where individual turns appear benign. We show this attack path

high relevance attack

CVE MEDIUM GHSA-gfg9-5357-hv4c

2026-04-29

OpenClaw: Webchat audio embedding could read local files without local

openclaw View details

Paper 2604.25200v1

2026-04-28

Making AI-Assisted Grant Evaluation Auditable without Exposing the Model

rubric measurement, and the evaluation output. The paper also considers a scenario-specific prompt injection risk: applicant-controlled documents may contain hidden or indirect instructions intended to influence

medium relevance benchmark

Paper 2604.25186v1

2026-04-28

FCMBench-Video: Benchmarking Document Video Intelligence

Cross-Document Validation and Evidence-Grounded Selection probe higher-level evidence integration, and Visual Prompt Injection provides a complementary robustness dimension. The overall score distribution is broad and approximately bell

medium relevance benchmark

Paper 2604.25102v1

2026-04-28

One Perturbation, Two Failure Modes: Probing VLM Safety via Embedding-Guided Typographic Perturbations

Typographic prompt injection exploits vision language models' (VLMs) ability to read text rendered in images, posing a growing threat as VLMs power autonomous agents. Prior work typically focus on maximizing

medium relevance defense

Paper 2604.24920v1

2026-04-27

SUDP: Secret-Use Delegation Protocol for Agentic Systems

reusable artifact derived from it, within a model-steerable boundary, so a transient prompt-injection or tool-side compromise becomes durable account compromise. Existing defenses cover adjacent pieces such

medium relevance survey

Paper 2604.23593v1

2026-04-26

When AI reviews science: Can we trust the referee?

informal adoption have exposed acute failure modes. Recent incidents have revealed that hidden prompt injections embedded in manuscripts can steer LLM-generated reviews toward unjustifiably positive judgments. Complementary studies have

medium relevance survey

CVE MEDIUM GHSA-7jm2-g593-4qrc

2026-04-25

OpenClaw: Agent gateway config mutations could change protected operator settings

openclaw View details

Paper 2604.20732v1

2026-04-22

Anchor-and-Resume Concession Under Dynamic Pricing for LLM-Augmented Freight Negotiation

flexibility but require expensive reasoning models, produce non-deterministic pricing, and remain vulnerable to prompt injection. We propose a two-index anchor-and-resume framework that addresses both limitations

medium relevance benchmark

Paper 2604.18206v1

2026-04-20

A Control Architecture for Training-Free Memory Use

Prompt-injected memory can improve reasoning without updating model weights, but it also creates a control problem: retrieved content helps only when it is applied in the right state

low relevance benchmark

Paper 2604.17562v1

2026-04-19

SafeAgent: A Runtime Protection Architecture for Agentic Systems

Large language model (LLM) agents are vulnerable to prompt-injection attacks that propagate through multi-step workflows, tool interactions, and persistent context, making input-output filtering alone insufficient for reliable

medium relevance defense

Previous Page 15 of 23 Next