Search: prompt injection | AI Threat Alert

Severity:

367 results in 156ms

Paper 2603.11875v1

2026-03-12

The Mirror Design Pattern: Strict Data Geometry over Model Scale for Prompt Injection Detection

Prompt injection defenses are often framed as semantic understanding problems and delegated to increasingly large neural detectors. For the first screening layer, however, the requirements are different: the detector runs

high relevance attack

Paper 2602.06268v1

2026-02-06

MPIB: A Benchmark for Medical Prompt Injection Attacks and Clinical Safety in LLMs

LLMs) and Retrieval-Augmented Generation (RAG) systems are increasingly integrated into clinical workflows; however, prompt injection attacks can steer these systems toward clinically unsafe or misleading outputs. We introduce

high relevance benchmark

CVE CRITICAL GHSA-2763-cj5r-c79m

2026-04-08

PraisonAI Vulnerable to OS Command Injection

CVSS 9.7 PraisonAI View details

Paper 2511.05797v1

2025-11-08

When AI Meets the Web: Prompt Injection Risks in Third-Party AI Chatbot Plugins

Prompt injection attacks pose a critical threat to large language models (LLMs), with prior work focusing on cutting-edge LLM applications like personal copilots. In contrast, simpler LLM applications, such

high relevance attack

Paper 2512.12583v1

2025-12-14

Detecting Prompt Injection Attacks Against Application Using Classifiers

networks, Random Forest, and Naive Bayes, to detect malicious prompts in LLM integrated web applications. The proposed approach improves prompt injection detection and mitigation, helping protect targeted applications and systems

high relevance attack

Paper 2510.01354v1

2025-10-01

WAInjectBench: Benchmarking Prompt Injection Detections for Web Agents

Multiple prompt injection attacks have been proposed against web agents. At the same time, various methods have been developed to detect general prompt injection attacks, but none have been systematically

high relevance benchmark

Paper 2509.24967v4

2025-09-29

SecInfer: Preventing Prompt Injection via Inference-time Scaling

Prompt injection attacks pose a pervasive threat to the security of Large Language Models (LLMs). State-of-the-art prevention-based defenses typically rely on fine-tuning

high relevance attack

Paper 2601.22240v1

2026-01-29

A Systematic Literature Review on LLM Defenses Against Prompt Injection and Jailbreaking: Expanding NIST Taxonomy

emergence of new security vulnerabilities and challenges, such as jailbreaking and other prompt injection attacks. These maliciously crafted inputs can exploit LLMs, causing data leaks, unauthorized actions, or compromised outputs

high relevance survey

Paper 2601.10173v1

2026-01-15

ReasAlign: Reasoning Enhanced Safety Alignment against Prompt Injection Attack

automating complex workflows across various fields. However, these systems are highly vulnerable to indirect prompt injection attacks, where malicious instructions embedded in external data can hijack agent behavior. In this

high relevance attack

Paper 2604.12548v1

2026-04-14

DeepSeek Robustness Against Semantic-Character Dual-Space Mutated Prompt Injection

Prompt injection has emerged as a critical security threat to large language models (LLMs), yet existing studies predominantly focus on single-dimensional attack strategies, such as semantic rewriting or character

high relevance attack

Paper 2511.00447v2

2025-11-01

DRIP: Defending Prompt Injection via Token-wise Representation Editing and Residual Instruction Fusion

they process user data according to predefined instructions. However, conventional LLMs remain vulnerable to prompt injection, where malicious users inject directive tokens into the data to subvert model behavior. Existing

high relevance attack

Paper 2512.23684v1

2025-12-29

Multilingual Hidden Prompt Injection Attacks on LLM-Based Academic Reviewing

find that prompt injection induces substantial changes in review scores and accept/reject decisions for English, Japanese, and Chinese injections, while Arabic injections produce little to no effect. These results highlight

high relevance survey

Paper 2511.21752v2

2025-11-23

Semantics as a Shield: Label Disguise Defense (LDD) against Prompt Injection in LLM Sentiment Classification

classification tasks such as sentiment analysis, yet their reliance on natural language prompts exposes them to prompt injection attacks. In particular, class-directive injections exploit knowledge of the model

high relevance attack

Paper 2602.03792v1

2026-02-03

WebSentinel: Detecting and Localizing Prompt Injection Attacks for Web Agents

Prompt injection attacks manipulate webpage content to cause web agents to execute attacker-specified tasks instead of the user's intended ones. Existing methods for detecting and localizing such attacks

high relevance attack

Paper 2512.09321v3

2025-12-10

ObliInjection: Order-Oblivious Prompt Injection Attack to LLM Agents with Multi-source Data

Prompt injection attacks aim to contaminate the input data of an LLM to mislead it into completing an attacker-chosen task instead of the intended task. In many applications

high relevance attack

Paper 2510.19207v2

2025-10-22

Defending Against Prompt Injection with DataFilter

agents are increasingly deployed to automate tasks and interact with untrusted external data, prompt injection emerges as a significant security threat. By injecting malicious instructions into the data that LLMs

high relevance attack

Paper 2511.12295v1

2025-11-15

Privacy-Preserving Prompt Injection Detection for LLMs Using Federated Learning and Embedding-Based NLP Classification

designed inputs. Existing detection approaches often require centralizing prompt data, creating significant privacy risks. This paper proposes a privacy-preserving prompt injection detection framework based on federated learning and embedding

high relevance attack

Paper 2601.13612v1

2026-01-20

PINA: Prompt Injection Attack against Navigation Agents

actions. Compared to text-based applications, their security is far more critical: a successful prompt injection attack does not just alter outputs but can directly misguide physical navigation, leading

high relevance attack

Paper 2601.17383v1

2026-01-24

Physical Prompt Injection Attacks on Large Vision-Language Models

reasoning in open physical environments. While LVLMs are known to be vulnerable to prompt injection attacks, existing methods either require access to input channels or depend on knowledge of user

high relevance attack

Paper 2509.25926v1

2025-09-30

Better Privilege Separation for Agents by Restricting Data Types

systems, such as AI agents. Unfortunately, these advantages have come with a vulnerability to prompt injections, an attack where an adversary subverts the LLM's intended functionality with an injected

medium relevance attack

Previous Page 2 of 19 Next