Search: prompt injection | AI Threat Intelligence

Severity:

288 results in 138ms

Paper 2603.06588v1

2026-02-02

vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM

core functions of vLLM Hook, in version 0, we demonstrate 3 use cases including prompt injection detection, enhanced retrieval-augmented retrieval (RAG), and activation steering. Finally, we welcome the community

medium relevance attack

Paper 2602.01942v1

2026-02-02

Human Society-Inspired Approaches to Agentic AI Security: The 4C Framework

Although recent work has strengthened defenses against model and pipeline level vulnerabilities such as prompt injection, data poisoning, and tool misuse, these system centric approaches may fail to capture risks

medium relevance tool

Paper 2602.01129v1

2026-02-01

SMCP: Secure Model Context Protocol

security and privacy challenges. These include risks such as unauthorized access, tool poisoning, prompt injection, privilege escalation, and supply chain attacks, any of which can impact different parts

medium relevance attack

CVE CRITICAL CVE-2026-25130

2026-01-30

CAI find_file Agent Tool has Command Injection Vulnerability Through

CVSS 9.7 cai-framework View details

Paper 2601.16314v1

2026-01-22

Machine-Assisted Grading of Nationwide School-Leaving Essay Exams with LLMs and Statistical NLP

raters and tends to fall within the human scoring range. We also evaluate bias, prompt injection risks, and LLMs as essay writers. These findings demonstrate that a principled, rubric-driven

medium relevance benchmark

Paper 2602.12285v1

2026-01-21

From Biased Chatbots to Biased Agents: Examining Role Assignment Effects on LLM Agent Robustness

shifts appear across task types and model architectures, indicating that persona conditioning and simple prompt injections can distort an agent's decision-making reliability. Our findings reveal an overlooked vulnerability

medium relevance benchmark

Paper 2601.12822v1

2026-01-19

MirrorGuard: Toward Secure Computer-Use Agents via Simulation-to-Real Reasoning Correction

perform complex tasks. This autonomy introduces serious security risks: malicious instructions or visual prompt injections can trigger unsafe reasoning and cause harmful system-level actions. Existing defenses, such as detection

medium relevance benchmark

Paper 2601.12560v1

2026-01-18

Agentic Artificial Intelligence (AI): Architectures, Taxonomies, and Evaluation of Large Language Model Agents

practices. Finally, we highlight open challenges, such as hallucination in action, infinite loops, and prompt injection, and outline future research directions toward more robust and reliable autonomous systems

medium relevance benchmark

Paper 2601.12449v1

2026-01-18

AgenTRIM: Tool Risk Mitigation for Agentic AI

While such tools extend capability, improper tool permissions introduce security risks such as indirect prompt injection and tool misuse. We characterize these failures as unbalanced tool-driven agency. Agents

medium relevance tool

Paper 2601.10338v1

2026-01-15

Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale

skills contain at least one vulnerability, spanning 14 distinct patterns across four categories: prompt injection, data exfiltration, privilege escalation, and supply chain risks. Data exfiltration (13.3%) and privilege escalation

medium relevance survey

Paper 2601.10156v1

2026-01-15

ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback

percent on average and improves benign task completion by approximately 10 percent under prompt injection attacks

medium relevance tool

Paper 2601.09923v2

2026-01-14

CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents

agents are vulnerable to prompt injection attacks, where malicious content hijacks agent behavior to steal credentials or cause financial loss. The only known robust defense is architectural isolation that strictly

medium relevance tool

Paper 2601.07263v1

2026-01-12

When Bots Take the Bait: Exposing and Mitigating the Emerging Social Engineering Attack in Web Automation Agent

broadened the attack surface. While prior research has focused on model threats such as prompt injection and backdoors, the risks of social engineering remain largely unexplored. We present the first

high relevance attack

Paper 2601.07185v1

2026-01-12

Defenses Against Prompt Attacks Learn Surface Heuristics

test-time accuracy drops of up to \textbf{40\%}. These findings suggest that current prompt-injection defenses frequently respond to attack-like surface patterns rather than the underlying intent

high relevance attack

Paper 2601.07853v1

2026-01-09

FinVault: Benchmarking Financial Agent Safety in Execution-Grounded Environments

constraints, together with 107 real-world vulnerabilities and 963 test cases that systematically cover prompt injection, jailbreaking, financially adapted attacks, as well as benign inputs for false-positive evaluation. Experimental

medium relevance benchmark

Paper 2601.05059v1

2026-01-08

From Understanding to Engagement: Personalized pharmacy Video Clips via Vision Language Models (VLMs)

smooth transitions and audio/visual alignment; (ii) a personalization mechanism based on role definition and prompt injection for tailored outputs (marketing, training, regulatory); (iii) a cost efficient e2e pipeline strategy balancing

medium relevance benchmark

Paper 2601.04583v1

2026-01-08

Autonomous Agents on Blockchains: Standards, Execution Models, and Trust Boundaries

threat model tailored to agent-driven transaction pipelines that captures risks ranging from prompt injection and policy misuse to key compromise, adversarial execution dynamics, and multi-agent collusion

medium relevance survey

Paper 2601.01972v3

2026-01-05

Hidden State Poisoning Attacks against Mamba-based Language Models

also observe that HiSPA triggers significantly weaken the Jamba model on the popular Open-Prompt-Injections benchmark, unlike pure Transformers. Finally, our interpretability study reveals patterns in Mamba's hidden

high relevance attack

Paper 2601.01972v4

2026-01-05

Hidden State Poisoning Attacks against Mamba-based Language Models

also observe that HiSPA triggers significantly weaken the Jamba model on the popular Open-Prompt-Injections benchmark, unlike pure Transformers. We further show that the theoretical and empirical findings extend

high relevance attack

Paper 2601.01241v1

2026-01-03

MCP-SandboxScan: WASM-based Secure Execution and Runtime Analysis for MCP Tools

agents raise new security risks: tool executions can introduce runtime-only behaviors, including prompt injection and unintended exposure of external inputs (e.g., environment secrets or local files). While existing scanners

medium relevance benchmark

Previous Page 11 of 15 Next