Search: prompt injection | AI Threat Alert

Severity:

458 results in 179ms

Paper 2605.05846v1

2026-05-07

LoopTrap: Termination Poisoning Attacks on LLM Agents

this self-directed loop facilitates autonomy, it also introduces a critical risk: by injecting malicious prompts into the agent's context, an adversary can distort the agent's termination judgment

high relevance attack

CVE CRITICAL GHSA-fq2m-6wqh-x44g

2026-06-18

PraisonAI: Jobs API exposes agent-execution endpoints with no authentication

CVSS 9.8 praisonai View details

Paper 2606.09499v1

2026-06-08

Targeting World Models to Compromise Robot Learning Pipelines

directly implant dangerous trajectories into sold or uploaded datasets, our novel attack methods inject malicious prompts or compromising transition dynamics into visibly safe teleoperated datasets which are only activated once

medium relevance benchmark

Paper 2603.17239v1

2026-03-18

LAAF: Logic-layer Automated Attack Framework A Systematic Red-Teaming Methodology for LPCI Vulnerabilities in Agentic Large Language Model Systems

pipelines, and external tool connectors face a class of attacks - Logic-layer Prompt Control Injection (LPCI) - for which no automated red-teaming instrument existed. We present LAAF (Logic-layer Automated

high relevance attack

Paper 2603.12644v1

2026-03-13

Uncovering Security Threats and Architecting Defenses in Autonomous Agents: A Case Study of OpenClaw

OpenClaw ecosystem. We systematically investigate its current threat landscape, highlighting critical vulnerabilities such as prompt injection-driven Remote Code Execution (RCE), sequential tool attack chains, context amnesia, and supply chain

medium relevance defense

Paper 2512.17146v1

2025-12-19

Biosecurity-Aware AI: Agentic Risk Auditing of Soft Prompt Attacks on ESM-Based Variant Predictors

GFMs. SAGE functions through an interpretable and automated risk auditing loop. It injects soft prompt perturbations, monitors model behavior across training checkpoints, computes risk metrics such as AUROC and AUPR

high relevance attack

Paper 2605.04808v1

2026-05-06

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

propose DTap-Red, the first autonomous red-teaming agent that systematically explores diverse injection vectors (e.g., prompt, tool, skill, environment, combinations) and autonomously discovers effective attack strategies tailored to varying

high relevance tool

Paper 2512.17259v1

2025-12-19

Verifiability-First Agents: Provable Observability and Lightweight Audit Agents for Controlling Autonomous LLM Systems

detection under stealthy strategies, and (iii) resilience of verifiability mechanisms to adversarial prompt and persona injection. Our approach shifts the evaluation focus from how likely misalignment is to how quickly

medium relevance tool

Paper 2603.08387v1

2026-03-09

AULLM++: Structural Reasoning with Large Language Models for Micro-Expression Recognition

propose AULLM++, a reasoning-oriented framework leveraging Large Language Models (LLMs), which injects visual features into textual prompts as actionable semantic premises to guide inference. It formulates AU prediction into

low relevance benchmark

Paper 2606.18120v1

2026-06-16

Structural Role Injection in Handlebars-Templated LLM Prompts: Triple-Brace Interpolation, Delimiter Family, and the Limits of HTML Auto-Escaping

Large language model applications build prompts from templates, and Handlebars is a widely used templating engine and the default prompt-template format in Microsoft Semantic Kernel. Its double-brace

high relevance attack

Paper 2602.05401v1

2026-02-05

BadTemplate: A Training-Free Backdoor Attack via Chat Template Against Large Language Models

chat templates allows an attacker who controls the template to inject arbitrary strings into the system prompt without the user's notice. Building on this, we propose a training-free

high relevance attack

Paper 2604.21700v1

2026-04-23

Stealthy Backdoor Attacks against LLMs Based on Natural Style Triggers

attack in a realistic threat model and systematically evaluate BadStyle under both prompt-induced and PEFT-based injection strategies. Extensive experiments across seven victim LLMs, including LLaMA, Phi, DeepSeek

high relevance attack

Paper 2604.21829v1

2026-04-23

Black-Box Skill Stealing Attack from Proprietary LLM Agents: An Empirical Study

prior prompt-stealing methods and build an automated stealing prompt generation agent. This agent starts from model-generated seed prompts, expands them through scenario rationalization and structure injection, and enforces

high relevance attack

CVE CRITICAL CVE-2024-34359

2024-05-14

llama-cpp-python is the Python bindings for llama.cpp. `llama

CVSS 9.6 View details

Paper 2601.02670v1

2026-01-06

Multi-Turn Jailbreaking of Aligned LLMs via Lexical Anchor Tree Search

injection. LATS reformulates jailbreaking as a breadth-first tree search over multi-turn dialogues, where each node incrementally injects missing content words from the attack goal into benign prompts. Evaluations

high relevance attack

Paper 2605.12746v1

2026-05-12

CoT-Guard: Small Models for Strong Monitoring

attacks, where the adversary is a third-party LLM router injecting hidden objectives into code-generation requests through either prompt manipulation or code manipulation attacks. To push beyond objectives that

medium relevance attack

CVE CRITICAL CVE-2025-9556

2025-09-12

files, which leads to a server side template injection vulnerability within langchaingo, allowing an attacker to insert a statement into a prompt to read the "etc/passwd" file

CVSS 9.8 View details

Paper 2605.10600v1

2026-05-11

Generate "Normal", Edit Poisoned: Branding Injection via Hint Embedding in Image Editing

rendered onto semantically related objects, even when the user prompt does not explicitly mention it. This form of hidden payload injection makes the attack stealthy. We study two realistic attack

high relevance attack

Paper 2511.10913v1

2025-11-14

Synthetic Voices, Real Threats: Evaluating Large Text-to-Speech Models in Generating Harmful Audio

second leverages audio-modality exploits (Read, Spell, Phoneme) that inject harmful content through auxiliary audio channels while maintaining benign textual prompts. Through evaluation across five commercial LALMs-based TTS systems

medium relevance benchmark

Paper 2511.17666v1

2025-11-21

Evaluating Adversarial Vulnerabilities in Modern Large Language Models

prompted to circumvent their own safety protocols, and 'cross-bypass', where one model generated adversarial prompts to exploit vulnerabilities in the other. Four attack methods were employed - direct injection, role

medium relevance attack

Previous Page 21 of 23 Next