Paper 2605.02812v1

Autonomous LLM Agent Worms: Cross-Platform Propagation, Automated Discovery and Temporal Re-Entry Defense

graph analyzer, traces data flow from file I/O to LLM context injection points and ranks carriers by context injection position without manual analysis. SRPO, our summary-resilient payload optimizer, generates

medium relevance tool
Paper 2606.18310v1

Conflict-Aware Retriever Editing for Knowledge Injection Attacks on LLM-Based RAG Systems

mislead downstream generation, posing a serious security threat for AI applications. Existing RAG injection attacks mainly rely on manipulating external knowledge bases, such as crafting malicious corpus. However, the synthetic

high relevance tool
Paper 2602.17837v1

TFL: Targeted Bit-Flip Attack on Large Language Model

safety and security critical applications, raising concerns about their robustness to model parameter fault injection attacks. Recent studies have shown that bit-flip attacks (BFAs), which exploit computer main memory

high relevance attack
Paper 2606.24402v1

Poisoned Playbooks: Demystifying Knowledge Poisoning Effects on AI Security Agents

challenges and AI agents. First, we demonstrate how a crafted single poisoned write-up injected into public-style security knowledge sources which we denote as Poisoned Playbooks, alters the behavior

medium relevance attack
Paper 2603.03332v2

Fragile Thoughts: How Large Language Models Handle Chain-of-Thought Perturbations

Chain-of-Thought (CoT) prompting has emerged as a foundational technique for eliciting reasoning from Large Language Models (LLMs), yet the robustness of this approach to corruptions in intermediate reasoning

medium relevance survey
Paper 2601.13300v1

OI-Bench: An Option Injection Benchmark for Evaluating LLM Susceptibility to Directive Interference

signals such as social cues, framing, and instructions. In this work, we introduce option injection, a benchmarking approach that augments the multiple-choice question answering (MCQA) interface with an additional

high relevance benchmark
Paper 2605.30096v1

How Reliable Are AI Attackers Against a Fixed Vulnerable Target? A 400-Run Empirical Study of LLM Penetration Testing Consistency

Large language models (LLMs) can autonomously conduct multi-stage cyber

high relevance attack
Paper 2604.21860v1

Transient Turn Injection: Exposing Stateless Multi-Turn Vulnerabilities in Large Language Models

workflows, raising the stakes for adversarial robustness and safety. This paper introduces Transient Turn Injection(TTI), a new multi-turn attack technique that systematically exploits stateless moderation by distributing adversarial

high relevance attack
Paper 2604.21131v1

Cross-Session Threats in AI Agents: Benchmark, Evaluation, and Algorithms

attack taxonomies classified by kill-chain stage and cross-session operation (accumulate, compose, launder, inject_on_reader), each bound to one of seven identity anchors that ground-truth "violation

medium relevance benchmark
Paper 2605.17480v1

The Capability Paradox: How Smarter Auditors Make Multi-Agent Systems Less Secure

domain-specific narratives and propagated to a Manager through Worker reports, without any syntactic injection primitives. Across 42,000 adversarial trials over 12 Manager models and 7 Worker configurations

medium relevance benchmark
Paper 2511.03675v1

Whisper Leak: a side-channel attack on Large Language Models

paramount. This paper introduces Whisper Leak, a side-channel attack that infers user prompt topics from encrypted LLM traffic by analyzing packet size and timing patterns in streaming responses. Despite

high relevance attack
Paper 2510.00490v1

Has the Two-Decade-Old Prophecy Come True? Artificial Bad Intelligence Triggered by Merely a Single-Bit Flip in Large Language Models

Recently, Bit-Flip Attack (BFA) has garnered widespread attention for

medium relevance attack
Previous Page 22 of 22