Paper 2603.03332v2

Fragile Thoughts: How Large Language Models Handle Chain-of-Thought Perturbations

Chain-of-Thought (CoT) prompting has emerged as a foundational technique for eliciting reasoning from Large Language Models (LLMs), yet the robustness of this approach to corruptions in intermediate reasoning

medium relevance survey
Paper 2601.13300v1

OI-Bench: An Option Injection Benchmark for Evaluating LLM Susceptibility to Directive Interference

signals such as social cues, framing, and instructions. In this work, we introduce option injection, a benchmarking approach that augments the multiple-choice question answering (MCQA) interface with an additional

high relevance benchmark
Paper 2511.03675v1

Whisper Leak: a side-channel attack on Large Language Models

paramount. This paper introduces Whisper Leak, a side-channel attack that infers user prompt topics from encrypted LLM traffic by analyzing packet size and timing patterns in streaming responses. Despite

high relevance attack
Paper 2510.00490v1

Has the Two-Decade-Old Prophecy Come True? Artificial Bad Intelligence Triggered by Merely a Single-Bit Flip in Large Language Models

Recently, Bit-Flip Attack (BFA) has garnered widespread attention for

medium relevance attack
Previous Page 15 of 15