Search: model poisoning | AI Threat Alert

296 results in 114ms

Paper 2511.12423v1

2025-11-16

GRAPHTEXTACK: A Realistic Black-Box Node Injection Attack on LLM-Enhanced GNNs

Recent work integrates Large Language Models (LLMs) with Graph Neural Networks (GNNs) to jointly model semantics and structure, resulting in more general and expressive models that achieve state

high relevance attack

Paper 2606.16242v1

2026-06-15

Rapid Poison: Practical Poisoning Attacks Against the Rapid Response Framework

injection can infiltrate this pipeline to deliver poisoned samples into the classifier's training set, enabling two attack objectives: (I) targeted poisoning attacks that create false positives on harmless samples

high relevance tool

Paper 2512.14448v1

2025-12-16

Reasoning-Style Poisoning of LLM Agents via Stealthy Style Transfer: Process-Level Attacks and Runtime Monitoring in RSV Space

Large Language Model (LLM) agents relying on external retrieval are increasingly deployed in high-stakes environments. While existing adversarial attacks primarily focus on content falsification or instruction injection, we identify

high relevance attack

Paper 2603.21642v1

2026-03-23

Are AI-assisted Development Tools Immune to Prompt Injection?

development tools built on the Model Context Protocol (MCP). However, their convenience comes with security risks, especially prompt-injection attacks delivered via tool-poisoning vectors. While prior research has studied

high relevance tool

Paper 2510.13842v1

2025-10-11

ADMIT: Few-shot Knowledge Poisoning Attacks on RAG-based Fact Checking

Knowledge poisoning poses a critical threat to Retrieval-Augmented Generation (RAG) systems by injecting adversarial content into knowledge bases, tricking Large Language Models (LLMs) into producing attacker-controlled outputs grounded

high relevance attack

Paper 2510.12143v1

2025-10-14

Fairness-Constrained Optimization Attack in Federated Learning

demographics. FL enables model sharing, while restricting the movement of data. Since FL provides participants with independence over their training data, it becomes susceptible to poisoning attacks. Such collaboration also

high relevance attack

Paper 2602.22427v2

2026-02-25

Adversarial Hubness Detector: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

Retrieval-Augmented Generation (RAG) systems are essential to contemporary AI

medium relevance attack

Paper 2511.14074v1

2025-11-18

Dynamic Black-box Backdoor Attacks on IoT Sensory Data

measurements can be fed to a machine learning-based model to train and classify human activities. While deep learning-based models have proven successful in classifying human activity and gestures

high relevance attack

Paper 2606.22700v1

2026-06-21

SCRUB-FL: Sanitizing and Cleansing Representations via Unlearning of Backdoors

data to manipulate model predictions. Existing defenses mainly operate during before and during aggregation cannot fully eliminate backdoor behaviors that persist in the converged global model. Moreover, the effectiveness

medium relevance benchmark

Paper 2601.05293v1

2026-01-08

A Survey of Agentic AI and Cybersecurity: Challenges, Opportunities and Use-case Prototypes

survey emerging threat models, security frameworks, and evaluation pipelines tailored to agentic systems, and analyze systemic risks including agent collusion, cascading failures, oversight evasion, and memory poisoning. Finally, we present

medium relevance survey

Paper 2509.26584v1

2025-09-30

Fairness Testing in Retrieval-Augmented Generation: How Small Perturbations Reveal Bias in Small Language Models

Large Language Models (LLMs) are widely used across multiple domains but continue to raise concerns regarding security and fairness. Beyond known attack vectors such as data poisoning and prompt injection

medium relevance benchmark

Paper 2603.00172v1

2026-02-26

Hidden in the Metadata: Stealth Poisoning Attacks on Multimodal Retrieval-Augmented Generation

augmented generation (RAG) has emerged as a powerful paradigm for enhancing multimodal large language models by grounding their responses in external, factual knowledge and thus mitigating hallucinations. However, the integration

high relevance attack

Paper 2511.17671v1

2025-11-21

MURMUR: Using cross-user chatter to break collaborative language agents in groups

today's language models lack a mechanism for isolating user interactions and concurrent tasks, creating a new attack vector inherent to this new setting: cross-user poisoning

medium relevance attack

Paper 2606.10322v1

2026-06-09

Game-Theoretic Multi-Agent Control for Robust Contextual Reasoning in LLMs

Large Language Models (LLMs) in multi-turn interactions maintain evolving context rather than generating isolated responses, making them vulnerable to prompt-injection and context-poisoning attacks in which locally plausible

medium relevance benchmark

Paper 2603.20357v1

2026-03-20

Memory poisoning and secure multi-agent systems

Memory poisoning attacks for Agentic AI and multi-agent systems (MAS) have recently caught attention. It is partially due to the fact that Large Language Models (LLMs) facilitate the construction

medium relevance attack

Paper 2605.29960v1

2026-05-28

Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction

Large language model (LLM) agents increasingly leverage long term memory to support persistent and autonomous task execution. However, this capability also introduces a new attack surface: memory poisoning, where adversaries

high relevance attack

Paper 2509.24408v2

2025-09-29

FuncPoison: Poisoning Function Library to Hijack Multi-agent Autonomous Driving Systems

Autonomous driving systems increasingly rely on multi-agent architectures powered by large language models (LLMs), where specialized agents collaborate to perceive, reason, and plan. A key component of these systems

medium relevance attack

Paper 2510.00586v2

2025-10-01

Eyes-on-Me: Scalable RAG Poisoning through Transferable Attention-Steering Attractors

data poisoning and show that modular, reusable components pose a practical threat to modern AI systems. They also reveal a strong link between attention concentration and model outputs, informing interpretability

medium relevance attack

Paper 2601.13112v1

2026-01-19

CODE: A Contradiction-Based Deliberation Extension Framework for Overthinking Attacks on Retrieval-Augmented Generation

multi-step self-verification. However, recent studies have shown that reasoning models suffer from overthinking attacks, where models are tricked to generate unnecessarily high number of reasoning tokens. In this

high relevance attack

Paper 2605.25073v1

2026-05-24

Security in the Fine-Tuning Lifecycle of Large Language Models: Threats, Defenses,Evaluation, and Future Directions

Language Models (LLMs) to downstream tasks, but its reliance on training data, parameter updates, and reusable components opens entry points for attackers. Threats have evolved from data poisoning and weight

medium relevance benchmark

Previous Page 10 of 15 Next