Search: model poisoning | AI Threat Alert

Severity:

303 results in 153ms

Paper 2605.29960v1

2026-05-28

Hijacking Agent Memory: Stealthy Trojan Attacks Through Conversational Interaction

Large language model (LLM) agents increasingly leverage long term memory to support persistent and autonomous task execution. However, this capability also introduces a new attack surface: memory poisoning, where adversaries

high relevance attack

Paper 2509.24408v2

2025-09-29

FuncPoison: Poisoning Function Library to Hijack Multi-agent Autonomous Driving Systems

Autonomous driving systems increasingly rely on multi-agent architectures powered by large language models (LLMs), where specialized agents collaborate to perceive, reason, and plan. A key component of these systems

medium relevance attack

Paper 2510.00586v2

2025-10-01

Eyes-on-Me: Scalable RAG Poisoning through Transferable Attention-Steering Attractors

data poisoning and show that modular, reusable components pose a practical threat to modern AI systems. They also reveal a strong link between attention concentration and model outputs, informing interpretability

medium relevance attack

Paper 2601.13112v1

2026-01-19

CODE: A Contradiction-Based Deliberation Extension Framework for Overthinking Attacks on Retrieval-Augmented Generation

multi-step self-verification. However, recent studies have shown that reasoning models suffer from overthinking attacks, where models are tricked to generate unnecessarily high number of reasoning tokens. In this

high relevance attack

Paper 2605.25073v1

2026-05-24

Security in the Fine-Tuning Lifecycle of Large Language Models: Threats, Defenses,Evaluation, and Future Directions

Language Models (LLMs) to downstream tasks, but its reliance on training data, parameter updates, and reusable components opens entry points for attackers. Threats have evolved from data poisoning and weight

medium relevance benchmark

Paper 2604.20932v1

2026-04-22

Adaptive Defense Orchestration for RAG: A Sentinel-Strategist Architecture against Multi-Vector Attacks

poisoning, the strongest ADO variants reduce attack success to near zero while restoring contextual recall to more than 75% of the undefended baseline, although robustness remains sensitive to model choice

high relevance attack

Paper 2512.13207v2

2025-12-15

Evaluating Adversarial Attacks on Federated Learning for Temperature Forecasting

high-resolution spatiotemporal forecasts that can surpass traditional numerical models, while FL allows institutions in different locations to collaboratively train models without sharing raw data, addressing efficiency and security concerns

high relevance attack

Paper 2605.30189v1

2026-05-28

Token-Level Generalization in LoRA Adapter Backdoors: Attack Characterization and Behavioral Detection

reliably backdoored through training data poisoning while preserving baseline task performance. On a Qwen 2.5 1.5B prompt-injection classifier, a small fraction of poisoned examples drives a clean-accuracy

high relevance attack

Paper 2601.14054v2

2026-01-20

SecureSplit: Mitigating Backdoor Attacks in Split Learning

trained model. To address this vulnerability, we introduce SecureSplit, a defense mechanism tailored to SL. SecureSplit applies a dimensionality transformation strategy to accentuate subtle differences between benign and poisoned embeddings

high relevance attack

Paper 2605.28074v1

2026-05-27

SilentRetrieval: Hijacking Retrieval-Augmented Generation via Semantically-Preserving Adversarial Data Poisoning

perplexity. Cross-model evaluation across four target LLMs shows nontrivial effectiveness under a fixed trigger generator, and transfer tests against unseen retrievers, including ColBERT and commercial embedding models, yield

medium relevance attack

Paper 2601.15474v1

2026-01-21

Multi-Targeted Graph Backdoor Attack

based attack. Our analysis on four GNN models confirms the generalization capability of our attack which is effective regardless of the GNN model architectures and training parameters settings. We further

high relevance attack

Paper 2602.08446v1

2026-02-09

RIFLE: Robust Distillation-based FL for Deep Model Deployment on Resource-Constrained IoT Networks

TinyML models, collaboratively train global models by sharing gradients with a central server while preserving data privacy. However, as data heterogeneity and task complexity increase, TinyML models often become insufficient

medium relevance benchmark

CVE HIGH CVE-2026-44552

2026-05-08

Open WebUI: Redis Cache Keys tool_servers and terminal_servers

CVSS 8.7 open-webui View details

Paper 2511.01268v1

2025-11-03

Rescuing the Unpoisoned: Efficient Defense against Knowledge Corruption Attacks on RAG Systems

poisoning) attacks in practical RAG deployments. RAGDefender operates during the post-retrieval phase, leveraging lightweight machine learning techniques to detect and filter out adversarial content without requiring additional model training

high relevance tool

Paper 2606.17223v1

2026-06-15

Safety, Security, and Cognitive Risks in Neuro-Symbolic AI

threat model extending MITRE ATLAS with 11 NeSy-specific tactic extensions and a five-profile attacker taxonomy; (3) a symbolic-layer threat catalogue covering knowledge graph (KG) poisoning, ontology-merging

medium relevance defense

Paper 2605.03213v1

2026-05-04

When Agents Handle Secrets: A Survey of Confidential Computing for Agentic AI

model inference. Agents accumulate sensitive context, hold credentials, and operate across pipelines no single party fully controls, enabling prompt injection, context exfiltration, credential theft, and inter-agent message poisoning. Current

medium relevance survey

Paper 2510.13893v2

2025-10-14

Guarding the Guardrails: A Taxonomy-Driven Approach to Jailbreak Detection

poisoning. Second, we analyzed the data collected from the challenge to examine the prevalence and success rates of different attack types, providing insights into how specific jailbreak strategies exploit model

high relevance survey

Paper 2510.14381v2

2025-10-16

Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers

systematic analysis of poisoning risks in LLM-based prompt optimization. Using HarmBench, we find systems are substantially more vulnerable to manipulated feedback than to query poisoning alone: feedback-based attacks

medium relevance attack

Paper 2509.21011v1

2025-09-25

Automatic Red Teaming LLM-based Agents with Model Context Protocol Tools

large language models (LLMs) has led to the wide application of LLM-based agents in various domains. To standardize interactions between LLM-based agents and their environments, model context protocol

high relevance tool

Paper 2512.10637v2

2025-12-11

Adaptive Intrusion Detection System Leveraging Dynamic Neural Models with Adversarial Learning for 5G/6G Networks

network security by providing robust, real-time threat detection and response capabilities. Unlike conventional models, which require costly retraining to update knowledge, the proposed framework integrates incremental learning algorithms, reducing

medium relevance attack

Previous Page 11 of 16 Next