Search: model poisoning | AI Threat Intelligence

Severity:

198 results in 108ms

Paper 2602.00183v1

2026-01-30

RPP: A Certified Poisoned-Sample Detection Framework for Backdoor Attacks under Dataset Imbalance

propose Randomized Probability Perturbation (RPP), a certified poisoned-sample detection framework that operates in a black-box setting using only model output probabilities. For any inspected sample, RPP determines whether

high relevance benchmark

Paper 2510.20768v2

2025-10-23

RAGRank: Using PageRank to Counter Poisoning in CTI LLM Pipelines

dominant architectural pattern to operationalize Large Language Model (LLM) usage in Cyber Threat Intelligence (CTI) systems. However, this design is susceptible to poisoning attacks, and previously proposed defenses can fail

medium relevance attack

Paper 2512.19286v2

2025-12-22

GShield: Mitigating Poisoning Attacks in Federated Learning

Learning models. In particular, it enables decentralized model training while preserving data privacy, but its distributed nature makes it highly vulnerable to a severe attack known as Data Poisoning

high relevance attack

Paper 2601.07395v1

2026-01-12

MCP-ITP: An Automated Framework for Implicit Tool Poisoning in MCP

environments, the Model Context Protocol (MCP) was proposed and has since been widely adopted. However, integrating external tools expands the attack surface, exposing agents to tool poisoning attacks. In such

medium relevance attack

Paper 2509.24566v1

2025-09-29

TokenSwap: Backdoor Attack on the Compositional Understanding of Large Vision-Language Models

corresponding textual answers. However, the poisoned samples exhibit only subtle differences from the original ones, making it challenging for the model to learn the backdoor behavior. To address this, TokenSwap

high relevance attack

Paper 2603.00711v1

2026-02-28

IU: Imperceptible Universal Backdoor Attack

simultaneously controls all target classes with minimal poisoning while preserving stealth. Our key idea is to leverage graph convolutional networks (GCNs) to model inter-class relationships and generate class-specific

high relevance attack

Paper 2510.10932v4

2025-10-13

DropVLA: An Action-Level Backdoor Attack on Vision-Language-Action Models

tuning. On OpenVLA-7B evaluated with LIBERO, vision-only poisoning achieves 98.67%-99.83% attack success rate (ASR) with only 0.31% poisoned episodes while preserving 98.50%-99.17% clean-task retention

high relevance attack

Paper 2511.00446v1

2025-11-01

ToxicTextCLIP: Text-Based Poisoning and Backdoor Attacks on CLIP Pre-training

Contrastive Language-Image Pretraining (CLIP) model has significantly advanced vision-language modeling by aligning image-text pairs from large-scale web data through self-supervised contrastive learning. Yet, its reliance

high relevance attack

Paper 2603.11501v2

2026-03-12

KEPo: Knowledge Evolution Poison on Graph-based Retrieval-Augmented Generation

timeliness and accuracy of Large Language Model (LLM) generations. However, this reliance on external data introduces new attack surfaces. Attackers can inject poisoned texts into databases to manipulate LLMs into

medium relevance benchmark

Paper 2512.08289v2

2025-12-09

MIRAGE: Misleading Retrieval-Augmented Generation via Black-box and Query-agnostic Poisoning Attacks

proposing MIRAGE, a novel multi-stage poisoning pipeline designed for strict black-box and query-agnostic environments. Operating on surrogate model feedback, MIRAGE functions as an automated optimization framework that

high relevance attack

Paper 2603.06263v1

2026-03-06

SPOILER: TEE-Shielded DNN Partitioning of On-Device Secure Inference with Poison Learning

Deploying deep neural networks (DNNs) on edge devices exposes valuable intellectual property to model-stealing attacks. While TEE-shielded DNN partitioning (TSDP) mitigates this by isolating sensitive computations, existing paradigms

medium relevance attack

Paper 2510.18324v1

2025-10-21

CryptoGuard: Lightweight Hybrid Detection and Response to Host-based Cryptojackers in Linux Cloud Environments

phase process, leveraging deep learning models to identify suspicious activity with high precision. To counter evasion techniques such as entry point poisoning and PID manipulation, CryptoGuard integrates targeted remediation mechanisms

low relevance defense

Paper 2602.03040v1

2026-02-03

DF-LoGiT: Data-Free Logic-Gated Backdoor Attacks in Vision Transformers

backdoor attacks largely rely on poisoned-data training, while prior data-free attempts typically require synthetic-data fine-tuning or extra model components. This paper introduces Data-Free Logic-Gated

high relevance attack

Paper 2602.20193v1

2026-02-21

When Backdoors Go Beyond Triggers: Semantic Drift in Diffusion Models Under Encoder Attacks

attacks on text-to-image (T2I) models primarily measure trigger activation and visual fidelity. We challenge this paradigm, demonstrating that encoder-side poisoning induces persistent, trigger-free semantic corruption that

high relevance attack

Paper 2510.17185v1

2025-10-20

Robustness in Text-Attributed Graph Learning: Insights, Trade-offs, and New Defenses

based, and hybrid perturbations in both poisoning and evasion scenarios. Our extensive analysis reveals multiple findings, among which three are particularly noteworthy: 1) models have inherent robustness trade-offs between

medium relevance defense

Paper 2509.20324v1

2025-09-24

RAG Security and Privacy: Formalizing the Threat Model and Attack Surface

knowledge, the first formal threat model for retrieval-RAG systems. We introduce a structured taxonomy of adversary types based on their access to model components and data, and we formally

high relevance attack

Paper 2510.26102v1

2025-10-30

PEEL: A Poisoning-Exposing Encoding Theoretical Framework for Local Differential Privacy

widely adopted privacy-protection model in the Internet of Things (IoT) due to its lightweight, decentralized, and scalable nature. However, it is vulnerable to poisoning attacks, and existing defenses either

medium relevance attack

Paper 2602.15671v1

2026-02-17

Revisiting Backdoor Threat in Federated Instruction Tuning from a Signal Aggregation Perspective

vulnerabilities from low-concentration poisoned data distributed across the datasets of benign clients.} This scenario is increasingly common in federated instruction tuning for language models, which often rely on unverified

medium relevance benchmark

Paper 2511.12423v1

2025-11-16

GRAPHTEXTACK: A Realistic Black-Box Node Injection Attack on LLM-Enhanced GNNs

Recent work integrates Large Language Models (LLMs) with Graph Neural Networks (GNNs) to jointly model semantics and structure, resulting in more general and expressive models that achieve state

high relevance attack

Paper 2512.14448v1

2025-12-16

Reasoning-Style Poisoning of LLM Agents via Stealthy Style Transfer: Process-Level Attacks and Runtime Monitoring in RSV Space

Large Language Model (LLM) agents relying on external retrieval are increasingly deployed in high-stakes environments. While existing adversarial attacks primarily focus on content falsification or instruction injection, we identify

high relevance attack

Previous Page 6 of 10 Next