Search: model poisoning | AI Threat Alert

294 results in 18ms

Paper 2512.06172v1

2025-12-05

DEFEND: Poisoned Model Detection and Malicious Client Exclusion Mechanism for Secure Federated Learning-based Road Condition Classification

includes a poisoned model detection strategy that leverages neuron-wise magnitude analysis for attack goal identification and Gaussian Mixture Model (GMM)-based clustering. DEFEND discards poisoned model contributions in each

medium relevance defense

Paper 2606.18697v1

2026-06-17

Stealthy World Model Manipulation via Data Poisoning

three stages of the poisoning pipeline: pre-training detection of poisoned transitions, robust training during fine-tuning, and test-time monitoring of the resulting world model. Across diverse continuous-control

medium relevance attack

Paper 2605.27631v1

2026-05-26

Poison with Style: A Practical Poisoning Attack on Code Large Language Models

tasks. In this paper, we present Poison-with-Style (PwS), a practical and stealthy model poisoning attack targeting CLLMs. Unlike prior attacks that assume an active adversary capable of directly

high relevance attack

Paper 2605.22506v1

2026-05-21

EnCAgg: Enhanced Clustering Aggregation for Robust Federated Learning against Dynamic Model Poisoning

Federated learning faces increasing threats from model poisoning attacks, which harms its application to improve privacy. Existing defense methods typically rely on fixed thresholds or perform clustering with a fixed

medium relevance attack

Paper 2510.05159v3

2025-10-03

Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain

layers of the supply chain: direct poisoning of finetuning data, pre-backdoored base models, and environment poisoning, a novel attack vector that exploits vulnerabilities specific to agentic training pipelines. Evaluated

medium relevance benchmark

Paper 2606.04929v1

2026-06-03

Sequential Data Poisoning in LLM Post-Training

propose the threat model of sequential data poisoning, where multiple adversaries separately poison the SFT and preference datasets. Under this threat model, we identify the single-attacker illusion: each adversary

medium relevance attack

Paper 2606.09548v1

2026-06-08

Model Poisoning Against Federated Model Adaptation with Chain of Bit-Flips

surface. In the context of federated model adaptation, we introduce a novel category of backdoor attack against FL systems that relies on model poisoning based on hardware-fault attacks. More

medium relevance attack

Paper 2511.07176v2

2025-11-10

Graph Representation-based Model Poisoning on the Heterogeneous Internet of Agents

without centralizing local datasets. However, the FFT-enabled IoA systems remain vulnerable to model poisoning attacks, where adversaries can upload malicious updates to the server to degrade the performance

medium relevance attack

Paper 2601.19061v2

2026-01-27

Thought-Transfer: Indirect Targeted Poisoning Attacks on Chain-of-Thought Reasoning Models

resulting in a form of ``clean label'' poisoning. Unlike prior targeted poisoning attacks that explicitly require target task samples in the poisoned data, we demonstrate that thought-transfer achieves

high relevance attack

Paper 2604.14444v1

2026-04-15

Robustness Analysis of Machine Learning Models for IoT Intrusion Detection Under Data Poisoning Attacks

model training pipelines. This study evaluates the susceptibility of four widely used classifiers, Random Forest, Gradient Boosting Machine, Logistic Regression, and Deep Neural Network models, against multiple poisoning strategies using

high relevance attack

Paper 2510.22274v1

2025-10-25

SecureLearn -- An Attack-agnostic Defense for Multiclass Machine Learning Against Data Poisoning Attacks

these models are significant in developing multi-modal applications. Therefore, this paper proposes SecureLearn, a two-layer attack-agnostic defense to defend multiclass models from poisoning attacks. It comprises

high relevance attack

Paper 2602.16480v1

2026-02-18

SRFed: Mitigating Poisoning Attacks in Privacy-Preserving Federated Learning with Heterogeneous Data

develop a privacy-preserving defensive model aggregation mechanism based on DEFE. This mechanism filters poisonous models under Non-IID data by layer-wise projection and clustering-based analysis. Theoretical analysis

high relevance attack

Paper 2605.02202v1

2026-05-04

CBV: Clean-label Backdoor Attacks on Vision Language Models via Diffusion Models

propose the Clean-Label Backdoor Attack on VLMs via Diffusion Models (CBV), which leverages diffusion models to generate natural poisoned examples via score matching. Specifically, CBV modifies the score during

high relevance attack

Paper 2603.19101v1

2026-03-19

FedTrident: Resilient Road Condition Classification Against Poisoning Attacks in Federated Learning

necessary attack-free levels in various attack scenarios due to: 1) not tailoring poisoned local model detection to TLFAs, 2) not excluding malicious vehicular clients based on historical behavior

high relevance attack

Paper 2603.28673v1

2026-03-30

FL-PBM: Pre-Training Backdoor Mitigation for Federated Learning

threat to the integrity and reliability of Artificial Intelligence (AI) models, enabling adversaries to manipulate model behavior by injecting poisoned data with hidden triggers. These attacks can lead to severe

medium relevance defense

Paper 2602.02147v1

2026-02-02

HPE: Hallucinated Positive Entanglement for Backdoor Attacks in Federated Self-Supervised Learning

backdoor samples in the representation space. Finally, selective parameter poisoning and proximity-aware updates constrain the poisoned model within the vicinity of the global model, enhancing its stability and persistence

high relevance attack

Paper 2511.19248v1

2025-11-24

FedPoisonTTP: A Threat Model and Poisoning Attack for Federated Test-Time Personalization

data poisoning in the federated adaptation setting. FedPoisonTTP distills a surrogate model from adversarial queries, synthesizes in-distribution poisons using feature-consistency, and optimizes attack objectives to generate high-entropy

high relevance attack

Paper 2601.14687v1

2026-01-21

Beyond Denial-of-Service: The Puppeteer's Attack for Fine-Grained Control in Ranking-Based Federated Learning

Descending Edges to align the global model with the target model, and (ii) widening the selection boundary gap to stabilize the global model at the target accuracy. Extensive experiments across

high relevance attack

Paper 2510.04503v2

2025-10-06

P2P: A Poison-to-Poison Remedy for Reliable Backdoor Defense in LLMs

Poison-to-Poison (P2P), a general and effective backdoor defense algorithm. P2P injects benign triggers with safe alternative labels into a subset of training samples and fine-tunes the model

medium relevance defense

Paper 2604.27238v1

2026-04-29

SafeTune: Mitigating Data Poisoning in LLM Fine-Tuning for RTL Code Generation

datasets frequently lack security verification and are highly susceptible to data poisoning attacks. Such poisoning can cause models to generate syntactically valid but insecure hardware modules that bypass standard functionality

medium relevance attack

Page 1 of 15 Next