Search: model poisoning | AI Threat Intelligence

195 results in 17ms

Paper 2512.06172v1

2025-12-05

DEFEND: Poisoned Model Detection and Malicious Client Exclusion Mechanism for Secure Federated Learning-based Road Condition Classification

includes a poisoned model detection strategy that leverages neuron-wise magnitude analysis for attack goal identification and Gaussian Mixture Model (GMM)-based clustering. DEFEND discards poisoned model contributions in each

medium relevance defense

Paper 2510.05159v3

2025-10-03

Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain

layers of the supply chain: direct poisoning of finetuning data, pre-backdoored base models, and environment poisoning, a novel attack vector that exploits vulnerabilities specific to agentic training pipelines. Evaluated

medium relevance benchmark

Paper 2511.07176v2

2025-11-10

Graph Representation-based Model Poisoning on the Heterogeneous Internet of Agents

without centralizing local datasets. However, the FFT-enabled IoA systems remain vulnerable to model poisoning attacks, where adversaries can upload malicious updates to the server to degrade the performance

medium relevance attack

Paper 2601.19061v2

2026-01-27

Thought-Transfer: Indirect Targeted Poisoning Attacks on Chain-of-Thought Reasoning Models

resulting in a form of ``clean label'' poisoning. Unlike prior targeted poisoning attacks that explicitly require target task samples in the poisoned data, we demonstrate that thought-transfer achieves

high relevance attack

Paper 2510.22274v1

2025-10-25

SecureLearn -- An Attack-agnostic Defense for Multiclass Machine Learning Against Data Poisoning Attacks

these models are significant in developing multi-modal applications. Therefore, this paper proposes SecureLearn, a two-layer attack-agnostic defense to defend multiclass models from poisoning attacks. It comprises

high relevance attack

Paper 2602.16480v1

2026-02-18

SRFed: Mitigating Poisoning Attacks in Privacy-Preserving Federated Learning with Heterogeneous Data

develop a privacy-preserving defensive model aggregation mechanism based on DEFE. This mechanism filters poisonous models under Non-IID data by layer-wise projection and clustering-based analysis. Theoretical analysis

high relevance attack

Paper 2603.19101v1

2026-03-19

FedTrident: Resilient Road Condition Classification Against Poisoning Attacks in Federated Learning

necessary attack-free levels in various attack scenarios due to: 1) not tailoring poisoned local model detection to TLFAs, 2) not excluding malicious vehicular clients based on historical behavior

high relevance attack

Paper 2602.02147v1

2026-02-02

HPE: Hallucinated Positive Entanglement for Backdoor Attacks in Federated Self-Supervised Learning

backdoor samples in the representation space. Finally, selective parameter poisoning and proximity-aware updates constrain the poisoned model within the vicinity of the global model, enhancing its stability and persistence

high relevance attack

Paper 2511.19248v1

2025-11-24

FedPoisonTTP: A Threat Model and Poisoning Attack for Federated Test-Time Personalization

data poisoning in the federated adaptation setting. FedPoisonTTP distills a surrogate model from adversarial queries, synthesizes in-distribution poisons using feature-consistency, and optimizes attack objectives to generate high-entropy

high relevance attack

Paper 2601.14687v1

2026-01-21

Beyond Denial-of-Service: The Puppeteer's Attack for Fine-Grained Control in Ranking-Based Federated Learning

Descending Edges to align the global model with the target model, and (ii) widening the selection boundary gap to stabilize the global model at the target accuracy. Extensive experiments across

high relevance attack

Paper 2510.04503v2

2025-10-06

P2P: A Poison-to-Poison Remedy for Reliable Backdoor Defense in LLMs

Poison-to-Poison (P2P), a general and effective backdoor defense algorithm. P2P injects benign triggers with safe alternative labels into a subset of training samples and fine-tunes the model

medium relevance defense

Paper 2601.01053v1

2026-01-03

Byzantine-Robust Federated Learning Framework with Post-Quantum Secure Aggregation for Real-Time Threat Intelligence Sharing in Critical IoT Infrastructure

security suffer from two critical vulnerabilities: susceptibility to Byzantine attacks where malicious participants poison model updates, and inadequacy against future quantum computing threats that can compromise cryptographic aggregation protocols. This

medium relevance benchmark

Paper 2511.02600v1

2025-11-04

On The Dangers of Poisoned LLMs In Security Automation

tuned Llama3.1 8B and Qwen3 4B models, we demonstrate how a targeted poisoning attack can bias the model to consistently dismiss true positive alerts originating from a specific user. Additionally

medium relevance benchmark

Paper 2603.03865v1

2026-03-04

Structure-Aware Distributed Backdoor Attacks in Federated Learning

across different model architectures. This assumption overlooks the impact of model structure on perturbation effectiveness. From a structure-aware perspective, this paper analyzes the coupling relationship between model architectures

high relevance attack

Paper 2603.12206v1

2026-03-12

CLASP: Defending Hybrid Large Language Models Against Hidden State Poisoning Attacks

State space models (SSMs) like Mamba have gained significant traction as efficient alternatives to Transformers, achieving linear complexity while maintaining competitive performance. However, Hidden State Poisoning Attacks (HiSPAs), a recently

high relevance attack

Paper 2510.26829v3

2025-10-29

Layer of Truth: Probing Belief Shifts under Continual Pre-Training Poisoning

track how internal preferences between competing facts evolve across checkpoints, layers, and model scales. Even moderate poisoning (50-100%) flips over 55% of responses from correct to counterfactual while leaving

low relevance attack

Paper 2603.17174v1

2026-03-17

Detecting Data Poisoning in Code Generation LLMs via Black-Box, Vulnerability-Oriented Scanning

generation large language models (LLMs) are increasingly integrated into modern software development workflows. Recent work has shown that these models are vulnerable to backdoor and poisoning attacks that induce

high relevance attack

Paper 2512.06556v1

2025-12-06

Securing the Model Context Protocol: Defending LLMs Against Tool Poisoning and Adversarial Attacks

Model Context Protocol (MCP) enables Large Language Models to integrate external tools through structured descriptors, increasing autonomy in decision-making, task execution, and multi-agent workflows. However, this autonomy creates

high relevance tool

Paper 2510.19145v4

2025-10-22

HAMLOCK: HArdware-Model LOgically Combined attacK

networks (DNNs) introduces new security vulnerabilities. Conventional model-level backdoor attacks, which only poison a model's weights to misclassify inputs with a specific trigger, are often detectable because

high relevance attack

Paper 2601.01972v3

2026-01-05

Hidden State Poisoning Attacks against Mamba-based Language Models

their hidden states, referred to as a Hidden State Poisoning Attack (HiSPA). Our benchmark RoBench25 allows evaluating a model's information retrieval capabilities when subject to HiSPAs, and confirms

high relevance attack

Page 1 of 10 Next