Search: model poisoning | AI Threat Intelligence

195 results in 95ms

Paper 2511.14989v2

2025-11-19

Critical Evaluation of Quantum Machine Learning for Adversarial Robustness

three threat models-black-box, gray-box, and white-box. We implement representative attacks in each category, including label-flipping for black-box, QUID encoder-level data poisoning for gray

medium relevance benchmark

Paper 2511.12936v1

2025-11-17

Privacy-Preserving Federated Learning from Partial Decryption Verifiable Threshold Multi-Client Functional Encryption

cooperate to train the model without directly exchanging their own private data, but the gradient leakage problem still threatens the privacy security and model integrity. Although the existing scheme uses

medium relevance benchmark

Paper 2512.09742v1

2025-12-10

Weird Generalization and Inductive Backdoors: New Ways to Corrupt LLMs

contexts can dramatically shift behavior outside those contexts. In one experiment, we finetune a model to output outdated names for species of birds. This causes it to behave

medium relevance benchmark

Paper 2603.02849v1

2026-03-03

DSBA: Dynamic Stealthy Backdoor Attack with Collaborative Optimization in Self-Supervised Learning

generalization capabilities, and its potential for privacy preservation. However, recent research reveals that SSL models are also vulnerable to backdoor attacks. Existing backdoor attack methods in the SSL context commonly

high relevance attack

Paper 2602.19555v1

2026-02-23

Agentic AI as a Cybersecurity Attack Surface: Threats, Exploits, and Defenses in Runtime Supply Chains

Agentic systems built on large language models (LLMs) extend beyond text generation to autonomously retrieve information and invoke tools. This runtime execution model shifts the attack surface from build-time

high relevance attack

Paper 2601.11207v1

2026-01-16

LoRA as Oracle

Existing defenses for backdoor detection and membership inference typically require access to clean reference models, extensive retraining, or strong assumptions about the attack mechanism. In this work, we introduce

medium relevance attack

Paper 2603.07835v1

2026-03-08

DistillGuard: Evaluating Defenses Against LLM Knowledge Distillation

Knowledge distillation from proprietary LLM APIs poses a growing threat to model providers, yet defenses against this attack remain fragmented and unevaluated. We present DistillGuard, a framework for systematically evaluating

medium relevance defense

Paper 2511.06212v1

2025-11-09

RAG-targeted Adversarial Attack on LLM-based Threat Detection and Mitigation Framework

Artificial Intelligence has become a valuable solution in securing IoT networks, with Large Language Models (LLMs) enabling automated attack behavior analysis and mitigation suggestion in Network Intrusion Detection Systems (NIDS

high relevance tool

Paper 2512.19297v1

2025-12-22

Causal-Guided Detoxify Backdoor Attack of Open-Weight LoRA Models

Backdoor Attack (CBA), a novel backdoor attack framework specifically designed for open-weight LoRA models. CBA operates without access to original training data and achieves high stealth through

high relevance attack

Paper 2603.03108v1

2026-03-03

RAIN: Secure and Robust Aggregation under Shuffle Model of Differential Privacy

achieving robustness under adversarial behavior remains challenging. Modern systems increasingly adopt the shuffle model of differential privacy (Shuffle-DP) to locally perturb client updates and globally anonymize them via shuffling

medium relevance benchmark

Paper 2512.13501v1

2025-12-15

Behavior-Aware and Generalizable Defense Against Black-Box Adversarial Attacks for ML-Based IDS

often fall short in practice. Most are tailored to specific attack types, require internal model access, or rely on static mechanisms that fail to generalize across evolving attack strategies. Furthermore

high relevance attack

Paper 2512.23307v1

2025-12-29

RobustMask: Certified Robustness against Adversarial Neural Ranking Attack via Randomized Masking

Neural ranking models have achieved remarkable progress and are now widely deployed in real-world applications such as Retrieval-Augmented Generation (RAG). However, like other neural architectures, they remain vulnerable

high relevance attack

Paper 2603.21654v1

2026-03-23

Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks

Augmented Generation (RAG) significantly mitigates the hallucinations and domain knowledge deficiency in large language models by incorporating external knowledge bases. However, the multi-module architecture of RAG introduces complex system

medium relevance survey

Paper 2602.02615v1

2026-02-02

TinyGuard:A lightweight Byzantine Defense for Resource-Constrained Federated Learning via Statistical Update Fingerprints

label poisoning. Against adaptive white-box adversaries, Pareto frontier analysis across four orders of magnitude confirms that attackers cannot simultaneously evade detection and achieve effective poisoning, features we term statistical

medium relevance defense

Paper 2602.17973v1

2026-02-20

PenTiDef: Enhancing Privacy and Robustness in Decentralized Federated Intrusion Detection Systems against Poisoning Attacks

Systems (IDS) introduces new challenges related to data privacy, centralized coordination, and susceptibility to poisoning attacks. While significant research has focused on protecting traditional FL-IDS with centralized aggregation servers

high relevance tool

Paper 2512.15799v1

2025-12-16

Cybercrime and Computer Forensics in Epoch of Artificial Intelligence in India

while Machine Learning offers high accuracy in pattern recognition, it introduces vulnerabilities regarding data poisoning and algorithmic bias. Findings highlight a critical tension between the Act's data minimization principles

low relevance benchmark

Paper 2602.01129v1

2026-02-01

SMCP: Secure Model Context Protocol

large language models (LLMs) are moving away from closed, single-model frameworks and toward open ecosystems that connect a variety of agents, external tools, and resources. The Model Context Protocol

medium relevance attack

Paper 2510.18541v1

2025-10-21

Pay Attention to the Triggers: Constructing Backdoors That Survive Distillation

often used by downstream users as teacher models for knowledge distillation, compressing their capabilities into memory-efficient models. However, as these teacher models may stem from untrusted parties, distillation

medium relevance benchmark

Paper 2601.09625v2

2026-01-14

The Promptware Kill Chain: How Prompt Injections Gradually Evolved Into a Multistep Malware Delivery Mechanism

Prompt injection was initially framed as the large language model (LLM) analogue of SQL injection. However, over the past three years, attacks labeled as prompt injection have evolved from isolated

high relevance attack

Paper 2603.20976v1

2026-03-21

Detection of adversarial intent in Human-AI teams using LLMs

Large language models (LLMs) are increasingly deployed in human-AI teams as support agents for complex tasks such as information retrieval, programming, and decision-making assistance. While these agents' autonomy

medium relevance attack

Previous Page 8 of 10 Next