Search: model poisoning | AI Threat Alert

Severity:

303 results in 133ms

Paper 2605.02372v1

2026-05-04

Privacy Preserving Machine Learning Workflow: from Anonymization to Personalized Differential Privacy Budgets in Federated Learning

called privacy preserving machine learning architectures, such as federated learning. While federated learning enables model training on decentralized data preventing their sharing and centralization, it still faces several challenges related

medium relevance benchmark

Paper 2603.07379v1

2026-03-07

SoK: Agentic Retrieval-Augmented Generation (RAG): Taxonomy, Architectures, Evaluation, and Research Directions

Retrieval-Augmented Generation (RAG) systems are increasingly evolving into agentic architectures where large language models autonomously coordinate multi-step reasoning, dynamic memory management, and iterative retrieval strategies. Despite rapid industrial

low relevance survey

Paper 2606.12797v1

2026-06-11

The Containment Gap: How Deployed Agentic AI Frameworks Fail Public-Facing Safety Requirements

Agentic large language model systems that autonomously invoke tools, maintain persistent memory, and execute multi-step plans are increasingly deployed in public-facing domains, including government services, healthcare triage

medium relevance tool

Paper 2601.05467v3

2026-01-09

STELP: Secure Transpilation and Execution of LLM-Generated Programs

Rapid evolution of Large Language Models (LLMs) has achieved major advances in reasoning, planning, and function-calling capabilities. Multi-agentic collaborative frameworks using such LLMs place them at the center

medium relevance survey

Paper 2511.18921v1

2025-11-24

BackdoorVLM: A Benchmark for Backdoor Attacks on Vision-Language Models

hijack. Each category captures a distinct pathway through which an adversary can manipulate a model's behavior. We evaluate these threats using 12 representative attack methods spanning text, image

high relevance benchmark

Paper 2602.22134v2

2026-02-25

Secure Semantic Communications via AI Defenses: Fundamentals, Solutions, and Future Directions

SemCom via AI defense. We analyze AI-centric threat models by consolidating existing studies and organizing attack surfaces across model-level, channel-realizable, knowledge-based, and networked inference vectors. Building

medium relevance defense

Paper 2602.00750v1

2026-01-31

Bypassing Prompt Injection Detectors through Evasive Injections

Large language models (LLMs) are increasingly used in interactive and retrieval-augmented systems, but they remain vulnerable to task drift; deviations from a user's intended instruction due to injected

high relevance attack

Paper 2512.21681v1

2025-12-25

Exploring the Security Threats of Retriever Backdoors in Retrieval-Augmented Code Generation

Retrieval-Augmented Code Generation (RACG) is increasingly adopted to enhance Large Language Models for software development, yet its security implications remain dangerously underexplored. This paper conducts the first systematic exploration

medium relevance attack

Paper 2512.10415v2

2025-12-11

How to Trick Your AI TA: A Systematic Study of Academic Jailbreaking in LLM Code Evaluation

Large Language Models (LLMs) as automatic judges for code evaluation is becoming increasingly prevalent in academic environments. But their reliability can be compromised by students who may employ adversarial prompting

high relevance benchmark

Paper 2605.27674v1

2026-05-26

Backdoor Attacks on Fault Detection and Localization in Cyber-Physical Systems

intelligent models are vulnerable to adversarial machine learning attacks, particularly backdoor attacks. In a backdoor attack, an adversary injects malicious patterns into the training data so that the model behaves

high relevance attack

Paper 2512.08290v2

2025-12-09

Systematization of Knowledge: Security and Safety in the Model Context Protocol Ecosystem

Model Context Protocol (MCP) has emerged as the de facto standard for connecting Large Language Models (LLMs) to external data and tools, effectively functioning as the "USB-C for Agentic

medium relevance survey

Paper 2510.01157v2

2025-10-01

Backdoor Attacks Against Speech Language Models

resulting model inherit vulnerabilities from all of its components. In this work, we present the first systematic study of audio backdoor attacks against speech language models. We demonstrate its effectiveness

high relevance attack

Paper 2605.11442v1

2026-05-12

Can a Single Message Paralyze the AI Infrastructure? The Rise of AbO-DDoS Attacks through Targeted Mobius Injection

safety filters, and highly configurable, allowing for surgical targeting of specific environments or model providers. To evaluate the real-world impact, we conduct extensive experiments across three representative claw-style

high relevance attack

Paper 2605.19253v1

2026-05-19

Detecting and Mitigating Backdoor Attacks in OTA-FL Systems: A Two-Stage Robust Aggregation Scheme

server (PS) cannot access individual local updates, making it difficult to identify and exclude poisoned gradients. The challenge is further exacerbated under non-independent and identically distributed (Non-IID) training

high relevance attack

Paper 2606.10742v1

2026-06-09

MemVenom: Triggered Poisoning of Multimodal Memories in Web Agents

systematically study multimodal memory poisoning, an overlooked yet practical attack surface in web-agent systems. We propose MemVenom, a unified black-box attack framework that poisons graph-structured external memory

medium relevance attack

Paper 2603.29328v1

2026-03-31

Beyond Corner Patches: Semantics-Aware Backdoor Attack in Federated Learning

this paper, we revisit the backdoor threat to standard FL (a single global model) under a more realistic setting where triggers must be semantically meaningful, in-distribution, and visually plausible

high relevance attack

Paper 2512.14741v1

2025-12-12

Persistent Backdoor Attacks under Continual Fine-Tuning of LLMs

Backdoor attacks embed malicious behaviors into Large Language Models (LLMs), enabling adversaries to trigger harmful outputs or bypass safety controls. However, the persistence of the implanted backdoors under user-driven

high relevance attack

Paper 2602.20593v1

2026-02-24

Is the Trigger Essential? A Feature-Based Triggerless Backdoor Attack in Vertical Federated Learning

parties with distinct features and one active party with labels to collaboratively train a model. Although it is known for the privacy-preserving capabilities, VFL still faces significant privacy

high relevance attack

Paper 2602.18082v1

2026-02-20

AndroWasm: an Empirical Study on Android Malware Obfuscation through WebAssembly

detection mechanisms and harden manual analysis. Adversaries typically rely on obfuscation, anti-repacking, steganography, poisoning, and evasion techniques to AI-based tools, and in-memory execution to conceal malicious functionality

medium relevance attack

Paper 2604.08407v1

2026-04-09

Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain

enforces cryptographic integrity between client and upstream model. We present the first systematic study of this attack surface. We formalize a threat model for malicious LLM API routers and define

high relevance attack

Previous Page 14 of 16 Next