294 results in 103ms
Paper 2602.06409v1

VENOMREC: Cross-Modal Interactive Poisoning for Targeted Promotion in Multimodal LLM Recommender Systems

language models (MLLMs) are pushing recommender systems (RecSys) toward content-grounded retrieval and ranking via cross-modal fusion. We find that while cross-modal consensus often mitigates conventional poisoning that

medium relevance tool
Paper 2603.20339v1

Graph-Aware Text-Only Backdoor Poisoning for Text-Attributed Graphs

platforms, an attacker may be able to quietly poison a small part of the training data and later make the model produce wrong predictions on demand. This paper studies that

medium relevance attack
Paper 2603.07191v2

Governance Architecture for Autonomous Agent Systems: Threats, Framework, and Engineering Practice

Autonomous agents powered by large language models introduce a class of execution-layer vulnerabilities -- prompt injection, retrieval poisoning, and uncontrolled tool invocation -- that existing guardrails fail to address systematically

medium relevance benchmark
Paper 2510.03705v1

Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods

prompt injection attack purposes. Specifically, the attackers poison the supervised fine-tuning samples and insert the backdoor into the model. Once the trigger is activated, the backdoored model executes

high relevance attack
Paper 2601.00566v1

Low Rank Comes with Low Security: Gradient Assembly Poisoning Attacks against Distributed LoRA-based LLM Systems

separately, while only their product $AB$ determines the model update, yet this composite is never directly verified. We propose Gradient Assembly Poisoning (GAP), a novel attack that exploits this blind

high relevance tool
Paper 2510.00554v1

Sentry: Authenticating Machine Learning Artifacts on the Fly

reliance on external datasets and pre-trained models exposes the system to supply chain attacks where an artifact can be poisoned before it is delivered to the end-user. Such

medium relevance benchmark
Paper 2603.00516v1

ProtegoFed: Backdoor-Free Federated Instruction Tuning with Interspersed Poisoned Data

Federated Instruction Tuning (FIT) enables collaborative instruction tuning of large language models across multiple organizations (clients) in a cross-silo setting without requiring the sharing of private instructions. Recent findings

medium relevance benchmark
Paper 2510.06823v2

Exposing Citation Vulnerabilities in Generative Engines

generation that cites web pages using large language models. Because anyone can publish information on the web, GEs are vulnerable to poisoning attacks. Existing studies of citation evaluation focus

medium relevance benchmark
Paper 2510.04347v2

Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models

behavior of backdoored pre-trained encoder-based language models, focusing on the consistent shift in attention and gradient attribution when processing poisoned inputs; where the trigger token dominates both attention

medium relevance defense
Paper 2603.18103v1

STEP: Detecting Audio Backdoor Attacks via Stability-based Trigger Exposure Profiling

serious threat: an adversary who poisons a small fraction of training data can implant a hidden trigger that controls the model's output while preserving normal behavior on clean inputs

high relevance attack
Paper 2605.10600v1

Generate "Normal", Edit Poisoned: Branding Injection via Hint Embedding in Image Editing

returned to users. The second is a poison-based setting, where an attacker distributes a compromised text-to-image diffusion model whose output contains hidden content. We evaluate both attacks

high relevance attack
Paper 2604.23640v1

Prompt-Unknown Promotion Attacks against LLM-based Sequential Recommender Systems

enabling the training of an effective surrogate model that mimics the behaviors of the victim model. Leveraging the distilled prompt and surrogate model, we devise a promotion attack that adversarially

high relevance tool
Paper 2602.00183v1

RPP: A Certified Poisoned-Sample Detection Framework for Backdoor Attacks under Dataset Imbalance

propose Randomized Probability Perturbation (RPP), a certified poisoned-sample detection framework that operates in a black-box setting using only model output probabilities. For any inspected sample, RPP determines whether

high relevance benchmark
Paper 2510.20768v2

RAGRank: Using PageRank to Counter Poisoning in CTI LLM Pipelines

dominant architectural pattern to operationalize Large Language Model (LLM) usage in Cyber Threat Intelligence (CTI) systems. However, this design is susceptible to poisoning attacks, and previously proposed defenses can fail

medium relevance attack
Paper 2512.19286v2

GShield: Mitigating Poisoning Attacks in Federated Learning

Learning models. In particular, it enables decentralized model training while preserving data privacy, but its distributed nature makes it highly vulnerable to a severe attack known as Data Poisoning

high relevance attack
Paper 2601.07395v1

MCP-ITP: An Automated Framework for Implicit Tool Poisoning in MCP

environments, the Model Context Protocol (MCP) was proposed and has since been widely adopted. However, integrating external tools expands the attack surface, exposing agents to tool poisoning attacks. In such

medium relevance attack
Paper 2605.10253v1

Knowledge Poisoning Attacks on Medical Multi-Modal Retrieval-Augmented Generation

injected with, adversarial knowledge, which can perturb model outputs and undermine system reliability. To investigate this risk, prior studies have explored knowledge poisoning attacks in medical RAG systems. Nevertheless, most

high relevance attack
Paper 2509.24566v1

TokenSwap: Backdoor Attack on the Compositional Understanding of Large Vision-Language Models

corresponding textual answers. However, the poisoned samples exhibit only subtle differences from the original ones, making it challenging for the model to learn the backdoor behavior. To address this, TokenSwap

high relevance attack
Paper 2606.19660v1

A Layered Security Framework Against Prompt Injection in RAG-Based Chatbots

cannot prevent malicious payloads from reaching the model. Consequently, retrieval-augmented generation (RAG) chatbots remain vulnerable to indirect injection, where a poisoned knowledge-base document compromises every user whose query

high relevance tool
Paper 2605.13796v1

Backdoor Threats in Variational Quantum Circuits: Taxonomy, Attacks, and Defenses

survey of backdoor attacks in VQCs, covering data-poisoning, compiler-level, and quantum-native mechanisms. We formalize key terminology and threat models, and review existing attack strategies along with their

high relevance survey
Previous Page 8 of 15 Next