Search: model poisoning | AI Threat Alert

Severity:

297 results in 148ms

Paper 2603.02240v1

2026-02-17

SuperLocalMemory: Privacy-Preserving Multi-Agent Memory with Bayesian Trust Defense Against Memory Poisoning

increasingly rely on persistent memory, cloud-based memory systems create centralized attack surfaces where poisoned memories propagate across sessions and users -- a threat demonstrated in documented attacks against production systems

medium relevance attack

Paper 2603.18063v1

2026-03-18

MCP-38: A Comprehensive Threat Taxonomy for Model Context Protocol Systems (v1.0)

Model Context Protocol (MCP) introduces a structurally distinct attack surface that existing threat frameworks, designed for traditional software systems or generic LLM deployments, do not adequately cover. This paper presents

medium relevance survey

Paper 2602.08563v1

2026-02-09

Stateless Yet Not Forgetful: Implicit Memory as a Hidden Channel in LLMs

supplied. We challenge this assumption by introducing implicit memory-the ability of a model to carry state across otherwise independent interactions by encoding information in its own outputs and later

low relevance benchmark

Paper 2512.22046v1

2025-12-26

Backdoor Attacks on Prompt-Driven Video Segmentation Foundation Models

Prompt-driven Video Segmentation Foundation Models (VSFMs) such as SAM2

high relevance attack

Paper 2512.15790v1

2025-12-15

Bilevel Optimization for Covert Memory Tampering in Heterogeneous Multi-Agent Architectures (XAMT)

inherently heterogeneous, integrating conventional Multi-Agent Reinforcement Learning (MARL) with emerging Large Language Model (LLM) agent architectures utilizing Retrieval-Augmented Generation (RAG). A critical shared vulnerability is reliance on centralized

medium relevance benchmark

Paper 2511.20920v1

2025-11-25

Securing the Model Context Protocol (MCP): Risks, Controls, and Governance

Model Context Protocol (MCP) replaces static, developer-controlled API integrations with more dynamic, user-driven agent systems, which also introduces new security risks. As MCP adoption grows across community servers

medium relevance attack

Paper 2606.12679v1

2026-06-10

Fed-FBD: Federated Functional Block Diversification for Isolation, Privacy, and Surgical Unlearning

Federated learning (FL) enables collaborative model training without sharing raw

medium relevance benchmark

Paper 2605.21780v1

2026-05-20

Provable Robustness against Backdoor Attacks via the Primal-Dual Perspective on Differential Privacy

Randomized smoothing is a powerful tool for certifying robustness to adversarial perturbations, including poisoning attacks via randomized training and evasion attacks via randomized inference. Extending these guarantees to backdoor attacks

high relevance attack

Paper 2605.18593v1

2026-05-18

Not What You Asked For: Typographic Attacks in Household Robot Manipulation

Open-vocabulary embodied AI agents increasingly rely on vision-language

high relevance attack

CVE CRITICAL CVE-2026-44336

2026-05-11

PraisonAI MCP `tools/call` path-traversal => RCE via Python `.pth` injection

CVSS 9.6 PraisonAI View details

Paper 2604.23593v1

2026-04-26

When AI reviews science: Can we trust the referee?

The volume of scientific submissions continues to climb, outpacing the

medium relevance survey

Paper 2603.08316v2

2026-03-09

SlowBA: An efficiency backdoor attack towards VLM-based GUI agents

Modern vision-language-model (VLM) based graphical user interface (GUI

high relevance attack

Paper 2512.06914v2

2025-12-07

SoK: Trust-Authorization Mismatch in LLM Agent Interactions

Large Language Models (LLMs) are evolving into autonomous agents capable of executing complex workflows via standardized protocols (e.g., MCP). However, this paradigm shifts control from deterministic code to probabilistic inference

medium relevance survey

Paper 2510.26420v1

2025-10-30

SSCL-BW: Sample-Specific Clean-Label Backdoor Watermarking for Dataset Ownership Verification

dataset owners. Existing backdoor-based dataset ownership verification methods suffer from inherent limitations: poison-label watermarks are easily detectable due to label inconsistencies, while clean-label watermarks face high technical

medium relevance benchmark

Paper 2510.14312v1

2025-10-16

Terrarium: Revisiting the Blackboard for Multi-Agent Safety, Privacy, and Security Studies

multi-agent system (MAS) powered by large language models (LLMs) can automate tedious user tasks such as meeting scheduling that requires inter-agent collaboration. LLMs enable nuanced protocols that account

medium relevance defense

Paper 2510.13322v1

2025-10-15

Injection, Attack and Erasure: Revocable Backdoor Attacks via Machine Unlearning

networks (DNNs) due to their stealth and durability. While recent research has explored leveraging model unlearning mechanisms to enhance backdoor concealment, existing attack strategies still leave persistent traces that

high relevance attack

Paper 2509.22040v1

2025-09-26

"Your AI, My Shell": Demystifying Prompt Injection Attacks on Agentic AI Coding Editors

Agentic AI coding editors driven by large language models have recently become more popular due to their ability to improve developer productivity during software development. Modern editors such as Cursor

high relevance attack

Previous Page 15 of 15