AI Security Research

AI Threat Alert indexes 3,037+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,037
Attack

1,183
Benchmark

868
Defense

410
Tool

319
Survey

177

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 161–180 of 868 papers

Clear filters

Benchmark MEDIUM

Enhancing Agent Safety Judgment: Controlled Benchmark Rewriting and Analogical Reasoning for Deceptive Out-of-Distribution Scenarios

Zuoyu Zhang, Yancheng Zhu

Tool-using agent systems powered by large language models (LLMs) are increasingly deployed across web, app, operating-system, and transactional...

1 months ago cs.AI PDF

Benchmark MEDIUM

MAGE: Safeguarding LLM Agents against Long-Horizon Threats via Shadow Memory

Yuhui Wang, Tanqiu Jiang, Jiacheng Liang +2 more

As large language model (LLM)-powered agents are increasingly deployed to perform complex, real-world tasks, they face a growing class of attacks...

1 months ago cs.CR cs.AI cs.CL PDF

Benchmark LOW

Distributed Deep Variational Approach for Privacy-preserving Data Release

Zahir Alsulaimawi, Huaping Liu

Federated learning (FL) lets distributed nodes train a shared model without exchanging their raw data, but in privacy-sensitive deployments medical...

1 months ago cs.CR cs.LG PDF

Benchmark LOW

An Empirical Study of Agent Skills for Healthcare: Practice, Gaps, and Governance

Gelei Xu, Ningzhi Tang, Xueyang Li +4 more

Healthcare automation is shaped by local procedures and organizational constraints, so agent capabilities rarely transfer unchanged across settings....

1 months ago cs.AI PDF

Benchmark MEDIUM

Privacy Preserving Machine Learning Workflow: from Anonymization to Personalized Differential Privacy Budgets in Federated Learning

Judith Sáinz-Pardo Díaz, Álvaro López García

The growing development of artificial intelligence based solutions, together with privacy legislation, has driven the rise of the so-called privacy...

1 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

On the Privacy of LLMs: An Ablation Study

Karima Makhlouf, Lamiaa Basyoni, Syed Khaderi +4 more

Large language models (LLMs) are increasingly deployed in interactive and retrieval-augmented settings, raising significant privacy concerns. While...

1 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Trojan Hippo: Weaponizing Agent Memory for Data Exfiltration

Debeshee Das, Julien Piet, Darya Kaviani +3 more

Memory systems enable otherwise-stateless LLM agents to persist user information across sessions, but also introduce a new attack surface. We...

1 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Repurposing and Evaluating the (In)Feasibility of Dataset Poisoning enabled Watermarking for Contrastive Learning

Zhiyang Dai, Yansong Gao, Boyu Kuang +5 more

Contrastive learning (CL) reduces annotation cost via auto-derived supervisory signals. Since large-scale in-house CL datasets are infeasible,...

1 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Needle-in-RAG: Prompt-Conditioned Character-Level Traceback of Poisoned Spans in Retrieved Evidence

Huining Cui, Wei Liu

Retrieval-augmented generation (RAG) improves factual grounding by conditioning large language models on retrieved evidence, but it also opens a...

1 months ago cs.CR cs.DB PDF

Benchmark MEDIUM

AdaBFL: Multi-Layer Defensive Adaptive Aggregation for Bzantine-Robust Federated Learning

Zehui Tang, Yuchen Liu, Feihu Huang

Federated learning (FL) is a popular distributed learning paradigm in machine learning, which enables multiple clients to collaboratively train...

2 months ago cs.LG cs.AI cs.CR PDF

Benchmark LOW

Identifying and Characterizing Semantic Clones of Solidity Functions

Ermanno Francesco Sannini, Francesco Salzano, Simone Scalabrino +4 more

Smart Contracts are essential blockchain components, mainly written in Solidity. The high availability of public Solidity code leads to frequent...

2 months ago cs.SE PDF

Benchmark MEDIUM

PRAG End-to-End Privacy-Preserving Retrieval-Augmented Generation

Zhijun Li, Minghui Xu, Huayi Qi +6 more

Retrieval-Augmented Generation (RAG) is essential for enhancing Large Language Models (LLMs) with external knowledge, but its reliance on cloud...

2 months ago cs.CR PDF

Benchmark MEDIUM

PRAG: End-to-End Privacy-Preserving Retrieval-Augmented Generation

Zhijun Li, Minghui Xu, Huayi Qi +6 more

Retrieval-Augmented Generation (RAG) is essential for enhancing Large Language Models (LLMs) with external knowledge, but its reliance on cloud...

2 months ago cs.CR PDF

Benchmark MEDIUM

Making AI-Assisted Grant Evaluation Auditable without Exposing the Model

Kemal Bicakci

Public agencies are beginning to consider large language models (LLMs) as decision-support tools for grant evaluation. This creates a practical...

2 months ago cs.CR cs.AI cs.CY PDF

Benchmark MEDIUM

FCMBench-Video: Benchmarking Document Video Intelligence

Runze Cui, Fangxin Shang, Yehui Yang +2 more

Document understanding is a critical capability in financial credit review, onboarding, and remote verification, where both decision accuracy and...

2 months ago cs.CV cs.CE cs.MM PDF

Benchmark MEDIUM

MGTEVAL: An Interactive Platform for Systemtic Evaluation of Machine-Generated Text Detectors

Yuanfan Li, Qi Zhou, Chengzhengxu Li +5 more

We present MGTEVAL, an extensible platform for systematic evaluation of Machine-Generated Text (MGT) detectors. Despite rapid progress in MGT...

2 months ago cs.CR cs.CL PDF

Benchmark MEDIUM

Green Shielding: A User-Centric Approach Towards Trustworthy AI

Aaron J. Li, Nicolas Sanchez, Hao Huang +8 more

Large language models (LLMs) are increasingly deployed, yet their outputs can be highly sensitive to routine, non-adversarial variation in how users...

2 months ago cs.CL cs.AI PDF

Benchmark LOW

Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents

German Marin, Jatin Chaudhary

Autonomous AI agents can remain fully authorized and still become unsafe as behavior drifts, adversaries adapt, and decision patterns shift without...

2 months ago cs.AI PDF

Benchmark MEDIUM

A Comparative Evaluation of AI Agent Security Guardrails

Qi Li, Jiu Li, Pingtao Wei +8 more

This report presents a comparative evaluation of DKnownAI Guard in AI agent security scenarios, benchmarked against three competing products: AWS...

2 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

GAMMAF: A Common Framework for Graph-Based Anomaly Monitoring Benchmarking in LLM Multi-Agent Systems

Pablo Mateo-Torrejón, Alfonso Sánchez-Macián

The rapid integration of Large Language Models (LLMs) into Multi-Agent Systems (MAS) has significantly enhanced their collaborative problem-solving...

2 months ago cs.CR cs.AI cs.MA PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,037+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial