AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 121–140 of 562 papers

Clear filters

Benchmark MEDIUM

LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injectio

Lei Zhao, Abhay Bhaskar, Edgar Dobriban

AI agents such as OpenClaw are increasingly deployed in local workflows with access to external tools. This creates indirect prompt-injection (IPI)...

1 months ago cs.CR cs.AI PDF

Benchmark LOW

SkyNative: A Native Multimodal Framework for Remote Sensing Visual Evidence Reasoning

Xiao Yang, Ronghao Fu, Zhiwen Lin +10 more

Remote sensing vision-language models commonly rely on pretrained visual encoders to convert images into semantic features before language-model...

1 months ago cs.CV PDF

Benchmark MEDIUM

The Capability Paradox: How Smarter Auditors Make Multi-Agent Systems Less Secure

Qiqi Liu, Thorsten Holz, Shilin Ye +1 more

Multi-agent systems extend large language models (LLMs) by decomposing tasks among specialized agents, but their distributed decision process creates...

1 months ago cs.AI PDF

Benchmark HIGH

ContraFix: Agentic Vulnerability Repair via Differential Runtime Evidence and Skill Reuse

Simiao Liu, Fang Liu, Li Zhang +2 more

Large language model (LLM) agents are increasingly used for automated vulnerability repair (AVR), where repository-level reasoning enables them to...

1 months ago cs.SE cs.AI cs.CL PDF

Benchmark HIGH

MemRepair: Hierarchical Memory for Agentic Repository-Level Vulnerability Repair

Simiao Liu, Li Zhang, Fang Liu +3 more

Modern software ecosystems face a rapidly growing number of disclosed vulnerabilities, increasing the need for automated repair techniques that can...

1 months ago cs.SE cs.AI cs.CL PDF

Benchmark MEDIUM

Rethinking Side-Channel Analysis: Automated Discovery and Analysis of Side-Channel Leakage with LLM-Assisted Agents

Zhen Xu, Zihao Wang, Yuhua Sun +1 more

Side-channel attacks exploit unintended information leakage from system behavior and continue to pose serious privacy risks in modern platforms....

1 months ago cs.CR PDF

Benchmark HIGH

Code-Centric Detection of Vulnerability-Fixing Commits: A Unified Benchmark and Empirical Study

Nils Loose, Joseph Bienhüls, Kristoffer Hempel +2 more

Automated detection of vulnerability-fixing commits (VFCs) is critical for timely security patch deployment, as advisory databases lag patch releases...

1 months ago cs.SE cs.CR cs.LG PDF

Benchmark LOW

Watermarking Should Be Treated as a Monitoring Primitive

Toluwani Aremu, Nils Lukas, Jie Zhang

Watermarking is widely proposed for provenance, attribution, and safety monitoring in generative models, yet is typically evaluated only under...

1 months ago cs.CR cs.AI cs.CY PDF

Benchmark LOW

Low-Rank Adapters Initialization via Gradient Surgery for Continual Learning

Joana Pasquali, Ramiro N. Barros, Arthur S. Bianchessi +7 more

LoRA is widely adopted for continual fine-tuning of Large Language Models due to its parameter efficiency, modularity across tasks, and compatibility...

1 months ago cs.LG PDF

Benchmark MEDIUM

Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack

Hao Wang, Hanchen Li, Qiuyang Mang +3 more

Agent benchmarks have become the de facto measure of frontier AI competence, guiding model selection, investment, and deployment. However, reward...

1 months ago cs.AI cs.CR PDF

Benchmark LOW

MedHopQA: A Disease-Centered Multi-Hop Reasoning Benchmark and Evaluation Framework for LLM-Based Biomedical Question Answering

Rezarta Islamaj, Robert Leaman, Joey Chan +13 more

Evaluating large language models (LLMs) in the biomedical domain requires benchmarks that can distinguish reasoning from pattern matching and remain...

1 months ago cs.CL cs.AI cs.IR PDF

Benchmark MEDIUM

Targeted Neuron Modulation via Contrastive Pair Search

Sam Herring, Jake Naviasky, Karan Malhotra

Language models are instruction-tuned to refuse harmful requests, but the mechanisms underlying this behavior remain poorly understood. Popular...

1 months ago cs.LG PDF

Benchmark MEDIUM

Reconstruction of Personally Identifiable Information from Supervised Finetuned Models

Sae Furukawa, Alina Oprea

Supervised Finetuning (SFT) has become one of the primary methods for adapting a large language model (LLM) with extensive pre-trained knowledge to...

1 months ago cs.CR cs.CL cs.LG PDF

Benchmark LOW

UHR-Micro: Diagnosing and Mitigating the Resolution Illusion in Earth Observation VLMs

Shuo Ni, Tong Wang, Jing Zhang +4 more

Vision-Language Models (VLMs) increasingly operate on ultra-high-resolution (UHR) Earth observation imagery, yet they remain vulnerable to a severe...

1 months ago cs.CV PDF

Benchmark LOW

When Looking Is Not Enough: Visual Attention Structure Reveals Hallucination in MLLMs

Fanpu Cao, Xin Zou, Xuming Hu +1 more

Multimodal large language models (MLLMs) have become a key interface for visual reasoning and grounded question answering, yet they remain vulnerable...

1 months ago cs.CV cs.AI PDF

Benchmark LOW

Benchmarking LLM-Based Static Analysis for Secure Smart Contract Development: Reliability, Limitations, and Potential Hybrid Solutions

Stefan-Claudiu Susan, Andrei Arusoaie, Dorel Lucanu

The irreversible nature of blockchain transactions makes the identification of smart contract vulnerabilities an essential requirement for secure...

1 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

From Controlled to the Wild: Evaluation of Pentesting Agents for the Real-World

Pedro Conde, Henrique Branquinho, Valerio Mazzone +3 more

AI pentesting agents are increasingly credible as offensive security systems, but current benchmarks still provide limited guidance on which will...

1 months ago cs.AI cs.CR PDF

Benchmark MEDIUM

Threat Modelling using Domain-Adapted Language Models: Empirical Evaluation and Insights

Saba Pourhanifeh, AbdulAziz AbdulGhaffar, Ashraf Matrawy

Large Language Models(LLMs) are increasingly explored for cybersecurity applications such as vulnerability detection. In the domain of threat...

1 months ago cs.CR cs.AI PDF

Benchmark HIGH

LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments

Chiyu Zhang, Huiqin Yang, Bendong Jiang +8 more

The rapid proliferation of LLM-based autonomous agents in real operating system environments introduces a new category of safety risk beyond content...

1 months ago cs.CR cs.CL PDF

Benchmark LOW

The Bystander Effect in Multi-Agent Reasoning: Quantifying Cognitive Loafing in Collaborative Interactions

Dahlia Shehata, Ming Li

Multi-agent systems (MAS) assume that collaborating inherently improves Large Language Model (LLM) reasoning. We challenge this by demonstrating that...

1 months ago cs.MA cs.AI PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial