AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 341–360 of 3,023 papers

Survey LOW

RAPTOR+: A Visually Grounded Vision-Language Framework to Improve Clinical Trust and Auditability in Automated Cancer Referral Processing

Sofiat Abioye, Ufaq Khan, Shazad Ashraf +6 more

Urgent suspected colorectal cancer (CRC) referrals create operational bottlenecks because semi-structured clinical documents often require manual...

1 months ago cs.CV PDF

Benchmark MEDIUM

Building an Adversarial Malware Dataset by Family and Type: Generation, Evasion, and Poisoning Evaluation

David Košťál, Martin Jureček

We present a dataset of adversarial malware samples derived from the public RawMal-TF collection of real-world malware binaries. Using a suite of...

1 months ago cs.CR cs.LG PDF

Attack MEDIUM

Closed-Loop Bidirectional Prompting for Adversarial Robustness of Vision Language Models

Xiao Liu, Jiaxiang Liu, Boci Peng +6 more

Vision Language Models adapt well to downstream tasks but are highly vulnerable to adversarial perturbations that disrupt cross-modal semantic...

1 months ago cs.CV PDF

Attack MEDIUM

Capability and Robustness Cannot Both Be Free: An Information-Theoretic Bound for Vision-Language-Action Models

Jianwei Tai

Vision-Language-Action (VLA) models are increasingly deployed on real robots, where each predicted action is executed and each failure carries a...

1 months ago cs.CR cs.LG PDF

Attack MEDIUM

Capability and Robustness Cannot Both Be Free: An Information-Theoretic Bound for Vision-Language-Action Models

Jianwei Tai

Vision-Language-Action (VLA) models are increasingly deployed on real robots, where each predicted action is executed and each failure carries a...

1 months ago cs.CR cs.LG PDF

Attack HIGH

How Agentic AI Coding Assistants Become the Attacker's Shell

Yue Liu, Yanjie Zhao, Yunbo Lyu +3 more

Agentic AI coding assistants can edit files, run commands, and access the internet on behalf of developers. However, their reliance on unvetted...

1 months ago cs.SE cs.CR PDF

Benchmark MEDIUM

TTPrint: Evidence-Grounded TTP Extraction via Diverge-then-Converge Verification

Yutong Cheng, Changze Li, Raihan Sultan Pasha Basuki +3 more

Extracting MITRE ATT&CK techniques from cyber threat intelligence (CTI) reports is an open-set, multi-label problem requiring both high recall (not...

1 months ago cs.CR cs.AI cs.CL PDF

Tool MEDIUM

"What is the Problem Space?" Defining Host-space Adversarial Perturbations against Network Intrusion Detection Systems

Miel Verkerken, Laurens D'hooge, Bruno Volckaert +2 more

Network Intrusion Detection Systems (NIDS) are now increasingly leveraging Machine Learning (ML) techniques to detect malicious network activities....

1 months ago cs.CR PDF

Benchmark MEDIUM

An Efficient and Privacy-Preserving Architecture for Cross-Institutional Collaborative RAG

Chenxin Mao, Shangyu Liu, Zhenzhe Zheng +3 more

Retrieval-Augmented Generation (RAG) empowers LLMs with external knowledge, making cross-institutional domain-specific knowledge base integration a...

1 months ago cs.CR cs.DC PDF

Survey HIGH

LLM-as-a-Reviewer: Benchmarking Their Ability, Divergence, and Prompt Injection Resistance as Paper Reviewers

Lingyao Li, Junjie Xiong, Changjia Zhu +5 more

Large language models (LLMs) are increasingly used in academic peer review, yet their reliability, alignment with human judgment, and robustness to...

1 months ago cs.CL cs.CY cs.ET PDF

Tool HIGH

Evo-Attacker: Memory-Augmented Reinforcement Learning for Long-Horizon Tool Attacks on LLM-MAS

Bingyu Yan, Xiaoming Zhang, Jinyu Hou +4 more

While Large Language Model-based Multi-Agent Systems (LLM-MAS) demonstrate remarkable capabilities in solving complex tasks by orchestrating...

1 months ago cs.CR cs.AI cs.MA PDF

Attack HIGH

When Interpretability Becomes a Liability: Adversarial Attacks on CBM Concept Layers

Aditya Sridhar

Concept Bottleneck Models (CBMs) have emerged as a cornerstone approach for interpretable machine learning, providing human-understandable...

1 months ago cs.LG cs.CR cs.CV PDF

Attack HIGH

Localization then Neutralization: Gradient-guided Token Suppression against Visual Prompt Injection Attack

Dongpeng Zhang, Ke Ma, Yangbangyan Jiang +4 more

Adversarial images pose a severe security threat to multimodal large language models through prompt injection. Existing defenses largely lack a...

1 months ago cs.LG PDF

Benchmark MEDIUM

Security in the Fine-Tuning Lifecycle of Large Language Models: Threats, Defenses,Evaluation, and Future Directions

Wenjuan Li, Yitao Liu, Runze Chen +1 more

Background: Fine-tuning is central to adapting pre-trained Large Language Models (LLMs) to downstream tasks, but its reliance on training data,...

1 months ago cs.CR cs.AI cs.LG PDF

Attack LOW

QML-PipeGuard: Drift-Aware Behavioral Fingerprinting for Quantum Machine Learning Pipeline Integrity

Esra Yeniaras

Quantum machine learning (QML) is moving from research prototypes to deployed cloud services. As QML enters regulated industries, the integrity of...

1 months ago quant-ph cs.CR cs.LG PDF

Attack MEDIUM

MemMark: State-Evolution Attribution Watermarking for Agent Long-Term Memory Systems

Haobo Zhang, Xutao Mao, Guangyuan Dong +5 more

Memory-backed agents need provenance that can survive leaked or migrated snapshots, where logs, visible outputs, and trusted metadata may be absent....

1 months ago cs.CR PDF

Attack MEDIUM

APT-Agent: Automated Penetration Testing using Large Language Models

William Guanting Li, Alsharif Abuadbba, Kristen Moore +1 more

Penetration testing is essential to securing modern web infrastructures, yet traditional manual methods struggle to keep pace with their scale and...

1 months ago cs.CR cs.AI PDF

Tool MEDIUM

Memory-Induced Tool-Drift in LLM Agents

Mahavir Dabas, Jihyun Jeong, Ming Jin +1 more

Modern LLM agents combine long-term memory for personalization with tool-calling interfaces for taking actions in the world -- a combination...

1 months ago cs.CR cs.LG PDF

Tool LOW

Contractual Skills: A GovernSpec Design Framework for Enterprise AI Agents

Ting Liu

Skills are increasingly used to package agent instructions, workflows, scripts, and reference materials. In enterprise settings, however, skills...

1 months ago cs.SE cs.AI PDF

Attack MEDIUM

EnCAgg: Enhanced Clustering Aggregation for Robust Federated Learning against Dynamic Model Poisoning

Tianyun Zhang, Zhen Yang, Haozhao Wang +2 more

Federated learning faces increasing threats from model poisoning attacks, which harms its application to improve privacy. Existing defense methods...

1 months ago cs.CR cs.LG PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial