AI Security Research

AI Threat Alert indexes 2,841+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

2,841
Attack

1,118
Benchmark

804
Defense

383
Tool

299
Survey

161

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1241–1260 of 1,352 papers

Clear filters

Attack MEDIUM

VisualDAN: Exposing Vulnerabilities in VLMs with Visual-Driven DAN Commands

Aofan Liu, Lulu Tang

Vision-Language Models (VLMs) have garnered significant attention for their remarkable ability to interpret and generate multimodal content. However,...

8 months ago cs.CR cs.AI PDF

Attack MEDIUM

Chain-of-Trigger: An Agentic Backdoor that Paradoxically Enhances Agentic Robustness

Jiyang Qiu, Xinbei Ma, Yunqing Xu +2 more

The rapid deployment of large language model (LLM)-based agents in real-world applications has raised serious concerns about their trustworthiness....

8 months ago cs.AI PDF

Defense MEDIUM

From Defender to Devil? Unintended Risk Interactions Induced by LLM Defenses

Xiangtao Meng, Tianshuo Cong, Li Wang +4 more

Large Language Models (LLMs) have shown remarkable performance across various applications, but their deployment in real-world settings faces several...

8 months ago cs.CR PDF

Benchmark MEDIUM

Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

Eric Hanchen Jiang, Weixuan Ou, Run Liu +8 more

Safety alignment of large language models currently faces a central challenge: existing alignment techniques often prioritize mitigating responses to...

8 months ago cs.LG cs.AI cs.CL PDF

Survey MEDIUM

Rethinking Reasoning: A Survey on Reasoning-based Backdoors in LLMs

Man Hu, Xinyi Wu, Zuofeng Suo +5 more

With the rise of advanced reasoning capabilities, large language models (LLMs) are receiving increasing attention. However, although reasoning...

8 months ago cs.CR cs.AI PDF

Survey MEDIUM

LLM Unlearning Under the Microscope: A Full-Stack View on Methods and Metrics

Chongyu Fan, Changsheng Wang, Yancheng Huang +2 more

Machine unlearning for large language models (LLMs) aims to remove undesired data, knowledge, and behaviors (e.g., for safety, privacy, or copyright)...

8 months ago cs.LG cs.CL PDF

Benchmark MEDIUM

PEAR: Planner-Executor Agent Robustness Benchmark

Shen Dong, Mingxuan Zhang, Pengfei He +4 more

Large Language Model (LLM)-based Multi-Agent Systems (MAS) have emerged as a powerful paradigm for tackling complex, multi-step tasks across diverse...

8 months ago cs.LG PDF

Tool MEDIUM

VelLMes: A high-interaction AI-based deception framework

Muris Sladić, Veronica Valeros, Carlos Catania +1 more

There are very few SotA deception systems based on Large Language Models. The existing ones are limited only to simulating one type of service,...

8 months ago cs.CR cs.AI cs.CL PDF

Benchmark MEDIUM

Exposing Citation Vulnerabilities in Generative Engines

Riku Mochizuki, Shusuke Komatsu, Souta Noguchi +1 more

We analyze answers generated by generative engines (GEs) from the perspectives of citation publishers and the content-injection barrier, defined as...

8 months ago cs.CR cs.CL cs.IR PDF

Attack MEDIUM

Get RICH or Die Scaling: Profitably Trading Inference Compute for Robustness

Tavish McDonald, Bo Lei, Stanislav Fort +2 more

Models are susceptible to adversarially out-of-distribution (OOD) data despite large training-compute investments into their robustification. Zaremba...

8 months ago cs.LG PDF

Attack MEDIUM

Are LLMs Reliable Rankers? Rank Manipulation via Two-Stage Token Optimization

Tiancheng Xing, Jerry Li, Yixuan Du +1 more

Large language models (LLMs) are increasingly used as rerankers in information retrieval, yet their ranking behavior can be steered by small,...

8 months ago cs.CL cs.AI cs.IR PDF

Benchmark MEDIUM

Distilling Lightweight Language Models for C/C++ Vulnerabilities

Zhiyuan Wei, Xiaoxuan Yang, Jing Sun +1 more

The increasing complexity of modern software systems exacerbates the prevalence of security vulnerabilities, posing risks of severe breaches and...

8 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Code Agent can be an End-to-end System Hacker: Benchmarking Real-world Threats of Computer-use Agent

Weidi Luo, Qiming Zhang, Tianyu Lu +9 more

Computer-use agent (CUA) frameworks, powered by large language models (LLMs) or multimodal LLMs (MLLMs), are rapidly maturing as assistants that can...

8 months ago cs.CR PDF

Defense MEDIUM

From Description to Detection: LLM based Extendable O-RAN Compliant Blind DoS Detection in 5G and Beyond

Thusitha Dayaratne, Ngoc Duy Pham, Viet Vo +5 more

The quality and experience of mobile communication have significantly improved with the introduction of 5G, and these improvements are expected to...

8 months ago cs.CR cs.ET cs.LG PDF

Benchmark MEDIUM

Text-to-Image Models Leave Identifiable Signatures: Implications for Leaderboard Security

Ali Naseh, Anshuman Suri, Yuefeng Peng +3 more

Generative AI leaderboards are central to evaluating model capabilities, but remain vulnerable to manipulation. Among key adversarial objectives is...

8 months ago cs.LG cs.CR PDF

Benchmark MEDIUM

DP-SNP-TIHMM: Differentially Private, Time-Inhomogeneous Hidden Markov Models for Synthesizing Genome-Wide Association Datasets

Shadi Rahimian, Mario Fritz

Single nucleotide polymorphism (SNP) datasets are fundamental to genetic studies but pose significant privacy risks when shared. The correlation of...

8 months ago cs.LG cs.CR q-bio.GN PDF

Benchmark MEDIUM

Towards Reliable and Practical LLM Security Evaluations via Bayesian Modelling

Mary Llewellyn, Annie Gray, Josh Collyer +1 more

Before adopting a new large language model (LLM) architecture, it is critical to understand vulnerabilities accurately. Existing evaluations can be...

8 months ago cs.CR cs.AI cs.CL PDF

Tool MEDIUM

AutoPentester: An LLM Agent-based Framework for Automated Pentesting

Yasod Ginige, Akila Niroshan, Sajal Jain +1 more

Penetration testing and vulnerability assessment are essential industry practices for safeguarding computer systems. As cyber threats grow in scale...

8 months ago cs.CR cs.AI PDF

Survey MEDIUM

The Role of Federated Learning in Improving Financial Security: A Survey

Cade Houston Kennedy, Amr Hilal, Morteza Momeni

With the growth of digital financial systems, robust security and privacy have become a concern for financial institutions. Even though traditional...

8 months ago cs.CR cs.AI PDF

Attack MEDIUM

Adversarial Reinforcement Learning for Large Language Model Agent Safety

Zizhao Wang, Dingcheng Li, Vaishakh Keshava +4 more

Large Language Model (LLM) agents can leverage tools such as Google Search to complete complex tasks. However, this tool usage introduces the risk of...

8 months ago cs.LG cs.AI cs.CL PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 2,841+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial