AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 781–800 of 866 papers

Clear filters

Benchmark MEDIUM

Countermind: A Multi-Layered Security Architecture for Large Language Models

Dominik Schwarz

The security of Large Language Model (LLM) applications is fundamentally challenged by "form-first" attacks like prompt injection and jailbreaking,...

8 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Don't Walk the Line: Boundary Guidance for Filtered Generation

Sarah Ball, Andreas Haupt

Generative models are increasingly paired with safety classifiers that filter harmful or undesirable outputs. A common strategy is to fine-tune the...

8 months ago cs.LG cs.CL PDF

Benchmark MEDIUM

Information-Preserving Reformulation of Reasoning Traces for Antidistillation

Jiayu Ding, Lei Cui, Li Dong +2 more

Recent advances in Large Language Models (LLMs) show that extending the length of reasoning chains significantly improves performance on complex...

8 months ago cs.CL PDF

Benchmark LOW

Defects4C: Benchmarking Large Language Model Repair Capability with C/C++ Bugs

Jian Wang, Xiaofei Xie, Qiang Hu +4 more

Automated Program Repair (APR) plays a critical role in enhancing the quality and reliability of software systems. While substantial progress has...

8 months ago cs.SE PDF

Benchmark LOW

Judge Before Answer: Can MLLM Discern the False Premise in Question?

Jidong Li, Lingyong Fang, Haodong Zhao +2 more

Multimodal large language models (MLLMs) have witnessed astonishing advancements in recent years. Despite these successes, MLLMs remain vulnerable to...

8 months ago cs.CL cs.AI PDF

Benchmark LOW

The Hidden DNA of LLM-Generated JavaScript: Structural Patterns Enable High-Accuracy Authorship Attribution

Norbert Tihanyi, Bilel Cherif, Richard A. Dubniczky +2 more

In this paper, we present the first large-scale study exploring whether JavaScript code generated by Large Language Models (LLMs) can reveal which...

8 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

One Token Embedding Is Enough to Deadlock Your Large Reasoning Model

Mohan Zhang, Yihua Zhang, Jinghan Jia +3 more

Modern large reasoning models (LRMs) exhibit impressive multi-step problem-solving via chain-of-thought (CoT) reasoning. However, this iterative...

8 months ago cs.LG cs.AI cs.CR PDF

Benchmark MEDIUM

PrediQL: Automated Testing of GraphQL APIs with LLMs

Shaolun Liu, Sina Marefat, Omar Tsai +4 more

GraphQL's flexible query model and nested data dependencies expose APIs to complex, context-dependent vulnerabilities that are difficult to uncover...

8 months ago cs.CR cs.SE PDF

Benchmark MEDIUM

SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents

Zonghao Ying, Yangguang Shao, Jianle Gan +9 more

Large vision-language model (LVLM)-based web agents are emerging as powerful tools for automating complex online tasks. However, when deployed in...

8 months ago cs.CR cs.CV PDF

Benchmark MEDIUM

Getting Your Indices in a Row: Full-Text Search for LLM Training Data for Real World

Ines Altemir Marinas, Anastasiia Kucherenko, Alexander Sternfeld +1 more

The performance of Large Language Models (LLMs) is determined by their training data. Despite the proliferation of open-weight LLMs, access to LLM...

8 months ago cs.CL PDF

Benchmark MEDIUM

Detecting Data Contamination from Reinforcement Learning Post-training for Large Language Models

Yongding Tao, Tian Wang, Yihong Dong +4 more

Data contamination poses a significant threat to the reliable evaluation of Large Language Models (LLMs). This issue arises when benchmark samples...

8 months ago cs.CL cs.AI cs.LG PDF

Benchmark MEDIUM

SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG

Xiaonan Si, Meilin Zhu, Simeng Qin +7 more

Retrieval-augmented generation (RAG) systems enhance large language models (LLMs) with external knowledge but are vulnerable to corpus poisoning and...

8 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

CommandSans: Securing AI Agents with Surgical Precision Prompt Sanitization

Debeshee Das, Luca Beurer-Kellner, Marc Fischer +1 more

The increasing adoption of LLM agents with access to numerous tools and sensitive data significantly widens the attack surface for indirect prompt...

8 months ago cs.CR cs.AI cs.LG PDF

Benchmark HIGH

When Search Goes Wrong: Red-Teaming Web-Augmented Large Language Models

Haoran Ou, Kangjie Chen, Xingshuo Han +4 more

Large Language Models (LLMs) have been augmented with web search to overcome the limitations of the static knowledge boundary by accessing up-to-date...

8 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

Eric Hanchen Jiang, Weixuan Ou, Run Liu +8 more

Safety alignment of large language models currently faces a central challenge: existing alignment techniques often prioritize mitigating responses to...

8 months ago cs.LG cs.AI cs.CL PDF

Benchmark LOW

Fortifying LLM-Based Code Generation with Graph-Based Reasoning on Secure Coding Practices

Rupam Patir, Keyan Guo, Haipeng Cai +1 more

The code generation capabilities of Large Language Models (LLMs) have transformed the field of software development. However, this advancement also...

8 months ago cs.CR cs.AI cs.SE PDF

Benchmark MEDIUM

PEAR: Planner-Executor Agent Robustness Benchmark

Shen Dong, Mingxuan Zhang, Pengfei He +4 more

Large Language Model (LLM)-based Multi-Agent Systems (MAS) have emerged as a powerful paradigm for tackling complex, multi-step tasks across diverse...

8 months ago cs.LG PDF

Benchmark LOW

Secure-Instruct: An Automated Pipeline for Synthesizing Instruction-Tuning Datasets Using LLMs for Secure Code Generation

Junjie Li, Fazle Rabbi, Bo Yang +2 more

Although Large Language Models (LLMs) show promising solutions to automated code generation, they often produce insecure code that threatens software...

8 months ago cs.SE PDF

Benchmark MEDIUM

Exposing Citation Vulnerabilities in Generative Engines

Riku Mochizuki, Shusuke Komatsu, Souta Noguchi +1 more

We analyze answers generated by generative engines (GEs) from the perspectives of citation publishers and the content-injection barrier, defined as...

8 months ago cs.CR cs.CL cs.IR PDF

Benchmark MEDIUM

Distilling Lightweight Language Models for C/C++ Vulnerabilities

Zhiyuan Wei, Xiaoxuan Yang, Jing Sun +1 more

The increasing complexity of modern software systems exacerbates the prevalence of security vulnerabilities, posing risks of severe breaches and...

8 months ago cs.CR cs.AI PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial