AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 681–700 of 866 papers

Clear filters

Benchmark MEDIUM

BackWeak: Backdooring Knowledge Distillation Simply with Weak Triggers and Fine-tuning

Shanmin Wang, Dongdong Zhao

Knowledge Distillation (KD) is essential for compressing large models, yet relying on pre-trained "teacher" models downloaded from third-party...

7 months ago cs.CR cs.AI cs.CV PDF

Benchmark LOW

SCRUTINEER: Detecting Logic-Level Usage Violations of Reusable Components in Smart Contracts

Xingshuang Lin, Binbin Zhao, Jinwen Wang +3 more

Smart Contract Reusable Components(SCRs) play a vital role in accelerating the development of business-specific contracts by promoting modularity and...

7 months ago cs.SE cs.CR PDF

Benchmark MEDIUM

SEAL: Subspace-Anchored Watermarks for LLM Ownership

Yanbo Dai, Zongjie Li, Zhenlan Ji +1 more

Large language models (LLMs) have achieved remarkable success across a wide range of natural language processing tasks, demonstrating human-level...

7 months ago cs.CR PDF

Benchmark MEDIUM

PATCHEVAL: A New Benchmark for Evaluating LLMs on Patching Real-World Vulnerabilities

Zichao Wei, Jun Zeng, Ming Wen +8 more

Software vulnerabilities are increasing at an alarming rate. However, manual patching is both time-consuming and resource-intensive, while existing...

7 months ago cs.CR cs.SE PDF

Benchmark MEDIUM

Robustness of LLM-enabled vehicle trajectory prediction under data security threats

Feilong Wang, Fuqiang Liu

The integration of large language models (LLMs) into automated driving systems has opened new possibilities for reasoning and decision-making by...

7 months ago cs.LG cs.AI cs.CR PDF

Benchmark MEDIUM

Synthetic Voices, Real Threats: Evaluating Large Text-to-Speech Models in Generating Harmful Audio

Guangke Chen, Yuhui Wang, Shouling Ji +2 more

Modern text-to-speech (TTS) systems, particularly those built on Large Audio-Language Models (LALMs), generate high-fidelity speech that faithfully...

7 months ago cs.SD cs.AI cs.CR PDF

Benchmark MEDIUM

Can AI Models be Jailbroken to Phish Elderly Victims? An End-to-End Evaluation

Fred Heiding, Simon Lermen

We present an end-to-end demonstration of how attackers can exploit AI safety failures to harm vulnerable populations: from jailbreaking LLMs to...

7 months ago cs.CR cs.AI cs.CY PDF

Benchmark LOW

OutSafe-Bench: A Benchmark for Multimodal Offensive Content Detection in Large Language Models

Yuping Yan, Yuhan Xie, Yuanshuai Li +3 more

Since Multimodal Large Language Models (MLLMs) are increasingly being integrated into everyday tools and intelligent agents, growing concerns have...

7 months ago cs.LG cs.CL PDF

Benchmark LOW

CTRL-ALT-DECEIT: Sabotage Evaluations for Automated AI R&D

Francis Rhys Ward, Teun van der Weij, Hanna Gábor +6 more

AI systems are increasingly able to autonomously conduct realistic software engineering tasks, and may soon be deployed to automate machine learning...

7 months ago cs.AI PDF

Benchmark MEDIUM

Taught by the Flawed: How Dataset Insecurity Breeds Vulnerable AI Code

Catherine Xia, Manar H. Alalfi

AI programming assistants have demonstrated a tendency to generate code containing basic security vulnerabilities. While developers are ultimately...

7 months ago cs.CR cs.AI PDF

Benchmark LOW

CARScenes: Semantic VLM Dataset for Safe Autonomous Driving

Yuankai He, Weisong Shi

CAR-Scenes is a frame-level dataset for autonomous driving that enables training and evaluation of vision-language models (VLMs) for interpretable,...

7 months ago cs.CV cs.RO PDF

Benchmark LOW

Toward Honest Language Models for Deductive Reasoning

Jiarui Liu, Kaustubh Dhole, Yingheng Wang +7 more

Deductive reasoning is the process of deriving conclusions strictly from the given premises, without relying on external knowledge. We define honesty...

7 months ago cs.CL PDF

Benchmark MEDIUM

One Signature, Multiple Payments: Demystifying and Detecting Signature Replay Vulnerabilities in Smart Contracts

Zexu Wang, Jiachi Chen, Zewei Lin +7 more

Smart contracts have significantly advanced blockchain technology, and digital signatures are crucial for reliable verification of contract...

7 months ago cs.CR cs.SE PDF

Benchmark LOW

Preference is More Than Comparisons: Rethinking Dueling Bandits with Augmented Human Feedback

Shengbo Wang, Hong Sun, Ke Li

Interactive preference elicitation (IPE) aims to substantially reduce human effort while acquiring human preferences in wide personalization systems....

7 months ago cs.LG PDF

Benchmark MEDIUM

DeepTracer: Tracing Stolen Model via Deep Coupled Watermarks

Yunfei Yang, Xiaojun Chen, Yuexin Xuan +3 more

Model watermarking techniques can embed watermark information into the protected model for ownership declaration by constructing specific...

7 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

Robust Backdoor Removal by Reconstructing Trigger-Activated Changes in Latent Representation

Kazuki Iwahana, Yusuke Yamasaki, Akira Ito +2 more

Backdoor attacks pose a critical threat to machine learning models, causing them to behave normally on clean data but misclassify poisoned data into...

7 months ago cs.LG cs.CR PDF

Benchmark MEDIUM

From LLMs to Agents: A Comparative Evaluation of LLMs and LLM-based Agents in Security Patch Detection

Junxiao Han, Zheng Yu, Lingfeng Bao +5 more

The widespread adoption of open-source software (OSS) has accelerated software innovation but also increased security risks due to the rapid...

7 months ago cs.CR cs.SE PDF

Benchmark HIGH

MSCR: Exploring the Vulnerability of LLMs' Mathematical Reasoning Abilities Using Multi-Source Candidate Replacement

Zhishen Sun, Guang Dai, Haishan Ye

LLMs demonstrate performance comparable to human abilities in complex tasks such as mathematical reasoning, but their robustness in mathematical...

7 months ago cs.AI PDF

Benchmark LOW

Probabilities Are All You Need: A Probability-Only Approach to Uncertainty Estimation in Large Language Models

Manh Nguyen, Sunil Gupta, Hung Le

Large Language Models (LLMs) exhibit strong performance across various natural language processing (NLP) tasks but remain vulnerable to...

7 months ago cs.LG PDF

Benchmark MEDIUM

Breaking the Stealth-Potency Trade-off in Clean-Image Backdoors with Generative Trigger Optimization

Binyan Xu, Fan Yang, Di Tang +2 more

Clean-image backdoor attacks, which use only label manipulation in training datasets to compromise deep neural networks, pose a significant threat to...

7 months ago cs.CV cs.CR cs.LG PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial