AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 301–320 of 562 papers

Clear filters

Benchmark MEDIUM

CTI-REALM: Benchmark to Evaluate Agent Performance on Security Detection Rule Generation Capabilities

Arjun Chakraborty, Sandra Ho, Adam Cook +1 more

CTI-REALM (Cyber Threat Real World Evaluation and LLM Benchmarking) is a benchmark designed to evaluate AI agents' ability to interpret cyber threat...

3 months ago cs.CR PDF

Benchmark LOW

Visual-ERM: Reward Modeling for Visual Equivalence

Ziyu Liu, Shengyuan Ding, Xinyu Fang +7 more

Vision-to-code tasks require models to reconstruct structured visual inputs, such as charts, tables, and SVGs, into executable or structured...

3 months ago cs.CV cs.AI PDF

Benchmark MEDIUM

Test-Time Attention Purification for Backdoored Large Vision Language Models

Zhifang Zhang, Bojun Yang, Shuo He +5 more

Despite the strong multimodal performance, large vision-language models (LVLMs) are vulnerable during fine-tuning to backdoor attacks, where...

3 months ago cs.CV cs.CR PDF

Benchmark HIGH

Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies

Siddharth Srikanth, Freddie Liang, Sophie Hsu +9 more

Vision-Language-Action (VLA) models have significant potential to enable general-purpose robotic systems for a range of vision-language tasks....

3 months ago cs.RO cs.AI cs.CL PDF

Benchmark MEDIUM

Security Considerations for Artificial Intelligence Agents

Ninghui Li, Kaiyuan Zhang, Kyle Polley +1 more

This article, a lightly adapted version of Perplexity's response to NIST/CAISI Request for Information 2025-0035, details our observations and...

3 months ago cs.LG cs.AI cs.CR PDF

Benchmark LOW

Continual Learning with Vision-Language Models via Semantic-Geometry Preservation

Chiyuan He, Zihuan Qiu, Fanman Meng +4 more

Continual learning of pretrained vision-language models (VLMs) is prone to catastrophic forgetting, yet current approaches adapt to new tasks without...

3 months ago cs.CV cs.LG PDF

Benchmark LOW

Continual Learning with Vision-Language Models via Semantic-Geometry Preservation

Chiyuan He, Zihuan Qiu, Fanman Meng +4 more

Continual learning of pretrained vision-language models (VLMs) is prone to catastrophic forgetting, yet current approaches adapt to new tasks without...

3 months ago cs.CV cs.LG PDF

Benchmark MEDIUM

Understanding LLM Behavior When Encountering User-Supplied Harmful Content in Harmless Tasks

Junjie Chu, Yiting Qu, Ye Leng +4 more

Large Language Models (LLMs) are increasingly trained to align with human values, primarily focusing on task level, i.e., refusing to execute...

3 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

KEPo: Knowledge Evolution Poison on Graph-based Retrieval-Augmented Generation

Qizhi Chen, Chao Qi, Yihong Huang +5 more

Graph-based Retrieval-Augmented Generation (GraphRAG) constructs the Knowledge Graph (KG) from external databases to enhance the timeliness and...

3 months ago cs.LG cs.AI cs.CR PDF

Benchmark LOW

AutoVeriFix+: High-Correctness RTL Generation via Trace-Aware Causal Fix and Semantic Redundancy Pruning

Yan Tan, Xiangchen Meng, Zijun Jiang +1 more

Large language models (LLMs) have demonstrated impressive capabilities in generating software code for high-level programming languages such as...

3 months ago cs.PL cs.AR PDF

Benchmark LOW

Follow the Saliency: Supervised Saliency for Retrieval-augmented Dense Video Captioning

Seung hee Choi, MinJu Jeon, Hyunwoo Oh +2 more

Existing retrieval-augmented approaches for Dense Video Captioning (DVC) often fail to achieve accurate temporal segmentation aligned with true event...

3 months ago cs.CV PDF

Benchmark LOW

Security-by-Design for LLM-Based Code Generation: Leveraging Internal Representations for Concept-Driven Steering Mechanisms

Maximilian Wendlinger, Daniel Kowatsch, Konstantin Böttinger +1 more

Large Language Models (LLMs) show remarkable capabilities in understanding natural language and generating complex code. However, as practitioners...

3 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

TOSSS: a CVE-based Software Security Benchmark for Large Language Models

Marc Damie, Murat Bilgehan Ertan, Domenico Essoussi +3 more

With their increasing capabilities, Large Language Models (LLMs) are now used across many industries. They have become useful tools for software...

3 months ago cs.LG cs.CL cs.CR PDF

Benchmark MEDIUM

IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs

Chuan Guo, Juan Felipe Ceron Uribe, Sicheng Zhu +10 more

Instruction hierarchy (IH) defines how LLMs prioritize system, developer, user, and tool instructions under conflict, providing a concrete,...

3 months ago cs.AI cs.CL cs.CR PDF

Benchmark MEDIUM

CLaRE-ty Amid Chaos: Quantifying Representational Entanglement to Predict Ripple Effects in LLM Editing

Manit Baser, Alperen Yildiz, Dinil Mon Divakaran +1 more

The static knowledge representations of large language models (LLMs) inevitably become outdated or incorrect over time. While model-editing...

3 months ago cs.LG PDF

Benchmark MEDIUM

Why LLMs Fail: A Failure Analysis and Partial Success Measurement for Automated Security Patch Generation

Amir Al-Maamari

Large Language Models (LLMs) show promise for Automated Program Repair (APR), yet their effectiveness on security vulnerabilities remains poorly...

3 months ago cs.CR cs.AI PDF

Benchmark LOW

AULLM++: Structural Reasoning with Large Language Models for Micro-Expression Recognition

Zhishu Liu, Kaishen Yuan, Bo Zhao +2 more

Micro-expression Action Unit (AU) detection identifies localized AUs from subtle facial muscle activations, providing a foundation for decoding...

3 months ago cs.CV PDF

Benchmark MEDIUM

Models as Lego Builders: Assembling Malice from Benign Blocks via Semantic Blueprints

Chenxi Li, Xianggan Liu, Dake Shen +9 more

Despite the rapid progress of Large Vision-Language Models (LVLMs), the integration of visual modalities introduces new safety vulnerabilities that...

3 months ago cs.CV cs.LG PDF

Benchmark MEDIUM

Backdoor4Good: Benchmarking Beneficial Uses of Backdoors in LLMs

Yige Li, Wei Zhao, Zhe Li +6 more

Backdoor mechanisms have traditionally been studied as security threats that compromise the integrity of machine learning models. However, the same...

3 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Governance Architecture for Autonomous Agent Systems: Threats, Framework, and Engineering Practice

Yuxu Ge

Autonomous agents powered by large language models introduce a class of execution-layer vulnerabilities -- prompt injection, retrieval poisoning, and...

3 months ago cs.CR cs.AI PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial