AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 161–180 of 521 papers

Clear filters

Benchmark MEDIUM

Design Principles for the Construction of a Benchmark Evaluating Security Operation Capabilities of Multi-agent AI Systems

Yicheng Cai, Mitchell John DeStefano, Guodong Dong +5 more

As Large Language Models (LLMs) and multi-agent AI systems are demonstrating increasing potential in cybersecurity operations, organizations,...

2 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Evaluating Privilege Usage of Agents on Real-World Tools

Quan Zhang, Lianhang Fu, Lvsi Lian +5 more

Equipping LLM agents with real-world tools can substantially improve productivity. However, granting agents autonomy over tool use also transfers the...

2 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Seeing to Ground: Visual Attention for Hallucination-Resilient MDLLMs

Vishal Narnaware, Animesh Gupta, Kevin Zhai +2 more

Multimodal Diffusion Large Language Models (MDLLMs) achieve high-concurrency generation through parallel masked decoding, yet the architectures...

3 months ago cs.CV PDF

Benchmark MEDIUM

Unveiling the Resilience of LLM-Enhanced Search Engines against Black-Hat SEO Manipulation

Pei Chen, Geng Hong, Xinyi Wu +6 more

The emergence of Large Language Model-enhanced Search Engines (LLMSEs) has revolutionized information retrieval by integrating web-scale search...

3 months ago cs.CR cs.IR PDF

Benchmark MEDIUM

Environment-Grounded Multi-Agent Workflow for Autonomous Penetration Testing

Michael Somma, Markus Großpointner, Paul Zabalegui +2 more

The increasing complexity and interconnectivity of digital infrastructures make scalable and reliable security assessment methods essential. Robotic...

3 months ago cs.RO cs.AI PDF

Benchmark MEDIUM

Walma: Learning to See Memory Corruption in WebAssembly

Oussama Draissi, Mark Günzel, Ahmad-Reza Sadeghi +1 more

WebAssembly's (Wasm) monolithic linear memory model facilitates memory corruption attacks that can escalate to cross-site scripting in browsers or go...

3 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

Do World Action Models Generalize Better than VLAs? A Robustness Study

Zhanguang Zhang, Zhiyuan Li, Behnam Rahmati +10 more

Robot action planning in the real world is challenging as it requires not only understanding the current state of the environment but also predicting...

3 months ago cs.RO PDF

Benchmark MEDIUM

SecureBreak -- A dataset towards safe and secure models

Marco Arazzi, Vignesh Kumar Kembu, Antonino Nocera

Large language models are becoming pervasive core components in many real-world applications. As a consequence, security alignment represents a...

3 months ago cs.CR cs.AI cs.CL PDF

Benchmark MEDIUM

Unveiling the Security Risks of Federated Learning in the Wild: From Research to Practice

Jiahao Chen, Zhiming Zhao, Yuwen Pu +4 more

Federated learning (FL) has attracted substantial attention in both academia and industry, yet its practical security posture remains poorly...

3 months ago cs.CR PDF

Benchmark MEDIUM

LJ-Bench: Ontology-Based Benchmark for U.S. Crime

Hung Yun Tseng, Wuzhen Li, Blerina Gkotse +1 more

The potential of Large Language Models (LLMs) to provide harmful information remains a significant concern due to the vast breadth of illegal queries...

3 months ago cs.LG PDF

Benchmark MEDIUM

The production of meaning in the processing of natural language

Christopher J. Agostino, Quan Le Thien, Nayan D'Souza +1 more

Understanding the fundamental mechanisms governing the production of meaning in the processing of natural language is critical for designing safe,...

3 months ago cs.CL cs.AI cs.HC PDF

Benchmark MEDIUM

Trojan's Whisper: Stealthy Manipulation of OpenClaw through Injected Bootstrapped Guidance

Fazhong Liu, Zhuoyan Chen, Tu Lan +6 more

Autonomous coding agents are increasingly integrated into software development workflows, offering capabilities that extend beyond code suggestion to...

3 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Functional Subspace Watermarking for Large Language Models

Zikang Ding, Junhao Li, Suling Wu +3 more

Model watermarking utilizes internal representations to protect the ownership of large language models (LLMs). However, these features inevitably...

3 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Parameter-Efficient Modality-Balanced Symmetric Fusion for Multimodal Remote Sensing Semantic Segmentation

Haocheng Li, Juepeng Zheng, Shuangxi Miao +4 more

Multimodal remote sensing semantic segmentation enhances scene interpretation by exploiting complementary physical cues from heterogeneous data....

3 months ago cs.CV PDF

Benchmark MEDIUM

WeatherReasonSeg: A Benchmark for Weather-Aware Reasoning Segmentation in Visual Language Models

Wanjun Du, Zifeng Yuan, Tingting Chen +3 more

Existing vision-language models (VLMs) have demonstrated impressive performance in reasoning-based segmentation. However, current benchmarks are...

3 months ago cs.CV cs.AI PDF

Benchmark MEDIUM

VeriGrey: Greybox Agent Validation

Yuntong Zhang, Sungmin Kang, Ruijie Meng +2 more

Agentic AI has been a topic of great interest recently. A Large Language Model (LLM) agent involves one or more LLMs in the back-end. In the front...

3 months ago cs.AI PDF

Benchmark MEDIUM

Differential Harm Propensity in Personalized LLM Agents: The Curious Case of Mental Health Disclosure

Caglar Yildirim

Large language models (LLMs) are increasingly deployed as tool-using agents, shifting safety concerns from harmful text generation to harmful task...

3 months ago cs.AI PDF

Benchmark MEDIUM

CoMAI: A Collaborative Multi-Agent Framework for Robust and Equitable Interview Evaluation

Gengxin Sun, Ruihao Yu, Liangyi Yin +3 more

Ensuring robust and fair interview assessment remains a key challenge in AI-driven evaluation. This paper presents CoMAI, a general-purpose...

3 months ago cs.MA cs.AI PDF

Benchmark MEDIUM

SFCoT: Safer Chain-of-Thought via Active Safety Evaluation and Calibration

Yu Pan, Wenlong Yu, Tiejun Wu +4 more

Large language models (LLMs) have demonstrated remarkable capabilities in complex reasoning tasks. However, they remain highly susceptible to...

3 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Directional Embedding Smoothing for Robust Vision Language Models

Ye Wang, Jing Liu, Toshiaki Koike-Akino

The safety and reliability of vision-language models (VLMs) are a crucial part of deploying trustworthy agentic AI systems. However, VLMs remain...

3 months ago cs.LG cs.AI cs.CL PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial