AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 181–200 of 562 papers

Clear filters

Benchmark MEDIUM

System-aware contextual digital twin for ICS anomaly diagnosis

Eungyu Woo, Yooshin Kim, Wonje Heo +1 more

Industrial Control Systems (ICS) integrate computing, physical processes, and communication to operate critical infrastructures such as power grids,...

2 months ago cs.CR PDF

Benchmark HIGH

Evaluation of Prompt Injection Defenses in Large Language Models

Priyal Deep, Shane Emmons, Amy Fox +3 more

LLM-powered applications routinely embed secrets in system prompts, yet models can be tricked into revealing them. We built an adaptive attacker that...

2 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms

Qi Li, Bo Yin, Weiqi Huang +6 more

Vision-Language-Action (VLA) models are emerging as a unified substrate for embodied intelligence. This shift raises a new class of safety...

2 months ago cs.RO PDF

Benchmark LOW

When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs

Pegah Khayatan, Jayneel Parekh, Arnaud Dapogny +3 more

Despite impressive progress in capabilities of large vision-language models (LVLMs), these systems remain vulnerable to hallucinations, i.e., outputs...

2 months ago cs.CV cs.AI cs.CL PDF

Benchmark MEDIUM

CSC: Turning the Adversary's Poison against Itself

Yuchen Shi, Xin Guo, Huajie Chen +3 more

Poisoning-based backdoor attacks pose significant threats to deep neural networks by embedding triggers in training data, causing models to...

2 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Supervised Learning Has a Necessary Geometric Blind Spot: Theory, Consequences, and Minimal Repair

Vishal Rajput

We prove that empirical risk minimisation (ERM) imposes a necessary geometric constraint on learned representations: any encoder that minimises...

2 months ago cs.LG cs.AI cs.CV PDF

Benchmark LOW

Understanding and Mitigating Spurious Signal Amplification in Test-Time Reinforcement Learning for Math Reasoning

Yongcan Yu, Lingxiao He, Jian Liang +5 more

Test-time reinforcement learning (TTRL) always adapts models at inference time via pseudo-labeling, leaving it vulnerable to spurious optimization...

2 months ago cs.LG cs.AI cs.CL PDF

Benchmark MEDIUM

Cross-Session Threats in AI Agents: Benchmark, Evaluation, and Algorithms

Ari Azarafrooz

AI-agent guardrails are memoryless: each message is judged in isolation, so an adversary who spreads a single attack across dozens of sessions slips...

2 months ago cs.CR cs.AI cs.CL PDF

Benchmark MEDIUM

Residual Risk Analysis in Benign Code: How Far Are We? A Multi-Model Semantic and Structural Similarity Approach

Mohammad Farhad, Shuvalaxmi Dass

Software security relies on effective vulnerability detection and patching, yet determining whether a patch fully eliminates risk remains an...

2 months ago cs.SE cs.CR PDF

Benchmark HIGH

Synthesizing Multi-Agent Harnesses for Vulnerability Discovery

Hanzhi Liu, Chaofan Shou, Xiaonan Liu +4 more

LLM agents have begun to find real security vulnerabilities that human auditors and automated fuzzers missed for decades, in source-available targets...

2 months ago cs.CR PDF

Benchmark MEDIUM

Anchor-and-Resume Concession Under Dynamic Pricing for LLM-Augmented Freight Negotiation

Hoang Nguyen, Lu Wang, Marta Gaia Bras

Freight brokerages negotiate thousands of carrier rates daily under dynamic pricing conditions where models frequently revise targets...

2 months ago cs.MA cs.AI cs.CL PDF

Benchmark MEDIUM

Towards Secure Logging: Characterizing and Benchmarking Logging Code Security Issues with LLMs

He Yang Yuan, Xin Wang, Kundi Yao +3 more

Logging code plays an important role in software systems by recording key events and behaviors, which are essential for debugging and monitoring....

2 months ago cs.SE cs.AI cs.CR PDF

Benchmark MEDIUM

Indic-CodecFake meets SATYAM: Towards Detecting Neural Audio Codec Synthesized Speech Deepfakes in Indic Languages

Girish, Mohd Mujtaba Akhtar, Orchid Chetia Phukan +1 more

The rapid advancement of Audio Large Language Models (ALMs), driven by Neural Audio Codecs (NACs), has led to the emergence of highly realistic...

2 months ago eess.AS PDF

Benchmark MEDIUM

An AI Agent Execution Environment to Safeguard User Data

Robert Stanley, Avi Verma, Lillian Tsai +2 more

AI agents promise to serve as general-purpose personal assistants for their users, which requires them to have access to private user data (e.g.,...

2 months ago cs.CR cs.AI cs.OS PDF

Benchmark MEDIUM

Cyber Defense Benchmark: Agentic Threat Hunting Evaluation for LLMs in SecOps

Alankrit Chona, Igor Kozlov, Ambuj Kumar

We introduce the Cyber Defense Benchmark, a benchmark for measuring how well large language model (LLM) agents perform the core SOC analyst task of...

2 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Cyber Defense Benchmark: Agentic Threat Hunting Evaluation for LLMs in SecOps

Alankrit Chona, Igor Kozlov, Ambuj Kumar

We introduce the Cyber Defense Benchmark, a benchmark for measuring how well large language model (LLM) agents perform the core SOC analyst task of...

2 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture The Flag Challenges

Ali Al-Kaswan, Maksim Plotnikov, Maxim Hájek +3 more

Large Language Model (LLM) agents are increasingly proposed for autonomous cybersecurity tasks, but their capabilities in realistic offensive...

2 months ago cs.AI cs.CR cs.SE PDF

Benchmark HIGH

HarDBench: A Benchmark for Draft-Based Co-Authoring Jailbreak Attacks for Safe Human-LLM Collaborative Writing

Euntae Kim, Soomin Han, Buru Chang

Large language models (LLMs) are increasingly used as co-authors in collaborative writing, where users begin with rough drafts and rely on LLMs to...

2 months ago cs.CL PDF

Benchmark MEDIUM

Towards Understanding the Robustness of Sparse Autoencoders

Ahson Saiyed, Sabrina Sadiekh, Chirag Agarwal

Large Language Models (LLMs) remain vulnerable to optimization-based jailbreak attacks that exploit internal gradient structure. While Sparse...

2 months ago cs.LG cs.AI cs.CL PDF

Benchmark MEDIUM

AgenTEE: Confidential LLM Agent Execution on Edge Devices

Sina Abdollahi, Mohammad M Maheri, Javad Forough +5 more

Large Language Model (LLM) agents provide powerful automation capabilities, but they also create a substantially broader attack surface than...

2 months ago cs.CR cs.OS PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial