AI Security Research

AI Threat Alert indexes 3,037+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,037
Attack

1,183
Benchmark

868
Defense

410
Tool

319
Survey

177

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 641–660 of 951 papers

Clear filters

Attack MEDIUM

vLLM Semantic Router: Signal Driven Decision Routing for Mixture-of-Modality Models

Xunzhuo Liu, Huamin Chen, Samzong Lu +27 more

As large language models (LLMs) diversify across modalities, capabilities, and cost profiles, the problem of intelligent request routing -- selecting...

4 months ago cs.NI cs.AI PDF

Tool MEDIUM

LLM-enabled Applications Require System-Level Threat Monitoring

Yedi Zhang, Haoyu Wang, Xianglin Yang +2 more

LLM-enabled applications are rapidly reshaping the software ecosystem by using large language models as core reasoning components for complex task...

4 months ago cs.CR cs.AI cs.SE PDF

Attack MEDIUM

Efficient Multi-Party Secure Comparison over Different Domains with Preprocessing Assistance

Kaiwen Wang, Xiaolin Chang, Yuehan Dong +1 more

Secure comparison is a fundamental primitive in multi-party computation, supporting privacy-preserving applications such as machine learning and data...

4 months ago cs.CR PDF

Benchmark MEDIUM

CIBER: A Comprehensive Benchmark for Security Evaluation of Code Interpreter Agents

Lei Ba, Qinbin Li, Songze Li

LLM-based code interpreter agents are increasingly deployed in critical workflows, yet their robustness against risks introduced by their code...

4 months ago cs.CR PDF

Benchmark MEDIUM

CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions

Jingwei Shi, Xinxiang Yin, Jing Huang +2 more

The evaluation of Large Language Models (LLMs) for code generation relies heavily on the quality and robustness of test cases. However, existing...

4 months ago cs.SE cs.AI cs.CR PDF

Tool MEDIUM

ILION: Deterministic Pre-Execution Safety Gates for Agentic AI Systems

Florin Adrian Chitan

The proliferation of autonomous AI agents capable of executing real-world actions - filesystem operations, API calls, database modifications,...

4 months ago cs.AI cs.CR PDF

Survey MEDIUM

LLM Scalability Risk for Agentic-AI and Model Supply Chain Security

Kiarash Ahi, Vaibhav Agrawal, Saeed Valizadeh

Large Language Models (LLMs) & Generative AI are transforming cybersecurity, enabling both advanced defenses and new attacks. Organizations now use...

4 months ago cs.CR PDF

Tool MEDIUM

AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems

Emmanuel Bamidele

Long-running LLM agents require persistent memory to preserve state across interactions, yet most deployed systems manage memory with age-based...

4 months ago cs.DC cs.AI cs.LG PDF

Benchmark MEDIUM

LoMime: Query-Efficient Membership Inference using Model Extraction in Label-Only Settings

Abdullah Caglar Oksuz, Anisa Halimi, Erman Ayday

Membership inference attacks (MIAs) threaten the privacy of machine learning models by revealing whether a specific data point was used during...

4 months ago cs.LG cs.CR PDF

Defense MEDIUM

MANATEE: Inference-Time Lightweight Diffusion Based Safety Defense for LLMs

Chun Yan Ryan Kan, Tommy Tran, Vedant Yadav +4 more

Defending LLMs against adversarial jailbreak attacks remains an open challenge. Existing defenses rely on binary classifiers that fail when...

4 months ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

AndroWasm: an Empirical Study on Android Malware Obfuscation through WebAssembly

Diego Soi, Silvia Lucia Sanna, Lorenzo Pisu +2 more

In recent years, stealthy Android malware has increasingly adopted sophisticated techniques to bypass automatic detection mechanisms and harden...

4 months ago cs.CR PDF

Benchmark MEDIUM

Asking Forever: Universal Activations Behind Turn Amplification in Conversational LLMs

Zachary Coalson, Bo Fang, Sanghyun Hong

Multi-turn interaction length is a dominant factor in the operational costs of conversational LLMs. In this work, we present a new failure mode in...

4 months ago cs.LG cs.CR PDF

Benchmark MEDIUM

What Makes a Good LLM Agent for Real-world Penetration Testing?

Gelei Deng, Yi Liu, Yuekang Li +5 more

LLM-based agents show promise for automating penetration testing, yet reported performance varies widely across systems and benchmarks. We analyze 28...

4 months ago cs.CR cs.SE PDF

Survey MEDIUM

What Breaks Embodied AI Security:LLM Vulnerabilities, CPS Flaws,or Something Else?

Boyang Ma, Hechuan Guo, Peizhuo Lv +5 more

Embodied AI systems (e.g., autonomous vehicles, service robots, and LLM-driven interactive agents) are rapidly transitioning from controlled...

4 months ago cs.CR cs.AI PDF

Defense MEDIUM

Fail-Closed Alignment for Large Language Models

Zachary Coalson, Beth Sohler, Aiden Gabriel +1 more

We identify a structural weakness in current large language model (LLM) alignment: modern refusal mechanisms are fail-open. While existing approaches...

4 months ago cs.LG cs.CR PDF

Tool MEDIUM

Mind the GAP: Text Safety Does Not Transfer to Tool-Call Safety in LLM Agents

Arnold Cartagena, Ariane Teixeira

Large language models deployed as agents increasingly interact with external systems through tool calls--actions with real-world consequences that...

4 months ago cs.AI cs.SE PDF

Attack MEDIUM

DeepContext: Stateful Real-Time Detection of Multi-Turn Adversarial Intent Drift in LLMs

Justin Albrethsen, Yash Datta, Kunal Kumar +1 more

While Large Language Model (LLM) capabilities have scaled, safety guardrails remain largely stateless, treating multi-turn dialogues as a series of...

4 months ago cs.AI cs.ET cs.LG PDF

Defense MEDIUM

NeST: Neuron Selective Tuning for LLM Safety

Sasha Behrouzi, Lichao Wu, Mohamadreza Rostami +1 more

Safety alignment is essential for the responsible deployment of large language models (LLMs). Yet, existing approaches often rely on heavyweight...

4 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

Large-scale online deanonymization with LLMs

Simon Lermen, Daniel Paleka, Joshua Swanson +3 more

We show that large language models can be used to perform at-scale deanonymization. With full Internet access, our agent can re-identify Hacker News...

4 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

Policy Compiler for Secure Agentic Systems

Nils Palumbo, Sarthak Choudhary, Jihye Choi +2 more

LLM-based agents are increasingly being deployed in contexts requiring complex authorization policies: customer service protocols, approval...

4 months ago cs.CR cs.AI cs.MA PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,037+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial