AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 321–340 of 388 papers

Clear filters

Attack MEDIUM

ShadowLogic: Backdoors in Any Whitebox LLM

Kasimir Schulz, Amelia Kawasaki, Leo Ring

Large language models (LLMs) are widely deployed across various applications, often with safeguards to prevent the generation of harmful or...

7 months ago cs.CR cs.AI PDF

Attack MEDIUM

Diffusion LLMs are Natural Adversaries for any LLM

David Lüdke, Tom Wollschläger, Paul Ungermann +2 more

We introduce a novel framework that transforms the resource-intensive (adversarial) prompt optimization problem into an \emph{efficient, amortized...

7 months ago cs.LG stat.ML PDF

Attack MEDIUM

Measuring the Security of Mobile LLM Agents under Adversarial Prompts from Untrusted Third-Party Channels

Chenghao Du, Quanfeng Huang, Tingxuan Tang +3 more

Large Language Models (LLMs) have transformed software development, enabling AI-powered applications known as LLM-based agents that promise to...

7 months ago cs.CR PDF

Attack MEDIUM

PVMark: Enabling Public Verifiability for LLM Watermarking Schemes

Haohua Duan, Liyao Xiang, Xin Zhang

Watermarking schemes for large language models (LLMs) have been proposed to identify the source of the generated text, mitigating the potential...

8 months ago cs.CR cs.CL cs.LG PDF

Attack MEDIUM

PEEL: A Poisoning-Exposing Encoding Theoretical Framework for Local Differential Privacy

Lisha Shuai, Jiuling Dong, Nan Zhang +5 more

Local Differential Privacy (LDP) is a widely adopted privacy-protection model in the Internet of Things (IoT) due to its lightweight, decentralized,...

8 months ago cs.CR PDF

Attack MEDIUM

SmoothGuard: Defending Multimodal Large Language Models with Noise Perturbation and Clustering Aggregation

Guangzhi Su, Shuchang Huang, Yutong Ke +3 more

Multimodal large language models (MLLMs) have achieved impressive performance across diverse tasks by jointly reasoning over textual and visual...

8 months ago cs.LG cs.CR PDF

Attack MEDIUM

S3C2 Summit 2025-03: Industry Secure Supply Chain Summit

Elizabeth Lin, Jonah Ghebremichael, William Enck +5 more

Software supply chains, while providing immense economic and software development value, are only as strong as their weakest link. Over the past...

8 months ago cs.CR PDF

Attack MEDIUM

Retracing the Past: LLMs Emit Training Data When They Get Lost

Myeongseob Ko, Nikhil Reddy Billa, Adam Nguyen +3 more

The memorization of training data in large language models (LLMs) poses significant privacy and copyright concerns. Existing data extraction methods,...

8 months ago cs.CL cs.AI PDF

Attack MEDIUM

Is Your Prompt Poisoning Code? Defect Induction Rates and Security Mitigation Strategies

Bin Wang, YiLu Zhong, MiDi Wan +4 more

Large language models (LLMs) have become indispensable for automated code generation, yet the quality and security of their outputs remain a critical...

8 months ago cs.CR cs.AI PDF

Attack MEDIUM

Self-Calibrated Consistency can Fight Back for Adversarial Robustness in Vision-Language Models

Jiaxiang Liu, Jiawei Du, Xiao Liu +2 more

Pre-trained vision-language models (VLMs) such as CLIP have demonstrated strong zero-shot capabilities across diverse domains, yet remain highly...

8 months ago cs.CV PDF

Attack MEDIUM

Adapting Noise-Driven PUF and AI for Secure WBG ICS: A Proof-of-Concept Study

Devon A. Kelly, Christiana Chamon

Wide-bandgap (WBG) technologies offer unprecedented improvements in power system efficiency, size, and performance, but also introduce unique sensor...

8 months ago cs.CR cs.LG eess.SY PDF

Attack MEDIUM

Toward Understanding the Transferability of Adversarial Suffixes in Large Language Models

Sarah Ball, Niki Hasrati, Alexander Robey +4 more

Discrete optimization-based jailbreaking attacks on large language models aim to generate short, nonsensical suffixes that, when appended onto input...

8 months ago cs.CL cs.AI PDF

Attack MEDIUM

Security Logs to ATT&CK Insights: Leveraging LLMs for High-Level Threat Understanding and Cognitive Trait Inference

Soham Hans, Stacy Marsella, Sophia Hirschmann +1 more

Understanding adversarial behavior in cybersecurity has traditionally relied on high-level intelligence reports and manual interpretation of attack...

8 months ago cs.CR cs.AI PDF

Attack MEDIUM

RAGRank: Using PageRank to Counter Poisoning in CTI LLM Pipelines

Austin Jia, Avaneesh Ramesh, Zain Shamsi +2 more

Retrieval-Augmented Generation (RAG) has emerged as the dominant architectural pattern to operationalize Large Language Model (LLM) usage in Cyber...

8 months ago cs.CR cs.AI cs.IR PDF

Attack MEDIUM

NeuPerm: Disrupting Malware Hidden in Neural Network Parameters by Leveraging Permutation Symmetry

Daniel Gilkarov, Ran Dubin

Pretrained deep learning model sharing holds tremendous value for researchers and enterprises alike. It allows them to apply deep learning by...

8 months ago cs.CR PDF

Attack MEDIUM

SecureInfer: Heterogeneous TEE-GPU Architecture for Privacy-Critical Tensors for Large Language Model Deployment

Tushar Nayan, Ziqi Zhang, Ruimin Sun

With the increasing deployment of Large Language Models (LLMs) on mobile and edge platforms, securing them against model extraction attacks has...

8 months ago cs.CR cs.LG cs.SE PDF

Attack MEDIUM

Collaborative penetration testing suite for emerging generative AI algorithms

Petar Radanliev

Problem Space: AI Vulnerabilities and Quantum Threats Generative AI vulnerabilities: model inversion, data poisoning, adversarial inputs. Quantum...

8 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

Agentic Reinforcement Learning for Search is Unsafe

Yushi Yang, Shreyansh Padarha, Andrew Lee +1 more

Agentic reinforcement learning (RL) trains large language models to autonomously call tools during reasoning, with search as the most common...

8 months ago cs.CL PDF

Attack MEDIUM

Can Transformer Memory Be Corrupted? Investigating Cache-Side Vulnerabilities in Large Language Models

Elias Hossain, Swayamjit Saha, Somshubhra Roy +1 more

Even when prompts and parameters are secured, transformer language models remain vulnerable because their key-value (KV) cache during inference...

8 months ago cs.CR cs.AI PDF

Attack MEDIUM

Black-box Optimization of LLM Outputs by Asking for Directions

Jie Zhang, Meng Ding, Yang Liu +2 more

We present a novel approach for attacking black-box large language models (LLMs) by exploiting their ability to express confidence in natural...

8 months ago cs.CR cs.LG PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial