AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1421–1440 of 1,455 papers

Clear filters

Benchmark MEDIUM

WAREX: Web Agent Reliability Evaluation on Existing Benchmarks

Su Kara, Fazle Faisal, Suman Nath

Recent advances in browser-based LLM agents have shown promise for automating tasks ranging from simple form filling to hotel booking or online...

9 months ago cs.AI cs.CR cs.LG PDF

Benchmark MEDIUM

An Ensemble Framework for Unbiased Language Model Watermarking

Yihan Wu, Ruibo Chen, Georgios Milis +1 more

As large language models become increasingly capable and widely deployed, verifying the provenance of machine-generated content is critical to...

9 months ago cs.CR PDF

Defense MEDIUM

Policy-as-Prompt: Turning AI Governance Rules into Guardrails for AI Agents

Gauri Kholkar, Ratinder Ahuja

As autonomous AI agents are used in regulated and safety-critical settings, organizations need effective ways to turn policy into enforceable...

9 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

Binary Diff Summarization using Large Language Models

Meet Udeshi, Venkata Sai Charan Putrevu, Prashanth Krishnamurthy +4 more

Security of software supply chains is necessary to ensure that software updates do not contain maliciously injected code or introduce vulnerabilities...

9 months ago cs.CR PDF

Benchmark MEDIUM

Quant Fever, Reasoning Blackholes, Schrodinger's Compliance, and More: Probing GPT-OSS-20B

Shuyi Lin, Tian Lu, Zikai Wang +3 more

OpenAI's GPT-OSS family provides open-weight language models with explicit chain-of-thought (CoT) reasoning and a Harmony prompt format. We summarize...

9 months ago cs.AI cs.CR PDF

Benchmark MEDIUM

How LLMs Learn to Reason: A Complex Network Perspective

Sihan Hu, Xiansheng Cai, Yuan Huang +5 more

Training large language models with Reinforcement Learning with Verifiable Rewards (RLVR) exhibits a set of distinctive and puzzling behaviors that...

9 months ago cs.AI cond-mat.dis-nn cond-mat.stat-mech PDF

Benchmark MEDIUM

AutoML in Cybersecurity: An Empirical Study

Sherif Saad, Kevin Shi, Mohammed Mamun +1 more

Automated machine learning (AutoML) has emerged as a promising paradigm for automating machine learning (ML) pipeline design, broadening AI adoption....

9 months ago cs.CR PDF

Defense MEDIUM

Uncovering Vulnerabilities of LLM-Assisted Cyber Threat Intelligence

Yuqiao Meng, Luoxi Tang, Feiyang Yu +4 more

Large language models (LLMs) are increasingly used to help security analysts manage the surge of cyber threats, automating tasks from vulnerability...

9 months ago cs.CR cs.AI PDF

Other MEDIUM

Contrastive Learning Enhances Language Model Based Cell Embeddings for Low-Sample Single Cell Transcriptomics

Luxuan Zhang, Douglas Jiang, Qinglong Wang +2 more

Large language models (LLMs) have shown strong ability in generating rich representations across domains such as natural language processing and...

9 months ago q-bio.GN cs.NE q-bio.MN PDF

Defense MEDIUM

ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search

Zeyu Shen, Basileal Imana, Tong Wu +3 more

Retrieval-Augmented Generation (RAG) enhances Large Language Models by grounding their outputs in external documents. These systems, however, remain...

9 months ago cs.CR cs.AI PDF

Defense MEDIUM

Beyond Embeddings: Interpretable Feature Extraction for Binary Code Similarity

Charles E. Gagnon, Steven H. H. Ding, Philippe Charland +1 more

Binary code similarity detection is a core task in reverse engineering. It supports malware analysis and vulnerability discovery by identifying...

9 months ago cs.AI cs.CR cs.SE PDF

Attack MEDIUM

Dual-Space Smoothness for Robust and Balanced LLM Unlearning

Han Yan, Zheyuan Liu, Meng Jiang

With the rapid advancement of large language models, Machine Unlearning has emerged to address growing concerns around user privacy, copyright...

9 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

Reinforcement Learning-Based Prompt Template Stealing for Text-to-Image Models

Xiaotian Zou

Multimodal Large Language Models (MLLMs) have transformed text-to-image workflows, allowing designers to create novel visual concepts with...

9 months ago cs.CV cs.AI PDF

Attack MEDIUM

LLM Watermark Evasion via Bias Inversion

Jeongyeon Hwang, Sangdon Park, Jungseul Ok

Watermarking offers a promising solution for detecting LLM-generated content, yet its robustness under realistic query-free (black-box) evasion...

9 months ago cs.CR cs.AI PDF

Attack MEDIUM

What Do They Fix? LLM-Aided Categorization of Security Patches for Critical Memory Bugs

Xingyu Li, Juefei Pu, Yifan Wu +13 more

Open-source software projects are foundational to modern software ecosystems, with the Linux kernel standing out as a critical exemplar due to its...

9 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

Evaluating the Limits of Large Language Models in Multilingual Legal Reasoning

Antreas Ioannou, Andreas Shiamishis, Nora Hollenstein +1 more

In an era dominated by Large Language Models (LLMs), understanding their capabilities and limitations, especially in high-stakes fields like law, is...

9 months ago cs.CL cs.AI cs.LG PDF

Benchmark MEDIUM

Erase or Hide? Suppressing Spurious Unlearning Neurons for Robust Unlearning

Nakyeong Yang, Dong-Kyum Kim, Jea Kwon +3 more

Large language models trained on web-scale data can memorize private or sensitive knowledge, raising significant privacy risks. Although some...

9 months ago cs.LG PDF

Benchmark MEDIUM

Secure and Efficient Access Control for Computer-Use Agents via Context Space

Haochen Gong, Chenxiao Li, Rui Chang +1 more

Large language model (LLM)-based computer-use agents represent a convergence of AI and OS capabilities, enabling natural language to control system-...

9 months ago cs.CR cs.AI cs.OS PDF

Benchmark MEDIUM

Polysemous Language Gaussian Splatting via Matching-based Mask Lifting

Jiayu Ding, Xinpeng Liu, Zhiyi Pan +2 more

Lifting 2D open-vocabulary understanding into 3D Gaussian Splatting (3DGS) scenes is a critical challenge. However, mainstream methods suffer from...

9 months ago cs.CV cs.AI PDF

Tool MEDIUM

Library Hallucinations in LLMs: Risk Analysis Grounded in Developer Queries

Lukas Twist, Jie M. Zhang, Mark Harman +1 more

Large language models (LLMs) are increasingly used to generate code, yet they continue to hallucinate, often inventing non-existent libraries. Such...

9 months ago cs.SE cs.CL PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial