AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 2901–2920 of 3,023 papers

Benchmark MEDIUM

Sentry: Authenticating Machine Learning Artifacts on the Fly

Andrew Gan, Zahra Ghodsi

Machine learning systems increasingly rely on open-source artifacts such as datasets and models that are created or hosted by other parties. The...

8 months ago cs.CR PDF

Tool MEDIUM

PolyLink: A Blockchain Based Decentralized Edge AI Platform for LLM Inference

Hongbo Liu, Jiannong Cao, Bo Yang +7 more

The rapid advancement of large language models (LLMs) in recent years has revolutionized the AI landscape. However, the deployment model and usage of...

8 months ago cs.CR cs.DC PDF

Attack HIGH

Attack logics, not outputs: Towards efficient robustification of deep neural networks by falsifying concept-based properties

Raik Dankworth, Gesina Schwalbe

Deep neural networks (NNs) for computer vision are vulnerable to adversarial attacks, i.e., miniscule malicious changes to inputs may induce...

8 months ago cs.CR cs.LG PDF

Attack MEDIUM

Understanding Sensitivity of Differential Attention through the Lens of Adversarial Robustness

Tsubasa Takahashi, Shojiro Yamabe, Futa Waseda +1 more

Differential Attention (DA) has been proposed as a refinement to standard attention, suppressing redundant or noisy context through a subtractive...

8 months ago cs.LG cs.CR PDF

Attack MEDIUM

Has the Two-Decade-Old Prophecy Come True? Artificial Bad Intelligence Triggered by Merely a Single-Bit Flip in Large Language Models

Yu Yan, Siqi Lu, Yang Gao +4 more

Recently, Bit-Flip Attack (BFA) has garnered widespread attention for its ability to compromise software system integrity remotely through hardware...

8 months ago cs.CR PDF

Attack HIGH

SVDefense: Effective Defense against Gradient Inversion Attacks via Singular Value Decomposition

Chenxiang Luo, David K. Y. Yau, Qun Song

Federated learning (FL) enables collaborative model training without sharing raw data but is vulnerable to gradient inversion attacks (GIAs), where...

8 months ago cs.CR cs.LG PDF

Tool MEDIUM

Cloud Investigation Automation Framework (CIAF): An AI-Driven Approach to Cloud Forensics

Dalal Alharthi, Ivan Roberto Kawaminami Garcia

Large Language Models (LLMs) have gained prominence in domains including cloud security and forensics. Yet cloud forensic investigations still rely...

8 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

A Call to Action for a Secure-by-Design Generative AI Paradigm

Dalal Alharthi, Ivan Roberto Kawaminami Garcia

Large language models have gained widespread prominence, yet their vulnerability to prompt injection and other adversarial attacks remains a critical...

8 months ago cs.CR cs.AI cs.LG PDF

Benchmark HIGH

From Trace to Line: LLM Agent for Real-World OSS Vulnerability Localization

Haoran Xi, Minghao Shao, Brendan Dolan-Gavitt +2 more

Large language models show promise for vulnerability discovery, yet prevailing methods inspect code in isolation, struggle with long contexts, and...

9 months ago cs.SE cs.CR cs.LG PDF

Attack MEDIUM

MOLM: Mixture of LoRA Markers

Samar Fares, Nurbek Tastan, Noor Hussein +1 more

Generative models can generate photorealistic images at scale. This raises urgent concerns about the ability to detect synthetically generated images...

9 months ago cs.CV cs.CR cs.LG PDF

Benchmark MEDIUM

SecureBERT 2.0: Advanced Language Model for Cybersecurity Intelligence

Ehsan Aghaei, Sarthak Jain, Prashanth Arun +1 more

Effective analysis of cybersecurity and threat intelligence data demands language models that can interpret specialized terminology, complex document...

9 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

CHAI: Command Hijacking against embodied AI

Luis Burbano, Diego Ortiz, Qi Sun +5 more

Embodied Artificial Intelligence (AI) promises to handle edge cases in robotic vehicle systems where data is scarce by using common-sense reasoning...

9 months ago cs.CR cs.AI cs.LG PDF

Tool LOW

SPATA: Systematic Pattern Analysis for Detailed and Transparent Data Cards

João Vitorino, Eva Maia, Isabel Praça +1 more

Due to the susceptibility of Artificial Intelligence (AI) to data perturbations and adversarial examples, it is crucial to perform a thorough...

9 months ago cs.LG cs.CR PDF

Attack MEDIUM

Are Robust LLM Fingerprints Adversarially Robust?

Anshul Nasery, Edoardo Contente, Alkin Kaz +2 more

Model fingerprinting has emerged as a promising paradigm for claiming model ownership. However, robustness evaluations of these schemes have mostly...

9 months ago cs.CR cs.AI cs.LG PDF

Benchmark MEDIUM

Fairness Testing in Retrieval-Augmented Generation: How Small Perturbations Reveal Bias in Small Language Models

Matheus Vinicius da Silva de Oliveira, Jonathan de Andrade Silva, Awdren de Lima Fontao

Large Language Models (LLMs) are widely used across multiple domains but continue to raise concerns regarding security and fairness. Beyond known...

9 months ago cs.AI cs.IR cs.LG PDF

Other LOW

Linking Process to Outcome: Conditional Reward Modeling for LLM Reasoning

Zheng Zhang, Ziwei Shan, Kaitao Song +2 more

Process Reward Models (PRMs) have emerged as a promising approach to enhance the reasoning capabilities of large language models (LLMs) by guiding...

9 months ago cs.LG PDF

Attack MEDIUM

DeepProv: Behavioral Characterization and Repair of Neural Networks via Inference Provenance Graph Analysis

Firas Ben Hmida, Abderrahmen Amich, Ata Kaboudi +1 more

Deep neural networks (DNNs) are increasingly being deployed in high-stakes applications, from self-driving cars to biometric authentication. However,...

9 months ago cs.CR cs.LG PDF

Benchmark LOW

Towards Reliable Benchmarking: A Contamination Free, Controllable Evaluation Framework for Multi-step LLM Function Calling

Seiji Maekawa, Jackson Hassell, Pouya Pezeshkpour +2 more

Existing benchmarks for tool-augmented language models (TaLMs) lack fine-grained control over task difficulty and remain vulnerable to data...

9 months ago cs.CL cs.PL PDF

Benchmark LOW

SeedPrints: Fingerprints Can Even Tell Which Seed Your Large Language Model Was Trained From

Yao Tong, Haonan Wang, Siquan Li +2 more

Fingerprinting Large Language Models (LLMs) is essential for provenance verification and model attribution. Existing methods typically extract...

9 months ago cs.CR cs.AI cs.CL PDF

Attack LOW

Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents

Shuai Shao, Qihan Ren, Chen Qian +8 more

Advances in Large Language Models (LLMs) have enabled a new class of self-evolving agents that autonomously improve through interaction with the...

9 months ago cs.AI cs.CL cs.LG PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial