AI Security Research

AI Threat Alert indexes 3,055+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,055
Attack

1,187
Benchmark

875
Defense

413
Tool

321
Survey

179

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 2321–2340 of 3,055 papers

Benchmark HIGH

BackdoorVLM: A Benchmark for Backdoor Attacks on Vision-Language Models

Juncheng Li, Yige Li, Hanxun Huang +5 more

Backdoor attacks undermine the reliability and trustworthiness of machine learning systems by injecting hidden behaviors that can be maliciously...

7 months ago cs.CV PDF

Defense MEDIUM

EAGER: Edge-Aligned LLM Defense for Robust, Efficient, and Accurate Cybersecurity Question Answering

Onat Gungor, Roshan Sood, Jiasheng Zhou +1 more

Large Language Models (LLMs) are highly effective for cybersecurity question answering (QA) but are difficult to deploy on edge devices due to their...

7 months ago cs.CR PDF

Benchmark MEDIUM

RoguePrompt: Dual-Layer Ciphering for Self-Reconstruction to Circumvent LLM Moderation

Benyamin Tafreshian

Large language models (LLMs) are becoming increasingly integrated into mainstream development platforms and daily technological workflows, typically...

7 months ago cs.CR PDF

Attack MEDIUM

Towards Realistic Guarantees: A Probabilistic Certificate for SmoothLLM

Adarsh Kumarappan, Ayushi Mehrotra

The SmoothLLM defense provides a certification guarantee against jailbreaking attacks, but it relies on a strict "k-unstable" assumption that rarely...

7 months ago cs.LG cs.AI PDF

Attack HIGH

Automating Deception: Scalable Multi-Turn LLM Jailbreaks

Adarsh Kumarappan, Ananya Mujoo

Multi-turn conversational attacks, which leverage psychological principles like Foot-in-the-Door (FITD), where a small initial request paves the way...

7 months ago cs.LG cs.AI PDF

Survey MEDIUM

From Reviewers' Lens: Understanding Bug Bounty Report Invalid Reasons with LLMs

Jiangrui Zheng, Yingming Zhou, Ali Abdullah Ahmad +2 more

Bug bounty platforms (e.g., HackerOne, BugCrowd) leverage crowd-sourced vulnerability discovery to improve continuous coverage, reduce the cost of...

7 months ago cs.SE cs.CR PDF

Attack HIGH

Semantics as a Shield: Label Disguise Defense (LDD) against Prompt Injection in LLM Sentiment Classification

Yanxi Li, Ruocheng Shan

Large language models are increasingly used for text classification tasks such as sentiment analysis, yet their reliance on natural language prompts...

7 months ago cs.CL cs.AI PDF

Attack HIGH

TASO: Jailbreak LLMs via Alternative Template and Suffix Optimization

Yanting Wang, Runpeng Geng, Jinghui Chen +2 more

Many recent studies showed that LLMs are vulnerable to jailbreak attacks, where an attacker can perturb the input of an LLM to induce it to generate...

7 months ago cs.CR PDF

Defense MEDIUM

Beyond Binary Classification: A Semi-supervised Approach to Generalized AI-generated Image Detection

Hong-Hanh Nguyen-Le, Van-Tuan Tran, Dinh-Thuc Nguyen +1 more

The rapid advancement of generators (e.g., StyleGAN, Midjourney, DALL-E) has produced highly realistic synthetic images, posing significant...

7 months ago cs.LG cs.AI cs.CR PDF

Other HIGH

For Those Who May Find Themselves on the Red Team

Tyler Shoemaker

This position paper argues that literary scholars must engage with large language model (LLM) interpretability research. While doing so will involve...

7 months ago cs.CL PDF

Tool MEDIUM

Shadows in the Code: Exploring the Risks and Defenses of LLM-based Multi-Agent Software Development Systems

Xiaoqing Wang, Keman Huang, Bin Liang +2 more

The rapid advancement of Large Language Model (LLM)-driven multi-agent systems has significantly streamlined software developing tasks, enabling...

7 months ago cs.CR cs.AI cs.CL PDF

Tool MEDIUM

LLMs as Firmware Experts: A Runtime-Grown Tree-of-Agents Framework

Xiangrui Zhang, Zeyu Chen, Haining Wang +1 more

Large Language Models (LLMs) and their agent systems have recently demonstrated strong potential in automating code reasoning and vulnerability...

7 months ago cs.CR cs.SE PDF

Tool MEDIUM

Z-Space: A Multi-Agent Tool Orchestration Framework for Enterprise-Grade LLM Automation

Qingsong He, Jing Nan, Jiayu Jiao +5 more

Large Language Models can break through knowledge and timeliness limitations by invoking external tools within the Model Context Protocol framework...

7 months ago cs.SE cs.AI PDF

Defense MEDIUM

SafeCiM: Investigating Resilience of Hybrid Floating-Point Compute-in-Memory Deep Learning Accelerators

Swastik Bhattacharya, Sanjay Das, Anand Menon +3 more

Deep Neural Networks (DNNs) continue to grow in complexity with Large Language Models (LLMs) incorporating vast numbers of parameters. Handling these...

7 months ago cs.AR cs.LG PDF

Other MEDIUM

Can LLMs Help Allocate Public Health Resources? A Case Study on Childhood Lead Testing

Mohamed Afane, Ying Wang, Juntao Chen

Public health agencies face critical challenges in identifying high-risk neighborhoods for childhood lead exposure with limited resources for...

7 months ago cs.CY cs.AI PDF

Benchmark MEDIUM

Think Fast: Real-Time IoT Intrusion Reasoning Using IDS and LLMs at the Edge Gateway

Saeid Jamshidi, Amin Nikanjam, Negar Shahabi +4 more

As the number of connected IoT devices continues to grow, securing these systems against cyber threats remains a major challenge, especially in...

7 months ago cs.CR PDF

Attack HIGH

Exploiting the Experts: Unauthorized Compression in MoE-LLMs

Pinaki Prasad Guha Neogi, Ahmad Mohammadshirazi, Dheeraj Kulshrestha +1 more

Mixture-of-Experts (MoE) architectures are increasingly adopted in large language models (LLMs) for their scalability and efficiency. However, their...

7 months ago cs.LG cs.AI PDF

Attack HIGH

Vulnerability-Aware Robust Multimodal Adversarial Training

Junrui Zhang, Xinyu Zhao, Jie Peng +3 more

Multimodal learning has shown significant superiority on various tasks by integrating multiple modalities. However, the interdependencies among...

7 months ago cs.LG cs.CR PDF

Attack MEDIUM

ASTRA: Agentic Steerability and Risk Assessment Framework

Itay Hazan, Yael Mathov, Guy Shtar +2 more

Securing AI agents powered by Large Language Models (LLMs) represents one of the most critical challenges in AI security today. Unlike traditional...

7 months ago cs.CR PDF

Benchmark MEDIUM

Building Browser Agents: Architecture, Security, and Practical Solutions

Aram Vardanyan

Browser agents enable autonomous web interaction but face critical reliability and security challenges in production. This paper presents findings...

7 months ago cs.SE PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,055+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial