AI Security Research

AI Threat Alert indexes 3,082+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,082
Attack

1,196
Benchmark

883
Defense

421
Tool

321
Survey

181

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 881–900 of 3,082 papers

Attack HIGH

Making MLLMs Blind: Adversarial Smuggling Attacks in MLLM Content Moderation

Zhiheng Li, Zongyang Ma, Yuntong Pan +8 more

Multimodal Large Language Models (MLLMs) are increasingly being deployed as automated content moderators. Within this landscape, we uncover a...

2 months ago cs.CV PDF

Attack MEDIUM

Evaluating PQC KEMs, Combiners, and Cascade Encryption via Adaptive IND-CPA Testing Using Deep Learning

Simon Calderon, Niklas Johansson, Onur Günlü

Ensuring ciphertext indistinguishability is fundamental to cryptographic security, but empirically validating this property in real implementations...

2 months ago cs.CR cs.IT cs.LG PDF

Attack HIGH

RefineRAG: Word-Level Poisoning Attacks via Retriever-Guided Text Refinement

Ziye Wang, Guanyu Wang, Kailong Wang

Retrieval-Augmented Generation (RAG) significantly enhances Large Language Models (LLMs), but simultaneously exposes a critical vulnerability to...

2 months ago cs.CR PDF

Defense MEDIUM

SentinelSphere: Integrating AI-Powered Real-Time Threat Detection with Cybersecurity Awareness Training

Nikolaos D. Tantaroudas, Ilias Karachalios, Andrew J. McCracken

The field of cybersecurity is confronted with two interrelated challenges: a worldwide deficit of qualified practitioners and ongoing human-factor...

2 months ago cs.CE cs.AI cs.CR PDF

Attack HIGH

MirageBackdoor: A Stealthy Attack that Induces Think-Well-Answer-Wrong Reasoning

Yizhe Zeng, Wei Zhang, Yunpeng Li +3 more

While Chain-of-Thought (CoT) prompting has become a standard paradigm for eliciting complex reasoning capabilities in Large Language Models, it...

2 months ago cs.CR PDF

Defense LOW

FedDetox: Robust Federated SLM Alignment via On-Device Data Sanitization

Shunan Zhu, Jiawei Chen, Yonghao Yu +1 more

As high quality public data becomes scarce, Federated Learning (FL) provides a vital pathway to leverage valuable private user data while preserving...

2 months ago cs.CR cs.LG PDF

Defense HIGH

Argus: Reorchestrating Static Analysis via a Multi-Agent Ensemble for Full-Chain Security Vulnerability Detection

Zi Liang, Qipeng Xie, Jun He +7 more

Recent advancements in Large Language Models (LLMs) have sparked interest in their application to Static Application Security Testing (SAST),...

2 months ago cs.CR cs.CL cs.SE PDF

Benchmark HIGH

PoC-Adapt: Semantic-Aware Automated Vulnerability Reproduction with LLM Multi-Agents and Reinforcement Learning-Driven Adaptive Policy

Phan The Duy, Nguyen Viet Duy, Khoa Ngo-Khanh +2 more

While recent approaches leverage large language models (LLMs) and multi-agent pipelines to automatically generate proof-of-concept (PoC) exploits...

2 months ago cs.CR PDF

Attack HIGH

Can Drift-Adaptive Malware Detectors Be Made Robust? Attacks and Defenses Under White-Box and Black-Box Threats

Adrian Shuai Li, Md Ajwad Akil, Elisa Bertino

Concept drift and adversarial evasion are two major challenges for deploying machine learning-based malware detectors. While both have been studied...

2 months ago cs.CR PDF

Tool MEDIUM

SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills

Yinghan Hou, Zongyou Yang

OpenClaw's ClawHub marketplace hosts over 13,000 community-contributed agent skills, and between 13% and 26% of them contain security vulnerabilities...

3 months ago cs.CR cs.AI PDF

Defense MEDIUM

VLMShield: Efficient and Robust Defense of Vision-Language Models against Malicious Prompts

Peigui Qi, Kunsheng Tang, Yanpu Yu +7 more

Vision-Language Models (VLMs) face significant safety vulnerabilities from malicious prompt attacks due to weakened alignment during visual...

3 months ago cs.LG PDF

Attack HIGH

The Defense Trilemma: Why Prompt Injection Defense Wrappers Fail?

Manish Bhatt, Sarthak Munshi, Vineeth Sai Narajala +4 more

We prove that no continuous, utility-preserving wrapper defense-a function $D: X\to X$ that preprocesses inputs before the model sees them-can make...

3 months ago cs.CR cs.AI PDF

Other LOW

Evidence-Based Actor-Verifier Reasoning for Echocardiographic Agents

Peng Huang, Yiming Wang, Yineng Chen +9 more

Echocardiography plays an important role in the screening and diagnosis of cardiovascular diseases. However, automated intelligent analysis of...

3 months ago cs.CV PDF

Attack MEDIUM

Exclusive Unlearning

Mutsumi Sasaki, Kouta Nakayama, Yusuke Miyao +2 more

When introducing Large Language Models (LLMs) into industrial applications, such as healthcare and education, the risk of generating harmful content...

3 months ago cs.CL PDF

Attack LOW

Social Dynamics as Critical Vulnerabilities that Undermine Objective Decision-Making in LLM Collectives

Changgeon Ko, Jisu Shin, Hoyun Song +3 more

Large language model (LLM) agents are increasingly acting as human delegates in multi-agent environments, where a representative agent integrates...

3 months ago cs.CL cs.AI cs.MA PDF

Survey MEDIUM

A Formal Security Framework for MCP-Based AI Agents: Threat Taxonomy, Verification Models, and Defense Mechanisms

Nirajan Acharya, Gaurav Kumar Gupta

The Model Context Protocol (MCP), introduced by Anthropic in November 2024 and now governed by the Linux Foundation's Agentic AI Foundation, has...

3 months ago cs.CR cs.AI PDF

Attack MEDIUM

Adversarial Robustness of Time-Series Classification for Crystal Collimator Alignment

Xaver Fink, Borja Fernandez Adiego, Daniele Mirarchi +4 more

In this paper, we analyze and improve the adversarial robustness of a convolutional neural network (CNN) that assists crystal-collimator alignment at...

3 months ago cs.CR cs.LG PDF

Attack HIGH

Reading Between the Pixels: An Inscriptive Jailbreak Attack on Text-to-Image Models

Zonghao Ying, Haowen Dai, Lianyu Hu +5 more

Modern text-to-image (T2I) models can now render legible, paragraph-length text, enabling a fundamentally new class of misuse. We identify and...

3 months ago cs.CV PDF

Attack HIGH

Reading Between the Pixels: An Inscriptive Jailbreak Attack on Text-to-Image Models

Zonghao Ying, Haowen Dai, Lianyu Hu +5 more

Modern text-to-image (T2I) models can now render legible, paragraph-length text, enabling a fundamentally new class of misuse. We identify and...

3 months ago cs.CV PDF

Defense MEDIUM

Harnessing Hyperbolic Geometry for Harmful Prompt Detection and Sanitization

Igor Maljkovic, Maria Rosaria Briglia, Iacopo Masi +2 more

Vision-Language Models (VLMs) have become essential for tasks such as image synthesis, captioning, and retrieval by aligning textual and visual...

3 months ago cs.CR cs.AI cs.CV PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,082+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial