AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 481–500 of 945 papers

Clear filters

Benchmark MEDIUM

The production of meaning in the processing of natural language

Christopher J. Agostino, Quan Le Thien, Nayan D'Souza +1 more

Understanding the fundamental mechanisms governing the production of meaning in the processing of natural language is critical for designing safe,...

3 months ago cs.CL cs.AI cs.HC PDF

Attack MEDIUM

Memory poisoning and secure multi-agent systems

Vicenç Torra, Maria Bras-Amorós

Memory poisoning attacks for Agentic AI and multi-agent systems (MAS) have recently caught attention. It is partially due to the fact that Large...

3 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Trojan's Whisper: Stealthy Manipulation of OpenClaw through Injected Bootstrapped Guidance

Fazhong Liu, Zhuoyan Chen, Tu Lan +6 more

Autonomous coding agents are increasingly integrated into software development workflows, offering capabilities that extend beyond code suggestion to...

3 months ago cs.CR cs.AI PDF

Attack MEDIUM

Graph-Aware Text-Only Backdoor Poisoning for Text-Attributed Graphs

Qi Luo, Minghui Xu, Dongxiao Yu +1 more

Many learning systems now use graph data in which each node also contains text, such as papers with abstracts or users with posts. Because these...

3 months ago cs.LG cs.CR PDF

Attack MEDIUM

Neural Uncertainty Principle: A Unified View of Adversarial Fragility and LLM Hallucination

Dong-Xiao Zhang, Hu Lou, Jun-Jie Zhang +2 more

Adversarial vulnerability in vision and hallucination in large language models are conventionally viewed as separate problems, each addressed with...

3 months ago cs.LG cs.IT physics.comp-ph PDF

Tool MEDIUM

A Framework for Formalizing LLM Agent Security

Vincent Siu, Jingxuan He, Kyle Montgomery +4 more

Security in LLM agents is inherently contextual. For example, the same action taken by an agent may represent legitimate behavior or a security...

3 months ago cs.CR cs.AI PDF

Defense MEDIUM

The Autonomy Tax: Defense Training Breaks LLM Agents

Shawn Li, Yue Zhao

Large language model (LLM) agents increasingly rely on external tools (file operations, API calls, database transactions) to autonomously complete...

3 months ago cs.CR cs.AI cs.LG PDF

Defense MEDIUM

SAVeS: Steering Safety Judgments in Vision-Language Models via Semantic Cues

Carlos Hinojosa, Clemens Grange, Bernard Ghanem

Vision-language models (VLMs) are increasingly deployed in real-world and embodied settings where safety decisions depend on visual context. However,...

3 months ago cs.CV cs.AI cs.CL PDF

Benchmark MEDIUM

Functional Subspace Watermarking for Large Language Models

Zikang Ding, Junhao Li, Suling Wu +3 more

Model watermarking utilizes internal representations to protect the ownership of large language models (LLMs). However, these features inevitably...

3 months ago cs.CR cs.AI PDF

Survey MEDIUM

Toward Reliable, Safe, and Secure LLMs for Scientific Applications

Saket Sanjeev Chaturvedi, Joshua Bergerson, Tanwi Mallick

As large language models (LLMs) evolve into autonomous "AI scientists," they promise transformative advances but introduce novel vulnerabilities,...

3 months ago cs.CR cs.CV PDF

Attack MEDIUM

Retrieval-Augmented LLMs for Security Incident Analysis

Xavier Cadet, Aditya Vikram Singh, Harsh Mamania +6 more

Investigating cybersecurity incidents requires collecting and analyzing evidence from multiple log sources, including intrusion detection alerts,...

3 months ago cs.CR cs.AI PDF

Attack MEDIUM

Retrieval-Augmented LLMs for Security Incident Analysis

Xavier Cadet, Aditya Vikram Singh, Harsh Mamania +6 more

Investigating cybersecurity incidents requires collecting and analyzing evidence from multiple log sources, including intrusion detection alerts,...

3 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Parameter-Efficient Modality-Balanced Symmetric Fusion for Multimodal Remote Sensing Semantic Segmentation

Haocheng Li, Juepeng Zheng, Shuangxi Miao +4 more

Multimodal remote sensing semantic segmentation enhances scene interpretation by exploiting complementary physical cues from heterogeneous data....

3 months ago cs.CV PDF

Benchmark MEDIUM

WeatherReasonSeg: A Benchmark for Weather-Aware Reasoning Segmentation in Visual Language Models

Wanjun Du, Zifeng Yuan, Tingting Chen +3 more

Existing vision-language models (VLMs) have demonstrated impressive performance in reasoning-based segmentation. However, current benchmarks are...

3 months ago cs.CV cs.AI PDF

Benchmark MEDIUM

VeriGrey: Greybox Agent Validation

Yuntong Zhang, Sungmin Kang, Ruijie Meng +2 more

Agentic AI has been a topic of great interest recently. A Large Language Model (LLM) agent involves one or more LLMs in the back-end. In the front...

3 months ago cs.AI PDF

Attack MEDIUM

Caging the Agents: A Zero Trust Security Architecture for Autonomous AI in Healthcare

Saikat Maiti

Autonomous AI agents powered by large language models are being deployed in production with capabilities including shell execution, file system...

3 months ago cs.CR cs.AI PDF

Survey MEDIUM

Is Your LLM-as-a-Recommender Agent Trustable? LLMs' Recommendation is Easily Hacked by Biases (Preferences)

Zichen Tang, Zirui Zhang, Qian Wang +3 more

Current Large Language Models (LLMs) are gradually exploited in practically valuable agentic workflows such as Deep Research, E-commerce...

3 months ago cs.CY cs.MA PDF

Survey MEDIUM

Is Your LLM-as-a-Recommender Agent Trustable? LLMs' Recommendation is Easily Hacked by Biases (Preferences)

Zichen Tang, Zirui Zhang, Qian Wang +3 more

Current Large Language Models (LLMs) are gradually exploited in practically valuable agentic workflows such as Deep Research, E-commerce...

3 months ago cs.CY cs.MA PDF

Survey MEDIUM

MCP-38: A Comprehensive Threat Taxonomy for Model Context Protocol Systems (v1.0)

Yi Ting Shen, Kentaroh Toyoda, Alex Leung

The Model Context Protocol (MCP) introduces a structurally distinct attack surface that existing threat frameworks, designed for traditional software...

3 months ago cs.CR cs.AI PDF

Survey MEDIUM

Network and Device Level Cyber Deception for Contested Environments Using RL and LLMs

Abhijeet Sahu, Shuva Paul, Richard Macwan

Cyber deception assists in increasing the attacker's budget in reconnaissance or any early phases of threat intrusions. In the past, numerous methods...

3 months ago cs.CR cs.ET PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial