AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1–20 of 28 papers

Clear filters

Tool HIGH

ShareLock: A Stealthy Multi-Tool Threshold Poisoning Attack Against MCP

Liwei Liu, Tianzhu Han, Zijian Liu +2 more

With the rapid evolution of LLM-driven agents, Model Context Protocol (MCP), an open protocol bridging LLMs with external tools, has quickly become...

2 days ago cs.CR cs.AI PDF

Tool MEDIUM

Empirical Software Engineering TerraProbe: A Layered-Oracle Framework for Detecting Deceptive Fixes in LLM-Assisted Terraform

Manar Alsaid, Chimdumebi Nebolisa, Faris Abbas

Security misconfigurations in Terraform Infrastructure-as-Code are a growing risk in cloud deployments, and large language models are increasingly...

2 days ago cs.LG cs.CR PDF

Tool MEDIUM

Beyond Feedforward Networks: Reentry Neural Systems as the Fundamental Basis of Subjecthood and Intrinsic Safety of Next-Generation AGI

A. S. Ushakov, Yu. N. Berdinsk

We propose a complete architectural blueprint for safe artificial general intelligence based on a closed reentry loop (D <-> I cycle). In contrast to...

2 days ago cs.LG cs.AI math-ph PDF

Tool LOW

The Unfireable Safety Kernel: Execution-Time AI Alignment for AI Agents and Other Escapable AI Systems

Seth Dobrin, Łukasz Chmiel

AI agents are granted access to tools, APIs, and other infrastructure, making them active principals in those systems. The dominant approach places...

3 days ago cs.AI cs.CR cs.LG PDF

Tool MEDIUM

Security and Privacy in Retrieval-Augmented Generation: Architectures, Threats, Defenses, and Future Directions for Building Trustworthy Systems

Balamurugan Palanisamy, G S S Chalapathi, Vikas Hassija +1 more

Retrieval-Augmented Generation (RAG) has emerged as a dominant paradigm for enhancing large language models with external knowledge. By coupling...

3 days ago cs.CR cs.CL PDF

Tool HIGH

RIFT-Bench: Dynamic Red-teaming For Agentic AI Systems

Yarin Yerushalmi Levi, Roy Betser, Amit Giloni +5 more

Agentic AI systems powered by large language models (LLMs) are rapidly evolving into autonomous decision-making systems, exposing attack vectors...

4 days ago cs.AI PDF

Tool MEDIUM

FlexServe: A Fast and Secure LLM Serving System for Mobile Devices with Flexible Resource Isolation

Yinpeng Wu, Yitong Chen, Lixiang Wang +3 more

Device-side Large Language Models (LLMs) have grown explosively, offering stronger privacy and higher availability than their cloud-side...

5 days ago cs.CR cs.LG cs.OS PDF

Tool MEDIUM

Safety in Self-Evolving LLM Agent Systems: Threats, Amplification, and Case Studies

Ruixiao Lin, Xinhao Deng, Qingming Li +12 more

Self-evolving LLM agent systems, which autonomously update their model parameters, memory, tools, and architectures, introduce a qualitatively new...

5 days ago cs.CR cs.AI PDF

Tool HIGH

Analyzing Defensive Misdirection Against Model-Guided Automated Attacks on Agentic AI Systems

Reza Soosahabi, Vivek Namsani

Agentic AI systems increasingly rely on language-model components to interpret instructions, process external data, invoke tools, and coordinate with...

1 weeks ago cs.CR cs.AI PDF

Tool HIGH

A Layered Security Framework Against Prompt Injection in RAG-Based Chatbots

Gulshan Saleem, Nisar Ahmed, Muhammad Imran Zaman +1 more

Prompt injection is ranked as the most critical vulnerability in large language model (LLM) deployments by the OWASP Top 10 for LLM Applications, yet...

1 weeks ago cs.CR cs.CL PDF

Tool MEDIUM

FloatDoor: Platform-Triggered Backdoors in LLMs

Nils Loose, Jonas Sander, Felix Mächtle +1 more

Large language models (LLMs) are increasingly deployed in sensitive settings such as software engineering, where their outputs directly shape...

1 weeks ago cs.CR cs.LG PDF

Tool HIGH

Image Prompt Reconstruction Attacks on Distributed MLLM Inference Frameworks

Xinjian Luo, Hongyan Chang, Jianxin Wei +5 more

Distributed large language model (LLM) inference frameworks connect isolated consumer-grade devices for large-scale model inference, substantially...

1 weeks ago cs.CR PDF

Tool MEDIUM

SafeClawBench: Separating Semantic, Audit-Evidence, and Sandbox Harm in Tool-Using LLM Agents

Yuchuan Tian, Mengyu Zheng, Haocheng Mei +5 more

Tool-using language-model agents introduce security failures that go beyond unsafe text: they can disclose protected objects, write persistent...

1 weeks ago cs.CR cs.AI PDF

Tool HIGH

Conflict-Aware Retriever Editing for Knowledge Injection Attacks on LLM-Based RAG Systems

Xinru Liu, Xianglong Zhang, Di Cai +3 more

Injecting malicious knowledge into retrieval-augmented generation (RAG) systems can manipulate retrieved evidence and mislead downstream generation,...

1 weeks ago cs.CR cs.AI PDF

Tool HIGH

Rapid Poison: Practical Poisoning Attacks Against the Rapid Response Framework

David Huang, Jaewon Chang, Avidan Shah +2 more

The Rapid Response (RR) framework, deployed in production systems, including Anthropic's ASL-3 safeguards, continuously improves jailbreak-detection...

1 weeks ago cs.LG cs.CL PDF

Tool MEDIUM

The Emergence of Autonomous Penetration Capabilities in Large Language Model-Powered AI Systems

Jiaqi Luo, Jiarun Dai, Zhile Chen +8 more

Nowadays, the autonomous execution of cyberattacks capable of causing substantial real-world harm is widely regarded as one of the critical red lines...

2 weeks ago cs.CR cs.AI PDF

Tool MEDIUM

The Containment Gap: How Deployed Agentic AI Frameworks Fail Public-Facing Safety Requirements

Md Jafrin Hossain, Mohammad Arif Hossain, Weiqi Liu +1 more

Agentic large language model systems that autonomously invoke tools, maintain persistent memory, and execute multi-step plans are increasingly...

2 weeks ago cs.AI PDF

Tool MEDIUM

SMSR: Certified Defence Against Runtime Memory Poisoning in Persistent LLM Agent Systems

Tarun Sharma

Retrieval-augmented generation (RAG) agents increasingly run with persistent memory that accumulates across user sessions. This creates a new attack...

2 weeks ago cs.CR cs.AI cs.LG PDF

Tool MEDIUM

External Experience Serving in Production LLM Systems: A Deployment-Oriented Study of Quality-Cost Trade-offs

Lin Sun, Heming Zhang, Xiangzheng Zhang

Production LLM systems accumulate reusable operational experience, but the practical deployment issue is not merely whether such experience can help....

2 weeks ago cs.CL PDF

Tool HIGH

JailbreakOPT: Tool-Assisted Iterative Jailbreak Prompt Optimization

Ge Shi, Jun Yin, Donglin Xie +3 more

Jailbreak attacks expose persistent safety weaknesses in large language models (LLMs), but existing stateless single-turn methods face a trade-off:...

2 weeks ago cs.CR cs.AI PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial