AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 2761–2780 of 3,023 papers

Attack HIGH

ADMIT: Few-shot Knowledge Poisoning Attacks on RAG-based Fact Checking

Yutao Wu, Xiao Liu, Yinghui Li +5 more

Knowledge poisoning poses a critical threat to Retrieval-Augmented Generation (RAG) systems by injecting adversarial content into knowledge bases,...

8 months ago cs.CL cs.AI cs.CR PDF

Attack HIGH

A Systematic Study on Generating Web Vulnerability Proof-of-Concepts Using Large Language Models

Mengyao Zhao, Kaixuan Li, Lyuye Zhang +4 more

Recent advances in Large Language Models (LLMs) have brought remarkable progress in code understanding and reasoning, creating new opportunities and...

8 months ago cs.SE PDF

Attack HIGH

Adversarial Attacks on Downstream Weather Forecasting Models: Application to Tropical Cyclone Trajectory Prediction

Yue Deng, Francisco Santos, Pang-Ning Tan +1 more

Deep learning based weather forecasting (DLWF) models leverage past weather observations to generate future forecasts, supporting a wide range of...

8 months ago cs.LG cs.CR stat.ML PDF

Benchmark MEDIUM

SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents

Zonghao Ying, Yangguang Shao, Jianle Gan +9 more

Large vision-language model (LVLM)-based web agents are emerging as powerful tools for automating complex online tasks. However, when deployed in...

8 months ago cs.CR cs.CV PDF

Defense MEDIUM

Path Drift in Large Reasoning Models:How First-Person Commitments Override Safety

Yuyi Huang, Runzhe Zhan, Lidia S. Chao +2 more

As large language models (LLMs) are increasingly deployed for complex reasoning tasks, Long Chain-of-Thought (Long-CoT) prompting has emerged as a...

8 months ago cs.CL PDF

Attack HIGH

Text Prompt Injection of Vision Language Models

Ruizhe Zhu

The widespread application of large vision language models has significantly raised safety concerns. In this project, we investigate text prompt...

8 months ago cs.CL cs.CV PDF

Benchmark MEDIUM

Getting Your Indices in a Row: Full-Text Search for LLM Training Data for Real World

Ines Altemir Marinas, Anastasiia Kucherenko, Alexander Sternfeld +1 more

The performance of Large Language Models (LLMs) is determined by their training data. Despite the proliferation of open-weight LLMs, access to LLM...

8 months ago cs.CL PDF

Attack HIGH

Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

Mikhail Terekhov, Alexander Panfilov, Daniil Dzenhaliou +4 more

AI control protocols serve as a defense mechanism to stop untrusted LLM agents from causing harm in autonomous settings. Prior work treats this as a...

8 months ago cs.LG cs.AI cs.CR PDF

Benchmark MEDIUM

Detecting Data Contamination from Reinforcement Learning Post-training for Large Language Models

Yongding Tao, Tian Wang, Yihong Dong +4 more

Data contamination poses a significant threat to the reliable evaluation of Large Language Models (LLMs). This issue arises when benchmark samples...

8 months ago cs.CL cs.AI cs.LG PDF

Defense MEDIUM

VisuoAlign: Safety Alignment of LVLMs with Multimodal Tree Search

MingSheng Li, Guangze Zhao, Sichen Liu

Large Vision-Language Models (LVLMs) have achieved remarkable progress in multimodal perception and generation, yet their safety alignment remains a...

8 months ago cs.AI cs.CR PDF

Attack HIGH

Provable Watermarking for Data Poisoning Attacks

Yifan Zhu, Lijia Yu, Xiao-Shan Gao

In recent years, data poisoning attacks have been increasingly designed to appear harmless and even beneficial, often with the intention of verifying...

8 months ago cs.CR cs.LG PDF

Tool HIGH

Exploiting Web Search Tools of AI Agents for Data Exfiltration

Dennis Rall, Bernhard Bauer, Mohit Mittal +1 more

Large language models (LLMs) are now routinely used to autonomously execute complex tasks, from natural language processing to dynamic workflows like...

8 months ago cs.CR cs.CL PDF

Other MEDIUM

Repairing Regex Vulnerabilities via Localization-Guided Instructions

Sicheol Sung, Joonghyuk Hahn, Yo-Sub Han

Regular expressions (regexes) are foundational to modern computing for critical tasks like input validation and data parsing, yet their ubiquity...

8 months ago cs.AI cs.PL PDF

Attack HIGH

The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections

Milad Nasr, Nicholas Carlini, Chawin Sitawarin +11 more

How should we evaluate the robustness of language model defenses? Current defenses against jailbreaks and prompt injections (which aim to prevent an...

8 months ago cs.LG cs.CR PDF

Benchmark MEDIUM

SeCon-RAG: A Two-Stage Semantic Filtering and Conflict-Free Framework for Trustworthy RAG

Xiaonan Si, Meilin Zhu, Simeng Qin +7 more

Retrieval-augmented generation (RAG) systems enhance large language models (LLMs) with external knowledge but are vulnerable to corpus poisoning and...

8 months ago cs.CL cs.AI PDF

Attack MEDIUM

"I know it's not right, but that's what it said to do": Investigating Trust in AI Chatbots for Cybersecurity Policy

Brandon Lit, Edward Crowder, Daniel Vogel +1 more

AI chatbots are an emerging security attack vector, vulnerable to threats such as prompt injection, and rogue chatbot creation. When deployed in...

8 months ago cs.HC PDF

Attack HIGH

Pattern Enhanced Multi-Turn Jailbreaking: Exploiting Structural Vulnerabilities in Large Language Models

Ragib Amin Nihal, Rui Wen, Kazuhiro Nakadai +1 more

Large language models (LLMs) remain vulnerable to multi-turn jailbreaking attacks that exploit conversational context to bypass safety constraints...

8 months ago cs.CL cs.AI cs.CR PDF

Survey LOW

Vibe Coding: Toward an AI-Native Paradigm for Semantic and Intent-Driven Programming

Vinay Bamil

Recent advances in large language models have enabled developers to generate software by conversing with artificial intelligence systems rather than...

8 months ago cs.SE cs.HC PDF

Benchmark MEDIUM

CommandSans: Securing AI Agents with Surgical Precision Prompt Sanitization

Debeshee Das, Luca Beurer-Kellner, Marc Fischer +1 more

The increasing adoption of LLM agents with access to numerous tools and sensitive data significantly widens the attack surface for indirect prompt...

8 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

The Model's Language Matters: A Comparative Privacy Analysis of LLMs

Abhishek K. Mishra, Antoine Boutet, Lucas Magnana

Large Language Models (LLMs) are increasingly deployed across multilingual applications that handle sensitive data, yet their scale and linguistic...

8 months ago cs.CL cs.CR PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial