AI Security Research

AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.

Adversarial attacks
Model defenses
Red-teaming benchmarks
Surveys
Security tooling

Total

3,023
Attack

1,175
Benchmark

866
Defense

407
Tool

319
Survey

176

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 441–460 of 1,455 papers

Clear filters

Attack MEDIUM

AEGIS: Adversarial Entropy-Guided Immune System -- Thermodynamic State Space Models for Zero-Day Network Evasion Detection

Vickson Ferrel

As TLS 1.3 encryption limits traditional Deep Packet Inspection (DPI), the security community has pivoted to Euclidean Transformer-based classifiers...

2 months ago cs.CR cs.LG PDF

Tool MEDIUM

MTI: A Behavior-Based Temperament Profiling System for AI Agents

Jihoon Jeong

AI models of equivalent capability can exhibit fundamentally different behavioral patterns, yet no standardized instrument exists to measure these...

2 months ago cs.AI cs.CL PDF

Benchmark MEDIUM

From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers

Yiheng Huang, Zhijia Zhao, Bihuan Chen +5 more

The model context protocol (MCP) standardizes how LLMs connect to external tools and data sources, enabling faster integration but introducing new...

2 months ago cs.CR cs.SE PDF

Defense MEDIUM

Assertain: Automated Security Assertion Generation Using Large Language Models

Shams Tarek, Dipayan Saha, Khan Thamid Hasan +3 more

The increasing complexity of modern system-on-chip designs amplifies hardware security risks and makes manual security property specification a major...

2 months ago cs.CR PDF

Benchmark MEDIUM

Cooking Up Risks: Benchmarking and Reducing Food Safety Risks in Large Language Models

Weidi Luo, Xiaofei Wen, Tenghao Huang +5 more

Large language models (LLMs) are increasingly deployed for everyday tasks, including food preparation and health-related guidance. However, food...

2 months ago cs.CR PDF

Defense MEDIUM

ClawSafety: "Safe" LLMs, Unsafe Agents

Bowen Wei, Yunbei Zhang, Jinhao Pan +5 more

Personal AI agents like OpenClaw run with elevated privileges on users' local machines, where a single successful prompt injection can leak...

2 months ago cs.AI PDF

Defense MEDIUM

Safety, Security, and Cognitive Risks in World Models

Manoj Parmar

World models -- learned internal simulators of environment dynamics -- are rapidly becoming foundational to autonomous decision-making in robotics,...

2 months ago cs.CR cs.AI cs.LG PDF

Benchmark MEDIUM

SERSEM: Selective Entropy-Weighted Scoring for Membership Inference in Code Language Models

Kıvanç Kuzey Dikici, Serdar Kara, Semih Çağlar +2 more

As Large Language Models (LLMs) for code increasingly utilize massive, often non-permissively licensed datasets, evaluating data contamination...

2 months ago cs.SE cs.CR PDF

Defense MEDIUM

Multi-Agent LLM Governance for Safe Two-Timescale Reinforcement Learning in SDN-IoT Defense

Saeid Jamshidi, Negar Shahabi, Foutse Khomh +2 more

Software-Defined Networking (SDN) is increasingly adopted to secure Internet-of-Things (IoT) networks due to its centralized control and programmable...

2 months ago cs.CR PDF

Other MEDIUM

SCPatcher: Automated Smart Contract Code Repair via Retrieval-Augmented Generation and Knowledge Graph

Xiaoqi Li, Shipeng Ye, Wenkai Li +1 more

Smart contract vulnerabilities can cause substantial financial losses due to the immutability of code after deployment. While existing tools detect...

2 months ago cs.SE PDF

Benchmark MEDIUM

EnsembleSHAP: Faithful and Certifiably Robust Attribution for Random Subspace Method

Yanting Wang, Jinyuan Jia

Random subspace method has wide security applications such as providing certified defenses against adversarial and backdoor attacks, and building...

3 months ago cs.CR PDF

Attack MEDIUM

Performative Scenario Optimization

Quanyan Zhu, Zhengye Han

This paper introduces a performative scenario optimization framework for decision-dependent chance-constrained problems. Unlike classical stochastic...

3 months ago cs.GT PDF

Other MEDIUM

BotVerse: Real-Time Event-Driven Simulation of Social Agents

Edoardo Allegrini, Edoardo Di Paolo, Angelo Spognardi +1 more

BotVerse is a scalable, event-driven framework for high-fidelity social simulation using LLM-based agents. It addresses the ethical risks of studying...

3 months ago cs.SI cs.AI cs.MA PDF

Survey MEDIUM

Security in LLM-as-a-Judge: A Comprehensive SoK

Aiman Almasoud, Antony Anju, Marco Arazzi +6 more

LLM-as-a-Judge (LaaJ) is a novel paradigm in which powerful language models are used to assess the quality, safety, or correctness of generated...

3 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning

Yubo Li, Lu Zhang, Tianchong Jiang +2 more

Large language models systematically fail when a salient surface cue conflicts with an unstated feasibility constraint. We study this through a...

3 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

Design Principles for the Construction of a Benchmark Evaluating Security Operation Capabilities of Multi-agent AI Systems

Yicheng Cai, Mitchell John DeStefano, Guodong Dong +5 more

As Large Language Models (LLMs) and multi-agent AI systems are demonstrating increasing potential in cybersecurity operations, organizations,...

3 months ago cs.CR cs.AI PDF

Defense MEDIUM

FL-PBM: Pre-Training Backdoor Mitigation for Federated Learning

Osama Wehbi, Sarhad Arisdakessian, Omar Abdel Wahab +3 more

Backdoor attacks pose a significant threat to the integrity and reliability of Artificial Intelligence (AI) models, enabling adversaries to...

3 months ago cs.LG cs.CR cs.DC PDF

Survey MEDIUM

Crossing the NL/PL Divide: Information Flow Analysis Across the NL/PL Boundary in LLM-Integrated Code

Zihao Xu, Xiao Cheng, Ruijie Meng +1 more

LLM API calls are becoming a ubiquitous program construct, yet they create a boundary that no existing program analysis can cross: runtime values...

3 months ago cs.SE cs.AI PDF

Benchmark MEDIUM

Evaluating Privilege Usage of Agents on Real-World Tools

Quan Zhang, Lianhang Fu, Lvsi Lian +5 more

Equipping LLM agents with real-world tools can substantially improve productivity. However, granting agents autonomy over tool use also transfers the...

3 months ago cs.CR cs.AI PDF

Attack MEDIUM

FedFG: Privacy-Preserving and Robust Federated Learning via Flow-Matching Generation

Ruiyang Wang, Rong Pan, Zhengan Yao

Federated learning (FL) enables distributed clients to collaboratively train a global model using local private data. Nevertheless, recent studies...

3 months ago cs.CR cs.AI cs.CV PDF

Frequently asked questions

What is AI security research?

AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.

How many AI security papers does AI Threat Alert track?

AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.

Where do the research papers come from?

Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.

What topics does the AI security research cover?

Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.

How is this different from a generic paper search?

Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial