AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 401–420 of 890 papers

Clear filters

Benchmark HIGH

AEGIS: White-Box Attack Path Generation using LLMs and Training Effectiveness Evaluation for Large-Scale Cyber Defence Exercises

Ivan K. Tung, Yu Xiang Shi, Alex Chien +2 more

Creating attack paths for cyber defence exercises requires substantial expert effort. Existing automation requires vulnerability graphs or exploit...

3 months ago cs.CR cs.AI PDF

Benchmark HIGH

RPP: A Certified Poisoned-Sample Detection Framework for Backdoor Attacks under Dataset Imbalance

Miao Lin, Feng Yu, Rui Ning +6 more

Deep neural networks are highly susceptible to backdoor attacks, yet most defense methods to date rely on balanced data, overlooking the pervasive...

3 months ago cs.CR cs.CV cs.LG PDF

Attack HIGH

Whispers of Wealth: Red-Teaming Google's Agent Payments Protocol via Prompt Injection

Tanusree Debi, Wentian Zhu

Large language model (LLM) based agents are increasingly used to automate financial transactions, yet their reliance on contextual reasoning exposes...

3 months ago cs.CR cs.AI PDF

Attack HIGH

FraudShield: Knowledge Graph Empowered Defense for LLMs against Fraud Attacks

Naen Xu, Jinghuai Zhang, Ping He +6 more

Large language models (LLMs) have been widely integrated into critical automated workflows, including contract review and job application processes....

3 months ago cs.CR cs.AI cs.CL PDF

Attack HIGH

Jailbreaks on Vision Language Model via Multimodal Reasoning

Aarush Noheria, Yuguang Yao

Vision-language models (VLMs) have become central to tasks such as visual question answering, image captioning, and text-to-image generation....

3 months ago cs.CV cs.AI PDF

Tool HIGH

An Effective Energy Mask-based Adversarial Evasion Attacks against Misclassification in Speaker Recognition Systems

Chanwoo Park, Chanwoo Kim

Evasion attacks pose significant threats to AI systems, exploiting vulnerabilities in machine learning models to bypass detection mechanisms. The...

3 months ago cs.SD cs.CR eess.AS PDF

Attack HIGH

Stealthy Poisoning Attacks Bypass Defenses in Regression Settings

Javier Carnerero-Cano, Luis Muñoz-González, Phillippa Spencer +1 more

Regression models are widely used in industrial processes, engineering, and in natural and physical sciences, yet their robustness to poisoning has...

3 months ago cs.LG cs.AI cs.CR PDF

Survey HIGH

A Systematic Literature Review on LLM Defenses Against Prompt Injection and Jailbreaking: Expanding NIST Taxonomy

Pedro H. Barcha Correia, Ryan W. Achjian, Diego E. G. Caetano de Oliveira +5 more

The rapid advancement and widespread adoption of generative artificial intelligence (GenAI) and large language models (LLMs) has been accompanied by...

3 months ago cs.CR cs.AI cs.CL PDF

Attack HIGH

ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathologically Long Reasoning in Large Reasoning Models

Xiaogeng Liu, Xinyan Wang, Yechao Zhang +5 more

Large reasoning models (LRMs) extend large language models with explicit multi-step reasoning traces, but this capability introduces a new class of...

3 months ago cs.CR cs.AI PDF

Attack HIGH

ICL-EVADER: Zero-Query Black-Box Evasion Attacks on In-Context Learning and Their Defenses

Ningyuan He, Ronghong Huang, Qianqian Tang +3 more

In-context learning (ICL) has become a powerful, data-efficient paradigm for text classification using large language models. However, its robustness...

3 months ago cs.CR PDF

Attack HIGH

ICON: Intent-Context Coupling for Efficient Multi-Turn Jailbreak Attack

Xingwei Lin, Wenhao Lin, Sicong Cao +4 more

Multi-turn jailbreak attacks have emerged as a critical threat to Large Language Models (LLMs), bypassing safety mechanisms by progressively...

3 months ago cs.CR cs.AI PDF

Attack HIGH

Membership Inference Attacks Against Fine-tuned Diffusion Language Models

Yuetian Chen, Kaiyuan Zhang, Yuntao Du +5 more

Diffusion Language Models (DLMs) represent a promising alternative to autoregressive language models, using bidirectional masked token prediction....

3 months ago cs.LG cs.AI PDF

Attack HIGH

What Hard Tokens Reveal: Exploiting Low-confidence Tokens for Membership Inference Attacks against Large Language Models

Md Tasnim Jawad, Mingyan Xiao, Yanzhao Wu

With the widespread adoption of Large Language Models (LLMs) and increasingly stringent privacy regulations, protecting data privacy in LLMs has...

3 months ago cs.CR PDF

Attack HIGH

LLM-VA: Resolving the Jailbreak-Overrefusal Trade-off via Vector Alignment

Haonan Zhang, Dongxia Wang, Yi Liu +2 more

Safety-aligned LLMs suffer from two failure modes: jailbreak (answering harmful inputs) and over-refusal (declining benign queries). Existing vector...

3 months ago cs.LG cs.AI PDF

Tool HIGH

SHIELD: An Auto-Healing Agentic Defense Framework for LLM Resource Exhaustion Attacks

Nirhoshan Sivaroopan, Kanchana Thilakarathna, Albert Zomaya +6 more

Sponge attacks increasingly threaten LLM systems by inducing excessive computation and DoS. Existing defenses either rely on statistical filters that...

3 months ago cs.CR cs.AI PDF

Attack HIGH

Thought-Transfer: Indirect Targeted Poisoning Attacks on Chain-of-Thought Reasoning Models

Harsh Chaudhari, Ethan Rathbun, Hanna Foerster +5 more

Chain-of-Thought (CoT) reasoning has emerged as a powerful technique for enhancing large language models' capabilities by generating intermediate...

3 months ago cs.CR cs.LG PDF

Defense HIGH

MulVul: Retrieval-augmented Multi-Agent Code Vulnerability Detection via Cross-Model Prompt Evolution

Zihan Wu, Jie Xu, Yun Peng +2 more

Large Language Models (LLMs) struggle to automate real-world vulnerability detection due to two key limitations: the heterogeneity of vulnerability...

3 months ago cs.SE cs.AI PDF

Attack HIGH

ARMOR: Agentic Reasoning for Methods Orchestration and Reparameterization for Robust Adversarial Attacks

Gabriel Lee Jun Rong, Christos Korgialas, Dion Jia Xu Ho +3 more

Existing automated attack suites operate as static ensembles with fixed sequences, lacking strategic adaptation and semantic awareness. This paper...

3 months ago cs.CV PDF

Attack HIGH

Comparison requires valid measurement: Rethinking attack success rate comparisons in AI red teaming

Alexandra Chouldechova, A. Feder Cooper, Solon Barocas +3 more

We argue that conclusions drawn about relative system safety or attack method efficacy via AI red teaming are often not supported by evidence...

3 months ago cs.LG PDF

Benchmark HIGH

Prompt Injection Evaluations: Refusal Boundary Instability and Artifact-Dependent Compliance in GPT-4-Series Models

Thomas Heverin

Prompt injection evaluations typically treat refusal as a stable, binary indicator of safety. This study challenges that paradigm by modeling refusal...

3 months ago cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial