AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 41–60 of 660 papers

Clear filters

Attack HIGH

LoopTrap: Termination Poisoning Attacks on LLM Agents

Huiyu Xu, Zhibo Wang, Wenhui Zhang +4 more

Modern LLM agents solve complex tasks by operating in iterative execution loops, where they repeatedly reason, act, and self-evaluate progress to...

1 weeks ago cs.CR cs.AI PDF

Attack HIGH

Conceal, Reconstruct, Jailbreak: Exploiting the Reconstruction-Concealment Tradeoff in MLLMs

Md Farhamdur Reza, Richeng Jin, Tianfu Wu +1 more

Intent-obfuscation-based jailbreak attacks on multimodal large language models (MLLMs) transform a harmful query into a concealed multimodal input to...

1 weeks ago cs.AI PDF

Attack HIGH

PersonaTeaming: Supporting Persona-Driven Red-Teaming for Generative AI

Wesley Hanwen Deng, Mingxi Yan, Sunnie S. Y. Kim +5 more

Recent developments in AI safety research have called for red-teaming methods that effectively surface potential risks posed by generative AI models,...

1 weeks ago cs.HC cs.AI cs.CY PDF

Survey HIGH

SoK: Robustness in Large Language Models against Jailbreak Attacks

Feiyue Xu, Hongsheng Hu, Chaoxiang He +9 more

Large Language Models (LLMs) have achieved remarkable success but remain highly susceptible to jailbreak attacks, in which adversarial prompts coerce...

1 weeks ago cs.CR cs.AI PDF

Tool HIGH

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

Zhaorun Chen, Xun Liu, Haibo Tong +14 more

AI agents are increasingly deployed across diverse domains to automate complex workflows through long-horizon and high-stakes action executions. Due...

1 weeks ago cs.AI PDF

Attack HIGH

Sparse Tokens Suffice: Jailbreaking Audio Language Models via Token-Aware Gradient Optimization

Zheng Fang, Xiaosen Wang, Shenyi Zhang +2 more

Jailbreak attacks on audio language models (ALMs) optimize audio perturbations to elicit unsafe generations, and they typically update the entire...

1 weeks ago cs.CR cs.AI cs.CL PDF

Attack HIGH

Misrouter: Exploiting Routing Mechanisms for Input-Only Attacks on Mixture-of-Experts LLMs

Zekun Fei, Zihao Wang, Weijie Liu +4 more

Mixture-of-Experts (MoE) architectures have emerged as a leading paradigm for scaling large language models through sparse, routing-based...

1 weeks ago cs.CR PDF

Attack HIGH

Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours

Raja Sekhar Rao Dheekonda, Will Pearce, Nick Landers

AI systems are entering critical domains like healthcare, finance, and defense, yet remain vulnerable to adversarial attacks. While AI red teaming is...

1 weeks ago cs.AI cs.CR PDF

Attack HIGH

Generating Proof-of-Vulnerability Tests to Help Enhance the Security of Complex Software

Shravya Kanchi, Xiaoyan Zang, Ying Zhang +2 more

Developers create modern software applications (Apps) on top of third-party libraries (Libs). When library vulnerabilities are reachable through...

1 weeks ago cs.CR cs.SE PDF

Attack HIGH

Membership Inference Attacks for Retrieval Based In-Context Learning for Document Question Answering

Tejas Kulkarni, Antti Koskela, Laith Zumot

We show that remotely hosted applications employing in-context learning when augmented with a retrieval function to select in-context examples can be...

1 weeks ago cs.CR cs.LG PDF

Tool HIGH

Exposing LLM Safety Gaps Through Mathematical Encoding:New Attacks and Systematic Analysis

Haoyu Zhang, Mohammad Zandsalimy, Shanu Sushmita

Large language models (LLMs) employ safety mechanisms to prevent harmful outputs, yet these defenses primarily rely on semantic pattern matching. We...

1 weeks ago cs.CR cs.AI cs.CL PDF

Attack HIGH

ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection

Shihao Weng, Yang Feng, Jinrui Zhang +3 more

The rise of Large Language Model (LLM) agents, augmented with tool use, skills, and external knowledge, has introduced new security risks. Among...

1 weeks ago cs.CR cs.SE PDF

Attack HIGH

Revisiting JBShield: Breaking and Rebuilding Representation-Level Jailbreak Defenses

Kemal Derya, Berk Sunar

Defending large language models (LLMs) against jailbreak attacks, such as Greedy Coordinate Gradient (GCG), remains a challenge, particularly under...

1 weeks ago cs.CR PDF

Attack HIGH

EvoPoC: Automated Exploit Synthesis for DeFi Smart Contracts via Hierarchical Knowledge Graphs

Ruichao Liang, Jing Chen, Xianglong Li +5 more

Smart contract vulnerabilities in Decentralized Finance caused over billions of dollars losses every year, yet the security community faces a...

1 weeks ago cs.CR cs.SE PDF

Attack HIGH

ContextualJailbreak: Evolutionary Red-Teaming via Simulated Conversational Priming

Mario Rodríguez Béjar, Francisco J. Cortés-Delgado, S. Braghin +1 more

Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety alignment and elicit harmful responses. A growing body of work...

1 weeks ago cs.CL cs.CR PDF

Attack HIGH

Noisy Networks, Nosy Neighbors: Simple Privacy Attacks Against Residential Wireless Traffic

Arne Roszeitis, Bartosz Burgiel, Victor Jüttner +1 more

Smart devices, such as light bulbs, TVs, fridges, etc., equipped with computing capabilities and wireless communication, are part of everyday life in...

1 weeks ago cs.CR PDF

Attack HIGH

APIOT: Autonomous Vulnerability Management Across Bare-Metal Industrial OT Networks

Adel ElZemity, Budi Arief, Shujun Li +6 more

Bare-metal operational technology (OT) devices -- especially the microcontrollers running Modbus/TCP and CoAP at the base of industrial control...

1 weeks ago cs.CR cs.AI PDF

Attack HIGH

CBV: Clean-label Backdoor Attacks on Vision Language Models via Diffusion Models

Ji Guo, Xiaolong Qin, Cencen Liu +3 more

Vision-Language Models (VLMs) have achieved remarkable success in tasks such as image captioning and visual question answering (VQA). However, as...

1 weeks ago cs.AI PDF

Attack HIGH

When Alignment Isn't Enough: Response-Path Attacks on LLM Agents

Mingyu Luo, Zihan Zhang, Zesen Liu +7 more

Bring-Your-Own-Key (BYOK) agent architectures let users route LLM traffic through third-party relays, creating a critical integrity gap: a malicious...

1 weeks ago cs.CR cs.AI PDF

Attack HIGH

FlashRT: Towards Computationally and Memory Efficient Red-Teaming for Prompt Injection and Knowledge Corruption

Yanting Wang, Chenlong Yin, Ying Chen +1 more

Long-context large language models (LLMs)-for example, Gemini-3.1-Pro and Qwen-3.5-are widely used to empower many real-world applications, such as...

2 weeks ago cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial