AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 61–80 of 90 papers

Clear filters

Benchmark HIGH

OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation

Xiaojun Jia, Jie Liao, Qi Guo +11 more

Recent advances in multi-modal large language models (MLLMs) have enabled unified perception-reasoning capabilities, yet these systems remain highly...

5 months ago cs.CR cs.CV PDF

Benchmark HIGH

Sift or Get Off the PoC: Applying Information Retrieval to Vulnerability Research with SiftRank

Caleb Gross

Security research is fundamentally a problem of resource constraint and consequent prioritization. There is simply too much attack surface and too...

5 months ago cs.CR cs.IR PDF

Benchmark HIGH

TeleAI-Safety: A comprehensive LLM jailbreaking benchmark towards attacks, defenses, and evaluations

Xiuyuan Chen, Jian Zhao, Yuxiang He +10 more

While the deployment of large language models (LLMs) in high-value industries continues to expand, the systematic assessment of their safety against...

5 months ago cs.CR PDF

Benchmark HIGH

Is Vibe Coding Safe? Benchmarking Vulnerability of Agent-Generated Code in Real-World Tasks

Songwen Zhao, Danqing Wang, Kexun Zhang +3 more

Vibe coding is a new programming paradigm in which human engineers instruct large language model (LLM) agents to complete complex coding tasks with...

5 months ago cs.SE cs.CL PDF

Benchmark HIGH

Red Teaming Large Reasoning Models

Jiawei Chen, Yang Yang, Chao Yu +6 more

Large Reasoning Models (LRMs) have emerged as a powerful advancement in multi-step reasoning tasks, offering enhanced transparency and logical...

5 months ago cs.CR cs.AI PDF

Benchmark HIGH

BackdoorVLM: A Benchmark for Backdoor Attacks on Vision-Language Models

Juncheng Li, Yige Li, Hanxun Huang +5 more

Backdoor attacks undermine the reliability and trustworthiness of machine learning systems by injecting hidden behaviors that can be maliciously...

5 months ago cs.CV PDF

Benchmark HIGH

ReVul-CoT: Towards Effective Software Vulnerability Assessment with Retrieval-Augmented Generation and Chain-of-Thought Prompting

Zhijie Chen, Xiang Chen, Ziming Li +2 more

Context: Software Vulnerability Assessment (SVA) plays a vital role in evaluating and ranking vulnerabilities in software systems to ensure their...

5 months ago cs.SE PDF

Benchmark HIGH

The Shawshank Redemption of Embodied AI: Understanding and Benchmarking Indirect Environmental Jailbreaks

Chunyang Li, Zifeng Kang, Junwei Zhang +4 more

The adoption of Vision-Language Models (VLMs) in embodied AI agents, while being effective, brings safety concerns such as jailbreaking. Prior work...

5 months ago cs.CR cs.CY cs.RO PDF

Benchmark HIGH

Attacking Autonomous Driving Agents with Adversarial Machine Learning: A Holistic Evaluation with the CARLA Leaderboard

Henry Wong, Clement Fung, Weiran Lin +3 more

To autonomously control vehicles, driving agents use outputs from a combination of machine-learning (ML) models, controller logic, and custom...

5 months ago cs.CR cs.CV cs.LG PDF

Benchmark HIGH

AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models

Jiayu Li, Yunhan Zhao, Xiang Zheng +4 more

Vision-Language-Action (VLA) models enable robots to interpret natural-language instructions and perform diverse tasks, yet their integration of...

5 months ago cs.CR cs.AI cs.CV PDF

Benchmark HIGH

MSCR: Exploring the Vulnerability of LLMs' Mathematical Reasoning Abilities Using Multi-Source Candidate Replacement

Zhishen Sun, Guang Dai, Haishan Ye

LLMs demonstrate performance comparable to human abilities in complex tasks such as mathematical reasoning, but their robustness in mathematical...

6 months ago cs.AI PDF

Benchmark HIGH

SIRAJ: Diverse and Efficient Red-Teaming for LLM Agents via Distilled Structured Reasoning

Kaiwen Zhou, Ahmed Elgohary, A S M Iftekhar +1 more

The ability of LLM agents to plan and invoke tools exposes them to new safety risks, making a comprehensive red-teaming system crucial for...

6 months ago cs.CR cs.AI cs.CL PDF

Benchmark HIGH

The Tail Tells All: Estimating Model-Level Membership Inference Vulnerability Without Reference Models

Euodia Dodd, Nataša Krčo, Igor Shilov +1 more

Membership inference attacks (MIAs) have emerged as the standard tool for evaluating the privacy risks of AI models. However, state-of-the-art...

6 months ago cs.LG cs.CR PDF

Benchmark HIGH

Prompting the Priorities: A First Look at Evaluating LLMs for Vulnerability Triage and Prioritization

Osama Al Haddad, Muhammad Ikram, Ejaz Ahmed +1 more

Security analysts face increasing pressure to triage large and complex vulnerability backlogs. Large Language Models (LLMs) offer a potential aid by...

6 months ago cs.CR PDF

Benchmark HIGH

Black-Box Evasion Attacks on Data-Driven Open RAN Apps: Tailored Design and Experimental Evaluation

Pranshav Gajjar, Molham Khoja, Abiodun Ganiyu +4 more

The impending adoption of Open Radio Access Network (O-RAN) is fueling innovation in the RAN towards data-driven operation. Unlike traditional RAN...

6 months ago cs.CR cs.NI PDF

Benchmark HIGH

BlueCodeAgent: A Blue Teaming Agent Enabled by Automated Red Teaming for CodeGen AI

Chengquan Guo, Yuzhou Nie, Chulin Xie +3 more

As large language models (LLMs) are increasingly used for code generation, concerns over the security risks have grown substantially. Early research...

6 months ago cs.SE PDF

Benchmark HIGH

LLM Agents for Automated Web Vulnerability Reproduction: Are We There Yet?

Bin Liu, Yanjie Zhao, Guoai Xu +1 more

Large language model (LLM) agents have demonstrated remarkable capabilities in software engineering and cybersecurity tasks, including code...

6 months ago cs.SE cs.CR PDF

Benchmark HIGH

Echoes of Human Malice in Agents: Benchmarking LLMs for Multi-Turn Online Harassment Attacks

Trilok Padhi, Pinxian Lu, Abdulkadir Erol +5 more

Large Language Model (LLM) agents are powering a growing share of interactive web applications, yet remain vulnerable to misuse and harm. Prior...

6 months ago cs.AI PDF

Benchmark HIGH

Selective Adversarial Attacks on LLM Benchmarks

Ivan Dubrovsky, Anastasia Orlova, Illarion Iov +3 more

Benchmarking outcomes increasingly govern trust, selection, and deployment of LLMs, yet these evaluations remain vulnerable to semantically...

6 months ago cs.LG PDF

Benchmark HIGH

MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents

Dongsen Zhang, Zekun Li, Xu Luo +3 more

The Model Context Protocol (MCP) standardizes how large language model (LLM) agents discover, describe, and call external tools. While MCP unlocks...

7 months ago cs.CR cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial