AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 21–40 of 90 papers

Clear filters

Benchmark HIGH

Enhancing Continual Learning for Software Vulnerability Prediction: Addressing Catastrophic Forgetting via Hybrid-Confidence-Aware Selective Replay for Temporal LLM Fine-Tuning

Xuhui Dou, Hayretdin Bahsi, Alejandro Guerra-Manzanares

Recent work applies Large Language Models (LLMs) to source-code vulnerability detection, but most evaluations still rely on random train-test splits...

2 months ago cs.CR cs.AI cs.LG PDF

Benchmark HIGH

FENCE: A Financial and Multimodal Jailbreak Detection Dataset

Mirae Kim, Seonghun Jeong, Youngjun Kwak

Jailbreaking poses a significant risk to the deployment of Large Language Models (LLMs) and Vision Language Models (VLMs). VLMs are particularly...

2 months ago cs.CL cs.AI cs.DB PDF

Benchmark HIGH

IndicJR: A Judge-Free Benchmark of Jailbreak Robustness in South Asian Languages

Priyaranjan Pattnayak, Sanchari Chowdhuri

Safety alignment of large language models (LLMs) is mostly evaluated in English and contract-bound, leaving multilingual vulnerabilities...

2 months ago cs.AI cs.CL PDF

Benchmark HIGH

Execution-State-Aware LLM Reasoning for Automated Proof-of-Vulnerability Generation

Haoyu Li, Xijia Che, Yanhao Wang +2 more

Proof-of-Vulnerability (PoV) generation is a critical task in software security, serving as a cornerstone for vulnerability validation, false...

2 months ago cs.SE cs.CR PDF

Benchmark HIGH

Favia: Forensic Agent for Vulnerability-fix Identification and Analysis

André Storhaug, Jiamou Sun, Jingyue Li

Identifying vulnerability-fixing commits corresponding to disclosed CVEs is essential for secure software maintenance but remains challenging at...

2 months ago cs.SE cs.AI cs.CR PDF

Benchmark HIGH

Red Teaming LLMs as Socio-Technical Practice: From Exploration and Data Creation to Evaluation

Adriana Alvarado Garcia, Ruyuan Wan, Ozioma C. Oguine +1 more

Recently, red teaming, with roots in security, has become a key evaluative approach to ensure the safety and reliability of Generative Artificial...

3 months ago cs.CY cs.AI cs.CL PDF

Benchmark HIGH

CAGE: A Framework for Culturally Adaptive Red-Teaming Benchmark Generation

Chaeyun Kim, YongTaek Lim, Kihyun Kim +2 more

Existing red-teaming benchmarks, when adapted to new languages via direct translation, fail to capture socio-technical vulnerabilities rooted in...

3 months ago cs.CY cs.AI PDF

Benchmark HIGH

From Assistant to Double Agent: Formalizing and Benchmarking Attacks on OpenClaw for Personalized Local AI Agent

Yuhang Wang, Feiming Xu, Zheng Lin +6 more

Although large language model (LLM)-based agents, exemplified by OpenClaw, are increasingly evolving from task-oriented systems into personalized AI...

3 months ago cs.AI PDF

Benchmark HIGH

CyberExplorer: Benchmarking LLM Offensive Security Capabilities in a Real-World Attacking Simulation Environment

Nanda Rani, Kimberly Milner, Minghao Shao +9 more

Real-world offensive security operations are inherently open-ended: attackers explore unknown attack surfaces, revise hypotheses under uncertainty,...

3 months ago cs.CR cs.AI cs.MA PDF

Benchmark HIGH

Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model

Tianyi Wu, Mingzhe Du, Yue Liu +4 more

Large language models (LLMs) are increasingly used in software development, yet their tendency to generate insecure code remains a major barrier to...

3 months ago cs.CR cs.AI cs.CL PDF

Benchmark HIGH

Evaluating and Enhancing the Vulnerability Reasoning Capabilities of Large Language Models

Li Lu, Yanjie Zhao, Hongzhou Rao +2 more

Large Language Models (LLMs) have demonstrated remarkable proficiency in vulnerability detection. However, a critical reliability gap persists:...

3 months ago cs.CR PDF

Benchmark HIGH

MPIB: A Benchmark for Medical Prompt Injection Attacks and Clinical Safety in LLMs

Junhyeok Lee, Han Jang, Kyu Sung Choi

Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems are increasingly integrated into clinical workflows; however, prompt...

3 months ago cs.CL cs.LG PDF

Benchmark HIGH

AgentDyn: A Dynamic Open-Ended Benchmark for Evaluating Prompt Injection Attacks of Real-World Agent Security System

Hao Li, Ruoyao Wen, Shanghao Shi +2 more

AI agents that autonomously interact with external tools and environments show great promise across real-world applications. However, the external...

3 months ago cs.CR PDF

Benchmark HIGH

Sifting the Noise: A Comparative Study of LLM Agents in Vulnerability False Positive Filtering

Yunpeng Xiong, Ting Zhang

Static Application Security Testing (SAST) tools are essential for identifying software vulnerabilities, but they often produce a high volume of...

3 months ago cs.SE PDF

Benchmark HIGH

AEGIS: White-Box Attack Path Generation using LLMs and Training Effectiveness Evaluation for Large-Scale Cyber Defence Exercises

Ivan K. Tung, Yu Xiang Shi, Alex Chien +2 more

Creating attack paths for cyber defence exercises requires substantial expert effort. Existing automation requires vulnerability graphs or exploit...

3 months ago cs.CR cs.AI PDF

Benchmark HIGH

RPP: A Certified Poisoned-Sample Detection Framework for Backdoor Attacks under Dataset Imbalance

Miao Lin, Feng Yu, Rui Ning +6 more

Deep neural networks are highly susceptible to backdoor attacks, yet most defense methods to date rely on balanced data, overlooking the pervasive...

3 months ago cs.CR cs.CV cs.LG PDF

Benchmark HIGH

Prompt Injection Evaluations: Refusal Boundary Instability and Artifact-Dependent Compliance in GPT-4-Series Models

Thomas Heverin

Prompt injection evaluations typically treat refusal as a stable, binary indicator of safety. This study challenges that paradigm by modeling refusal...

3 months ago cs.CR PDF

Benchmark HIGH

Multi-Agent End-to-End Vulnerability Management for Mitigating Recurring Vulnerabilities

Zelong Zheng, Jiayuan Zhou, Xing Hu +2 more

Software vulnerability management has become increasingly critical as modern systems scale in size and complexity. However, existing automated...

3 months ago cs.SE PDF

Benchmark HIGH

Vulnerability of LLMs' Stated Beliefs? LLMs Belief Resistance Check Through Strategic Persuasive Conversation Interventions

Fan Huang, Haewoon Kwak, Jisun An

Large Language Models (LLMs) are increasingly employed in various question-answering tasks. However, recent studies showcase that LLMs are...

3 months ago cs.CL cs.AI PDF

Benchmark HIGH

AgenticRed: Optimizing Agentic Systems for Automated Red-teaming

Jiayi Yuan, Jonathan Nöther, Natasha Jaques +1 more

While recent automated red-teaming methods show promise for systematically exposing model vulnerabilities, most existing approaches rely on...

3 months ago cs.AI cs.NE PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial