AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 41–60 of 90 papers

Clear filters

Benchmark HIGH

OI-Bench: An Option Injection Benchmark for Evaluating LLM Susceptibility to Directive Interference

Yow-Fu Liou, Yu-Chien Tang, Yu-Hsiang Liu +1 more

Benchmarking large language models (LLMs) is critical for understanding their capabilities, limitations, and robustness. In addition to interface...

3 months ago cs.CL PDF

Benchmark HIGH

Ethical Risks in Deploying Large Language Models: An Evaluation of Medical Ethics Jailbreaking

Chutian Huang, Dake Cao, Jiacheng Ji +3 more

Background: While Large Language Models (LLMs) have achieved widespread adoption, malicious prompt engineering specifically "jailbreak attacks" poses...

3 months ago cs.CY PDF

Benchmark HIGH

Hidden-in-Plain-Text: A Benchmark for Social-Web Indirect Prompt Injection in RAG

Haoze Guo, Ziqi Wei

Retrieval-augmented generation (RAG) systems put more and more emphasis on grounding their responses in user-generated content found on the Web,...

3 months ago cs.CR cs.HC PDF

Benchmark HIGH

LLMs in Code Vulnerability Analysis: A Proof of Concept

Shaznin Sultana, Sadia Afreen, Nasir U. Eisty

Context: Traditional software security analysis methods struggle to keep pace with the scale and complexity of modern codebases, requiring...

3 months ago cs.SE PDF

Benchmark HIGH

RedBench: A Universal Dataset for Comprehensive Red Teaming of Large Language Models

Quy-Anh Dang, Chris Ngo, Truong-Son Hy

As large language models (LLMs) become integral to safety-critical applications, ensuring their robustness against adversarial prompts is paramount....

4 months ago cs.CL PDF

Benchmark HIGH

Jailbreaking LLMs & VLMs: Mechanisms, Evaluation, and Unified Defense

Zejian Chen, Chaozhuo Li, Chao Li +3 more

This paper provides a systematic survey of jailbreak attacks and defenses on Large Language Models (LLMs) and Vision-Language Models (VLMs),...

4 months ago cs.CR PDF

Benchmark HIGH

The Anatomy of Conversational Scams: A Topic-Based Red Teaming Analysis of Multi-Turn Interactions in LLMs

Xiangzhe Yuan, Zhenhao Zhang, Haoming Tang +1 more

As LLMs gain persuasive agentic capabilities through extended dialogues, they introduce novel risks in multi-turn conversational scams that...

4 months ago cs.CL PDF

Benchmark HIGH

How Real is Your Jailbreak? Fine-grained Jailbreak Evaluation with Anchored Reference

Songyang Liu, Chaozhuo Li, Rui Pu +5 more

Jailbreak attacks present a significant challenge to the safety of Large Language Models (LLMs), yet current automated evaluation methods largely...

4 months ago cs.CR cs.CL PDF

Benchmark HIGH

An Empirical Evaluation of LLM-Based Approaches for Code Vulnerability Detection: RAG, SFT, and Dual-Agent Systems

Md Hasan Saju, Maher Muhtadi, Akramul Azim

The rapid advancement of Large Language Models (LLMs) presents new opportunities for automated software vulnerability detection, a crucial task in...

4 months ago cs.SE cs.AI PDF

Benchmark HIGH

Language Model Agents Under Attack: A Cross Model-Benchmark of Profit-Seeking Behaviors in Customer Service

Jingyu Zhang

Customer-service LLM agents increasingly make policy-bound decisions (refunds, rebooking, billing disputes), but the same ``helpful'' interaction...

4 months ago cs.CR cs.HC PDF

Benchmark HIGH

Prompt-Induced Over-Generation as Denial-of-Service: A Black-Box Attack-Side Benchmark

Manu, Yi Guo, Kanchana Thilakarathna +5 more

Large Language Models (LLMs) can be driven into over-generation, emitting thousands of tokens before producing an end-of-sequence (EOS) token. This...

4 months ago cs.CR cs.AI cs.LG PDF

Benchmark HIGH

Rethinking the Capability of Fine-Tuned Language Models for Automated Vulnerability Repair

Woorim Han, Yeongjun Kwak, Miseon Yu +4 more

Learning-based automated vulnerability repair (AVR) techniques that utilize fine-tuned language models have shown promise in generating vulnerability...

4 months ago cs.SE PDF

Benchmark HIGH

Beyond Single Bugs: Benchmarking Large Language Models for Multi-Vulnerability Detection

Chinmay Pushkar, Sanchit Kabra, Dhruv Kumar +1 more

Large Language Models (LLMs) have demonstrated significant potential in automated software security, particularly in vulnerability detection....

4 months ago cs.CR cs.AI PDF

Benchmark HIGH

Well Begun is Half Done: Location-Aware and Trace-Guided Iterative Automated Vulnerability Repair

Zhenlei Ye, Xiaobing Sun, Sicong Cao +2 more

The advances of large language models (LLMs) have paved the way for automated software vulnerability repair approaches, which iteratively refine the...

4 months ago cs.SE PDF

Benchmark HIGH

DREAM: Dynamic Red-teaming across Environments for AI Models

Liming Lu, Xiang Gu, Junyu Huang +5 more

Large Language Models (LLMs) are increasingly used in agentic systems, where their interactions with diverse tools and environments create complex,...

4 months ago cs.CR PDF

Benchmark HIGH

Learning-Based Automated Adversarial Red-Teaming for Robustness Evaluation of Large Language Models

Zhang Wei, Peilu Hu, Zhenyuan Wei +16 more

The increasing deployment of large language models (LLMs) in safety-critical applications raises fundamental challenges in systematically evaluating...

4 months ago cs.CR cs.CL PDF

Benchmark HIGH

Beyond the Benchmark: Innovative Defenses Against Prompt Injection Attacks

Safwan Shaheer, G. M. Refatul Islam, Mohammad Rafid Hamid +1 more

In this fast-evolving area of LLMs, our paper discusses the significant security risk presented by prompt injection attacks. It focuses on small...

4 months ago cs.CR cs.AI PDF

Benchmark HIGH

From Lab to Reality: A Practical Evaluation of Deep Learning Models and LLMs for Vulnerability Detection

Chaomeng Lu, Bert Lagaisse

Vulnerability detection methods based on deep learning (DL) have shown strong performance on benchmark datasets, yet their real-world effectiveness...

5 months ago cs.CR cs.LG cs.SE PDF

Benchmark HIGH

How to Trick Your AI TA: A Systematic Study of Academic Jailbreaking in LLM Code Evaluation

Devanshu Sahoo, Vasudev Majhi, Arjun Neekhra +3 more

The use of Large Language Models (LLMs) as automatic judges for code evaluation is becoming increasingly prevalent in academic environments. But...

5 months ago cs.SE cs.AI PDF

Benchmark HIGH

Read or Ignore? A Unified Benchmark for Typographic-Attack Robustness and Text Recognition in Vision-Language Models

Futa Waseda, Shojiro Yamabe, Daiki Shiono +2 more

Large vision-language models (LVLMs) are vulnerable to typographic attacks, where misleading text within an image overrides visual understanding....

5 months ago cs.CV PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial