AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 481–500 of 736 papers

Clear filters

Benchmark MEDIUM

Biothreat Benchmark Generation Framework for Evaluating Frontier AI Models II: Benchmark Generation Process

Gary Ackerman, Zachary Kallenborn, Anna Wetzel +7 more

The potential for rapidly-evolving frontier artificial intelligence (AI) models, especially large language models (LLMs), to facilitate bioterrorism...

5 months ago cs.LG cs.AI cs.CY PDF

Benchmark MEDIUM

Secure or Suspect? Investigating Package Hallucinations of Shell Command in Original and Quantized LLMs

Md Nazmul Haque, Elizabeth Lin, Lawrence Arkoh +2 more

Large Language Models for code (LLMs4Code) are increasingly used to generate software artifacts, including library and package recommendations in...

5 months ago cs.SE PDF

Benchmark MEDIUM

An Adaptive Multi-Layered Honeynet Architecture for Threat Behavior Analysis via Deep Learning

Lukas Johannes Möller

The escalating sophistication and variety of cyber threats have rendered static honeypots inadequate, necessitating adaptive, intelligence-driven...

5 months ago cs.CR cs.DC cs.LG PDF

Benchmark MEDIUM

Auditing Games for Sandbagging

Jordan Taylor, Sid Black, Dillon Bowen +10 more

Future AI systems could conceal their capabilities ('sandbagging') during evaluations, potentially misleading developers and auditors. We...

5 months ago cs.AI PDF

Benchmark LOW

SAVE: Sparse Autoencoder-Driven Visual Information Enhancement for Mitigating Object Hallucination

Sangha Park, Seungryong Yoo, Jisoo Mok +1 more

Although Multimodal Large Language Models (MLLMs) have advanced substantially, they remain vulnerable to object hallucination caused by language...

5 months ago cs.CV cs.AI PDF

Benchmark LOW

Privacy Practices of Browser Agents

Alisha Ukani, Hamed Haddadi, Ali Shahin Shamsabadi +1 more

This paper presents a systematic evaluation of the privacy behaviors and attributes of eight recent, popular browser agents. Browser agents are...

5 months ago cs.CR PDF

Benchmark MEDIUM

How Do LLMs Fail In Agentic Scenarios? A Qualitative Analysis of Success and Failure Scenarios of Various LLMs in Agentic Simulations

JV Roig

We investigate how large language models (LLMs) fail when operating as autonomous agents with tool-use capabilities. Using the Kamiwaza Agentic Merit...

5 months ago cs.AI cs.SE PDF

Benchmark MEDIUM

Pay Less Attention to Function Words for Free Robustness of Vision-Language Models

Qiwei Tian, Chenhao Lin, Zhengyu Zhao +1 more

To address the trade-off between robustness and performance for robust VLM, we observe that function words could incur vulnerability of VLMs against...

5 months ago cs.LG cs.CL PDF

Benchmark HIGH

OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation

Xiaojun Jia, Jie Liao, Qi Guo +11 more

Recent advances in multi-modal large language models (MLLMs) have enabled unified perception-reasoning capabilities, yet these systems remain highly...

5 months ago cs.CR cs.CV PDF

Benchmark MEDIUM

CFCEval: Evaluating Security Aspects in Code Generated by Large Language Models

Cheng Cheng, Jinqiu Yang

Code-focused Large Language Models (LLMs), such as CodeX and Star-Coder, have demonstrated remarkable capabilities in enhancing developer...

5 months ago cs.SE PDF

Benchmark HIGH

Sift or Get Off the PoC: Applying Information Retrieval to Vulnerability Research with SiftRank

Caleb Gross

Security research is fundamentally a problem of resource constraint and consequent prioritization. There is simply too much attack surface and too...

5 months ago cs.CR cs.IR PDF

Benchmark HIGH

TeleAI-Safety: A comprehensive LLM jailbreaking benchmark towards attacks, defenses, and evaluations

Xiuyuan Chen, Jian Zhao, Yuxiang He +10 more

While the deployment of large language models (LLMs) in high-value industries continues to expand, the systematic assessment of their safety against...

5 months ago cs.CR PDF

Benchmark MEDIUM

Auto-SPT: Automating Semantic Preserving Transformations for Code

Ashish Hooda, Mihai Christodorescu, Chuangang Ren +3 more

Machine learning (ML) models for code clone detection determine whether two pieces of code are semantically equivalent, which in turn is a key...

5 months ago cs.SE cs.AI PDF

Benchmark LOW

Beyond Detection: A Comprehensive Benchmark and Study on Representation Learning for Fine-Grained Webshell Family Classification

Feijiang Han

Malicious WebShells pose a significant and evolving threat by compromising critical digital infrastructures and endangering public services in...

5 months ago cs.CR cs.AI cs.LG PDF

Benchmark MEDIUM

Boundary-Aware Test-Time Adaptation for Zero-Shot Medical Image Segmentation

Chenlin Xu, Lei Zhang, Lituan Wang +5 more

Due to the scarcity of annotated data and the substantial computational costs of model, conventional tuning methods in medical image segmentation...

5 months ago cs.CV PDF

Benchmark MEDIUM

MarkTune: Improving the Quality-Detectability Trade-off in Open-Weight LLM Watermarking

Yizhou Zhao, Zhiwei Steven Wu, Adam Block

Watermarking aims to embed hidden signals in generated text that can be reliably detected when given access to a secret key. Open-weight language...

5 months ago cs.LG cs.AI cs.CR PDF

Benchmark MEDIUM

Context-Aware Hierarchical Learning: A Two-Step Paradigm towards Safer LLMs

Tengyun Ma, Jiaqi Yao, Daojing He +4 more

Large Language Models (LLMs) have emerged as powerful tools for diverse applications. However, their uniform token processing paradigm introduces...

5 months ago cs.CR cs.AI PDF

Benchmark HIGH

Is Vibe Coding Safe? Benchmarking Vulnerability of Agent-Generated Code in Real-World Tasks

Songwen Zhao, Danqing Wang, Kexun Zhang +3 more

Vibe coding is a new programming paradigm in which human engineers instruct large language model (LLM) agents to complete complex coding tasks with...

5 months ago cs.SE cs.CL PDF

Benchmark MEDIUM

COGNITION: From Evaluation to Defense against Multimodal LLM CAPTCHA Solvers

Junyu Wang, Changjia Zhu, Yuanbo Zhou +3 more

This paper studies how multimodal large language models (MLLMs) undermine the security guarantees of visual CAPTCHA. We identify the attack surface...

5 months ago cs.CR cs.AI PDF

Benchmark LOW

DialogGuard: Multi-Agent Psychosocial Safety Evaluation of Sensitive LLM Responses

Han Luo, Guy Laban

Large language models (LLMs) now mediate many web-based mental-health, crisis, and other emotionally sensitive services, yet their psychosocial...

5 months ago cs.AI cs.HC cs.MA PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial