AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 381–400 of 890 papers

Clear filters

Attack HIGH

RedVisor: Reasoning-Aware Prompt Injection Defense via Zero-Copy KV Cache Reuse

Mingrui Liu, Sixiao Zhang, Cheng Long +1 more

Large Language Models (LLMs) are increasingly vulnerable to Prompt Injection (PI) attacks, where adversarial instructions hidden within retrieved...

3 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

Efficient Adversarial Attacks on High-dimensional Offline Bandits

Seyed Mohammad Hadi Hosseini, Amir Najafi, Mahdieh Soleymani Baghshah

Bandit algorithms have recently emerged as a powerful tool for evaluating machine learning models, including generative image models and large...

3 months ago cs.LG cs.AI PDF

Tool HIGH

Provable Defense Framework for LLM Jailbreaks via Noise-Augumented Alignment

Zehua Cheng, Jianwei Yang, Wei Dai +1 more

Large Language Models (LLMs) remain vulnerable to adaptive jailbreaks that easily bypass empirical defenses like GCG. We propose a framework for...

3 months ago cs.CL cs.AI PDF

Attack HIGH

SGHA-Attack: Semantic-Guided Hierarchical Alignment for Transferable Targeted Attacks on Vision-Language Models

Haobo Wang, Weiqi Luo, Xiaojun Jia +1 more

Large vision-language models (VLMs) are vulnerable to transfer-based adversarial perturbations, enabling attackers to optimize on surrogate models...

3 months ago cs.CV PDF

Attack HIGH

MAGIC: A Co-Evolving Attacker-Defender Adversarial Game for Robust LLM Safety

Xiaoyu Wen, Zhida He, Han Qi +7 more

Ensuring robust safety alignment is crucial for Large Language Models (LLMs), yet existing defenses often lag behind evolving adversarial attacks due...

3 months ago cs.AI cs.CL cs.LG PDF

Attack HIGH

TxRay: Agentic Postmortem of Live Blockchain Attacks

Ziyue Wang, Jiangshan Yu, Kaihua Qin +3 more

Decentralized Finance (DeFi) has turned blockchains into financial infrastructure, allowing anyone to trade, lend, and build protocols without...

3 months ago cs.CR cs.AI PDF

Attack HIGH

To Defend Against Cyber Attacks, We Must Teach AI Agents to Hack

Terry Yue Zhuo, Yangruibo Ding, Wenbo Guo +1 more

For over a decade, cybersecurity has relied on human labor scarcity to limit attackers to high-value targets manually or generic automated attacks at...

3 months ago cs.CR cs.AI cs.CY PDF

Attack HIGH

Toward Universal and Transferable Jailbreak Attacks on Vision-Language Models

Kaiyuan Cui, Yige Li, Yutao Wu +4 more

Vision-language models (VLMs) extend large language models (LLMs) with vision encoders, enabling text generation conditioned on both images and text....

3 months ago cs.LG cs.AI cs.CV PDF

Attack HIGH

GradingAttack: Attacking Large Language Models Towards Short Answer Grading Ability

Xueyi Li, Zhuoneng Zhou, Zitao Liu +2 more

Large language models (LLMs) have demonstrated remarkable potential for automatic short answer grading (ASAG), significantly boosting student...

3 months ago cs.CR cs.AI cs.CL PDF

Attack HIGH

A Causal Perspective for Enhancing Jailbreak Attack and Defense

Licheng Pan, Yunsheng Lu, Jiexi Liu +5 more

Uncovering the mechanisms behind "jailbreaks" in large language models (LLMs) is crucial for enhancing their safety and reliability, yet these...

3 months ago cs.LG cs.AI cs.CR PDF

Attack HIGH

Bypassing Prompt Injection Detectors through Evasive Injections

Md Jahedur Rahman, Ihsen Alouani

Large language models (LLMs) are increasingly used in interactive and retrieval-augmented systems, but they remain vulnerable to task drift;...

3 months ago cs.CR cs.AI PDF

Attack HIGH

Jailbreaking LLMs via Calibration

Yuxuan Lu, Yongkang Guo, Yuqing Kong

Safety alignment in Large Language Models (LLMs) often creates a systematic discrepancy between a model's aligned output and the underlying...

3 months ago cs.CL cs.AI cs.CR PDF

Tool HIGH

DECEIVE-AFC: Adversarial Claim Attacks against Search-Enabled LLM-based Fact-Checking Systems

Haoran Ou, Kangjie Chen, Gelei Deng +4 more

Fact-checking systems with search-enabled large language models (LLMs) have shown strong potential for verifying claims by dynamically retrieving...

3 months ago cs.CR cs.AI PDF

Attack HIGH

Text is All You Need for Vision-Language Model Jailbreaking

Yihang Chen, Zhao Xu, Youyuan Jiang +2 more

Large Vision-Language Models (LVLMs) are increasingly equipped with robust safety safeguards to prevent responses to harmful or disallowed prompts....

3 months ago cs.CV cs.AI cs.CR PDF

Attack HIGH

"Someone Hid It": Query-Agnostic Black-Box Attacks on LLM-Based Retrieval

Jiate Li, Defu Cao, Li Li +8 more

Large language models (LLMs) have been serving as effective backbones for retrieval systems, including Retrieval-Augmentation-Generation (RAG), Dense...

3 months ago cs.CR PDF

Attack HIGH

Optimal Transport-Guided Adversarial Attacks on Graph Neural Network-Based Bot Detection

Kunal Mukherjee, Zulfikar Alom, Tran Gia Bao Ngo +2 more

The rise of bot accounts on social media poses significant risks to public discourse. To address this threat, modern bot detectors increasingly rely...

3 months ago cs.LG cs.AI cs.CR PDF

Survey HIGH

Semantics-Preserving Evasion of LLM Vulnerability Detectors

Luze Sun, Alina Oprea, Eric Wong

LLM-based vulnerability detectors are increasingly deployed in security-critical code review, yet their resilience to evasion under...

3 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

Now You Hear Me: Audio Narrative Attacks Against Large Audio-Language Models

Ye Yu, Haibo Jin, Yaoning Yu +2 more

Large audio-language models increasingly operate on raw speech inputs, enabling more seamless integration across domains such as voice assistants,...

3 months ago cs.CL cs.AI cs.CR PDF

Attack HIGH

From Similarity to Vulnerability: Key Collision Attack on LLM Semantic Caching

Zhixiang Zhang, Zesen Liu, Yuchong Xie +2 more

Semantic caching has emerged as a pivotal technique for scaling LLM applications, widely adopted by major providers including AWS and Microsoft. By...

3 months ago cs.CR cs.AI PDF

Benchmark HIGH

Sifting the Noise: A Comparative Study of LLM Agents in Vulnerability False Positive Filtering

Yunpeng Xiong, Ting Zhang

Static Application Security Testing (SAST) tools are essential for identifying software vulnerabilities, but they often produce a high volume of...

3 months ago cs.SE PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial