AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 601–620 of 660 papers

Clear filters

Attack HIGH

LeechHijack: Covert Computational Resource Exploitation in Intelligent Agent Systems

Yuanhe Zhang, Weiliu Wang, Zhenhong Zhou +5 more

Large Language Model (LLM)-based agents have demonstrated remarkable capabilities in reasoning, planning, and tool usage. The recently proposed Model...

5 months ago cs.CR cs.CL PDF

Attack HIGH

Ensemble Privacy Defense for Knowledge-Intensive LLMs against Membership Inference Attacks

Haowei Fu, Bo Ni, Han Xu +3 more

Retrieval-Augmented Generation (RAG) and Supervised Finetuning (SFT) have become the predominant paradigms for equipping Large Language Models (LLMs)...

5 months ago cs.CR cs.AI PDF

Attack HIGH

Securing Large Language Models (LLMs) from Prompt Injection Attacks

Omar Farooq Khan Suri, John McCrae

Large Language Models (LLMs) are increasingly being deployed in real-world applications, but their flexibility exposes them to prompt injection...

5 months ago cs.CR cs.CL cs.LG PDF

Attack HIGH

DefenSee: Dissecting Threat from Sight and Text -- A Multi-View Defensive Pipeline for Multi-modal Jailbreaks

Zihao Wang, Kar Wai Fok, Vrizlynn L. L. Thing

Multi-modal large language models (MLLMs), capable of processing text, images, and audio, have been widely adopted in various AI applications....

5 months ago cs.CR PDF

Attack HIGH

Mitigating Indirect Prompt Injection via Instruction-Following Intent Analysis

Mintong Kang, Chong Xiang, Sanjay Kariyappa +3 more

Indirect prompt injection attacks (IPIAs), where large language models (LLMs) follow malicious instructions hidden in input data, pose a critical...

5 months ago cs.CR cs.LG PDF

Attack HIGH

Bias Injection Attacks on RAG Databases and Sanitization Defenses

Hao Wu, Prateek Saxena

This paper explores attacks and defenses on vector databases in retrieval-augmented generation (RAG) systems. Prior work on knowledge poisoning...

5 months ago cs.CR cs.AI cs.DB PDF

Attack HIGH

Concept-Guided Backdoor Attack on Vision Language Models

Haoyu Shen, Weimin Lyu, Haotian Xu +1 more

Vision-Language Models (VLMs) have achieved impressive progress in multimodal text generation, yet their rapid adoption raises increasing concerns...

5 months ago cs.CR cs.AI PDF

Benchmark HIGH

Red Teaming Large Reasoning Models

Jiawei Chen, Yang Yang, Chao Yu +6 more

Large Reasoning Models (LRMs) have emerged as a powerful advancement in multi-step reasoning tasks, offering enhanced transparency and logical...

5 months ago cs.CR cs.AI PDF

Attack HIGH

WARP: Weight Teleportation for Attack-Resilient Unlearning Protocols

Mohammad M Maheri, Xavier Cadet, Peter Chin +1 more

Approximate machine unlearning aims to efficiently remove the influence of specific data points from a trained model, offering a practical...

5 months ago cs.LG cs.AI cs.CR PDF

Defense HIGH

Retrieval-Augmented Few-Shot Prompting Versus Fine-Tuning for Code Vulnerability Detection

Fouad Trad, Ali Chehab

Few-shot prompting has emerged as a practical alternative to fine-tuning for leveraging the capabilities of large language models (LLMs) in...

5 months ago cs.SE cs.AI cs.CL PDF

Attack HIGH

Evaluating the Robustness of Large Language Model Safety Guardrails Against Adversarial Attacks

Richard J. Young

Large Language Model (LLM) safety guardrail models have emerged as a primary defense mechanism against harmful content generation, yet their...

5 months ago cs.CR PDF

Attack HIGH

Distillability of LLM Security Logic: Predicting Attack Success Rate of Outline Filling Attack via Ranking Regression

Tianyu Zhang, Zihang Xi, Jingyu Hua +1 more

In the realm of black-box jailbreak attacks on large language models (LLMs), the feasibility of constructing a narrow safety proxy, a lightweight...

5 months ago cs.CR cs.AI PDF

Attack HIGH

BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents

Kaiyuan Zhang, Mark Tenenholtz, Kyle Polley +3 more

The integration of artificial intelligence (AI) agents into web browsers introduces security challenges that go beyond traditional web application...

5 months ago cs.LG cs.AI cs.CR PDF

Attack HIGH

Adversarial Confusion Attack: Disrupting Multimodal Large Language Models

Jakub Hoscilowicz, Artur Janicki

We introduce the Adversarial Confusion Attack, a new class of threats against multimodal large language models (MLLMs). Unlike jailbreaks or targeted...

5 months ago cs.CL PDF

Attack HIGH

V-Attack: Targeting Disentangled Value Features for Controllable Adversarial Attacks on LVLMs

Sen Nie, Jie Zhang, Jianxin Yan +2 more

Adversarial attacks have evolved from simply disrupting predictions on conventional task-specific models to the more complex goal of manipulating...

5 months ago cs.CV PDF

Attack HIGH

Medusa: Cross-Modal Transferable Adversarial Attacks on Multimodal Medical Retrieval-Augmented Generation

Yingjia Shang, Yi Liu, Huimin Wang +4 more

With the rapid advancement of retrieval-augmented vision-language models, multimodal medical retrieval-augmented generation (MMed-RAG) systems are...

5 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

FedPoisonTTP: A Threat Model and Poisoning Attack for Federated Test-Time Personalization

Md Akil Raihan Iftee, Syed Md. Ahnaf Hasan, Amin Ahsan Ali +3 more

Test-time personalization in federated learning enables models at clients to adjust online to local domain shifts, enhancing robustness and...

5 months ago cs.CR cs.CV PDF

Attack HIGH

Adversarial Attack-Defense Co-Evolution for LLM Safety Alignment via Tree-Group Dual-Aware Search and Optimization

Xurui Li, Kaisong Song, Rui Zhu +2 more

Large Language Models (LLMs) have developed rapidly in web services, delivering unprecedented capabilities while amplifying societal risks. Existing...

5 months ago cs.CR cs.AI PDF

Attack HIGH

AttackPilot: Autonomous Inference Attacks Against ML Services With LLM-Based Agents

Yixin Wu, Rui Wen, Chi Cui +2 more

Inference attacks have been widely studied and offer a systematic risk assessment of ML services; however, their implementation and the attack...

5 months ago cs.CR cs.AI PDF

Attack HIGH

Defending Large Language Models Against Jailbreak Exploits with Responsible AI Considerations

Ryan Wong, Hosea David Yu Fei Ng, Dhananjai Sharma +2 more

Large Language Models (LLMs) remain susceptible to jailbreak exploits that bypass safety filters and induce harmful or unethical behavior. This work...

5 months ago cs.CR cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial