AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 421–440 of 635 papers

Clear filters

Attack HIGH

Bias Injection Attacks on RAG Databases and Sanitization Defenses

Hao Wu, Prateek Saxena

This paper explores attacks and defenses on vector databases in retrieval-augmented generation (RAG) systems. Prior work on knowledge poisoning...

5 months ago cs.CR cs.AI cs.DB PDF

Attack HIGH

Concept-Guided Backdoor Attack on Vision Language Models

Haoyu Shen, Weimin Lyu, Haotian Xu +1 more

Vision-Language Models (VLMs) have achieved impressive progress in multimodal text generation, yet their rapid adoption raises increasing concerns...

5 months ago cs.CR cs.AI PDF

Attack HIGH

WARP: Weight Teleportation for Attack-Resilient Unlearning Protocols

Mohammad M Maheri, Xavier Cadet, Peter Chin +1 more

Approximate machine unlearning aims to efficiently remove the influence of specific data points from a trained model, offering a practical...

5 months ago cs.LG cs.AI cs.CR PDF

Attack HIGH

Evaluating the Robustness of Large Language Model Safety Guardrails Against Adversarial Attacks

Richard J. Young

Large Language Model (LLM) safety guardrail models have emerged as a primary defense mechanism against harmful content generation, yet their...

5 months ago cs.CR PDF

Attack HIGH

Distillability of LLM Security Logic: Predicting Attack Success Rate of Outline Filling Attack via Ranking Regression

Tianyu Zhang, Zihang Xi, Jingyu Hua +1 more

In the realm of black-box jailbreak attacks on large language models (LLMs), the feasibility of constructing a narrow safety proxy, a lightweight...

5 months ago cs.CR cs.AI PDF

Attack HIGH

BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agents

Kaiyuan Zhang, Mark Tenenholtz, Kyle Polley +3 more

The integration of artificial intelligence (AI) agents into web browsers introduces security challenges that go beyond traditional web application...

5 months ago cs.LG cs.AI cs.CR PDF

Attack HIGH

Adversarial Confusion Attack: Disrupting Multimodal Large Language Models

Jakub Hoscilowicz, Artur Janicki

We introduce the Adversarial Confusion Attack, a new class of threats against multimodal large language models (MLLMs). Unlike jailbreaks or targeted...

5 months ago cs.CL PDF

Attack HIGH

V-Attack: Targeting Disentangled Value Features for Controllable Adversarial Attacks on LVLMs

Sen Nie, Jie Zhang, Jianxin Yan +2 more

Adversarial attacks have evolved from simply disrupting predictions on conventional task-specific models to the more complex goal of manipulating...

5 months ago cs.CV PDF

Attack HIGH

Medusa: Cross-Modal Transferable Adversarial Attacks on Multimodal Medical Retrieval-Augmented Generation

Yingjia Shang, Yi Liu, Huimin Wang +4 more

With the rapid advancement of retrieval-augmented vision-language models, multimodal medical retrieval-augmented generation (MMed-RAG) systems are...

5 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

FedPoisonTTP: A Threat Model and Poisoning Attack for Federated Test-Time Personalization

Md Akil Raihan Iftee, Syed Md. Ahnaf Hasan, Amin Ahsan Ali +3 more

Test-time personalization in federated learning enables models at clients to adjust online to local domain shifts, enhancing robustness and...

5 months ago cs.CR cs.CV PDF

Attack HIGH

Adversarial Attack-Defense Co-Evolution for LLM Safety Alignment via Tree-Group Dual-Aware Search and Optimization

Xurui Li, Kaisong Song, Rui Zhu +2 more

Large Language Models (LLMs) have developed rapidly in web services, delivering unprecedented capabilities while amplifying societal risks. Existing...

5 months ago cs.CR cs.AI PDF

Attack HIGH

AttackPilot: Autonomous Inference Attacks Against ML Services With LLM-Based Agents

Yixin Wu, Rui Wen, Chi Cui +2 more

Inference attacks have been widely studied and offer a systematic risk assessment of ML services; however, their implementation and the attack...

5 months ago cs.CR cs.AI PDF

Attack HIGH

Defending Large Language Models Against Jailbreak Exploits with Responsible AI Considerations

Ryan Wong, Hosea David Yu Fei Ng, Dhananjai Sharma +2 more

Large Language Models (LLMs) remain susceptible to jailbreak exploits that bypass safety filters and induce harmful or unethical behavior. This work...

5 months ago cs.CR cs.AI PDF

Attack HIGH

Automating Deception: Scalable Multi-Turn LLM Jailbreaks

Adarsh Kumarappan, Ananya Mujoo

Multi-turn conversational attacks, which leverage psychological principles like Foot-in-the-Door (FITD), where a small initial request paves the way...

5 months ago cs.LG cs.AI PDF

Attack HIGH

Semantics as a Shield: Label Disguise Defense (LDD) against Prompt Injection in LLM Sentiment Classification

Yanxi Li, Ruocheng Shan

Large language models are increasingly used for text classification tasks such as sentiment analysis, yet their reliance on natural language prompts...

5 months ago cs.CL cs.AI PDF

Attack HIGH

TASO: Jailbreak LLMs via Alternative Template and Suffix Optimization

Yanting Wang, Runpeng Geng, Jinghui Chen +2 more

Many recent studies showed that LLMs are vulnerable to jailbreak attacks, where an attacker can perturb the input of an LLM to induce it to generate...

5 months ago cs.CR PDF

Attack HIGH

Exploiting the Experts: Unauthorized Compression in MoE-LLMs

Pinaki Prasad Guha Neogi, Ahmad Mohammadshirazi, Dheeraj Kulshrestha +1 more

Mixture-of-Experts (MoE) architectures are increasingly adopted in large language models (LLMs) for their scalability and efficiency. However, their...

5 months ago cs.LG cs.AI PDF

Attack HIGH

Vulnerability-Aware Robust Multimodal Adversarial Training

Junrui Zhang, Xinyu Zhao, Jie Peng +3 more

Multimodal learning has shown significant superiority on various tasks by integrating multiple modalities. However, the interdependencies among...

5 months ago cs.LG cs.CR PDF

Attack HIGH

Federated Anomaly Detection and Mitigation for EV Charging Forecasting Under Cyberattacks

Oluleke Babayomi, Dong-Seong Kim

Electric Vehicle (EV) charging infrastructure faces escalating cybersecurity threats that can severely compromise operational efficiency and grid...

5 months ago cs.LG cs.CR PDF

Attack HIGH

Beyond Jailbreak: Unveiling Risks in LLM Applications Arising from Blurred Capability Boundaries

Yunyi Zhang, Shibo Cui, Baojun Liu +4 more

LLM applications (i.e., LLM apps) leverage the powerful capabilities of LLMs to provide users with customized services, revolutionizing traditional...

5 months ago cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial