AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 501–520 of 879 papers

Clear filters

Attack HIGH

Emoji-Based Jailbreaking of Large Language Models

M P V S Gopinadh, S Mahaboob Hussain

Large Language Models (LLMs) are integral to modern AI applications, but their safety alignment mechanisms can be bypassed through adversarial prompt...

4 months ago cs.CR cs.AI PDF

Tool HIGH

Low Rank Comes with Low Security: Gradient Assembly Poisoning Attacks against Distributed LoRA-based LLM Systems

Yueyan Dong, Minghui Xu, Qin Hu +5 more

Low-Rank Adaptation (LoRA) has become a popular solution for fine-tuning large language models (LLMs) in federated settings, dramatically reducing...

4 months ago cs.CR PDF

Attack HIGH

Engineering Attack Vectors and Detecting Anomalies in Additive Manufacturing

Md Mahbub Hasan, Marcus Sternhagen, Krishna Chandra Roy

Additive manufacturing (AM) is rapidly integrating into critical sectors such as aerospace, automotive, and healthcare. However, this cyber-physical...

4 months ago cs.CR cs.AI cs.LG PDF

Benchmark HIGH

An Empirical Evaluation of LLM-Based Approaches for Code Vulnerability Detection: RAG, SFT, and Dual-Agent Systems

Md Hasan Saju, Maher Muhtadi, Akramul Azim

The rapid advancement of Large Language Models (LLMs) presents new opportunities for automated software vulnerability detection, a crucial task in...

4 months ago cs.SE cs.AI PDF

Attack HIGH

Overlooked Safety Vulnerability in LLMs: Malicious Intelligent Optimization Algorithm Request and its Jailbreak

Haoran Gu, Handing Wang, Yi Mei +2 more

The widespread deployment of large language models (LLMs) has raised growing concerns about their misuse risks and associated safety issues. While...

4 months ago cs.CR cs.CL PDF

Attack HIGH

Large Empirical Case Study: Go-Explore adapted for AI Red Team Testing

Manish Bhatt, Adrian Wood, Idan Habler +1 more

Production LLM agents with tool-using capabilities require security testing despite their safety training. We adapt Go-Explore to evaluate...

4 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

GCG Attack On A Diffusion LLM

Ruben Neyroud, Sam Corley

While most LLMs are autoregressive, diffusion-based LLMs have recently emerged as an alternative method for generation. Greedy Coordinate Gradient...

4 months ago cs.LG cs.CL cs.CR PDF

Benchmark HIGH

Language Model Agents Under Attack: A Cross Model-Benchmark of Profit-Seeking Behaviors in Customer Service

Jingyu Zhang

Customer-service LLM agents increasingly make policy-bound decisions (refunds, rebooking, billing disputes), but the same ``helpful'' interaction...

4 months ago cs.CR cs.HC PDF

Attack HIGH

Jailbreaking Attacks vs. Content Safety Filters: How Far Are We in the LLM Safety Arms Race?

Yuan Xin, Dingfan Chen, Linyi Yang +2 more

As large language models (LLMs) are increasingly deployed, ensuring their safe use is paramount. Jailbreaking, adversarial prompts that bypass model...

4 months ago cs.CR cs.AI cs.CL PDF

Attack HIGH

Breaking Audio Large Language Models by Attacking Only the Encoder: A Universal Targeted Latent-Space Audio Attack

Roee Ziv, Raz Lapid, Moshe Sipper

Audio-language models combine audio encoders with large language models to enable multimodal reasoning, but they also introduce new security...

4 months ago cs.SD cs.AI cs.CR PDF

Survey HIGH

Multilingual Hidden Prompt Injection Attacks on LLM-Based Academic Reviewing

Panagiotis Theocharopoulos, Ajinkya Kulkarni, Mathew Magimai. -Doss

Large language models (LLMs) are increasingly considered for use in high-impact workflows, including academic peer review. However, LLMs are...

4 months ago cs.CL cs.AI PDF

Tool HIGH

Toward Trustworthy Agentic AI: A Multimodal Framework for Preventing Prompt Injection Attacks

Toqeer Ali Syed, Mishal Ateeq Almutairi, Mahmoud Abdel Moaty

Powerful autonomous systems, which reason, plan, and converse using and between numerous tools and agents, are made possible by Large Language Models...

4 months ago cs.CR cs.AI PDF

Defense HIGH

Agentic AI for Autonomous Defense in Software Supply Chain Security: Beyond Provenance to Vulnerability Mitigation

Toqeer Ali Syed, Mohammad Riyaz Belgaum, Salman Jan +2 more

The software supply chain attacks are becoming more and more focused on trusted development and delivery procedures, so the conventional post-build...

4 months ago cs.CR cs.AI PDF

Benchmark HIGH

Prompt-Induced Over-Generation as Denial-of-Service: A Black-Box Attack-Side Benchmark

Manu, Yi Guo, Kanchana Thilakarathna +5 more

Large Language Models (LLMs) can be driven into over-generation, emitting thousands of tokens before producing an end-of-sequence (EOS) token. This...

4 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

RobustMask: Certified Robustness against Adversarial Neural Ranking Attack via Randomized Masking

Jiawei Liu, Zhuo Chen, Rui Zhu +4 more

Neural ranking models have achieved remarkable progress and are now widely deployed in real-world applications such as Retrieval-Augmented Generation...

4 months ago cs.CR cs.IR PDF

Attack HIGH

EquaCode: A Multi-Strategy Jailbreak Approach for Large Language Models via Equation Solving and Code Completion

Zhen Liang, Hai Huang, Zhengkui Chen

Large language models (LLMs), such as ChatGPT, have achieved remarkable success across a wide range of fields. However, their trustworthiness remains...

4 months ago cs.CR cs.AI PDF

Attack HIGH

Adaptive Trust Consensus for Blockchain IoT: Comparing RL, DRL, and MARL Against Naive, Collusive, Adaptive, Byzantine, and Sleeper Attacks

Soham Padia, Dhananjay Vaidya, Ramchandra Mangrulkar

Securing blockchain-enabled IoT networks against sophisticated adversarial attacks remains a critical challenge. This paper presents a trust-based...

4 months ago cs.CR cs.LG cs.MA PDF

Benchmark HIGH

Rethinking the Capability of Fine-Tuned Language Models for Automated Vulnerability Repair

Woorim Han, Yeongjun Kwak, Miseon Yu +4 more

Learning-based automated vulnerability repair (AVR) techniques that utilize fine-tuned language models have shown promise in generating vulnerability...

4 months ago cs.SE PDF

Attack HIGH

Backdoor Attacks on Prompt-Driven Video Segmentation Foundation Models

Zongmin Zhang, Zhen Sun, Yifan Liao +5 more

Prompt-driven Video Segmentation Foundation Models (VSFMs) such as SAM2 are increasingly deployed in applications like autonomous driving and digital...

4 months ago cs.CV cs.CR PDF

Benchmark HIGH

Beyond Single Bugs: Benchmarking Large Language Models for Multi-Vulnerability Detection

Chinmay Pushkar, Sanchit Kabra, Dhruv Kumar +1 more

Large Language Models (LLMs) have demonstrated significant potential in automated software security, particularly in vulnerability detection....

4 months ago cs.CR cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial