AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 301–320 of 635 papers

Clear filters

Attack HIGH

Persona Jailbreaking in Large Language Models

Jivnesh Sandhan, Fei Cheng, Tushar Sandhan +1 more

Large Language Models (LLMs) are increasingly deployed in domains such as education, mental health and customer support, where stable and consistent...

3 months ago cs.CL PDF

Attack HIGH

Attributing and Exploiting Safety Vectors through Global Optimization in Large Language Models

Fengheng Chu, Jiahao Chen, Yuhong Wang +4 more

While Large Language Models (LLMs) are aligned to mitigate risks, their safety guardrails remain fragile against jailbreak attacks. This reveals...

3 months ago cs.LG cs.CR PDF

Attack HIGH

Beyond Visual Safety: Jailbreaking Multimodal Large Language Models for Harmful Image Generation via Semantic-Agnostic Inputs

Mingyu Yu, Lana Liu, Zhehao Zhao +2 more

The rapid advancement of Multimodal Large Language Models (MLLMs) has introduced complex security challenges, particularly at the intersection of...

3 months ago cs.CV cs.AI PDF

Attack HIGH

Multi-Targeted Graph Backdoor Attack

Md Nabi Newaz Khan, Abdullah Arafat Miah, Yu Bi

Graph neural network (GNN) have demonstrated exceptional performance in solving critical problems across diverse domains yet remain susceptible to...

3 months ago cs.LG cs.AI cs.CR PDF

Attack HIGH

Robust Fake News Detection using Large Language Models under Adversarial Sentiment Attacks

Sahar Tahmasebi, Eric Müller-Budack, Ralph Ewerth

Misinformation and fake news have become a pressing societal challenge, driving the need for reliable automated detection methods. Prior research has...

3 months ago cs.CL PDF

Attack HIGH

Lightweight LLMs for Network Attack Detection in IoT Networks

Piyumi Bhagya Sudasinghe, Kushan Sudheera Kalupahana Liyanage, Harsha S. Gardiyawasam Pussewalage

The rapid growth of Internet of Things (IoT) devices has increased the scale and diversity of cyberattacks, exposing limitations in traditional...

3 months ago cs.CR PDF

Attack HIGH

Beyond Denial-of-Service: The Puppeteer's Attack for Fine-Grained Control in Ranking-Based Federated Learning

Zhihao Chen, Zirui Gong, Jianting Ning +2 more

Federated Rank Learning (FRL) is a promising Federated Learning (FL) paradigm designed to be resilient against model poisoning attacks due to its...

3 months ago cs.LG cs.CR cs.DC PDF

Attack HIGH

Uncovering and Understanding FPR Manipulation Attack in Industrial IoT Networks

Mohammad Shamim Ahsan, Peng Liu

In the network security domain, due to practical issues -- including imbalanced data and heterogeneous legitimate network traffic -- adversarial...

3 months ago cs.CR cs.LG PDF

Attack HIGH

SecureSplit: Mitigating Backdoor Attacks in Split Learning

Zhihao Dou, Dongfei Cui, Weida Wang +7 more

Split Learning (SL) offers a framework for collaborative model training that respects data privacy by allowing participants to share the same dataset...

3 months ago cs.CR cs.DC cs.LG PDF

Attack HIGH

PINA: Prompt Injection Attack against Navigation Agents

Jiani Liu, Yixin He, Lanlan Fan +5 more

Navigation agents powered by large language models (LLMs) convert natural language instructions into executable plans and actions. Compared to...

3 months ago cs.CR PDF

Attack HIGH

SilentDrift: Exploiting Action Chunking for Stealthy Backdoor Attacks on Vision-Language-Action Models

Bingxin Xu, Yuzhang Shang, Binghui Wang +1 more

Vision-Language-Action (VLA) models are increasingly deployed in safety-critical robotic applications, yet their security vulnerabilities remain...

3 months ago cs.CR cs.AI cs.RO PDF

Attack HIGH

Sockpuppetting: Jailbreaking LLMs Without Optimization Through Output Prefix Injection

Asen Dotsinski, Panagiotis Eustratiadis

As open-weight large language models (LLMs) increase in capabilities, safeguarding them against malicious prompts and understanding possible attack...

3 months ago cs.CL cs.CR cs.LG PDF

Attack HIGH

Prompt Injection Mitigation with Agentic AI, Nested Learning, and AI Sustainability via Semantic Caching

Diego Gosmar, Deborah A. Dahl

Prompt injection remains a central obstacle to the safe deployment of large language models, particularly in multi-agent settings where intermediate...

3 months ago cs.AI cs.MA PDF

Attack HIGH

CODE: A Contradiction-Based Deliberation Extension Framework for Overthinking Attacks on Retrieval-Augmented Generation

Xiaolei Zhang, Xiaojun Jia, Liquan Chen +1 more

Introducing reasoning models into Retrieval-Augmented Generation (RAG) systems enhances task performance through step-by-step reasoning, logical...

3 months ago cs.CR PDF

Attack HIGH

ChartAttack: Testing the Vulnerability of LLMs to Malicious Prompting in Chart Generation

Jesus-German Ortiz-Barajas, Jonathan Tonglet, Vivek Gupta +1 more

Multimodal large language models (MLLMs) are increasingly used to automate chart generation from data tables, enabling efficient data analysis and...

3 months ago cs.CL PDF

Attack HIGH

TrojanPraise: Jailbreak LLMs via Benign Fine-Tuning

Zhixin Xie, Xurui Song, Jun Luo

The demand of customized large language models (LLMs) has led to commercial LLMs offering black-box fine-tuning APIs, yet this convenience introduces...

3 months ago cs.CR cs.LG PDF

Attack HIGH

Zero-Shot Embedding Drift Detection: A Lightweight Defense Against Prompt Injections in LLMs

Anirudh Sekar, Mrinal Agarwal, Rachel Sharma +4 more

Prompt injection attacks have become an increasing vulnerability for LLM applications, where adversarial prompts exploit indirect input channels such...

3 months ago cs.CR cs.CL PDF

Attack HIGH

SD-RAG: A Prompt-Injection-Resilient Framework for Selective Disclosure in Retrieval-Augmented Generation

Aiman Al Masoud, Marco Arazzi, Antonino Nocera

Retrieval-Augmented Generation (RAG) has attracted significant attention due to its ability to combine the generative capabilities of Large Language...

3 months ago cs.CR cs.AI PDF

Attack HIGH

AJAR: Adaptive Jailbreak Architecture for Red-teaming

Yipu Dou, Wang Yang

Large language model (LLM) safety evaluation is moving from content moderation to action security as modern systems gain persistent state, tool...

3 months ago cs.CR cs.CL PDF

Attack HIGH

Serverless AI Security: Attack Surface Analysis and Runtime Protection Mechanisms for FaaS-Based Machine Learning

Chetan Pathade, Vinod Dhimam, Sheheryar Ahmad +1 more

Serverless computing has achieved widespread adoption, with over 70% of AWS organizations using serverless solutions [1]. Meanwhile, machine learning...

3 months ago cs.CR cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial