AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 261–280 of 312 papers

Clear filters

Attack MEDIUM

Collaborative penetration testing suite for emerging generative AI algorithms

Petar Radanliev

Problem Space: AI Vulnerabilities and Quantum Threats Generative AI vulnerabilities: model inversion, data poisoning, adversarial inputs. Quantum...

6 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

Agentic Reinforcement Learning for Search is Unsafe

Yushi Yang, Shreyansh Padarha, Andrew Lee +1 more

Agentic reinforcement learning (RL) trains large language models to autonomously call tools during reasoning, with search as the most common...

6 months ago cs.CL PDF

Attack MEDIUM

Can Transformer Memory Be Corrupted? Investigating Cache-Side Vulnerabilities in Large Language Models

Elias Hossain, Swayamjit Saha, Somshubhra Roy +1 more

Even when prompts and parameters are secured, transformer language models remain vulnerable because their key-value (KV) cache during inference...

6 months ago cs.CR cs.AI PDF

Attack MEDIUM

Black-box Optimization of LLM Outputs by Asking for Directions

Jie Zhang, Meng Ding, Yang Liu +2 more

We present a novel approach for attacking black-box large language models (LLMs) by exploiting their ability to express confidence in natural...

6 months ago cs.CR cs.LG PDF

Attack MEDIUM

DistilLock: Safeguarding LLMs from Unauthorized Knowledge Distillation on the Edge

Asmita Mohanty, Gezheng Kang, Lei Gao +1 more

Large Language Models (LLMs) have demonstrated strong performance across diverse tasks, but fine-tuning them typically relies on cloud-based,...

6 months ago cs.CR cs.LG PDF

Attack MEDIUM

Detecting Adversarial Fine-tuning with Auditing Agents

Sarah Egler, John Schulman, Nicholas Carlini

Large Language Model (LLM) providers expose fine-tuning APIs that let end users fine-tune their frontier LLMs. Unfortunately, it has been shown that...

6 months ago cs.CR cs.AI PDF

Attack MEDIUM

Are My Optimized Prompts Compromised? Exploring Vulnerabilities of LLM-based Optimizers

Andrew Zhao, Reshmi Ghosh, Vitor Carvalho +4 more

Large language model (LLM) systems increasingly power everyday AI applications such as chatbots, computer-use assistants, and autonomous robots,...

6 months ago cs.LG cs.AI cs.CL PDF

Attack MEDIUM

RHINO: Guided Reasoning for Mapping Network Logs to Adversarial Tactics and Techniques with Large Language Models

Fanchao Meng, Jiaping Gui, Yunbo Li +1 more

Modern Network Intrusion Detection Systems generate vast volumes of low-level alerts, yet these outputs remain semantically fragmented, requiring...

6 months ago cs.CR PDF

Attack MEDIUM

TAO: Tolerance-Aware Optimistic Verification for Floating-Point Neural Networks

Jianzhu Yao, Hongxu Su, Taobo Liao +4 more

Neural networks increasingly run on hardware outside the user's control (cloud GPUs, inference marketplaces). Yet ML-as-a-Service reveals little...

6 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

DeepTrust: Multi-Step Classification through Dissimilar Adversarial Representations for Robust Android Malware Detection

Daniel Pulido-Cortázar, Daniel Gibert, Felip Manyà

Over the last decade, machine learning has been extensively applied to identify malicious Android applications. However, such approaches remain...

7 months ago cs.CR cs.LG PDF

Attack MEDIUM

Robust ML-based Detection of Conventional, LLM-Generated, and Adversarial Phishing Emails Using Advanced Text Preprocessing

Deeksha Hareesha Kulal, Chidozie Princewill Arannonu, Afsah Anwar +2 more

Phishing remains a critical cybersecurity threat, especially with the advent of large language models (LLMs) capable of generating highly convincing...

7 months ago cs.CR PDF

Attack MEDIUM

Living Off the LLM: How LLMs Will Change Adversary Tactics

Sean Oesch, Jack Hutchins, Luke Koch +1 more

In living off the land attacks, malicious actors use legitimate tools and processes already present on a system to avoid detection. In this paper, we...

7 months ago cs.CR cs.AI PDF

Attack MEDIUM

Large Language Models Are Effective Code Watermarkers

Rui Xu, Jiawei Chen, Zhaoxia Yin +2 more

The widespread use of large language models (LLMs) and open-source code has raised ethical and security concerns regarding the distribution and...

7 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

Generative AI for Biosciences: Emerging Threats and Roadmap to Biosecurity

Zaixi Zhang, Souradip Chakraborty, Amrit Singh Bedi +16 more

The rapid adoption of generative artificial intelligence (GenAI) in the biosciences is transforming biotechnology, medicine, and synthetic biology....

7 months ago cs.CR q-bio.BM PDF

Attack MEDIUM

Safeguarding Efficacy in Large Language Models: Evaluating Resistance to Human-Written and Algorithmic Adversarial Prompts

Tiarnaigh Downey-Webb, Olamide Jogunola, Oluwaseun Ajao

This paper presents a systematic security assessment of four prominent Large Language Models (LLMs) against diverse adversarial attack vectors. We...

7 months ago cs.CR cs.AI cs.CY PDF

Attack MEDIUM

"I know it's not right, but that's what it said to do": Investigating Trust in AI Chatbots for Cybersecurity Policy

Brandon Lit, Edward Crowder, Daniel Vogel +1 more

AI chatbots are an emerging security attack vector, vulnerable to threats such as prompt injection, and rogue chatbot creation. When deployed in...

7 months ago cs.HC PDF

Attack MEDIUM

The Model's Language Matters: A Comparative Privacy Analysis of LLMs

Abhishek K. Mishra, Antoine Boutet, Lucas Magnana

Large Language Models (LLMs) are increasingly deployed across multilingual applications that handle sensitive data, yet their scale and linguistic...

7 months ago cs.CL cs.CR PDF

Attack MEDIUM

VisualDAN: Exposing Vulnerabilities in VLMs with Visual-Driven DAN Commands

Aofan Liu, Lulu Tang

Vision-Language Models (VLMs) have garnered significant attention for their remarkable ability to interpret and generate multimodal content. However,...

7 months ago cs.CR cs.AI PDF

Attack MEDIUM

Chain-of-Trigger: An Agentic Backdoor that Paradoxically Enhances Agentic Robustness

Jiyang Qiu, Xinbei Ma, Yunqing Xu +2 more

The rapid deployment of large language model (LLM)-based agents in real-world applications has raised serious concerns about their trustworthiness....

7 months ago cs.AI PDF

Attack MEDIUM

Get RICH or Die Scaling: Profitably Trading Inference Compute for Robustness

Tavish McDonald, Bo Lei, Stanislav Fort +2 more

Models are susceptible to adversarially out-of-distribution (OOD) data despite large training-compute investments into their robustification. Zaremba...

7 months ago cs.LG PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial