AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 981–1000 of 1,220 papers

Clear filters

Defense MEDIUM

Reimagining Safety Alignment with An Image

Yifan Xia, Guorui Chen, Wenqian Yu +3 more

Large language models (LLMs) excel in diverse applications but face dual challenges: generating harmful content under jailbreak attacks and...

6 months ago cs.AI cs.CR PDF

Defense MEDIUM

Proactive DDoS Detection and Mitigation in Decentralized Software-Defined Networking via Port-Level Monitoring and Zero-Training Large Language Models

Mohammed N. Swileh, Shengli Zhang

Centralized Software-Defined Networking (cSDN) offers flexible and programmable control of networks but suffers from scalability and reliability...

6 months ago cs.CR cs.AI PDF

Attack MEDIUM

Diffusion LLMs are Natural Adversaries for any LLM

David Lüdke, Tom Wollschläger, Paul Ungermann +2 more

We introduce a novel framework that transforms the resource-intensive (adversarial) prompt optimization problem into an \emph{efficient, amortized...

6 months ago cs.LG stat.ML PDF

Survey MEDIUM

Prevalence of Security and Privacy Risk-Inducing Usage of AI-based Conversational Agents

Kathrin Grosse, Nico Ebert

Recent improvement gains in large language models (LLMs) have lead to everyday usage of AI-based Conversational Agents (CAs). At the same time, LLMs...

6 months ago cs.CR PDF

Attack MEDIUM

Measuring the Security of Mobile LLM Agents under Adversarial Prompts from Untrusted Third-Party Channels

Chenghao Du, Quanfeng Huang, Tingxuan Tang +3 more

Large Language Models (LLMs) have transformed software development, enabling AI-powered applications known as LLM-based agents that promise to...

6 months ago cs.CR PDF

Benchmark MEDIUM

Self-HarmLLM: Can Large Language Model Harm Itself?

Heehwan Kim, Sungjune Park, Daeseon Choi

Large Language Models (LLMs) are generally equipped with guardrails to block the generation of harmful responses. However, existing defenses always...

6 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

Adapting Large Language Models to Emerging Cybersecurity using Retrieval Augmented Generation

Arnabh Borah, Md Tanvirul Alam, Nidhi Rastogi

Security applications are increasingly relying on large language models (LLMs) for cyber threat detection; however, their opaque reasoning often...

6 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Reasoning Up the Instruction Ladder for Controllable Language Models

Zishuo Zheng, Vidhisha Balachandran, Chan Young Park +2 more

As large language model (LLM) based systems take on high-stakes roles in real-world decision-making, they must reconcile competing instructions from...

6 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

Broken-Token: Filtering Obfuscated Prompts by Counting Characters-Per-Token

Shaked Zychlinski, Yuval Kainan

Large Language Models (LLMs) are susceptible to jailbreak attacks where malicious prompts are disguised using ciphers and character-level encodings...

6 months ago cs.CR cs.AI cs.CL PDF

Benchmark MEDIUM

SSCL-BW: Sample-Specific Clean-Label Backdoor Watermarking for Dataset Ownership Verification

Yingjia Wang, Ting Qiao, Xing Liu +3 more

The rapid advancement of deep neural networks (DNNs) heavily relies on large-scale, high-quality datasets. However, unauthorized commercial use of...

6 months ago cs.CR cs.AI PDF

Attack MEDIUM

PVMark: Enabling Public Verifiability for LLM Watermarking Schemes

Haohua Duan, Liyao Xiang, Xin Zhang

Watermarking schemes for large language models (LLMs) have been proposed to identify the source of the generated text, mitigating the potential...

6 months ago cs.CR cs.CL cs.LG PDF

Attack MEDIUM

PEEL: A Poisoning-Exposing Encoding Theoretical Framework for Local Differential Privacy

Lisha Shuai, Jiuling Dong, Nan Zhang +5 more

Local Differential Privacy (LDP) is a widely adopted privacy-protection model in the Internet of Things (IoT) due to its lightweight, decentralized,...

6 months ago cs.CR PDF

Defense MEDIUM

ALMGuard: Safety Shortcuts and Where to Find Them as Guardrails for Audio-Language Models

Weifei Jin, Yuxin Cao, Junjie Su +5 more

Recent advances in Audio-Language Models (ALMs) have significantly improved multimodal understanding capabilities. However, the introduction of the...

6 months ago cs.SD cs.CR cs.LG PDF

Benchmark MEDIUM

LLMBisect: Breaking Barriers in Bug Bisection with A Comparative Analysis Pipeline

Zheng Zhang, Haonan Li, Xingyu Li +2 more

Bug bisection has been an important security task that aims to understand the range of software versions impacted by a bug, i.e., identifying the...

6 months ago cs.LG PDF

Benchmark MEDIUM

RECAP: Reproducing Copyrighted Data from LLMs Training with an Agentic Pipeline

André V. Duarte, Xuying li, Bin Zeng +3 more

If we cannot inspect the training data of a large language model (LLM), how can we ever know what it has seen? We believe the most compelling...

6 months ago cs.CL PDF

Survey MEDIUM

SoK: Honeypots & LLMs, More Than the Sum of Their Parts?

Robert A. Bridges, Thomas R. Mitchell, Mauricio Muñoz +1 more

The advent of Large Language Models (LLMs) promised to resolve the long-standing paradox in honeypot design, achieving high-fidelity deception with...

6 months ago cs.CR PDF

Benchmark MEDIUM

VISAT: Benchmarking Adversarial and Distribution Shift Robustness in Traffic Sign Recognition with Visual Attributes

Simon Yu, Peilin Yu, Hongbo Zheng +3 more

We present VISAT, a novel open dataset and benchmarking suite for evaluating model robustness in the task of traffic sign recognition with the...

6 months ago cs.CR cs.AI cs.LG PDF

Tool MEDIUM

AAGATE: A NIST AI RMF-Aligned Governance Platform for Agentic AI

Ken Huang, Kyriakos Rock Lambros, Jerry Huang +8 more

This paper introduces the Agentic AI Governance Assurance & Trust Engine (AAGATE), a Kubernetes-native control plane designed to address the unique...

6 months ago cs.CR cs.AI PDF

Attack MEDIUM

SmoothGuard: Defending Multimodal Large Language Models with Noise Perturbation and Clustering Aggregation

Guangzhi Su, Shuchang Huang, Yutong Ke +3 more

Multimodal large language models (MLLMs) have achieved impressive performance across diverse tasks by jointly reasoning over textual and visual...

6 months ago cs.LG cs.CR PDF

Benchmark MEDIUM

NetEcho: From Real-World Streaming Side-Channels to Full LLM Conversation Recovery

Zheng Zhang, Guanlong Wu, Sen Deng +2 more

In the rapidly expanding landscape of Large Language Model (LLM) applications, real-time output streaming has become the dominant interaction...

6 months ago cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial