AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 601–620 of 715 papers

Clear filters

Attack HIGH

Adversarial Attacks Against Deep Learning-Based Radio Frequency Fingerprint Identification

Jie Ma, Junqing Zhang, Guanxiong Shen +2 more

Radio frequency fingerprint identification (RFFI) is an emerging technique for the lightweight authentication of wireless Internet of things (IoT)...

5 months ago cs.CR cs.LG PDF

Attack MEDIUM

PHANTOM: Progressive High-fidelity Adversarial Network for Threat Object Modeling

Jamal Al-Karaki, Muhammad Al-Zafar Khan, Rand Derar Mohammad Al Athamneh

The scarcity of cyberattack data hinders the development of robust intrusion detection systems. This paper introduces PHANTOM, a novel adversarial...

5 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

Persistent Backdoor Attacks under Continual Fine-Tuning of LLMs

Jing Cui, Yufei Han, Jianbin Jiao +1 more

Backdoor attacks embed malicious behaviors into Large Language Models (LLMs), enabling adversaries to trigger harmful outputs or bypass safety...

5 months ago cs.CR cs.AI PDF

Attack MEDIUM

Adaptive Intrusion Detection System Leveraging Dynamic Neural Models with Adversarial Learning for 5G/6G Networks

Neha, Tarunpreet Bhatia

Intrusion Detection Systems (IDS) are critical components in safeguarding 5G/6G networks from both internal and external cyber threats. While...

5 months ago cs.CR cs.LG PDF

Attack HIGH

FlipLLM: Efficient Bit-Flip Attacks on Multimodal LLMs using Reinforcement Learning

Khurram Khalil, Khaza Anuarul Hoque

Generative Artificial Intelligence models, such as Large Language Models (LLMs) and Large Vision Models (VLMs), exhibit state-of-the-art performance...

5 months ago cs.CR cs.AI PDF

Attack HIGH

SCOUT: A Defense Against Data Poisoning Attacks in Fine-Tuned Language Models

Mohamed Afane, Abhishek Satyam, Ke Chen +3 more

Backdoor attacks create significant security threats to language models by embedding hidden triggers that manipulate model behavior during inference,...

5 months ago cs.CR cs.CL PDF

Attack HIGH

ObliInjection: Order-Oblivious Prompt Injection Attack to LLM Agents with Multi-source Data

Reachal Wang, Yuqi Jia, Neil Zhenqiang Gong

Prompt injection attacks aim to contaminate the input data of an LLM to mislead it into completing an attacker-chosen task instead of the intended...

5 months ago cs.CR PDF

Attack MEDIUM

Improved Pseudorandom Codes from Permuted Puzzles

Miranda Christ, Noah Golowich, Sam Gunn +2 more

Watermarks are an essential tool for identifying AI-generated content. Recently, Christ and Gunn (CRYPTO '24) introduced pseudorandom...

5 months ago cs.CR PDF

Attack HIGH

When Tables Leak: Attacking String Memorization in LLM-Based Tabular Data Generation

Joshua Ward, Bochao Gu, Chi-Hua Wang +1 more

Large Language Models (LLMs) have recently demonstrated remarkable performance in generating high-quality tabular synthetic data. In practice, two...

5 months ago cs.LG cs.AI PDF

Attack MEDIUM

Insured Agents: A Decentralized Trust Insurance Mechanism for Agentic Economy

Botao 'Amber' Hu, Bangdao Chen

The emerging "agentic web" envisions large populations of autonomous agents coordinating, transacting, and delegating across open networks. Yet many...

5 months ago cs.CY cs.MA PDF

Attack HIGH

Attention is All You Need to Defend Against Indirect Prompt Injection Attacks in LLMs

Yinan Zhong, Qianhao Miao, Yanjiao Chen +3 more

Large Language Models (LLMs) have been integrated into many applications (e.g., web agents) to perform more sophisticated tasks. However,...

5 months ago cs.CR PDF

Attack HIGH

MIRAGE: Misleading Retrieval-Augmented Generation via Black-box and Query-agnostic Poisoning Attacks

Tailun Chen, Yu He, Yan Wang +9 more

Retrieval-Augmented Generation (RAG) systems enhance LLMs with external knowledge but introduce a critical attack surface: corpus poisoning. While...

5 months ago cs.CR PDF

Attack HIGH

How a Bit Becomes a Story: Semantic Steering via Differentiable Fault Injection

Zafaryab Haider, Md Hafizur Rahman, Shane Moeykens +2 more

Hard-to-detect hardware bit flips, from either malicious circuitry or bugs, have already been shown to make transformers vulnerable in non-generative...

5 months ago cs.LG cs.AI PDF

Attack LOW

Universal Adversarial Suffixes for Language Models Using Reinforcement Learning with Calibrated Reward

Sampriti Soor, Suklav Ghosh, Arijit Sur

Language models are vulnerable to short adversarial suffixes that can reliably alter predictions. Previous works usually find such suffixes with...

5 months ago cs.CL PDF

Attack HIGH

Detecting Ambiguity Aversion in Cyberattack Behavior to Inform Cognitive Defense Strategies

Stephan Carney, Soham Hans, Sofia Hirschmann +4 more

Adversaries (hackers) attempting to infiltrate networks frequently face uncertainty in their operational environments. This research explores the...

5 months ago cs.CR cs.HC PDF

Attack HIGH

TROJail: Trajectory-Level Optimization for Multi-Turn Large Language Model Jailbreaks with Process Rewards

Xiqiao Xiong, Ouxiang Li, Zhuo Liu +5 more

Large language models have seen widespread adoption, yet they remain vulnerable to multi-turn jailbreak attacks, threatening their safe deployment....

5 months ago cs.AI cs.LG PDF

Attack LOW

AdLift: Lifting Adversarial Perturbations to Safeguard 3D Gaussian Splatting Assets Against Instruction-Driven Editing

Ziming Hong, Tianyu Huang, Runnan Chen +4 more

Recent studies have extended diffusion-based instruction-driven 2D image editing pipelines to 3D Gaussian Splatting (3DGS), enabling faithful...

5 months ago cs.CV cs.CR cs.LG PDF

Attack HIGH

Response-Based Knowledge Distillation for Multilingual Jailbreak Prevention Unwittingly Compromises Safety

Max Zhang, Derek Liu, Kai Zhang +2 more

Large language models (LLMs) are increasingly deployed worldwide, yet their safety alignment remains predominantly English-centric. This allows for...

5 months ago cs.CL PDF

Attack HIGH

ThinkTrap: Denial-of-Service Attacks against Black-box LLM Services via Infinite Thinking

Yunzhe Li, Jianan Wang, Hongzi Zhu +3 more

Large Language Models (LLMs) have become foundational components in a wide range of applications, including natural language understanding and...

5 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

Replicating TEMPEST at Scale: Multi-Turn Adversarial Attacks Against Trillion-Parameter Frontier Models

Richard Young

Despite substantial investment in safety alignment, the vulnerability of large language models to sophisticated multi-turn adversarial attacks...

5 months ago cs.CL PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial