AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 461–480 of 2,583 papers

Survey MEDIUM

Security in LLM-as-a-Judge: A Comprehensive SoK

Aiman Almasoud, Antony Anju, Marco Arazzi +6 more

LLM-as-a-Judge (LaaJ) is a novel paradigm in which powerful language models are used to assess the quality, safety, or correctness of generated...

1 months ago cs.CR cs.AI PDF

Attack HIGH

Beyond Corner Patches: Semantics-Aware Backdoor Attack in Federated Learning

Kavindu Herath, Joshua Zhao, Saurabh Bagchi

Backdoor attacks on federated learning (FL) are most often evaluated with synthetic corner patches or out-of-distribution (OOD) patterns that are...

1 months ago cs.CR cs.AI cs.CV PDF

Defense HIGH

Software Vulnerability Detection Using a Lightweight Graph Neural Network

Miles Farmer, Ekincan Ufuktepe, Anne Watson +4 more

Large Language Models (LLMs) have emerged as a popular choice in vulnerability detection studies given their foundational capabilities, open source...

1 months ago cs.SE cs.AI cs.CR PDF

Attack HIGH

Dummy-Aware Weighted Attack (DAWA): Breaking the Safe Sink in Dummy Class Defenses

Yunrui Yu, Xuxiang Feng, Pengda Qin +5 more

Adversarial robustness evaluation faces a critical challenge as new defense paradigms emerge that can exploit limitations in existing assessment...

1 months ago cs.LG cs.CR PDF

Tool HIGH

CivicShield: A Cross-Domain Defense-in-Depth Framework for Securing Government-Facing AI Chatbots Against Multi-Turn Adversarial Attacks

KrishnaSaiReddy Patil

LLM-based chatbots in government services face critical security gaps. Multi-turn adversarial attacks achieve over 90% success against current...

1 months ago cs.CR cs.AI PDF

Attack HIGH

Trojan-Speak: Bypassing Constitutional Classifiers with No Jailbreak Tax via Adversarial Finetuning

Bilgehan Sel, Xuanli He, Alwin Peng +2 more

Fine-tuning APIs offered by major AI providers create new attack surfaces where adversaries can bypass safety measures through targeted fine-tuning....

1 months ago cs.CR cs.AI cs.CL PDF

Benchmark MEDIUM

The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning

Yubo Li, Lu Zhang, Tianchong Jiang +2 more

Large language models systematically fail when a salient surface cue conflicts with an unstated feasibility constraint. We study this through a...

1 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

Design Principles for the Construction of a Benchmark Evaluating Security Operation Capabilities of Multi-agent AI Systems

Yicheng Cai, Mitchell John DeStefano, Guodong Dong +5 more

As Large Language Models (LLMs) and multi-agent AI systems are demonstrating increasing potential in cybersecurity operations, organizations,...

1 months ago cs.CR cs.AI PDF

Attack HIGH

\texttt{ReproMIA}: A Comprehensive Analysis of Model Reprogramming for Proactive Membership Inference Attacks

Chihan Huang, Huaijin Wang, Shuai Wang

The pervasive deployment of deep learning models across critical domains has concurrently intensified privacy concerns due to their inherent...

1 months ago cs.LG cs.CR PDF

Defense MEDIUM

FL-PBM: Pre-Training Backdoor Mitigation for Federated Learning

Osama Wehbi, Sarhad Arisdakessian, Omar Abdel Wahab +3 more

Backdoor attacks pose a significant threat to the integrity and reliability of Artificial Intelligence (AI) models, enabling adversaries to...

1 months ago cs.LG cs.CR cs.DC PDF

Attack HIGH

XSPA: Crafting Imperceptible X-Shaped Sparse Adversarial Perturbations for Transferable Attacks on VLMs

Chengyin Hu, Jiaju Han, Xuemeng Sun +6 more

Vision-language models (VLMs) rely on a shared visual-textual representation space to perform tasks such as zero-shot classification, image...

1 months ago cs.CV PDF

Survey MEDIUM

Crossing the NL/PL Divide: Information Flow Analysis Across the NL/PL Boundary in LLM-Integrated Code

Zihao Xu, Xiao Cheng, Ruijie Meng +1 more

LLM API calls are becoming a ubiquitous program construct, yet they create a boundary that no existing program analysis can cross: runtime values...

1 months ago cs.SE cs.AI PDF

Defense HIGH

VulnScout-C: A Lightweight Transformer for C Code Vulnerability Detection

Aymen Lassoued, Nacef Mbarek, Bechir Dardouri +3 more

Vulnerability detection in C programs is a critical challenge in software security. Although large language models (LLMs) achieve strong detection...

1 months ago cs.CR PDF

Benchmark MEDIUM

Evaluating Privilege Usage of Agents on Real-World Tools

Quan Zhang, Lianhang Fu, Lvsi Lian +5 more

Equipping LLM agents with real-world tools can substantially improve productivity. However, granting agents autonomy over tool use also transfers the...

1 months ago cs.CR cs.AI PDF

Tool HIGH

ORACAL: A Robust and Explainable Multimodal Framework for Smart Contract Vulnerability Detection with Causal Graph Enrichment

Tran Duong Minh Dai, Triet Huynh Minh Le, M. Ali Babar +2 more

Although Graph Neural Networks (GNNs) have shown promise for smart contract vulnerability detection, they still face significant limitations....

1 months ago cs.LG cs.CR PDF

Attack HIGH

Kill-Chain Canaries: Stage-Level Tracking of Prompt Injection Across Attack Surfaces and Model Safety Tiers

Haochuan Kevin Wang

We present a stage-decomposed analysis of prompt injection attacks against five frontier LLM agents. Prior work measures task-level attack success...

1 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

FedFG: Privacy-Preserving and Robust Federated Learning via Flow-Matching Generation

Ruiyang Wang, Rong Pan, Zhengan Yao

Federated learning (FL) enables distributed clients to collaboratively train a global model using local private data. Nevertheless, recent studies...

1 months ago cs.CR cs.AI cs.CV PDF

Benchmark LOW

CDH-Bench: A Commonsense-Driven Hallucination Benchmark for Evaluating Visual Fidelity in Vision-Language Models

Kesheng Chen, Yamin Hu, Qi Zhou +2 more

Vision-language models (VLMs) achieve strong performance on many benchmarks, yet a basic reliability question remains underexplored: when visual...

1 months ago cs.CV cs.AI cs.CL PDF

Survey HIGH

Adversarial Attacks on Multimodal Large Language Models: A Comprehensive Survey

Bhavuk Jain, Sercan Ö. Arık, Hardeo K. Thakur

Multimodal large language models (MLLMs) integrate information from multiple modalities such as text, images, audio, and video, enabling complex...

1 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Seeing to Ground: Visual Attention for Hallucination-Resilient MDLLMs

Vishal Narnaware, Animesh Gupta, Kevin Zhai +2 more

Multimodal Diffusion Large Language Models (MDLLMs) achieve high-concurrency generation through parallel masked decoding, yet the architectures...

1 months ago cs.CV PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial