AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 541–560 of 1,930 papers

Clear filters

Defense MEDIUM

The Autonomy Tax: Defense Training Breaks LLM Agents

Shawn Li, Yue Zhao

Large language model (LLM) agents increasingly rely on external tools (file operations, API calls, database transactions) to autonomously complete...

1 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

Automated Membership Inference Attacks: Discovering MIA Signal Computations using LLM Agents

Toan Tran, Olivera Kotevska, Li Xiong

Membership inference attacks (MIAs), which enable adversaries to determine whether specific data points were part of a model's training dataset, have...

1 months ago cs.CR cs.LG PDF

Defense LOW

kRAIG: A Natural Language-Driven Agent for Automated DataOps Pipeline Generation

Rohan Siva, Kai Cheung, Lichi Li +1 more

Modern machine learning systems rely on complex data engineering workflows to extract, transform, and load (ELT) data into production pipelines....

1 months ago cs.SE cs.AI cs.CL PDF

Benchmark LOW

Box Maze: A Process-Control Architecture for Reliable LLM Reasoning

Zou Qiang

Large language models (LLMs) demonstrate strong generative capabilities but remain vulnerable to hallucination and unreliable reasoning under...

1 months ago cs.AI cs.CL PDF

Attack HIGH

On Optimizing Multimodal Jailbreaks for Spoken Language Models

Aravind Krishnan, Karolina Stańczak, Dietrich Klakow

As Spoken Language Models (SLMs) integrate speech and text modalities, they inherit the safety vulnerabilities of their LLM backbone and an expanded...

1 months ago cs.LG PDF

Attack HIGH

FedTrident: Resilient Road Condition Classification Against Poisoning Attacks in Federated Learning

Sheng Liu, Panos Papadimitratos

FL has emerged as a transformative paradigm for ITS, notably camera-based Road Condition Classification (RCC). However, by enabling collaboration,...

1 months ago cs.CR cs.AI cs.DC PDF

Defense MEDIUM

SAVeS: Steering Safety Judgments in Vision-Language Models via Semantic Cues

Carlos Hinojosa, Clemens Grange, Bernard Ghanem

Vision-language models (VLMs) are increasingly deployed in real-world and embodied settings where safety decisions depend on visual context. However,...

1 months ago cs.CV cs.AI cs.CL PDF

Attack LOW

Towards Verifiable AI with Lightweight Cryptographic Proofs of Inference

Pranay Anchuri, Matteo Campanelli, Paul Cesaretti +4 more

When large AI models are deployed as cloud-based services, clients have no guarantee that responses are correct or were produced by the intended...

1 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

Functional Subspace Watermarking for Large Language Models

Zikang Ding, Junhao Li, Suling Wu +3 more

Model watermarking utilizes internal representations to protect the ownership of large language models (LLMs). However, these features inevitably...

1 months ago cs.CR cs.AI PDF

Survey HIGH

Measuring and Exploiting Confirmation Bias in LLM-Assisted Security Code Review

Dimitris Mitropoulos, Nikolaos Alexopoulos, Georgios Alexopoulos +1 more

Security code reviews increasingly rely on systems integrating Large Language Models (LLMs), ranging from interactive assistants to autonomous agents...

1 months ago cs.SE cs.AI cs.CR PDF

Attack HIGH

Cyber-Resilient Digital Twins: Discriminating Attacks for Safe Critical Infrastructure Control

Mohammadhossein Homaei, Iman Khazrak, Rubén Molano +2 more

Industrial Cyber-Physical Systems (ICPS) face growing threats from cyber-attacks that exploit sensor and control vulnerabilities. Digital Twin (DT)...

1 months ago cs.CR cs.LG PDF

Attack HIGH

Attack by Unlearning: Unlearning-Induced Adversarial Attacks on Graph Neural Networks

Jiahao Zhang, Yilong Wang, Suhang Wang

Graph neural networks (GNNs) are widely used for learning from graph-structured data in domains such as social networks, recommender systems, and...

1 months ago cs.LG cs.CR PDF

Tool HIGH

Prompt Control-Flow Integrity: A Priority-Aware Runtime Defense Against Prompt Injection in LLM Systems

Md Takrim Ul Alam, Akif Islam, Mohd Ruhul Ameen +2 more

Large language models (LLMs) deployed behind APIs and retrieval-augmented generation (RAG) stacks are vulnerable to prompt injection attacks that may...

1 months ago cs.CR PDF

Benchmark LOW

The Validity Gap in Health AI Evaluation: A Cross-Sectional Analysis of Benchmark Composition

Alvin Rajkomar, Pavan Sudarshan, Angela Lai +1 more

Background: Clinical trials rely on transparent inclusion criteria to ensure generalizability. In contrast, benchmarks validating health-related...

1 months ago cs.AI PDF

Survey MEDIUM

Toward Reliable, Safe, and Secure LLMs for Scientific Applications

Saket Sanjeev Chaturvedi, Joshua Bergerson, Tanwi Mallick

As large language models (LLMs) evolve into autonomous "AI scientists," they promise transformative advances but introduce novel vulnerabilities,...

1 months ago cs.CR cs.CV PDF

Attack MEDIUM

Retrieval-Augmented LLMs for Security Incident Analysis

Xavier Cadet, Aditya Vikram Singh, Harsh Mamania +6 more

Investigating cybersecurity incidents requires collecting and analyzing evidence from multiple log sources, including intrusion detection alerts,...

1 months ago cs.CR cs.AI PDF

Attack MEDIUM

Retrieval-Augmented LLMs for Security Incident Analysis

Xavier Cadet, Aditya Vikram Singh, Harsh Mamania +6 more

Investigating cybersecurity incidents requires collecting and analyzing evidence from multiple log sources, including intrusion detection alerts,...

1 months ago cs.CR cs.AI PDF

Benchmark HIGH

Machine Learning for Network Attacks Classification and Statistical Evaluation of Machine Learning for Network Attacks Classification and Adversarial Learning Methodologies for Synthetic Data Generation

Iakovos-Christos Zarkadis, Christos Douligeris

Supervised detection of network attacks has always been a critical part of network intrusion detection systems (NIDS). Nowadays, in a pivotal time...

1 months ago cs.CR cs.AI stat.AP PDF

Benchmark MEDIUM

Parameter-Efficient Modality-Balanced Symmetric Fusion for Multimodal Remote Sensing Semantic Segmentation

Haocheng Li, Juepeng Zheng, Shuangxi Miao +4 more

Multimodal remote sensing semantic segmentation enhances scene interpretation by exploiting complementary physical cues from heterogeneous data....

1 months ago cs.CV PDF

Benchmark MEDIUM

WeatherReasonSeg: A Benchmark for Weather-Aware Reasoning Segmentation in Visual Language Models

Wanjun Du, Zifeng Yuan, Tingting Chen +3 more

Existing vision-language models (VLMs) have demonstrated impressive performance in reasoning-based segmentation. However, current benchmarks are...

1 months ago cs.CV cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial