AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 61–80 of 919 papers

Clear filters

Benchmark MEDIUM

AdaBFL: Multi-Layer Defensive Adaptive Aggregation for Bzantine-Robust Federated Learning

Zehui Tang, Yuchen Liu, Feihu Huang

Federated learning (FL) is a popular distributed learning paradigm in machine learning, which enables multiple clients to collaboratively train...

1 weeks ago cs.LG cs.AI cs.CR PDF

Attack MEDIUM

Understanding Adversarial Transferability in Vision-Language Models for Autonomous Driving: A Cross-Architecture Analysis

David Fernandez, Pedram MohajerAnsari, Amir Salarpour +1 more

Vision-language models (VLMs) are increasingly used in autonomous driving because they combine visual perception with language-based reasoning,...

1 weeks ago cs.CV cs.CR cs.LG PDF

Tool MEDIUM

From Prompt to Physical Actuation: Holistic Threat Modeling of LLM-Enabled Robotic Systems

Neha Nagaraja, Hayretdin Bahsi, Carlo R. da Cunha

As large language models are integrated into autonomous robotic systems for task planning and control, compromised inputs or unsafe model outputs can...

1 weeks ago cs.CR cs.AI cs.RO PDF

Attack MEDIUM

SafeTune: Mitigating Data Poisoning in LLM Fine-Tuning for RTL Code Generation

Mahshid Rezakhani, Nowfel Mashnoor, Kimia Azar +1 more

As large language models (LLMs) are increasingly fine-tuned for hardware tasks like RTL code generation, the scarcity of high-quality datasets often...

1 weeks ago cs.CR cs.AR PDF

Attack MEDIUM

Dynamic Adversarial Fine-Tuning Reorganizes Refusal Geometry

Wenhao Lan, Shan Li, Junbin Yang +2 more

Safety-aligned language models must refuse harmful requests without collapsing into broad over-refusal, but the training-time mechanisms behind this...

1 weeks ago cs.LG cs.CL cs.CR PDF

Benchmark MEDIUM

PRAG End-to-End Privacy-Preserving Retrieval-Augmented Generation

Zhijun Li, Minghui Xu, Huayi Qi +6 more

Retrieval-Augmented Generation (RAG) is essential for enhancing Large Language Models (LLMs) with external knowledge, but its reliance on cloud...

1 weeks ago cs.CR PDF

Benchmark MEDIUM

PRAG: End-to-End Privacy-Preserving Retrieval-Augmented Generation

Zhijun Li, Minghui Xu, Huayi Qi +6 more

Retrieval-Augmented Generation (RAG) is essential for enhancing Large Language Models (LLMs) with external knowledge, but its reliance on cloud...

1 weeks ago cs.CR PDF

Survey MEDIUM

SafeReview: Defending LLM-based Review Systems Against Adversarial Hidden Prompts

Yuan Xin, Yixuan Weng, Minjun Zhu +6 more

As Large Language Models (LLMs) are increasingly integrated into academic peer review, their vulnerability to adversarial prompts -- adversarial...

1 weeks ago cs.CL cs.CR PDF

Attack MEDIUM

Quantamination: Dynamic Quantization Leaks Your Data Across the Batch

Hanna Foerster, Ilia Shumailov, Cheng Zhang +3 more

Dynamic quantization emerged as a practical approach to increase the utilization and efficiency of the machine learning serving flow. Unlike static...

1 weeks ago cs.CR cs.LG PDF

Benchmark MEDIUM

Making AI-Assisted Grant Evaluation Auditable without Exposing the Model

Kemal Bicakci

Public agencies are beginning to consider large language models (LLMs) as decision-support tools for grant evaluation. This creates a practical...

2 weeks ago cs.CR cs.AI cs.CY PDF

Benchmark MEDIUM

FCMBench-Video: Benchmarking Document Video Intelligence

Runze Cui, Fangxin Shang, Yehui Yang +2 more

Document understanding is a critical capability in financial credit review, onboarding, and remote verification, where both decision accuracy and...

2 weeks ago cs.CV cs.CE cs.MM PDF

Benchmark MEDIUM

MGTEVAL: An Interactive Platform for Systemtic Evaluation of Machine-Generated Text Detectors

Yuanfan Li, Qi Zhou, Chengzhengxu Li +5 more

We present MGTEVAL, an extensible platform for systematic evaluation of Machine-Generated Text (MGT) detectors. Despite rapid progress in MGT...

2 weeks ago cs.CR cs.CL PDF

Defense MEDIUM

One Perturbation, Two Failure Modes: Probing VLM Safety via Embedding-Guided Typographic Perturbations

Ravikumar Balakrishnan, Sanket Mendapara

Typographic prompt injection exploits vision language models' (VLMs) ability to read text rendered in images, posing a growing threat as VLMs power...

2 weeks ago cs.CV PDF

Survey MEDIUM

SUDP: Secret-Use Delegation Protocol for Agentic Systems

Xiaohang Yu, Hejia Geng, William Knottenbelt

Agentic systems increasingly act with user secrets for APIs, messaging platforms, and cloud services. Today's bearer-secret interfaces implement...

2 weeks ago cs.CR cs.AI PDF

Benchmark MEDIUM

Green Shielding: A User-Centric Approach Towards Trustworthy AI

Aaron J. Li, Nicolas Sanchez, Hao Huang +8 more

Large language models (LLMs) are increasingly deployed, yet their outputs can be highly sensitive to routine, non-adversarial variation in how users...

2 weeks ago cs.CL cs.AI PDF

Benchmark MEDIUM

A Comparative Evaluation of AI Agent Security Guardrails

Qi Li, Jiu Li, Pingtao Wei +8 more

This report presents a comparative evaluation of DKnownAI Guard in AI agent security scenarios, benchmarked against three competing products: AWS...

2 weeks ago cs.CR cs.AI PDF

Defense MEDIUM

Layerwise Convergence Fingerprints for Runtime Misbehavior Detection in Large Language Models

Nay Myat Min, Long H. Pham, Jun Sun

Large language models deployed at runtime can misbehave in ways that clean-data validation cannot anticipate: training-time backdoors lie dormant...

2 weeks ago cs.CR cs.AI cs.CL PDF

Benchmark MEDIUM

GAMMAF: A Common Framework for Graph-Based Anomaly Monitoring Benchmarking in LLM Multi-Agent Systems

Pablo Mateo-Torrejón, Alfonso Sánchez-Macián

The rapid integration of Large Language Models (LLMs) into Multi-Agent Systems (MAS) has significantly enhanced their collaborative problem-solving...

2 weeks ago cs.CR cs.AI cs.MA PDF

Survey MEDIUM

A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations

Zihan Liu, Yizhen Wang, Rui Wang +2 more

Fine-tuning unlocks large language models (LLMs) for specialized applications, but its high computational cost often puts it out of reach for...

2 weeks ago cs.CR cs.CL cs.DC PDF

Attack MEDIUM

Unveiling the Backdoor Mechanism Hidden Behind Catastrophic Overfitting in Fast Adversarial Training

Mengnan Zhao, Lihe Zhang, Tianhang Zheng +2 more

Fast Adversarial Training (FAT) has attracted significant attention due to its efficiency in enhancing neural network robustness against adversarial...

2 weeks ago cs.LG cs.AI cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial