AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 21–40 of 312 papers

Clear filters

Attack MEDIUM

Low Rank Adaptation for Adversarial Perturbation

Han Liu, Shanghao Shi, Yevgeniy Vorobeychik +2 more

Low-Rank Adaptation (LoRA), which leverages the insight that model updates typically reside in a low-dimensional space, has significantly improved...

1 weeks ago cs.LG cs.CR PDF

Attack MEDIUM

Understanding Adversarial Transferability in Vision-Language Models for Autonomous Driving: A Cross-Architecture Analysis

David Fernandez, Pedram MohajerAnsari, Amir Salarpour +1 more

Vision-language models (VLMs) are increasingly used in autonomous driving because they combine visual perception with language-based reasoning,...

1 weeks ago cs.CV cs.CR cs.LG PDF

Attack MEDIUM

SafeTune: Mitigating Data Poisoning in LLM Fine-Tuning for RTL Code Generation

Mahshid Rezakhani, Nowfel Mashnoor, Kimia Azar +1 more

As large language models (LLMs) are increasingly fine-tuned for hardware tasks like RTL code generation, the scarcity of high-quality datasets often...

1 weeks ago cs.CR cs.AR PDF

Attack MEDIUM

Dynamic Adversarial Fine-Tuning Reorganizes Refusal Geometry

Wenhao Lan, Shan Li, Junbin Yang +2 more

Safety-aligned language models must refuse harmful requests without collapsing into broad over-refusal, but the training-time mechanisms behind this...

1 weeks ago cs.LG cs.CL cs.CR PDF

Attack MEDIUM

Quantamination: Dynamic Quantization Leaks Your Data Across the Batch

Hanna Foerster, Ilia Shumailov, Cheng Zhang +3 more

Dynamic quantization emerged as a practical approach to increase the utilization and efficiency of the machine learning serving flow. Unlike static...

1 weeks ago cs.CR cs.LG PDF

Attack MEDIUM

Unveiling the Backdoor Mechanism Hidden Behind Catastrophic Overfitting in Fast Adversarial Training

Mengnan Zhao, Lihe Zhang, Tianhang Zheng +2 more

Fast Adversarial Training (FAT) has attracted significant attention due to its efficiency in enhancing neural network robustness against adversarial...

2 weeks ago cs.LG cs.AI cs.CR PDF

Attack MEDIUM

Mitigating Error Amplification in Fast Adversarial Training

Mengnan Zhao, Lihe Zhang, Bo Wang +3 more

Fast Adversarial Training (FAT) has proven effective in enhancing model robustness by encouraging networks to learn perturbation-invariant...

2 weeks ago cs.LG cs.CR PDF

Attack MEDIUM

Dialect vs Demographics: Quantifying LLM Bias from Implicit Linguistic Signals vs. Explicit User Profiles

Irti Haq, Belén Saldías

As state-of-the-art Large Language Models (LLMs) have become ubiquitous, ensuring equitable performance across diverse demographics is critical....

2 weeks ago cs.CY cs.AI cs.CL PDF

Attack MEDIUM

Auto-ART: Structured Literature Synthesis and Automated Adversarial Robustness Testing

Abhijit Talluri

Adversarial robustness evaluation underpins every claim of trustworthy ML deployment, yet the field suffers from fragmented protocols and undetected...

2 weeks ago cs.CR cs.LG PDF

Attack MEDIUM

Beyond Indistinguishability: Measuring Extraction Risk in LLM APIs

Ruixuan Liu, David Evans, Li Xiong

Indistinguishability properties such as differential privacy bounds or low empirically measured membership inference are widely treated as proxies to...

3 weeks ago cs.CR cs.CL cs.LG PDF

Attack MEDIUM

Privatar: Scalable Privacy-preserving Multi-user VR via Secure Offloading

Jianming Tong, Hanshen Xiao, Krishna Kumar Nair +5 more

Multi-user virtual reality enables immersive interaction. However, rendering avatars for numerous participants on each headset incurs prohibitive...

3 weeks ago cs.CR cs.AR cs.CV PDF

Attack MEDIUM

Segment-Level Coherence for Robust Harmful Intent Probing in LLMs

Xuanli He, Bilgehan Sel, Faizan Ali +3 more

Large Language Models (LLMs) are increasingly exposed to adaptive jailbreaking, particularly in high-stakes Chemical, Biological, Radiological, and...

3 weeks ago cs.CL cs.CR PDF

Attack MEDIUM

NeuroTrace: Inference Provenance-Based Detection of Adversarial Examples

Firas Ben Hmida, Philemon Hailemariam, Kashif Ali Khan +1 more

Deep neural networks (DNNs) remain largely opaque at inference time, limiting our ability to detect and diagnose malicious input manipulations such...

3 weeks ago cs.CR PDF

Attack MEDIUM

From Where Words Come: Efficient Regularization of Code Tokenizers Through Source Attribution

Pavel Chizhov, Egor Bogomolov, Ivan P. Yamshchikov

Efficiency and safety of Large Language Models (LLMs), among other factors, rely on the quality of tokenization. A good tokenizer not only improves...

3 weeks ago cs.CL PDF

Attack MEDIUM

Understanding and Improving Continuous Adversarial Training for LLMs via In-context Learning Theory

Shaopeng Fu, Di Wang

Adversarial training (AT) is an effective defense for large language models (LLMs) against jailbreak attacks, but performing AT on LLMs is costly. To...

4 weeks ago cs.LG cs.CR stat.ML PDF

Attack MEDIUM

Robust Semi-Supervised Temporal Intrusion Detection for Adversarial Cloud Networks

Anasuya Chattopadhyay, Daniel Reti, Hans D. Schotten

Cloud networks increasingly rely on machine learning based Network Intrusion Detection Systems to defend against evolving cyber threats. However,...

4 weeks ago cs.LG cs.CR PDF

Attack MEDIUM

LLM-Guided Prompt Evolution for Password Guessing

Vladimir A. Mazin, Mikhail A. Zorin, Dmitrii S. Korzh +3 more

Passwords still remain a dominant authentication method, yet their security is routinely subverted by predictable user choices and large-scale...

4 weeks ago cs.CR cs.AI PDF

Attack MEDIUM

AdversarialCoT: Single-Document Retrieval Poisoning for LLM Reasoning

Hongru Song, Yu-An Liu, Ruqing Zhang +4 more

Retrieval-augmented generation (RAG) enhances large language model (LLM) reasoning by retrieving external documents, but also opens up new attack...

4 weeks ago cs.IR PDF

Attack MEDIUM

Fully Homomorphic Encryption on Llama 3 model for privacy preserving LLM inference

Anes Abdennebi, Nadjia Kara, Laaziz Lahlou

The applications of Generative Artificial Intelligence (GenAI) and their intersections with data-driven fields, such as healthcare, finance,...

4 weeks ago cs.CR cs.AI PDF

Attack MEDIUM

Beyond A Fixed Seal: Adaptive Stealing Watermark in Large Language Models

Shuhao Zhang, Yuli Chen, Jiale Han +2 more

Watermarking provides a critical safeguard for large language model (LLM) services by facilitating the detection of LLM-generated text....

1 months ago cs.CR cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial