AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total
2,529
Attack
969
Benchmark
729
Defense
345
Tool
272
Survey
142

Showing 41–60 of 1,207 papers

Clear filters
Defense MEDIUM

Self-Mined Hardness for Safety Fine-Tuning

Prakhar Gupta, Garv Shah, Donghua Zhang

Safety fine-tuning of language models typically requires a curated adversarial dataset. We take a different approach: score each candidate prompt's...

1 weeks ago cs.LG cs.AI cs.CR PDF
Attack MEDIUM

Dependency-Aware Privacy for Multi-turn Agents

Divyam Anshumaan, Sarthak Choudhary, Nils Palumbo +1 more

LLM agents release private data across multi-service interactions. Existing prompt sanitizers based on metric differential privacy treat each release...

1 weeks ago cs.CR PDF
Benchmark MEDIUM

On the Privacy of LLMs: An Ablation Study

Karima Makhlouf, Lamiaa Basyoni, Syed Khaderi +4 more

Large language models (LLMs) are increasingly deployed in interactive and retrieval-augmented settings, raising significant privacy concerns. While...

1 weeks ago cs.CR cs.AI PDF
Attack MEDIUM

Low Rank Adaptation for Adversarial Perturbation

Han Liu, Shanghao Shi, Yevgeniy Vorobeychik +2 more

Low-Rank Adaptation (LoRA), which leverages the insight that model updates typically reside in a low-dimensional space, has significantly improved...

1 weeks ago cs.LG cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial