AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total
2,077
Attack
809
Benchmark
603
Defense
272
Tool
226
Survey
113

Showing 21–40 of 259 papers

Clear filters
Attack MEDIUM

Good-Enough LLM Obfuscation (GELO)

Anatoly Belikov, Ilya Fedotov

Large Language Models (LLMs) are increasingly served on shared accelerators where an adversary with read access to device memory can observe KV...

2 weeks ago cs.CR cs.LG PDF
Attack MEDIUM

Tracking Capabilities for Safer Agents

Martin Odersky, Yaoyu Zhao, Yichen Xu +2 more

AI agents that interact with the real world through tool calls pose fundamental safety challenges: agents might leak private information, cause...

3 weeks ago cs.AI cs.PL PDF
Attack MEDIUM

Training Agents to Self-Report Misbehavior

Bruce W. Lee, Chen Yueh-Han, Tomek Korbak

Frontier AI agents may pursue hidden goals while concealing their pursuit from oversight. Alignment training aims to prevent such behavior by...

3 weeks ago cs.LG cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial