AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1101–1120 of 1,207 papers

Clear filters

Survey MEDIUM

LLM Unlearning Under the Microscope: A Full-Stack View on Methods and Metrics

Chongyu Fan, Changsheng Wang, Yancheng Huang +2 more

Machine unlearning for large language models (LLMs) aims to remove undesired data, knowledge, and behaviors (e.g., for safety, privacy, or copyright)...

7 months ago cs.LG cs.CL PDF

Benchmark MEDIUM

PEAR: Planner-Executor Agent Robustness Benchmark

Shen Dong, Mingxuan Zhang, Pengfei He +4 more

Large Language Model (LLM)-based Multi-Agent Systems (MAS) have emerged as a powerful paradigm for tackling complex, multi-step tasks across diverse...

7 months ago cs.LG PDF

Tool MEDIUM

VelLMes: A high-interaction AI-based deception framework

Muris Sladić, Veronica Valeros, Carlos Catania +1 more

There are very few SotA deception systems based on Large Language Models. The existing ones are limited only to simulating one type of service,...

7 months ago cs.CR cs.AI cs.CL PDF

Benchmark MEDIUM

Exposing Citation Vulnerabilities in Generative Engines

Riku Mochizuki, Shusuke Komatsu, Souta Noguchi +1 more

We analyze answers generated by generative engines (GEs) from the perspectives of citation publishers and the content-injection barrier, defined as...

7 months ago cs.CR cs.CL cs.IR PDF

Attack MEDIUM

Get RICH or Die Scaling: Profitably Trading Inference Compute for Robustness

Tavish McDonald, Bo Lei, Stanislav Fort +2 more

Models are susceptible to adversarially out-of-distribution (OOD) data despite large training-compute investments into their robustification. Zaremba...

7 months ago cs.LG PDF

Attack MEDIUM

Are LLMs Reliable Rankers? Rank Manipulation via Two-Stage Token Optimization

Tiancheng Xing, Jerry Li, Yixuan Du +1 more

Large language models (LLMs) are increasingly used as rerankers in information retrieval, yet their ranking behavior can be steered by small,...

7 months ago cs.CL cs.AI cs.IR PDF

Benchmark MEDIUM

Distilling Lightweight Language Models for C/C++ Vulnerabilities

Zhiyuan Wei, Xiaoxuan Yang, Jing Sun +1 more

The increasing complexity of modern software systems exacerbates the prevalence of security vulnerabilities, posing risks of severe breaches and...

7 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Code Agent can be an End-to-end System Hacker: Benchmarking Real-world Threats of Computer-use Agent

Weidi Luo, Qiming Zhang, Tianyu Lu +9 more

Computer-use agent (CUA) frameworks, powered by large language models (LLMs) or multimodal LLMs (MLLMs), are rapidly maturing as assistants that can...

7 months ago cs.CR PDF

Defense MEDIUM

From Description to Detection: LLM based Extendable O-RAN Compliant Blind DoS Detection in 5G and Beyond

Thusitha Dayaratne, Ngoc Duy Pham, Viet Vo +5 more

The quality and experience of mobile communication have significantly improved with the introduction of 5G, and these improvements are expected to...

7 months ago cs.CR cs.ET cs.LG PDF

Benchmark MEDIUM

Text-to-Image Models Leave Identifiable Signatures: Implications for Leaderboard Security

Ali Naseh, Anshuman Suri, Yuefeng Peng +3 more

Generative AI leaderboards are central to evaluating model capabilities, but remain vulnerable to manipulation. Among key adversarial objectives is...

7 months ago cs.LG cs.CR PDF

Benchmark MEDIUM

DP-SNP-TIHMM: Differentially Private, Time-Inhomogeneous Hidden Markov Models for Synthesizing Genome-Wide Association Datasets

Shadi Rahimian, Mario Fritz

Single nucleotide polymorphism (SNP) datasets are fundamental to genetic studies but pose significant privacy risks when shared. The correlation of...

7 months ago cs.LG cs.CR q-bio.GN PDF

Benchmark MEDIUM

Towards Reliable and Practical LLM Security Evaluations via Bayesian Modelling

Mary Llewellyn, Annie Gray, Josh Collyer +1 more

Before adopting a new large language model (LLM) architecture, it is critical to understand vulnerabilities accurately. Existing evaluations can be...

7 months ago cs.CR cs.AI cs.CL PDF

Tool MEDIUM

AutoPentester: An LLM Agent-based Framework for Automated Pentesting

Yasod Ginige, Akila Niroshan, Sajal Jain +1 more

Penetration testing and vulnerability assessment are essential industry practices for safeguarding computer systems. As cyber threats grow in scale...

7 months ago cs.CR cs.AI PDF

Survey MEDIUM

The Role of Federated Learning in Improving Financial Security: A Survey

Cade Houston Kennedy, Amr Hilal, Morteza Momeni

With the growth of digital financial systems, robust security and privacy have become a concern for financial institutions. Even though traditional...

7 months ago cs.CR cs.AI PDF

Attack MEDIUM

Adversarial Reinforcement Learning for Large Language Model Agent Safety

Zizhao Wang, Dingcheng Li, Vaishakh Keshava +4 more

Large Language Model (LLM) agents can leverage tools such as Google Search to complete complex tasks. However, this tool usage introduces the risk of...

7 months ago cs.LG cs.AI cs.CL PDF

Benchmark MEDIUM

WeatherArchive-Bench: Benchmarking Retrieval-Augmented Reasoning for Historical Weather Archives

Yongan Yu, Xianda Du, Qingchen Hu +7 more

Historical archives on weather events are collections of enduring primary source records that offer rich, untapped narratives of how societies have...

7 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

DP-Adam-AC: Privacy-preserving Fine-Tuning of Localizable Language Models Using Adam Optimization with Adaptive Clipping

Ruoxing Yang

Large language models (LLMs) such as ChatGPT have evolved into powerful and ubiquitous tools. Fine-tuning on small datasets allows LLMs to acquire...

7 months ago cs.LG cs.AI cs.CR PDF

Benchmark MEDIUM

SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests

Punya Syon Pandey, Hai Son Le, Devansh Bhardwaj +2 more

Large language models (LLMs) are increasingly deployed in contexts where their failures can have direct sociopolitical consequences. Yet, existing...

7 months ago cs.CL cs.AI cs.LG PDF

Defense MEDIUM

P2P: A Poison-to-Poison Remedy for Reliable Backdoor Defense in LLMs

Shuai Zhao, Xinyi Wu, Shiqian Zhao +4 more

During fine-tuning, large language models (LLMs) are increasingly vulnerable to data-poisoning backdoor attacks, which compromise their reliability...

7 months ago cs.CR cs.AI cs.CL PDF

Defense MEDIUM

Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models

Anindya Sundar Das, Kangjie Chen, Monowar Bhuyan

Pre-trained language models have achieved remarkable success across a wide range of natural language processing (NLP) tasks, particularly when...

7 months ago cs.CL cs.LG PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial