AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 301–312 of 312 papers

Clear filters

Attack MEDIUM

PRIVMARK: Private Large Language Models Watermarking with MPC

Thomas Fargues, Ye Dong, Tianwei Zhang +1 more

The rapid growth of Large Language Models (LLMs) has highlighted the pressing need for reliable mechanisms to verify content ownership and ensure...

7 months ago cs.CR PDF

Attack MEDIUM

FuncPoison: Poisoning Function Library to Hijack Multi-agent Autonomous Driving Systems

Yuzhen Long, Songze Li

Autonomous driving systems increasingly rely on multi-agent architectures powered by large language models (LLMs), where specialized agents...

7 months ago cs.CR cs.LG PDF

Attack MEDIUM

Dual Mechanisms of Value Expression: Intrinsic vs. Prompted Values in Large Language Models

Jongwook Han, Jongwon Lim, Injin Kong +1 more

Large language models can express values in two main ways: (1) intrinsic expression, reflecting the model's inherent values learned during training,...

7 months ago cs.CL cs.AI PDF

Attack MEDIUM

AdvChain: Adversarial Chain-of-Thought Tuning for Robust Safety Alignment of Large Reasoning Models

Zihao Zhu, Xinyu Wu, Gehan Hu +3 more

Large Reasoning Models (LRMs) have demonstrated remarkable capabilities in complex problem-solving through Chain-of-Thought (CoT) reasoning. However,...

7 months ago cs.AI cs.CL PDF

Attack MEDIUM

Dual-Space Smoothness for Robust and Balanced LLM Unlearning

Han Yan, Zheyuan Liu, Meng Jiang

With the rapid advancement of large language models, Machine Unlearning has emerged to address growing concerns around user privacy, copyright...

7 months ago cs.CL cs.AI PDF

Attack MEDIUM

LLM Watermark Evasion via Bias Inversion

Jeongyeon Hwang, Sangdon Park, Jungseul Ok

Watermarking offers a promising solution for detecting LLM-generated content, yet its robustness under realistic query-free (black-box) evasion...

7 months ago cs.CR cs.AI PDF

Attack MEDIUM

What Do They Fix? LLM-Aided Categorization of Security Patches for Critical Memory Bugs

Xingyu Li, Juefei Pu, Yifan Wu +13 more

Open-source software projects are foundational to modern software ecosystems, with the Linux kernel standing out as a critical exemplar due to its...

7 months ago cs.CR cs.LG PDF

Attack MEDIUM

Adversarial training with restricted data manipulation

David Benfield, Stefano Coniglio, Phan Tu Vuong +1 more

Adversarial machine learning concerns situations in which learners face attacks from active adversaries. Such scenarios arise in applications such as...

7 months ago cs.LG cs.CR PDF

Attack MEDIUM

Backdoor Attribution: Elucidating and Controlling Backdoor in Language Models

Miao Yu, Zhenhong Zhou, Moayad Aloqaily +5 more

Fine-tuned Large Language Models (LLMs) are vulnerable to backdoor attacks through data poisoning, yet the internal mechanisms governing these...

7 months ago cs.CR cs.AI PDF

Attack MEDIUM

PMark: Towards Robust and Distortion-free Semantic-level Watermarking with Channel Constraints

Jiahao Huo, Shuliang Liu, Bin Wang +5 more

Semantic-level watermarking (SWM) for large language models (LLMs) enhances watermarking robustness against text modifications and paraphrasing...

7 months ago cs.CR cs.CL PDF

Attack MEDIUM

Cryptographic Backdoor for Neural Networks: Boon and Bane

Anh Tu Ngo, Anupam Chattopadhyay, Subhamoy Maitra

In this paper we show that cryptographic backdoors in a neural network (NN) can be highly effective in two directions, namely mounting the attacks as...

7 months ago cs.CR cs.LG PDF

Attack MEDIUM

Investigating Security Implications of Automatically Generated Code on the Software Supply Chain

Xiaofan Li, Xing Gao

In recent years, various software supply chain (SSC) attacks have posed significant risks to the global community. Severe consequences may arise if...

7 months ago cs.CR cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial