AI Security Research

2,077+ academic papers on AI security, attacks, and defenses

Total

2,077

Attack

809

Benchmark

603

Defense

272

Tool

226

Survey

113

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 961–978 of 978 papers

Clear filters

Defense MEDIUM

ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search

Zeyu Shen, Basileal Imana, Tong Wu +3 more

Retrieval-Augmented Generation (RAG) enhances Large Language Models by grounding their outputs in external documents. These systems, however, remain...

5 months ago cs.CR cs.AI PDF

Defense MEDIUM

Beyond Embeddings: Interpretable Feature Extraction for Binary Code Similarity

Charles E. Gagnon, Steven H. H. Ding, Philippe Charland +1 more

Binary code similarity detection is a core task in reverse engineering. It supports malware analysis and vulnerability discovery by identifying...

5 months ago cs.AI cs.CR cs.SE PDF

Attack MEDIUM

Dual-Space Smoothness for Robust and Balanced LLM Unlearning

Han Yan, Zheyuan Liu, Meng Jiang

With the rapid advancement of large language models, Machine Unlearning has emerged to address growing concerns around user privacy, copyright...

5 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

Reinforcement Learning-Based Prompt Template Stealing for Text-to-Image Models

Xiaotian Zou

Multimodal Large Language Models (MLLMs) have transformed text-to-image workflows, allowing designers to create novel visual concepts with...

5 months ago cs.CV cs.AI PDF

Attack MEDIUM

LLM Watermark Evasion via Bias Inversion

Jeongyeon Hwang, Sangdon Park, Jungseul Ok

Watermarking offers a promising solution for detecting LLM-generated content, yet its robustness under realistic query-free (black-box) evasion...

5 months ago cs.CR cs.AI PDF

Attack MEDIUM

What Do They Fix? LLM-Aided Categorization of Security Patches for Critical Memory Bugs

Xingyu Li, Juefei Pu, Yifan Wu +13 more

Open-source software projects are foundational to modern software ecosystems, with the Linux kernel standing out as a critical exemplar due to its...

5 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

Evaluating the Limits of Large Language Models in Multilingual Legal Reasoning

Antreas Ioannou, Andreas Shiamishis, Nora Hollenstein +1 more

In an era dominated by Large Language Models (LLMs), understanding their capabilities and limitations, especially in high-stakes fields like law, is...

6 months ago cs.CL cs.AI cs.LG PDF

Benchmark MEDIUM

Erase or Hide? Suppressing Spurious Unlearning Neurons for Robust Unlearning

Nakyeong Yang, Dong-Kyum Kim, Jea Kwon +3 more

Large language models trained on web-scale data can memorize private or sensitive knowledge, raising significant privacy risks. Although some...

6 months ago cs.LG PDF

Benchmark MEDIUM

Secure and Efficient Access Control for Computer-Use Agents via Context Space

Haochen Gong, Chenxiao Li, Rui Chang +1 more

Large language model (LLM)-based computer-use agents represent a convergence of AI and OS capabilities, enabling natural language to control system-...

6 months ago cs.CR cs.AI cs.OS PDF

Benchmark MEDIUM

Polysemous Language Gaussian Splatting via Matching-based Mask Lifting

Jiayu Ding, Xinpeng Liu, Zhiyi Pan +2 more

Lifting 2D open-vocabulary understanding into 3D Gaussian Splatting (3DGS) scenes is a critical challenge. However, mainstream methods suffer from...

6 months ago cs.CV cs.AI PDF

Tool MEDIUM

Library Hallucinations in LLMs: Risk Analysis Grounded in Developer Queries

Lukas Twist, Jie M. Zhang, Mark Harman +1 more

Large language models (LLMs) are increasingly used to generate code, yet they continue to hallucinate, often inventing non-existent libraries. Such...

6 months ago cs.SE cs.CL PDF

Attack MEDIUM

Adversarial training with restricted data manipulation

David Benfield, Stefano Coniglio, Phan Tu Vuong +1 more

Adversarial machine learning concerns situations in which learners face attacks from active adversaries. Such scenarios arise in applications such as...

6 months ago cs.LG cs.CR PDF

Defense MEDIUM

The Rogue Scalpel: Activation Steering Compromises LLM Safety

Anton Korznikov, Andrey Galichin, Alexey Dontsov +3 more

Activation steering is a promising technique for controlling LLM behavior by adding semantically meaningful vectors directly into a model's hidden...

6 months ago cs.LG cs.AI PDF

Tool MEDIUM

You Can't Steal Nothing: Mitigating Prompt Leakages in LLMs via System Vectors

Bochuan Cao, Changjiang Li, Yuanpu Cao +3 more

Large language models (LLMs) have been widely adopted across various applications, leveraging customized system prompts for diverse tasks. Facing...

6 months ago cs.CR cs.AI cs.CL PDF

Defense MEDIUM

Defending MoE LLMs against Harmful Fine-Tuning via Safety Routing Alignment

Jaehan Kim, Minkyoo Song, Seungwon Shin +1 more

Recent large language models (LLMs) have increasingly adopted the Mixture-of-Experts (MoE) architecture for efficiency. MoE-based LLMs heavily depend...

6 months ago cs.CR cs.AI PDF

Tool MEDIUM

PhishLumos: An Adaptive Multi-Agent System for Proactive Phishing Campaign Mitigation

Daiki Chiba, Hiroki Nakano, Takashi Koide

Phishing attacks are a significant societal threat, disproportionately harming vulnerable populations and eroding trust in essential digital...

6 months ago cs.CR PDF

Attack MEDIUM

Backdoor Attribution: Elucidating and Controlling Backdoor in Language Models

Miao Yu, Zhenhong Zhou, Moayad Aloqaily +5 more

Fine-tuned Large Language Models (LLMs) are vulnerable to backdoor attacks through data poisoning, yet the internal mechanisms governing these...

6 months ago cs.CR cs.AI PDF

Tool MEDIUM

MobiLLM: An Agentic AI Framework for Closed-Loop Threat Mitigation in 6G Open RANs

Prakhar Sharma, Haohuang Wen, Vinod Yegneswaran +3 more

The evolution toward 6G networks is being accelerated by the Open Radio Access Network (O-RAN) paradigm -- an open, interoperable architecture that...

6 months ago cs.CR cs.AI cs.LG PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial