AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 41–60 of 93 papers

Clear filters

Attack HIGH

Secret Stealing Attacks on Local LLM Fine-Tuning through Supply-Chain Model Code Backdoors

Zi Li, Tian Zhou, Wenze Li +3 more

Local fine-tuning datasets routinely contain sensitive secrets such as API keys, personal identifiers, and financial records. Although ''local...

1 weeks ago cs.CR cs.AI PDF

Attack HIGH

Indirect Prompt Injection in the Wild: An Empirical Study of Prevalence, Techniques, and Objectives

Soheil Khodayari, Xuenan Zhang, Bhupendra Acharya +1 more

As LLMs are increasingly integrated into systems that browse, retrieve, summarize, and act on web content, webpages have become an untrusted input...

1 weeks ago cs.CR PDF

Attack HIGH

Enhancing Linux Privilege Escalation Attack Capabilities of Local LLM Agents

Benjamin Probst, Andreas Happe, Jürgen Cito

Recent research has demonstrated the potential of Large Language Models (LLMs) for autonomous penetration testing, particularly when using...

1 weeks ago cs.CR cs.AI PDF

Tool HIGH

Large Language Models as Explainable Cyberattack Detectors for Energy Industrial Control Systems

Weiyi Kong, Ahmad Mohammad Saber, Amr Youssef +1 more

In modern energy systems, industrial control systems (ICS) and power-system SCADA require intrusion detection that is not only accurate but also...

2 weeks ago cs.CR PDF

Attack HIGH

Cross-Lingual Jailbreak Detection via Semantic Codebooks

Shirin Alanova, Bogdan Minko, Sabrina Sadiekh +1 more

Safety mechanisms for large language models (LLMs) remain predominantly English-centric, creating systematic vulnerabilities in multilingual...

2 weeks ago cs.CL cs.AI PDF

Defense HIGH

Learning Generalizable Multimodal Representations for Software Vulnerability Detection

Zeming Dong, Yuejun Guo, Qiang Hu +5 more

Source code and its accompanying comments are complementary yet naturally aligned modalities-code encodes structural logic while comments capture...

2 weeks ago cs.SE cs.AI PDF

Defense HIGH

PLMGH: What Matters in PLM-GNN Hybrids for Code Classification and Vulnerability Detection

Mohamed Taoufik Kaouthar El Idrissi, Edward Zulkoski, Mohammad Hamdaqa

Code understanding models increasingly rely on pretrained language models (PLMs) and graph neural networks (GNNs), which capture complementary...

2 weeks ago cs.SE cs.LG PDF

Attack HIGH

SnapGuard: Lightweight Prompt Injection Detection for Screenshot-Based Web Agents

Mengyao Du, Han Fang, Haokai Ma +4 more

Web agents have emerged as an effective paradigm for automating interactions with complex web environments, yet remain vulnerable to prompt injection...

2 weeks ago cs.CR cs.AI PDF

Attack HIGH

Adaptive Prompt Embedding Optimization for LLM Jailbreaking

Miles Q. Li, Benjamin C. M. Fung, Boyang Li +2 more

Existing white-box jailbreak attacks against aligned LLMs typically append discrete adversarial suffixes to the user prompt, which visibly alters the...

2 weeks ago cs.AI PDF

Attack HIGH

Poisoning Learned Index Structures: Static and Dynamic Adversarial Attacks on ALEX

Allen Jue

Learned index structures achieve high performance by modeling the cumulative distribution function (CDF) of keys, but this reliance on data...

2 weeks ago cs.CR cs.DB PDF

Attack HIGH

AgentVisor: Defending LLM Agents Against Prompt Injection via Semantic Virtualization

Zonghao Ying, Haozheng Wang, Jiangfan Liu +5 more

Large Language Model (LLM) agents are increasingly used to automate complex workflows, but integrating untrusted external data with privileged...

2 weeks ago cs.CR PDF

Attack HIGH

Jailbreaking Frontier Foundation Models Through Intention Deception

Xinhe Wang, Katia Sycara, Yaqi Xie

Large (vision-)language models exhibit remarkable capability but remain highly susceptible to jailbreaking. Existing safety training approaches aim...

2 weeks ago cs.CR cs.AI cs.CL PDF

Benchmark HIGH

Evaluation of Prompt Injection Defenses in Large Language Models

Priyal Deep, Shane Emmons, Amy Fox +3 more

LLM-powered applications routinely embed secrets in system prompts, yet models can be tricked into revealing them. We built an adaptive attacker that...

2 weeks ago cs.CR cs.AI PDF

Attack HIGH

Spore: Efficient and Training-Free Privacy Extraction Attack on LLMs via Inference-Time Hybrid Probing

Yu Cui, Ruiqing Yue, Hang Fu +6 more

With the wide adoption of personal AI assistants such as OpenClaw, privacy leakage in user interaction contexts with large language model (LLM)...

2 weeks ago cs.CR PDF

Tool HIGH

Prompt-Unknown Promotion Attacks against LLM-based Sequential Recommender Systems

Yuchuan Zhao, Tong Chen, Junliang Yu +3 more

Large language model-powered sequential recommender systems (LLM-SRSs) have recently demonstrated remarkable performance, enabling recommendations...

2 weeks ago cs.IR PDF

Attack HIGH

Transient Turn Injection: Exposing Stateless Multi-Turn Vulnerabilities in Large Language Models

Naheed Rayhan, Sohely Jahan

Large language models (LLMs) are increasingly integrated into sensitive workflows, raising the stakes for adversarial robustness and safety. This...

2 weeks ago cs.CR cs.AI PDF

Attack HIGH

Black-Box Skill Stealing Attack from Proprietary LLM Agents: An Empirical Study

Zihan Wang, Rui Zhang, Yu Liu +4 more

LLM agents increasingly rely on skills to encapsulate reusable capabilities via progressively disclosed instructions. High-quality skills inject...

2 weeks ago cs.CR PDF

Attack HIGH

Stealthy Backdoor Attacks against LLMs Based on Natural Style Triggers

Jiali Wei, Ming Fan, Guoheng Sun +3 more

The growing application of large language models (LLMs) in safety-critical domains has raised urgent concerns about their security. Many recent...

2 weeks ago cs.CR cs.AI cs.CL PDF

Tool HIGH

MCP Pitfall Lab: Exposing Developer Pitfalls in MCP Tool Server Security under Multi-Vector Attacks

Run Hao, Zhuoran Tan

Model Context Protocol (MCP) is increasingly adopted for tool-integrated LLM agents, but its multi-layer design and third-party server ecosystem...

2 weeks ago cs.CR PDF

Defense HIGH

Strategic Heterogeneous Multi-Agent Architecture for Cost-Effective Code Vulnerability Detection

Zhaohui Geoffrey Wang

Automated code vulnerability detection is critical for software security, yet existing approaches face a fundamental trade-off between detection...

2 weeks ago cs.CR cs.LG cs.SE PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial