AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1–20 of 69 papers

Clear filters

Tool HIGH

MATRA: Modeling the Attack Surface of Agentic AI Systems -- OpenClaw Case Study

Tim Van hamme, Thomas Vissers, Javier Carnerero-Cano +4 more

LLMs are increasingly deployed as autonomous agents with access to tools, databases, and external services, yet practitioners (across different...

Yesterday cs.AI cs.CR PDF

Tool HIGH

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

Zhaorun Chen, Xun Liu, Haibo Tong +14 more

AI agents are increasingly deployed across diverse domains to automate complex workflows through long-horizon and high-stakes action executions. Due...

6 days ago cs.AI PDF

Tool HIGH

Exposing LLM Safety Gaps Through Mathematical Encoding:New Attacks and Systematic Analysis

Haoyu Zhang, Mohammad Zandsalimy, Shanu Sushmita

Large language models (LLMs) employ safety mechanisms to prevent harmful outputs, yet these defenses primarily rely on semantic pattern matching. We...

1 weeks ago cs.CR cs.AI cs.CL PDF

Tool HIGH

Large Language Models as Explainable Cyberattack Detectors for Energy Industrial Control Systems

Weiyi Kong, Ahmad Mohammad Saber, Amr Youssef +1 more

In modern energy systems, industrial control systems (ICS) and power-system SCADA require intrusion detection that is not only accurate but also...

2 weeks ago cs.CR PDF

Tool HIGH

Prompt-Unknown Promotion Attacks against LLM-based Sequential Recommender Systems

Yuchuan Zhao, Tong Chen, Junliang Yu +3 more

Large language model-powered sequential recommender systems (LLM-SRSs) have recently demonstrated remarkable performance, enabling recommendations...

2 weeks ago cs.IR PDF

Tool HIGH

MCP Pitfall Lab: Exposing Developer Pitfalls in MCP Tool Server Security under Multi-Vector Attacks

Run Hao, Zhuoran Tan

Model Context Protocol (MCP) is increasingly adopted for tool-integrated LLM agents, but its multi-layer design and third-party server ecosystem...

2 weeks ago cs.CR PDF

Tool HIGH

If you're waiting for a sign... that might not be it! Mitigating Trust Boundary Confusion from Visual Injections on Vision-Language Agentic Systems

Jiamin Chang, Minhui Xue, Ruoxi Sun +3 more

Recent advances in embodied Vision-Language Agentic Systems (VLAS), powered by large vision-language models (LVLMs), enable AI systems to perceive...

3 weeks ago cs.CV cs.AI PDF

Tool HIGH

ARES: Adaptive Red-Teaming and End-to-End Repair of Policy-Reward System

Jiacheng Liang, Yao Ma, Tharindu Kumarage +5 more

Reinforcement Learning from Human Feedback (RLHF) is central to aligning Large Language Models (LLMs), yet it introduces a critical vulnerability: an...

3 weeks ago cs.AI cs.CR cs.LG PDF

Tool HIGH

ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection

Wei Zhao, Zhe Li, Peixin Zhang +1 more

Tool-augmented Large Language Model (LLM) agents have demonstrated impressive capabilities in automating complex, multi-step real-world tasks, yet...

4 weeks ago cs.CR cs.AI PDF

Tool HIGH

The Salami Slicing Threat: Exploiting Cumulative Risks in LLM Systems

Yihao Zhang, Kai Wang, Jiangrong Wu +7 more

Large Language Models (LLMs) face prominent security risks from jailbreaking, a practice that manipulates models to bypass built-in security...

4 weeks ago cs.CR cs.AI cs.CL PDF

Tool HIGH

Critical-CoT: A Robust Defense Framework against Reasoning-Level Backdoor Attacks in Large Language Models

Vu Tuan Truong, Long Bao Le

Large Language Models (LLMs), despite their impressive capabilities across domains, have been shown to be vulnerable to backdoor attacks. Prior...

1 months ago cs.CR cs.AI PDF

Tool HIGH

ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems

Zhuowen Yuan, Zhaorun Chen, Zhen Xiang +5 more

Existing research on LLM agent security mainly focuses on prompt injection and unsafe input/output behaviors. However, as agents increasingly rely on...

1 months ago cs.AI PDF

Tool HIGH

Automated Framework to Evaluate and Harden LLM System Instructions against Encoding Attacks

Anubhab Sahu, Diptisha Samanta, Reza Soosahabi

System Instructions in Large Language Models (LLMs) are commonly used to enforce safety policies, define agent behavior, and protect sensitive...

1 months ago cs.CR cs.AI PDF

Tool HIGH

PDA: Text-Augmented Defense Framework for Robust Vision-Language Models against Adversarial Image Attacks

Jingning Xu, Haochen Luo, Chen Liu

Vision-language models (VLMs) are vulnerable to adversarial image perturbations. Existing works based on adversarial training against task-specific...

1 months ago cs.CV cs.MM PDF

Tool HIGH

The Persistent Vulnerability of Aligned AI Systems

Aengus Lynch

Autonomous AI agents are being deployed with filesystem access, email control, and multi-step planning. This thesis contributes to four open problems...

1 months ago cs.LG cs.AI PDF

Tool HIGH

Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks

Chong Xiang, Drew Zagieboylo, Shaona Ghosh +5 more

AI agents, predominantly powered by large language models (LLMs), are vulnerable to indirect prompt injection, in which malicious instructions...

1 months ago cs.CR cs.AI PDF

Tool HIGH

CivicShield: A Cross-Domain Defense-in-Depth Framework for Securing Government-Facing AI Chatbots Against Multi-Turn Adversarial Attacks

KrishnaSaiReddy Patil

LLM-based chatbots in government services face critical security gaps. Multi-turn adversarial attacks achieve over 90% success against current...

1 months ago cs.CR cs.AI PDF

Tool HIGH

ORACAL: A Robust and Explainable Multimodal Framework for Smart Contract Vulnerability Detection with Causal Graph Enrichment

Tran Duong Minh Dai, Triet Huynh Minh Le, M. Ali Babar +2 more

Although Graph Neural Networks (GNNs) have shown promise for smart contract vulnerability detection, they still face significant limitations....

1 months ago cs.LG cs.CR PDF

Tool HIGH

The System Prompt Is the Attack Surface: How LLM Agent Configuration Shapes Security and Creates Exploitable Vulnerabilities

Ron Litvak

System prompt configuration can make the difference between near-total phishing blindness and near-perfect detection in LLM email agents. We present...

1 months ago cs.CR cs.AI PDF

Tool HIGH

Are AI-assisted Development Tools Immune to Prompt Injection?

Charoes Huang, Xin Huang, Amin Milani Fard

Prompt injection is listed as the number-one vulnerability class in the OWASP Top 10 for LLM Applications that can subvert LLM guardrails, disclose...

1 months ago cs.CR cs.SE PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial