AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 381–400 of 2,560 papers

Tool MEDIUM

From Incomplete Architecture to Quantified Risk: Multimodal LLM-Driven Security Assessment for Cyber-Physical Systems

Shaofei Huang, Christopher M. Poskitt, Lwin Khin Shar

Cyber-physical systems often contend with incomplete architectural documentation or outdated information resulting from legacy technologies,...

1 months ago cs.CR cs.AI PDF

Benchmark HIGH

Uncovering Linguistic Fragility in Vision-Language-Action Models via Diversity-Aware Red Teaming

Baoshun Tong, Haoran He, Ling Pan +2 more

Vision-Language-Action (VLA) models have achieved remarkable success in robotic manipulation. However, their robustness to linguistic nuances remains...

1 months ago cs.RO cs.CV PDF

Other HIGH

Stop Fixating on Prompts: Reasoning Hijacking and Constraint Tightening for Red-Teaming LLM Agents

Yanxu Mao, Peipei Liu, Tiehan Cui +3 more

With the widespread application of LLM-based agents across various domains, their complexity has introduced new security threats. Existing red-team...

1 months ago cs.CL PDF

Defense MEDIUM

MA-IDS: Multi-Agent RAG Framework for IoT Network Intrusion Detection with an Experience Library

Md Shamimul Islam, Luis G. Jaimes, Ayesha S. Dina

Network Intrusion Detection Systems (NIDS) face important limitations. Signature-based methods are effective for known attack patterns, but they...

1 months ago cs.CR cs.AI PDF

Tool MEDIUM

LanG -- A Governance-Aware Agentic AI Platform for Unified Security Operations

Anes Abdennebi, Nadjia Kara, Laaziz Lahlou +1 more

Modern Security Operations Centers struggle with alert fatigue, fragmented tooling, and limited cross-source event correlation. Challenges that...

1 months ago cs.CR cs.AI PDF

Tool MEDIUM

Your LLM Agent Can Leak Your Data: Data Exfiltration via Backdoored Tool Use

Wuyang Zhang, Shichao Pei

Tool-use large language model (LLM) agents are increasingly deployed to support sensitive workflows, relying on tool calls for retrieval, external...

1 months ago cs.CR cs.AI PDF

Defense MEDIUM

Gradient-Controlled Decoding: A Safety Guardrail for LLMs with Dual-Anchor Steering

Purva Chiniya, Kevin Scaria, Sagar Chaturvedi

Large language models (LLMs) remain susceptible to jailbreak and direct prompt-injection attacks, yet the strongest defensive filters frequently...

1 months ago cs.CL PDF

Benchmark MEDIUM

Compiled AI: Deterministic Code Generation for LLM-Based Workflow Automation

Geert Trooskens, Aaron Karlsberg, Anmol Sharma +6 more

We study compiled AI, a paradigm in which large language models generate executable code artifacts during a compilation phase, after which workflows...

1 months ago cs.SE cs.AI PDF

Tool MEDIUM

Strengthening Human-Centric Chain-of-Thought Reasoning Integrity in LLMs via a Structured Prompt Framework

Jiling Zhou, Aisvarya Adeseye, Seppo Virtanen +2 more

Chain-of-Thought (CoT) prompting has been used to enhance the reasoning capability of LLMs. However, its reliability in security-sensitive analytical...

1 months ago cs.CR cs.AI PDF

Attack HIGH

Do No Harm: Exposing Hidden Vulnerabilities of LLMs via Persona-based Client Simulation Attack in Psychological Counseling

Qingyang Xu, Yaling Shen, Stephanie Fong +7 more

The increasing use of large language models (LLMs) in mental healthcare raises safety concerns in high-stakes therapeutic interactions. A key...

1 months ago cs.CL PDF

Benchmark LOW

LiveFact: A Dynamic, Time-Aware Benchmark for LLM-Driven Fake News Detection

Cheng Xu, Changhong Jin, Yingjie Niu +5 more

The rapid development of Large Language Models (LLMs) has transformed fake news detection and fact-checking tasks from simple classification to...

1 months ago cs.CL cs.AI PDF

Benchmark LOW

Forgetting to Witness: Efficient Federated Unlearning and Its Visible Evaluation

Houzhe Wang, Xiaojie Zhu, Chi Chen

With the increasing importance of data privacy and security, federated unlearning has emerged as a novel research field dedicated to ensuring that...

1 months ago cs.LG cs.CR PDF

Defense MEDIUM

Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw

Zijun Wang, Haoqin Tu, Letian Zhang +11 more

OpenClaw, the most widely deployed personal AI agent in early 2026, operates with full local system access and integrates with sensitive services...

1 months ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

Undetectable Conversations Between AI Agents via Pseudorandom Noise-Resilient Key Exchange

Vinod Vaikuntanathan, Or Zamir

AI agents are increasingly deployed to interact with other agents on behalf of users and organizations. We ask whether two such agents, operated by...

1 months ago cs.CR cs.AI cs.LG PDF

Benchmark LOW

Discovering Failure Modes in Vision-Language Models using RL

Kanishk Jain, Qian Yang, Shravan Nayak +3 more

Vision-language Models (VLMs), despite achieving strong performance on multimodal benchmarks, often misinterpret straightforward visual concepts that...

1 months ago cs.CV cs.AI PDF

Benchmark MEDIUM

From Curiosity to Caution: Mitigating Reward Hacking for Best-of-N with Pessimism

Zhuohao Yu, Zhiwei Steven Wu, Adam Block

Inference-time compute scaling has emerged as a powerful paradigm for improving language model performance on a wide range of tasks, but the question...

1 months ago cs.LG PDF

Survey HIGH

Mapping the Exploitation Surface: A 10,000-Trial Taxonomy of What Makes LLM Agents Exploit Vulnerabilities

Charafeddine Mouzouni

LLM agents with tool access can discover and exploit security vulnerabilities. This is known. What is not known is which features of a system prompt...

1 months ago cs.CR cs.AI cs.CL PDF

Benchmark MEDIUM

Beyond Standard Benchmarks: A Systematic Audit of Vision-Language Model's Robustness to Natural Semantic Variation Across Diverse Tasks

Jia Chengyu, AprilPyone MaungMaung, Huy H. Nguyen +2 more

Recent advances in vision-language models (VLMs) trained on web-scale image-text pairs have enabled impressive zero-shot transfer across a diverse...

1 months ago cs.CV PDF

Benchmark MEDIUM

Bounded by Risk, Not Capability: Quantifying AI Occupational Substitution Rates via a Tech-Risk Dual-Factor Model

Shuyao Gao, Minghao Huang

The deployment of Large Language Models (LLMs) has ignited concerns about technological unemployment. Existing task-based evaluations predominantly...

1 months ago cs.CY econ.GN PDF

Tool HIGH

ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems

Zhuowen Yuan, Zhaorun Chen, Zhen Xiang +5 more

Existing research on LLM agent security mainly focuses on prompt injection and unsafe input/output behaviors. However, as agents increasingly rely on...

1 months ago cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial