AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 481–500 of 2,583 papers

Tool LOW

Measuring What Matters -- or What's Convenient?: Robustness of LLM-Based Scoring Systems to Construct-Irrelevant Factors

Cole Walsh, Rodica Ivan

Automated systems have been widely adopted across the educational testing industry for open-response assessment and essay scoring. These systems...

1 months ago cs.CL cs.AI cs.CY PDF

Benchmark MEDIUM

Unveiling the Resilience of LLM-Enhanced Search Engines against Black-Hat SEO Manipulation

Pei Chen, Geng Hong, Xinyi Wu +6 more

The emergence of Large Language Model-enhanced Search Engines (LLMSEs) has revolutionized information retrieval by integrating web-scale search...

1 months ago cs.CR cs.IR PDF

Defense MEDIUM

Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models

Xunguang Wang, Yuguang Zhou, Qingyue Wang +5 more

Large language models (LLMs) increasingly rely on explicit chain-of-thought (CoT) reasoning to solve complex tasks, yet the safety of the reasoning...

1 months ago cs.AI cs.CR PDF

Attack HIGH

Shape and Substance: Dual-Layer Side-Channel Attacks on Local Vision-Language Models

Eyal Hadad, Mordechai Guri

On-device Vision-Language Models (VLMs) promise data privacy via local execution. However, we show that the architectural shift toward Dynamic...

1 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

On the Vulnerability of Deep Automatic Modulation Classifiers to Explainable Backdoor Threats

Younes Salmi, Hanna Bogucka

Deep learning (DL) has been widely studied for assisting applications of modern wireless communications. One of the applications is automatic...

1 months ago cs.CR PDF

Attack HIGH

Physical Backdoor Attack Against Deep Learning-Based Modulation Classification

Younes Salmi, Hanna Bogucka

Deep Learning (DL) has become a key technology that assists radio frequency (RF) signal classification applications, such as modulation...

1 months ago cs.CR PDF

Attack HIGH

Mitigating Evasion Attacks in Fog Computing Resource Provisioning Through Proactive Hardening

Younes Salmi, Hanna Bogucka

This paper investigates the susceptibility to model integrity attacks that overload virtual machines assigned by the k-means algorithm used for...

1 months ago cs.CR cs.LG PDF

Attack HIGH

Prompt Attack Detection with LLM-as-a-Judge and Mixture-of-Models

Hieu Xuan Le, Benjamin Goh, Quy Anh Tang

Prompt attacks, including jailbreaks and prompt injections, pose a critical security risk to Large Language Model (LLM) systems. In production,...

1 months ago cs.CL PDF

Attack HIGH

PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems

Haozhen Wang, Haoyue Liu, Jionghao Zhu +3 more

Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of applications. However, their practical deployment is...

1 months ago cs.CR cs.AI PDF

Defense LOW

ElephantBroker: A Knowledge-Grounded Cognitive Runtime for Trustworthy AI Agents

Cristian Lupascu, Alexandru Lupascu

Large Language Model based agents increasingly operate in high stakes, multi turn settings where factual grounding is critical, yet their memory...

1 months ago cs.AI PDF

Benchmark LOW

AuthorityBench: Benchmarking LLM Authority Perception for Reliable Retrieval-Augmented Generation

Zhihui Yao, Hengran Zhang, Keping Bi

Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) with external knowledge but remains vulnerable to low-authority sources...

1 months ago cs.IR PDF

Tool HIGH

The System Prompt Is the Attack Surface: How LLM Agent Configuration Shapes Security and Creates Exploitable Vulnerabilities

Ron Litvak

System prompt configuration can make the difference between near-total phishing blindness and near-perfect detection in LLM email agents. We present...

1 months ago cs.CR cs.AI PDF

Survey MEDIUM

AI Security in the Foundation Model Era: A Comprehensive Survey from a Unified Perspective

Zhenyi Wang, Siyu Luan

As machine learning (ML) systems expand in both scale and functionality, the security landscape has become increasingly complex, with a proliferation...

1 months ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

Bridging Code Property Graphs and Language Models for Program Analysis

Ahmed Lekssays

Large Language Models (LLMs) face critical challenges when analyzing security vulnerabilities in real world codebases: token limits prevent loading...

1 months ago cs.CR PDF

Benchmark LOW

From Weights to Concepts: Data-Free Interpretability of CLIP via Singular Vector Decomposition

Francesco Gentile, Nicola Dall'Asen, Francesco Tonini +3 more

As vision-language models are deployed at scale, understanding their internal mechanisms becomes increasingly critical. Existing interpretability...

1 months ago cs.CV PDF

Defense MEDIUM

Analysing the Safety Pitfalls of Steering Vectors

Yuxiao Li, Alina Fastowski, Efstratios Zaradoukas +2 more

Activation steering has emerged as a powerful tool to shape LLM behavior without the need for weight updates. While its inherent brittleness and...

1 months ago cs.CR cs.CL PDF

Attack HIGH

Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs

Alexander Panfilov, Peter Romov, Igor Shilov +3 more

LLM agents like Claude Code can not only write code but also be used for autonomous AI research and engineering \citep{rank2026posttrainbench,...

1 months ago cs.LG cs.AI cs.CR PDF

Attack HIGH

Attack Assessment and Augmented Identity Recognition for Human Skeleton Data

Joseph G. Zalameda, Megan A. Witherow, Alexander M. Glandon +2 more

Machine learning models trained on small data sets for security applications are especially vulnerable to adversarial attacks. Person identification...

1 months ago cs.LG cs.CR cs.CV PDF

Benchmark MEDIUM

Environment-Grounded Multi-Agent Workflow for Autonomous Penetration Testing

Michael Somma, Markus Großpointner, Paul Zabalegui +2 more

The increasing complexity and interconnectivity of digital infrastructures make scalable and reliable security assessment methods essential. Robotic...

1 months ago cs.RO cs.AI PDF

Attack HIGH

Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search

Yulin Shen, Xudong Pan, Geng Hong +1 more

Recent advances in the Model Context Protocol (MCP) have enabled large language models (LLMs) to invoke external tools with unprecedented ease. This...

1 months ago cs.CR cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial