AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 221–240 of 727 papers

Clear filters

Attack HIGH

SpectralGuard: Detecting Memory Collapse Attacks in State Space Models

Davi Bonetto

State Space Models (SSMs) such as Mamba achieve linear-time sequence processing through input-dependent recurrence, but this mechanism introduces a...

2 months ago cs.LG cs.CR PDF

Attack HIGH

CLASP: Defending Hybrid Large Language Models Against Hidden State Poisoning Attacks

Alexandre Le Mercier, Thomas Demeester, Chris Develder

State space models (SSMs) like Mamba have gained significant traction as efficient alternatives to Transformers, achieving linear complexity while...

2 months ago cs.CL PDF

Attack MEDIUM

EmbTracker: Traceable Black-box Watermarking for Federated Language Models

Haodong Zhao, Jinming Hu, Yijie Bai +6 more

Federated Language Model (FedLM) allows a collaborative learning without sharing raw data, yet it introduces a critical vulnerability, as every...

2 months ago cs.CR PDF

Attack HIGH

The Mirror Design Pattern: Strict Data Geometry over Model Scale for Prompt Injection Detection

J Alex Corll

Prompt injection defenses are often framed as semantic understanding problems and delegated to increasingly large neural detectors. For the first...

2 months ago cs.CR cs.AI PDF

Attack HIGH

Jailbreak Scaling Laws for Large Language Models: Polynomial-Exponential Crossover

Indranil Halder, Annesya Banerjee, Cengiz Pehlevan

Adversarial attacks can reliably steer safety-aligned large language models toward unsafe behavior. Empirically, we find that adversarial...

2 months ago cs.LG cs.AI PDF

Attack HIGH

Enhancing Network Intrusion Detection Systems: A Multi-Layer Ensemble Approach to Mitigate Adversarial Attacks

Nasim Soltani, Shayan Nejadshamsi, Zakaria Abou El Houda +4 more

Adversarial examples can represent a serious threat to machine learning (ML) algorithms. If used to manipulate the behaviour of ML-based Network...

2 months ago cs.CR cs.AI PDF

Attack HIGH

Semantic Chameleon: Corpus-Dependent Poisoning Attacks and Defenses in RAG Systems

Scott Thornton

Retrieval-Augmented Generation (RAG) systems extend large language models (LLMs) with external knowledge sources but introduce new attack surfaces...

2 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

MCP-in-SoS: Risk assessment framework for open-source MCP servers

Pratyay Kumar, Miguel Antonio Guirao Aguilera, Srikathyayani Srikanteswara +2 more

Model Context Protocol (MCP) servers have rapidly emerged over the past year as a widely adopted way to enable Large Language Model (LLM) agents to...

2 months ago cs.CR cs.AI PDF

Attack HIGH

Compatibility at a Cost: Systematic Discovery and Exploitation of MCP Clause-Compliance Vulnerabilities

Nanzi Yang, Weiheng Bai, Kangjie Lu

The Model Context Protocol (MCP) is a recently proposed interoperability standard that unifies how AI agents connect with external tools and data...

2 months ago cs.CR cs.AI PDF

Attack HIGH

Execution Is the New Attack Surface: Survivability-Aware Agentic Crypto Trading with OpenClaw-Style Local Executors

Ailiya Borjigin, Igor Stadnyk, Ben Bilski +2 more

OpenClaw-style agent stacks turn language into privileged execution: LLM intents flow through tool interception, policy gates, and a local executor....

2 months ago cs.CR cs.AI PDF

Attack HIGH

Multi-Stream Perturbation Attack: Breaking Safety Alignment of Thinking LLMs Through Concurrent Task Interference

Fan Yang

The widespread adoption of thinking mode in large language models (LLMs) has significantly enhanced complex task processing capabilities while...

2 months ago cs.CR cs.AI PDF

Attack MEDIUM

CLIOPATRA: Extracting Private Information from LLM Insights

Meenatchi Sundaram Muthu Selva Annamalai, Emiliano De Cristofaro, Peter Kairouz

As AI assistants become widely used, privacy-aware platforms like Anthropic's Clio have been introduced to generate insights from real-world AI use....

2 months ago cs.CR PDF

Attack MEDIUM

Compartmentalization-Aware Automated Program Repair

Jia Hu, Youcheng Sun, Pierre Olivier

Software compartmentalization breaks down an application into compartments isolated from each other: an attacker taking over a compartment will be...

2 months ago cs.CR PDF

Attack MEDIUM

Amnesia: Adversarial Semantic Layer Specific Activation Steering in Large Language Models

Ali Raza, Gurang Gupta, Nikolay Matyunin +1 more

Warning: This article includes red-teaming experiments, which contain examples of compromised LLM responses that may be offensive or upsetting. Large...

2 months ago cs.CR cs.AI cs.LG PDF

Attack HIGH

Reasoning-Oriented Programming: Chaining Semantic Gadgets to Jailbreak Large Vision Language Models

Quanchen Zou, Moyang Chen, Zonghao Ying +6 more

Large Vision-Language Models (LVLMs) undergo safety alignment to suppress harmful content. However, current defenses predominantly target explicit...

2 months ago cs.CR PDF

Attack MEDIUM

AgenticCyOps: Securing Multi-Agentic AI Integration in Enterprise Cyber Operations

Shaswata Mitra, Raj Patel, Sudip Mittal +2 more

Multi-agent systems (MAS) powered by LLMs promise adaptive, reasoning-driven enterprise workflows, yet granting agents autonomous control over tools,...

2 months ago cs.CR cs.MA cs.SE PDF

Attack HIGH

NetDiffuser: Deceiving DNN-Based Network Attack Detection Systems with Diffusion-Generated Adversarial Traffic

Pratyay Kumar, Abu Saleh Md Tayeen, Satyajayant Misra +4 more

Deep learning (DL)-based Network Intrusion Detection System (NIDS) has demonstrated great promise in detecting malicious network traffic. However,...

2 months ago cs.CR cs.AI PDF

Attack HIGH

Comparative Analysis of Patch Attack on VLM-Based Autonomous Driving Architectures

David Fernandez, Pedram MohajerAnsari, Amir Salarpour +3 more

Vision-language models are emerging for autonomous driving, yet their robustness to physical adversarial attacks remains unexplored. This paper...

2 months ago cs.CV PDF

Attack MEDIUM

LLM-Agent Interactions on Markets with Information Asymmetries

Alexander Erlei, Lukas Meub

As AI agents increasingly act on behalf of human stakeholders in economic settings, understanding their behavior in complex market environments...

2 months ago econ.GN PDF

Attack HIGH

SlowBA: An efficiency backdoor attack towards VLM-based GUI agents

Junxian Li, Tu Lan, Haozhen Tan +2 more

Modern vision-language-model (VLM) based graphical user interface (GUI) agents are expected not only to execute actions accurately but also to...

2 months ago cs.CR cs.CL cs.CV PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial