AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 201–220 of 312 papers

Clear filters

Attack MEDIUM

Large Language Models and Forensic Linguistics: Navigating Opportunities and Threats in the Age of Generative AI

George Mikros

Large language models (LLMs) present a dual challenge for forensic linguistics. They serve as powerful analytical tools enabling scalable corpus...

5 months ago cs.CL cs.CY PDF

Attack MEDIUM

From Description to Score: Can LLMs Quantify Vulnerabilities?

Sima Jafarikhah, Daniel Thompson, Eva Deans +2 more

Manual vulnerability scoring, such as assigning Common Vulnerability Scoring System (CVSS) scores, is a resource-intensive process that is often...

5 months ago cs.CR cs.AI cs.PL PDF

Attack MEDIUM

Look Twice before You Leap: A Rational Agent Framework for Localized Adversarial Anonymization

Donghang Duan, Xu Zheng, Yuefeng He +3 more

Current LLM-based text anonymization frameworks usually rely on remote API services from powerful LLMs, which creates an inherent privacy paradox:...

5 months ago cs.CR cs.CL PDF

Attack MEDIUM

Topology Matters: Measuring Memory Leakage in Multi-Agent LLMs

Jinbo Liu, Defu Cao, Yifei Wei +6 more

Graph topology is a fundamental determinant of memory leakage in multi-agent LLM systems, yet its effects remain poorly quantified. We introduce MAMA...

5 months ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

In-Context Representation Hijacking

Itay Yona, Amir Sarid, Michael Karasik +1 more

We introduce $\textbf{Doublespeak}$, a simple in-context representation hijacking attack against large language models (LLMs). The attack works by...

5 months ago cs.CL cs.AI cs.CR PDF

Attack MEDIUM

SELF: A Robust Singular Value and Eigenvalue Approach for LLM Fingerprinting

Hanxiu Zhang, Yue Zheng

The protection of Intellectual Property (IP) in Large Language Models (LLMs) represents a critical challenge in contemporary AI research. While...

5 months ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

Invasive Context Engineering to Control Large Language Models

Thomas Rivasseau

Current research on operator control of Large Language Models improves model robustness against adversarial attacks and misbehavior by training on...

5 months ago cs.AI PDF

Attack MEDIUM

Adversarial Robustness of Traffic Classification under Resource Constraints: Input Structure Matters

Adel Chehade, Edoardo Ragusa, Paolo Gastaldo +1 more

Traffic classification (TC) plays a critical role in cybersecurity, particularly in IoT and embedded contexts, where inspection must often occur...

5 months ago cs.NI cs.CR cs.LG PDF

Attack MEDIUM

CluCERT: Certifying LLM Robustness via Clustering-Guided Denoising Smoothing

Zixia Wang, Gaojie Jin, Jia Hu +1 more

Recent advancements in Large Language Models (LLMs) have led to their widespread adoption in daily applications. Despite their impressive...

5 months ago cs.LG cs.AI PDF

Attack MEDIUM

From monoliths to modules: Decomposing transducers for efficient world modelling

Alexander Boyd, Franz Nowak, David Hyland +2 more

World models have been recently proposed as sandbox environments in which AI agents can be trained and evaluated before deployment. Although...

5 months ago cs.AI PDF

Attack MEDIUM

Factor(T,U): Factored Cognition Strengthens Monitoring of Untrusted AI

Aaron Sandoval, Cody Rushing

The field of AI Control seeks to develop robust control protocols, deployment safeguards for untrusted AI which may be intentionally subversive....

5 months ago cs.CR cs.CL PDF

Attack MEDIUM

Many-to-One Adversarial Consensus: Exposing Multi-Agent Collusion Risks in AI-Based Healthcare

Adeela Bashir, The Anh han, Zia Ush Shamszaman

The integration of large language models (LLMs) into healthcare IoT systems promises faster decisions and improved medical support. LLMs are also...

5 months ago cs.CR cs.LG cs.MA PDF

Attack MEDIUM

On the Regulatory Potential of User Interfaces for AI Agent Governance

K. J. Kevin Feng, Tae Soo Kim, Rock Yuren Pang +3 more

AI agents that take actions in their environment autonomously over extended time horizons require robust governance interventions to curb their...

5 months ago cs.CY cs.AI PDF

Attack MEDIUM

An Empirical Study on the Security Vulnerabilities of GPTs

Tong Wu, Weibin Wu, Zibin Zheng

Equipped with various tools and knowledge, GPTs, one kind of customized AI agents based on OpenAI's large language models, have illustrated great...

5 months ago cs.CR cs.SE PDF

Attack MEDIUM

NetDeTox: Adversarial and Efficient Evasion of Hardware-Security GNNs via RL-LLM Orchestration

Zeng Wang, Minghao Shao, Akashdeep Saha +4 more

Graph neural networks (GNNs) have shown promise in hardware security by learning structural motifs from netlist graphs. However, this reliance on...

5 months ago cs.CR cs.AI PDF

Attack MEDIUM

Securing the Model Context Protocol (MCP): Risks, Controls, and Governance

Herman Errico, Jiquan Ngiam, Shanita Sojan

The Model Context Protocol (MCP) replaces static, developer-controlled API integrations with more dynamic, user-driven agent systems, which also...

5 months ago cs.CR PDF

Attack MEDIUM

Ranking-Enhanced Anomaly Detection Using Active Learning-Assisted Attention Adversarial Dual AutoEncoders

Sidahmed Benabderrahmane, James Cheney, Talal Rahwan

Advanced Persistent Threats (APTs) pose a significant challenge in cybersecurity due to their stealthy and long-term nature. Modern supervised...

5 months ago cs.LG cs.AI cs.CR PDF

Attack MEDIUM

Prompt Fencing: A Cryptographic Approach to Establishing Security Boundaries in Large Language Model Prompts

Steven Peh

Large Language Models (LLMs) remain vulnerable to prompt injection attacks, representing the most significant security threat in production...

5 months ago cs.CR cs.AI PDF

Attack MEDIUM

Towards Realistic Guarantees: A Probabilistic Certificate for SmoothLLM

Adarsh Kumarappan, Ayushi Mehrotra

The SmoothLLM defense provides a certification guarantee against jailbreaking attacks, but it relies on a strict "k-unstable" assumption that rarely...

5 months ago cs.LG cs.AI PDF

Attack MEDIUM

ASTRA: Agentic Steerability and Risk Assessment Framework

Itay Hazan, Yael Mathov, Guy Shtar +2 more

Securing AI agents powered by Large Language Models (LLMs) represents one of the most critical challenges in AI security today. Unlike traditional...

5 months ago cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial