AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1721–1740 of 1,932 papers

Clear filters

Attack HIGH

Replicating TEMPEST at Scale: Multi-Turn Adversarial Attacks Against Trillion-Parameter Frontier Models

Richard Young

Despite substantial investment in safety alignment, the vulnerability of large language models to sophisticated multi-turn adversarial attacks...

5 months ago cs.CL PDF

Attack MEDIUM

Large Language Models and Forensic Linguistics: Navigating Opportunities and Threats in the Age of Generative AI

George Mikros

Large language models (LLMs) present a dual challenge for forensic linguistics. They serve as powerful analytical tools enabling scalable corpus...

5 months ago cs.CL cs.CY PDF

Survey MEDIUM

SoK: Trust-Authorization Mismatch in LLM Agent Interactions

Guanquan Shi, Haohua Du, Zhiqiang Wang +4 more

Large Language Models (LLMs) are evolving into autonomous agents capable of executing complex workflows via standardized protocols (e.g., MCP)....

5 months ago cs.CR cs.AI PDF

Defense MEDIUM

MINES: Explainable Anomaly Detection through Web API Invariant Inference

Wenjie Zhang, Yun Lin, Chun Fung Amos Kwok +5 more

Detecting the anomalies of web applications, important infrastructures for running modern companies and governments, is crucial for providing...

5 months ago cs.SE cs.CR cs.DB PDF

Defense MEDIUM

CKG-LLM: LLM-Assisted Detection of Smart Contract Access Control Vulnerabilities Based on Knowledge Graphs

Xiaoqi Li, Hailu Kuang, Wenkai Li +2 more

Traditional approaches for smart contract analysis often rely on intermediate representations such as abstract syntax trees, control-flow graphs, or...

5 months ago cs.CR PDF

Attack MEDIUM

From Description to Score: Can LLMs Quantify Vulnerabilities?

Sima Jafarikhah, Daniel Thompson, Eva Deans +2 more

Manual vulnerability scoring, such as assigning Common Vulnerability Scoring System (CVSS) scores, is a resource-intensive process that is often...

5 months ago cs.CR cs.AI cs.PL PDF

Tool MEDIUM

Cognitive Control Architecture (CCA): A Lifecycle Supervision Framework for Robustly Aligned AI Agents

Zhibo Liang, Tianze Hu, Zaiye Chen +1 more

Autonomous Large Language Model (LLM) agents exhibit significant vulnerability to Indirect Prompt Injection (IPI) attacks. These attacks hijack agent...

5 months ago cs.AI cs.CL cs.CR PDF

Attack MEDIUM

Look Twice before You Leap: A Rational Agent Framework for Localized Adversarial Anonymization

Donghang Duan, Xu Zheng, Yuefeng He +3 more

Current LLM-based text anonymization frameworks usually rely on remote API services from powerful LLMs, which creates an inherent privacy paradox:...

5 months ago cs.CR cs.CL PDF

Attack HIGH

RunawayEvil: Jailbreaking the Image-to-Video Generative Models

Songping Wang, Rufan Qian, Yueming Lyu +5 more

Image-to-Video (I2V) generation synthesizes dynamic visual content from image and text inputs, providing significant creative control. However, the...

5 months ago cs.CV PDF

Defense MEDIUM

GSAE: Graph-Regularized Sparse Autoencoders for Robust LLM Safety Steering

Jehyeok Yeon, Federico Cinus, Yifan Wu +1 more

Large language models (LLMs) face critical safety challenges, as they can be manipulated to generate harmful content through adversarial prompts and...

5 months ago cs.LG cs.AI PDF

Benchmark HIGH

OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation

Xiaojun Jia, Jie Liao, Qi Guo +11 more

Recent advances in multi-modal large language models (MLLMs) have enabled unified perception-reasoning capabilities, yet these systems remain highly...

5 months ago cs.CR cs.CV PDF

Tool HIGH

Securing the Model Context Protocol: Defending LLMs Against Tool Poisoning and Adversarial Attacks

Saeid Jamshidi, Kawser Wazed Nafi, Arghavan Moradi Dakhel +3 more

The Model Context Protocol (MCP) enables Large Language Models to integrate external tools through structured descriptors, increasing autonomy in...

5 months ago cs.CR cs.AI PDF

Tool MEDIUM

BEACON: A Unified Behavioral-Tactical Framework for Explainable Cybercrime Analysis with Large Language Models

Arush Sachdeva, Rajendraprasad Saravanan, Gargi Sarkar +2 more

Cybercrime increasingly exploits human cognitive biases in addition to technical vulnerabilities, yet most existing analytical frameworks focus...

5 months ago cs.CR cs.AI cs.CY PDF

Attack HIGH

Metaphor-based Jailbreaking Attacks on Text-to-Image Models

Chenyu Zhang, Yiwen Ma, Lanjun Wang +3 more

Text-to-image~(T2I) models commonly incorporate defense mechanisms to prevent the generation of sensitive images. Unfortunately, recent jailbreaking...

5 months ago cs.CR cs.AI cs.CV PDF

Tool MEDIUM

UncertaintyZoo: A Unified Toolkit for Quantifying Predictive Uncertainty in Deep Learning Systems

Xianzong Wu, Xiaohong Li, Lili Quan +1 more

Large language models(LLMs) are increasingly expanding their real-world applications across domains, e.g., question answering, autonomous driving,...

5 months ago cs.AI cs.LG PDF

Survey MEDIUM

Web Technologies Security in the AI Era: A Survey of CDN-Enhanced Defenses

Mehrab Hosain, Sabbir Alom Shuvo, Matthew Ogbe +4 more

The modern web stack, which is dominated by browser-based applications and API-first backends, now operates under an adversarial equilibrium where...

5 months ago cs.CR cs.AI cs.LG PDF

Tool HIGH

Beyond Model Jailbreak: Systematic Dissection of the "Ten DeadlySins" in Embodied Intelligence

Yuhang Huang, Junchao Li, Boyang Ma +6 more

Embodied AI systems integrate language models with real world sensing, mobility, and cloud connected mobile apps. Yet while model jailbreaks have...

5 months ago cs.CR cs.RO PDF

Benchmark MEDIUM

CFCEval: Evaluating Security Aspects in Code Generated by Large Language Models

Cheng Cheng, Jinqiu Yang

Code-focused Large Language Models (LLMs), such as CodeX and Star-Coder, have demonstrated remarkable capabilities in enhancing developer...

5 months ago cs.SE PDF

Defense MEDIUM

DEFEND: Poisoned Model Detection and Malicious Client Exclusion Mechanism for Secure Federated Learning-based Road Condition Classification

Sheng Liu, Panos Papadimitratos

Federated Learning (FL) has drawn the attention of the Intelligent Transportation Systems (ITS) community. FL can train various models for ITS tasks,...

5 months ago cs.CR cs.AI PDF

Benchmark HIGH

Sift or Get Off the PoC: Applying Information Retrieval to Vulnerability Research with SiftRank

Caleb Gross

Security research is fundamentally a problem of resource constraint and consequent prioritization. There is simply too much attack surface and too...

5 months ago cs.CR cs.IR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial