AI Security Research

2,589+ academic papers on AI security, attacks, and defenses

Total

2,589

Attack

998

Benchmark

740

Defense

355

Tool

276

Survey

147

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1641–1660 of 2,589 papers

Defense MEDIUM

C-ing Clearly: Enhanced Binary Code Explanations using C code

Teodor Poncu, Ioana Pintilie, Marius Dragoi +2 more

Large Language Models (LLMs) typically excel at coding tasks involving high-level programming languages, as opposed to lower-level programming...

5 months ago cs.CL cs.LG PDF

Attack MEDIUM

From Adversarial Poetry to Adversarial Tales: An Interpretability Research Agenda

Piercosma Bisconti, Marcello Galisai, Matteo Prandi +6 more

Safety mechanisms in LLMs remain vulnerable to attacks that reframe harmful requests through culturally coded structures. We introduce Adversarial...

5 months ago cs.CL cs.AI cs.CY PDF

Attack HIGH

Reasoning-Style Poisoning of LLM Agents via Stealthy Style Transfer: Process-Level Attacks and Runtime Monitoring in RSV Space

Xingfu Zhou, Pengfei Wang

Large Language Model (LLM) agents relying on external retrieval are increasingly deployed in high-stakes environments. While existing adversarial...

5 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

PentestEval: Benchmarking LLM-based Penetration Testing with Modular and Stage-Level Design

Ruozhao Yang, Mingfei Cheng, Gelei Deng +3 more

Penetration testing is essential for assessing and strengthening system security against real-world threats, yet traditional workflows remain highly...

5 months ago cs.SE cs.AI cs.CR PDF

Attack HIGH

IntentMiner: Intent Inversion Attack via Tool Call Analysis in the Model Context Protocol

Yunhao Yao, Zhiqiang Wang, Haoran Cheng +3 more

The evolution of Large Language Models (LLMs) into Agentic AI has established the Model Context Protocol (MCP) as the standard for connecting...

5 months ago cs.CR cs.AI PDF

Attack HIGH

CIS-BA: Continuous Interaction Space Based Backdoor Attack for Object Detection in the Real-World

Shuxin Zhao, Bo Lang, Nan Xiao +1 more

Object detection models deployed in real-world applications such as autonomous driving face serious threats from backdoor attacks. Despite their...

5 months ago cs.CV cs.CR PDF

Tool MEDIUM

From Obfuscated to Obvious: A Comprehensive JavaScript Deobfuscation Tool for Security Analysis

Dongchao Zhou, Lingyun Ying, Huajun Chai +1 more

JavaScript's widespread adoption has made it an attractive target for malicious attackers who employ sophisticated obfuscation techniques to conceal...

5 months ago cs.CR cs.SE PDF

Benchmark MEDIUM

Bilevel Optimization for Covert Memory Tampering in Heterogeneous Multi-Agent Architectures (XAMT)

Akhil Sharma, Shaikh Yaser Arafat, Jai Kumar Sharma +1 more

The increasing operational reliance on complex Multi-Agent Systems (MAS) across safety-critical domains necessitates rigorous adversarial robustness...

5 months ago cs.CR PDF

Survey MEDIUM

Async Control: Stress-testing Asynchronous Control Measures for LLM Agents

Asa Cooper Stickland, Jan Michelfeit, Arathi Mani +6 more

LLM-based software engineering agents are increasingly used in real-world development tasks, often with access to sensitive data or security-critical...

5 months ago cs.LG PDF

Attack HIGH

Behavior-Aware and Generalizable Defense Against Black-Box Adversarial Attacks for ML-Based IDS

Sabrine Ennaji, Elhadj Benkhelifa, Luigi Vincenzo Mancini

Machine learning based intrusion detection systems are increasingly targeted by black box adversarial attacks, where attackers craft evasive inputs...

5 months ago cs.CR cs.AI PDF

Attack MEDIUM

Practical challenges of control monitoring in frontier AI deployments

David Lindner, Charlie Griffin, Tomek Korbak +4 more

Automated control monitors could play an important role in overseeing highly capable AI agents that we do not fully trust. Prior work has explored...

5 months ago cs.CR cs.AI cs.MA PDF

Benchmark MEDIUM

On the Effectiveness of Membership Inference in Targeted Data Extraction from Large Language Models

Ali Al Sahili, Ali Chehab, Razane Tajeddine

Large Language Models (LLMs) are prone to memorizing training data, which poses serious privacy risks. Two of the most prominent concerns are...

5 months ago cs.LG cs.CL cs.CR PDF

Attack HIGH

Evaluating Adversarial Attacks on Federated Learning for Temperature Forecasting

Karina Chichifoi, Fabio Merizzi, Michele Colajanni

Deep learning and federated learning (FL) are becoming powerful partners for next-generation weather forecasting. Deep learning enables...

5 months ago cs.LG cs.CR PDF

Benchmark MEDIUM

PHANTOM: PHysical ANamorphic Threats Obstructing Connected Vehicle Mobility

Md Nahid Hasan Shuvo, Moinul Hossain

Connected autonomous vehicles (CAVs) rely on vision-based deep neural networks (DNNs) and low-latency (Vehicle-to-Everything) V2X communication to...

5 months ago cs.CV cs.AI cs.CR PDF

Tool MEDIUM

Cisco Integrated AI Security and Safety Framework Report

Amy Chang, Tiffany Saade, Sanket Mendapara +2 more

Artificial intelligence (AI) systems are being readily and rapidly adopted, increasingly permeating critical domains: from consumer platforms and...

5 months ago cs.CR cs.AI PDF

Tool MEDIUM

CTIGuardian: A Few-Shot Framework for Mitigating Privacy Leakage in Fine-Tuned LLMs

Shashie Dilhara Batan Arachchige, Benjamin Zi Hao Zhao, Hassan Jameel Asghar +2 more

Large Language Models (LLMs) are often fine-tuned to adapt their general-purpose knowledge to specific tasks and domains such as cyber threat...

5 months ago cs.CR cs.AI cs.LG PDF

Defense MEDIUM

Auto-Tuning Safety Guardrails for Black-Box Large Language Models

Perry Abdulkadir

Large language models (LLMs) are increasingly deployed behind safety guardrails such as system prompts and content filters, especially in settings...

5 months ago cs.CR cs.CL cs.LG PDF

Attack MEDIUM

Adversarial Robustness in Financial Machine Learning: Defenses, Economic Impact, and Governance Evidence

Samruddhi Baviskar

We evaluate adversarial robustness in tabular machine learning models used in financial decision making. Using credit scoring and fraud detection...

5 months ago cs.LG cs.AI cs.CR PDF

Attack MEDIUM

GradID: Adversarial Detection via Intrinsic Dimensionality of Gradients

Mohammad Mahdi Razmjoo, Mohammad Mahdi Sharifian, Saeed Bagheri Shouraki

Despite their remarkable performance, deep neural networks exhibit a critical vulnerability: small, often imperceptible, adversarial perturbations...

5 months ago cs.LG cs.CR cs.CV PDF

Attack MEDIUM

CODE ACROSTIC: Robust Watermarking for Code Generation

Li Lin, Siyuan Xin, Yang Cao +1 more

Watermarking large language models (LLMs) is vital for preventing their misuse, including the fabrication of fake news, plagiarism, and spam. It is...

5 months ago cs.CR cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial