AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 61–80 of 1,906 papers

Clear filters

Benchmark MEDIUM

AgentTrust: Runtime Safety Evaluation and Interception for AI Agent Tool Use

Chenglin Yang

Modern AI agents execute real-world side effects through tool calls such as file operations, shell commands, HTTP requests, and database queries. A...

6 days ago cs.AI cs.CR PDF

Attack HIGH

Sparse Tokens Suffice: Jailbreaking Audio Language Models via Token-Aware Gradient Optimization

Zheng Fang, Xiaosen Wang, Shenyi Zhang +2 more

Jailbreak attacks on audio language models (ALMs) optimize audio perturbations to elicit unsafe generations, and they typically update the entire...

6 days ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

Gray-Box Poisoning of Continuous Malware Ingestion Pipelines

Jan Dolejš, Martin Jureček, Róbert Lórencz

Modern malware detection pipelines rely on continuous data ingestion and machine learning to counter the high volume of novel threats. This work...

6 days ago cs.CR cs.LG PDF

Survey LOW

Bridging Generation and Training: A Systematic Review of Quality Issues in LLMs for Code

Kaifeng He, Xiaojun Zhang, Peiliang Cai +7 more

Large language models (LLMs) frequently generate defective outputs in code generation tasks, ranging from logical bugs to security vulnerabilities....

6 days ago cs.SE cs.AI PDF

Defense LOW

RangeGuard: Efficient, Bounded Approximate Error Correction for Reliable DNNs

Hanum Ko, Sangheum Yeon, Jong Hwan Ko +1 more

As DRAM scales in density and adopts 3D integration, raw fault rates increase and multi-bit errors are no longer rare. Such errors can severely...

6 days ago cs.AR PDF

Attack HIGH

Misrouter: Exploiting Routing Mechanisms for Input-Only Attacks on Mixture-of-Experts LLMs

Zekun Fei, Zihao Wang, Weijie Liu +4 more

Mixture-of-Experts (MoE) architectures have emerged as a leading paradigm for scaling large language models through sparse, routing-based...

1 weeks ago cs.CR PDF

Defense LOW

Experiment-as-Code Labs: A Declarative Stack for AI-Driven Scientific Discovery

Zhenning Yang, Yuhan Chen, Patrick Tser Jern Kon +5 more

To unleash the full potential of AI for Science, we must untether the agents from a purely digital environment. The agent's ability to control and...

1 weeks ago eess.SY cs.AI PDF

Attack MEDIUM

Laundering AI Authority with Adversarial Examples

Jie Zhang, Pura Peetathawatchai, Florian Tramèr +1 more

Vision-language models (VLMs) are increasingly deployed as trusted authorities -- fact-checking images on social media, comparing products, and...

1 weeks ago cs.CR cs.LG PDF

Attack MEDIUM

Undetectable Backdoors in Model Parameters: Hiding Sparse Secrets in High Dimensions

Sarthak Choudhary, Atharv Singh Patlan, Nils Palumbo +3 more

We present Sparse Backdoor, a supply-chain attack that plants a \emph{provably undetectable} backdoor in pre-trained image classifiers, including...

1 weeks ago cs.CR cs.AI cs.LG PDF

Attack HIGH

Redefining AI Red Teaming in the Agentic Era: From Weeks to Hours

Raja Sekhar Rao Dheekonda, Will Pearce, Nick Landers

AI systems are entering critical domains like healthcare, finance, and defense, yet remain vulnerable to adversarial attacks. While AI red teaming is...

1 weeks ago cs.AI cs.CR PDF

Attack HIGH

Generating Proof-of-Vulnerability Tests to Help Enhance the Security of Complex Software

Shravya Kanchi, Xiaoyan Zang, Ying Zhang +2 more

Developers create modern software applications (Apps) on top of third-party libraries (Libs). When library vulnerabilities are reachable through...

1 weeks ago cs.CR cs.SE PDF

Attack MEDIUM

The Infinite Mutation Engine? Measuring Polymorphism in LLM-Generated Offensive Code

Gabriel Hortea, Juan Tapiador

Malware authors have traditionally relied on polymorphic techniques to produce variants in the same malware family, complicating signature-based...

1 weeks ago cs.CR PDF

Attack MEDIUM

The Infinite Mutation Engine? Measuring Polymorphism in LLM-Generated Offensive Code

Gabriel Hortea, Juan Tapiador

Malware authors have traditionally relied on polymorphic techniques to produce variants in the same malware family, complicating signature-based...

1 weeks ago cs.CR PDF

Attack HIGH

Membership Inference Attacks for Retrieval Based In-Context Learning for Document Question Answering

Tejas Kulkarni, Antti Koskela, Laith Zumot

We show that remotely hosted applications employing in-context learning when augmented with a retrieval function to select in-context examples can be...

1 weeks ago cs.CR cs.LG PDF

Attack MEDIUM

MEMSAD: Gradient-Coupled Anomaly Detection for Memory Poisoning in Retrieval-Augmented Agents

Ishrith Gowda

Persistent external memory enables LLM agents to maintain context across sessions, yet its security properties remain formally uncharacterized. We...

1 weeks ago cs.CR cs.AI cs.LG PDF

Tool HIGH

Exposing LLM Safety Gaps Through Mathematical Encoding:New Attacks and Systematic Analysis

Haoyu Zhang, Mohammad Zandsalimy, Shanu Sushmita

Large language models (LLMs) employ safety mechanisms to prevent harmful outputs, yet these defenses primarily rely on semantic pattern matching. We...

1 weeks ago cs.CR cs.AI cs.CL PDF

Defense LOW

Robust Agent Compensation (RAC): Teaching AI Agents to Compensate

Srinath Perera, Kaviru Hapuarachchi, Frank Leymann +1 more

We present Robust Agent Compensation (RAC), a log-based recovery paradigm (providing a safety net) implemented through an architectural extension...

1 weeks ago cs.AI PDF

Benchmark MEDIUM

Graph Reconstruction from Differentially Private GNN Explanations

Rishi Raj Sahoo, Jyotirmaya Shivottam, Subhankar Mishra

Regulatory frameworks such as GDPR increasingly require that ML predictions be accompanied by post-hoc explanations, even when raw data and trained...

1 weeks ago cs.LG cs.CR PDF

Benchmark MEDIUM

DECKER: Domain-invariant Embedding for Cross-Keyboard Extraction and Recognition

Bikrant Bikram Pratap Maurya, Nitin Choudhury, Daksh Agarwal +1 more

Acoustic side-channel attacks (ASCA) on keyboards pose a significant security risk, as keystrokes can be inferred from typing acoustics, revealing...

1 weeks ago cs.CR cs.SD PDF

Attack HIGH

ARGUS: Defending LLM Agents Against Context-Aware Prompt Injection

Shihao Weng, Yang Feng, Jinrui Zhang +3 more

The rise of Large Language Model (LLM) agents, augmented with tool use, skills, and external knowledge, has introduced new security risks. Among...

1 weeks ago cs.CR cs.SE PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial