AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1021–1040 of 1,220 papers

Clear filters

Benchmark MEDIUM

Securing AI Agent Execution

Christoph Bühler, Matteo Biagiola, Luca Di Grazia +1 more

Large Language Models (LLMs) have evolved into AI agents that interact with external tools and environments to perform complex tasks. The Model...

6 months ago cs.CR cs.AI cs.SE PDF

Benchmark MEDIUM

Quantifying CBRN Risk in Frontier Models

Divyanshu Kumar, Nitin Aravind Birur, Tanay Baswa +2 more

Frontier Large Language Models (LLMs) pose unprecedented dual-use risks through the potential proliferation of chemical, biological, radiological,...

6 months ago cs.CR cs.AI PDF

Defense MEDIUM

Soft Instruction De-escalation Defense

Nils Philipp Walter, Chawin Sitawarin, Jamie Hayes +2 more

Large Language Models (LLMs) are increasingly deployed in agentic systems that interact with an external environment; this makes them susceptible to...

6 months ago cs.CR cs.LG PDF

Tool MEDIUM

A Reinforcement Learning Framework for Robust and Secure LLM Watermarking

Li An, Yujian Liu, Yepeng Liu +3 more

Watermarking has emerged as a promising solution for tracing and authenticating text generated by large language models (LLMs). A common approach to...

6 months ago cs.CR PDF

Attack MEDIUM

Security Logs to ATT&CK Insights: Leveraging LLMs for High-Level Threat Understanding and Cognitive Trait Inference

Soham Hans, Stacy Marsella, Sophia Hirschmann +1 more

Understanding adversarial behavior in cybersecurity has traditionally relied on high-level intelligence reports and manual interpretation of attack...

6 months ago cs.CR cs.AI PDF

Attack MEDIUM

RAGRank: Using PageRank to Counter Poisoning in CTI LLM Pipelines

Austin Jia, Avaneesh Ramesh, Zain Shamsi +2 more

Retrieval-Augmented Generation (RAG) has emerged as the dominant architectural pattern to operationalize Large Language Model (LLM) usage in Cyber...

6 months ago cs.CR cs.AI cs.IR PDF

Survey MEDIUM

Learning to Triage Taint Flows Reported by Dynamic Program Analysis in Node.js Packages

Ronghao Ni, Aidan Z. H. Yang, Min-Chien Hsu +5 more

Program analysis tools often produce large volumes of candidate vulnerability reports that require costly manual review, creating a practical...

6 months ago cs.CR cs.LG cs.SE PDF

Tool MEDIUM

Adversarially-Aware Architecture Design for Robust Medical AI Systems

Alyssa Gerhart, Balaji Iyangar

Adversarial attacks pose a severe risk to AI systems used in healthcare, capable of misleading models into dangerous misclassifications that can...

6 months ago cs.LG cs.CR PDF

Attack MEDIUM

NeuPerm: Disrupting Malware Hidden in Neural Network Parameters by Leveraging Permutation Symmetry

Daniel Gilkarov, Ran Dubin

Pretrained deep learning model sharing holds tremendous value for researchers and enterprises alike. It allows them to apply deep learning by...

6 months ago cs.CR PDF

Defense MEDIUM

SAID: Empowering Large Language Models with Self-Activating Internal Defense

Yulong Chen, Yadong Liu, Jiawen Zhang +3 more

Large Language Models (LLMs), despite advances in safety alignment, remain vulnerable to jailbreak attacks designed to circumvent protective...

6 months ago cs.CR cs.AI PDF

Attack MEDIUM

SecureInfer: Heterogeneous TEE-GPU Architecture for Privacy-Critical Tensors for Large Language Model Deployment

Tushar Nayan, Ziqi Zhang, Ruimin Sun

With the increasing deployment of Large Language Models (LLMs) on mobile and edge platforms, securing them against model extraction attacks has...

6 months ago cs.CR cs.LG cs.SE PDF

Defense MEDIUM

Towards Strong Certified Defense with Universal Asymmetric Randomization

Hanbin Hong, Ashish Kundu, Ali Payani +2 more

Randomized smoothing has become essential for achieving certified adversarial robustness in machine learning models. However, current methods...

6 months ago cs.LG cs.CR PDF

Tool MEDIUM

AegisMCP: Online Graph Intrusion Detection for Tool-Augmented LLMs on Edge Devices

Zhonghao Zhan, Amir Al Sadi, Krinos Li +1 more

In this work, we study security of Model Context Protocol (MCP) agent toolchains and their applications in smart homes. We introduce AegisMCP, a...

6 months ago cs.CR PDF

Benchmark MEDIUM

Monitoring LLM-based Multi-Agent Systems Against Corruptions via Node Evaluation

Chengcan Wu, Zhixin Zhang, Mingqian Xu +2 more

Large Language Model (LLM)-based Multi-Agent Systems (MAS) have become a popular paradigm of AI applications. However, trustworthiness issues in MAS...

6 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

Collaborative penetration testing suite for emerging generative AI algorithms

Petar Radanliev

Problem Space: AI Vulnerabilities and Quantum Threats Generative AI vulnerabilities: model inversion, data poisoning, adversarial inputs. Quantum...

6 months ago cs.CR cs.AI cs.LG PDF

Tool MEDIUM

OpenGuardrails: A Configurable, Unified, and Scalable Guardrails Platform for Large Language Models

Thomas Wang, Haowen Li

As large language models (LLMs) are increasingly integrated into real-world applications, ensuring their safety, robustness, and privacy compliance...

6 months ago cs.CR cs.CL PDF

Benchmark MEDIUM

Exploring Membership Inference Vulnerabilities in Clinical Large Language Models

Alexander Nemecek, Zebin Yun, Zahra Rahmani +4 more

As large language models (LLMs) become progressively more embedded in clinical decision-support, documentation, and patient-information systems,...

6 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Evaluating Large Language Models in detecting Secrets in Android Apps

Marco Alecci, Jordan Samhi, Tegawendé F. Bissyandé +1 more

Mobile apps often embed authentication secrets, such as API keys, tokens, and client IDs, to integrate with cloud services. However, developers often...

6 months ago cs.CR cs.SE PDF

Benchmark MEDIUM

Pay Attention to the Triggers: Constructing Backdoors That Survive Distillation

Giovanni De Muri, Mark Vero, Robin Staab +1 more

LLMs are often used by downstream users as teacher models for knowledge distillation, compressing their capabilities into memory-efficient models....

6 months ago cs.LG cs.AI cs.CR PDF

Survey MEDIUM

The Attribution Story of WhisperGate: An Academic Perspective

Oleksandr Adamov, Anders Carlsson

This paper explores the challenges of cyberattack attribution, specifically APTs, applying the case study approach for the WhisperGate cyber...

6 months ago cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial