AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 881–900 of 1,220 papers

Clear filters

Benchmark MEDIUM

RoguePrompt: Dual-Layer Ciphering for Self-Reconstruction to Circumvent LLM Moderation

Benyamin Tafreshian

Large language models (LLMs) are becoming increasingly integrated into mainstream development platforms and daily technological workflows, typically...

5 months ago cs.CR PDF

Attack MEDIUM

Towards Realistic Guarantees: A Probabilistic Certificate for SmoothLLM

Adarsh Kumarappan, Ayushi Mehrotra

The SmoothLLM defense provides a certification guarantee against jailbreaking attacks, but it relies on a strict "k-unstable" assumption that rarely...

5 months ago cs.LG cs.AI PDF

Survey MEDIUM

From Reviewers' Lens: Understanding Bug Bounty Report Invalid Reasons with LLMs

Jiangrui Zheng, Yingming Zhou, Ali Abdullah Ahmad +2 more

Bug bounty platforms (e.g., HackerOne, BugCrowd) leverage crowd-sourced vulnerability discovery to improve continuous coverage, reduce the cost of...

5 months ago cs.SE cs.CR PDF

Defense MEDIUM

Beyond Binary Classification: A Semi-supervised Approach to Generalized AI-generated Image Detection

Hong-Hanh Nguyen-Le, Van-Tuan Tran, Dinh-Thuc Nguyen +1 more

The rapid advancement of generators (e.g., StyleGAN, Midjourney, DALL-E) has produced highly realistic synthetic images, posing significant...

5 months ago cs.LG cs.AI cs.CR PDF

Tool MEDIUM

Shadows in the Code: Exploring the Risks and Defenses of LLM-based Multi-Agent Software Development Systems

Xiaoqing Wang, Keman Huang, Bin Liang +2 more

The rapid advancement of Large Language Model (LLM)-driven multi-agent systems has significantly streamlined software developing tasks, enabling...

5 months ago cs.CR cs.AI cs.CL PDF

Tool MEDIUM

LLMs as Firmware Experts: A Runtime-Grown Tree-of-Agents Framework

Xiangrui Zhang, Zeyu Chen, Haining Wang +1 more

Large Language Models (LLMs) and their agent systems have recently demonstrated strong potential in automating code reasoning and vulnerability...

5 months ago cs.CR cs.SE PDF

Tool MEDIUM

Z-Space: A Multi-Agent Tool Orchestration Framework for Enterprise-Grade LLM Automation

Qingsong He, Jing Nan, Jiayu Jiao +5 more

Large Language Models can break through knowledge and timeliness limitations by invoking external tools within the Model Context Protocol framework...

5 months ago cs.SE cs.AI PDF

Defense MEDIUM

SafeCiM: Investigating Resilience of Hybrid Floating-Point Compute-in-Memory Deep Learning Accelerators

Swastik Bhattacharya, Sanjay Das, Anand Menon +3 more

Deep Neural Networks (DNNs) continue to grow in complexity with Large Language Models (LLMs) incorporating vast numbers of parameters. Handling these...

5 months ago cs.AR cs.LG PDF

Other MEDIUM

Can LLMs Help Allocate Public Health Resources? A Case Study on Childhood Lead Testing

Mohamed Afane, Ying Wang, Juntao Chen

Public health agencies face critical challenges in identifying high-risk neighborhoods for childhood lead exposure with limited resources for...

5 months ago cs.CY cs.AI PDF

Benchmark MEDIUM

Think Fast: Real-Time IoT Intrusion Reasoning Using IDS and LLMs at the Edge Gateway

Saeid Jamshidi, Amin Nikanjam, Negar Shahabi +4 more

As the number of connected IoT devices continues to grow, securing these systems against cyber threats remains a major challenge, especially in...

5 months ago cs.CR PDF

Attack MEDIUM

ASTRA: Agentic Steerability and Risk Assessment Framework

Itay Hazan, Yael Mathov, Guy Shtar +2 more

Securing AI agents powered by Large Language Models (LLMs) represents one of the most critical challenges in AI security today. Unlike traditional...

5 months ago cs.CR PDF

Benchmark MEDIUM

Building Browser Agents: Architecture, Security, and Practical Solutions

Aram Vardanyan

Browser agents enable autonomous web interaction but face critical reliability and security challenges in production. This paper presents findings...

5 months ago cs.SE PDF

Benchmark MEDIUM

Vision Language Models are Confused Tourists

Patrick Amadeus Irawan, Ikhlasul Akmal Hanif, Muhammad Dehan Al Kautsar +3 more

Although the cultural dimension has been one of the key aspects in evaluating Vision-Language Models (VLMs), their ability to remain stable across...

5 months ago cs.CV cs.CL PDF

Benchmark MEDIUM

Cognitive Inception: Agentic Reasoning against Visual Deceptions by Injecting Skepticism

Yinjie Zhao, Heng Zhao, Bihan Wen +1 more

As the development of AI-generated contents (AIGC), multi-modal Large Language Models (LLM) struggle to identify generated visual inputs from real...

5 months ago cs.AI PDF

Attack MEDIUM

MURMUR: Using cross-user chatter to break collaborative language agents in groups

Atharv Singh Patlan, Peiyao Sheng, S. Ashwin Hebbar +2 more

Language agents are rapidly expanding from single-user assistants to multi-user collaborators in shared workspaces and groups. However, today's...

5 months ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

Evaluating Adversarial Vulnerabilities in Modern Large Language Models

Tom Perel

The recent boom and rapid integration of Large Language Models (LLMs) into a wide range of applications warrants a deeper understanding of their...

5 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

AssurAI: Experience with Constructing Korean Socio-cultural Datasets to Discover Potential Risks of Generative AI

Chae-Gyun Lim, Seung-Ho Han, EunYoung Byun +51 more

The rapid evolution of generative AI necessitates robust safety evaluations. However, current safety datasets are predominantly English-centric,...

5 months ago cs.AI cs.CY cs.LG PDF

Benchmark MEDIUM

Q-MLLM: Vector Quantization for Robust Multimodal Large Language Model Security

Wei Zhao, Zhe Li, Yige Li +1 more

Multimodal Large Language Models (MLLMs) have demonstrated impressive capabilities in cross-modal understanding, but remain vulnerable to adversarial...

5 months ago cs.CR cs.AI PDF

Attack MEDIUM

PSM: Prompt Sensitivity Minimization via LLM-Guided Black-Box Optimization

Huseein Jawad, Nicolas Brunel

System prompts are critical for guiding the behavior of Large Language Models (LLMs), yet they often contain proprietary logic or sensitive...

5 months ago cs.CR cs.CL PDF

Defense MEDIUM

Entropy-Based Measurement of Value Drift and Alignment Work in Large Language Models

Samih Fadli

Large language model safety is usually assessed with static benchmarks, but key failures are dynamic: value drift under distribution shift, jailbreak...

5 months ago cs.CL cs.AI cs.LG PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial