AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 121–140 of 740 papers

Clear filters

Benchmark LOW

Teaching Students to Question the Machine: An AI Literacy Intervention Improves Students' Regulation of LLM Use in a Science Task

O. Clerc, R. Abdelghani, C. Desvaux +3 more

The rapid adoption of generative artificial intelligence (GenAI) in schools raises concerns about students' uncritical reliance on its outputs....

1 months ago cs.CY PDF

Benchmark MEDIUM

From Component Manipulation to System Compromise: Understanding and Detecting Malicious MCP Servers

Yiheng Huang, Zhijia Zhao, Bihuan Chen +5 more

The model context protocol (MCP) standardizes how LLMs connect to external tools and data sources, enabling faster integration but introducing new...

1 months ago cs.CR cs.SE PDF

Benchmark LOW

AURA: Multimodal Shared Autonomy for Real-World Urban Navigation

Yukai Ma, Honglin He, Selina Song +2 more

Long-horizon navigation in complex urban environments relies heavily on continuous human operation, which leads to fatigue, reduced efficiency, and...

1 months ago cs.RO PDF

Benchmark MEDIUM

Cooking Up Risks: Benchmarking and Reducing Food Safety Risks in Large Language Models

Weidi Luo, Xiaofei Wen, Tenghao Huang +5 more

Large language models (LLMs) are increasingly deployed for everyday tasks, including food preparation and health-related guidance. However, food...

1 months ago cs.CR PDF

Benchmark MEDIUM

SERSEM: Selective Entropy-Weighted Scoring for Membership Inference in Code Language Models

Kıvanç Kuzey Dikici, Serdar Kara, Semih Çağlar +2 more

As Large Language Models (LLMs) for code increasingly utilize massive, often non-permissively licensed datasets, evaluating data contamination...

1 months ago cs.SE cs.CR PDF

Benchmark LOW

BloClaw: An Omniscient, Multi-Modal Agentic Workspace for Next-Generation Scientific Discovery

Yao Qin, Yangyang Yan, Jinhua Pang +1 more

The integration of Large Language Models (LLMs) into life sciences has catalyzed the development of "AI Scientists." However, translating these...

1 months ago cs.AI PDF

Benchmark MEDIUM

EnsembleSHAP: Faithful and Certifiably Robust Attribution for Random Subspace Method

Yanting Wang, Jinyuan Jia

Random subspace method has wide security applications such as providing certified defenses against adversarial and backdoor attacks, and building...

1 months ago cs.CR PDF

Benchmark MEDIUM

The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning

Yubo Li, Lu Zhang, Tianchong Jiang +2 more

Large language models systematically fail when a salient surface cue conflicts with an unstated feasibility constraint. We study this through a...

1 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

Design Principles for the Construction of a Benchmark Evaluating Security Operation Capabilities of Multi-agent AI Systems

Yicheng Cai, Mitchell John DeStefano, Guodong Dong +5 more

As Large Language Models (LLMs) and multi-agent AI systems are demonstrating increasing potential in cybersecurity operations, organizations,...

1 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Evaluating Privilege Usage of Agents on Real-World Tools

Quan Zhang, Lianhang Fu, Lvsi Lian +5 more

Equipping LLM agents with real-world tools can substantially improve productivity. However, granting agents autonomy over tool use also transfers the...

1 months ago cs.CR cs.AI PDF

Benchmark LOW

CDH-Bench: A Commonsense-Driven Hallucination Benchmark for Evaluating Visual Fidelity in Vision-Language Models

Kesheng Chen, Yamin Hu, Qi Zhou +2 more

Vision-language models (VLMs) achieve strong performance on many benchmarks, yet a basic reliability question remains underexplored: when visual...

1 months ago cs.CV cs.AI cs.CL PDF

Benchmark MEDIUM

Seeing to Ground: Visual Attention for Hallucination-Resilient MDLLMs

Vishal Narnaware, Animesh Gupta, Kevin Zhai +2 more

Multimodal Diffusion Large Language Models (MDLLMs) achieve high-concurrency generation through parallel masked decoding, yet the architectures...

1 months ago cs.CV PDF

Benchmark MEDIUM

Unveiling the Resilience of LLM-Enhanced Search Engines against Black-Hat SEO Manipulation

Pei Chen, Geng Hong, Xinyi Wu +6 more

The emergence of Large Language Model-enhanced Search Engines (LLMSEs) has revolutionized information retrieval by integrating web-scale search...

1 months ago cs.CR cs.IR PDF

Benchmark LOW

AuthorityBench: Benchmarking LLM Authority Perception for Reliable Retrieval-Augmented Generation

Zhihui Yao, Hengran Zhang, Keping Bi

Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) with external knowledge but remains vulnerable to low-authority sources...

1 months ago cs.IR PDF

Benchmark LOW

From Weights to Concepts: Data-Free Interpretability of CLIP via Singular Vector Decomposition

Francesco Gentile, Nicola Dall'Asen, Francesco Tonini +3 more

As vision-language models are deployed at scale, understanding their internal mechanisms becomes increasingly critical. Existing interpretability...

1 months ago cs.CV PDF

Benchmark MEDIUM

Environment-Grounded Multi-Agent Workflow for Autonomous Penetration Testing

Michael Somma, Markus Großpointner, Paul Zabalegui +2 more

The increasing complexity and interconnectivity of digital infrastructures make scalable and reliable security assessment methods essential. Robotic...

1 months ago cs.RO cs.AI PDF

Benchmark MEDIUM

Walma: Learning to See Memory Corruption in WebAssembly

Oussama Draissi, Mark Günzel, Ahmad-Reza Sadeghi +1 more

WebAssembly's (Wasm) monolithic linear memory model facilitates memory corruption attacks that can escalate to cross-site scripting in browsers or go...

1 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

Do World Action Models Generalize Better than VLAs? A Robustness Study

Zhanguang Zhang, Zhiyuan Li, Behnam Rahmati +10 more

Robot action planning in the real world is challenging as it requires not only understanding the current state of the environment but also predicting...

1 months ago cs.RO PDF

Benchmark MEDIUM

SecureBreak -- A dataset towards safe and secure models

Marco Arazzi, Vignesh Kumar Kembu, Antonino Nocera

Large language models are becoming pervasive core components in many real-world applications. As a consequence, security alignment represents a...

1 months ago cs.CR cs.AI cs.CL PDF

Benchmark LOW

Mirage The Illusion of Visual Understanding

Mohammad Asadi, Jack W. O'Sullivan, Fang Cao +5 more

Multimodal AI systems have achieved remarkable performance across a broad range of real-world tasks, yet the mechanisms underlying visual-language...

1 months ago cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial