AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 101–120 of 919 papers

Clear filters

Benchmark MEDIUM

Towards Secure Logging: Characterizing and Benchmarking Logging Code Security Issues with LLMs

He Yang Yuan, Xin Wang, Kundi Yao +3 more

Logging code plays an important role in software systems by recording key events and behaviors, which are essential for debugging and monitoring....

3 weeks ago cs.SE cs.AI cs.CR PDF

Benchmark MEDIUM

Indic-CodecFake meets SATYAM: Towards Detecting Neural Audio Codec Synthesized Speech Deepfakes in Indic Languages

Girish, Mohd Mujtaba Akhtar, Orchid Chetia Phukan +1 more

The rapid advancement of Audio Large Language Models (ALMs), driven by Neural Audio Codecs (NACs), has led to the emergence of highly realistic...

3 weeks ago eess.AS PDF

Benchmark MEDIUM

An AI Agent Execution Environment to Safeguard User Data

Robert Stanley, Avi Verma, Lillian Tsai +2 more

AI agents promise to serve as general-purpose personal assistants for their users, which requires them to have access to private user data (e.g.,...

3 weeks ago cs.CR cs.AI cs.OS PDF

Benchmark MEDIUM

Cyber Defense Benchmark: Agentic Threat Hunting Evaluation for LLMs in SecOps

Alankrit Chona, Igor Kozlov, Ambuj Kumar

We introduce the Cyber Defense Benchmark, a benchmark for measuring how well large language model (LLM) agents perform the core SOC analyst task of...

3 weeks ago cs.CR cs.AI PDF

Benchmark MEDIUM

Cyber Defense Benchmark: Agentic Threat Hunting Evaluation for LLMs in SecOps

Alankrit Chona, Igor Kozlov, Ambuj Kumar

We introduce the Cyber Defense Benchmark, a benchmark for measuring how well large language model (LLM) agents perform the core SOC analyst task of...

3 weeks ago cs.CR cs.AI PDF

Defense MEDIUM

Evaluating LLM-Generated Obfuscated XSS Payloads for Machine Learning-Based Detection

Divyesh Gabbireddy, Suman Saha

Cross-site scripting (XSS) remains a persistent web security vulnerability, especially because obfuscation can change the surface form of a malicious...

3 weeks ago cs.CR cs.LG cs.SE PDF

Defense MEDIUM

Malicious ML Model Detection by Learning Dynamic Behaviors

Sarang Nambiar, Dhruv Pradhan, Ezekiel Soremekun

Pre-trained machine learning models (PTMs) are commonly provided via Model Hubs (e.g., Hugging Face) in standard formats like Pickles to facilitate...

3 weeks ago cs.CR cs.SE PDF

Benchmark MEDIUM

Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture The Flag Challenges

Ali Al-Kaswan, Maksim Plotnikov, Maxim Hájek +3 more

Large Language Model (LLM) agents are increasingly proposed for autonomous cybersecurity tasks, but their capabilities in realistic offensive...

3 weeks ago cs.AI cs.CR cs.SE PDF

Defense MEDIUM

ProjLens: Unveiling the Role of Projectors in Multimodal Model Safety

Kun Wang, Cheng Qian, Miao Yu +6 more

Multimodal Large Language Models (MLLMs) have achieved remarkable success in cross-modal understanding and generation, yet their deployment is...

3 weeks ago cs.CR cs.AI PDF

Defense MEDIUM

Mechanistic Anomaly Detection via Functional Attribution

Hugo Lyons Keenan, Christopher Leckie, Sarah Erfani

We can often verify the correctness of neural network outputs using ground truth labels, but we cannot reliably determine whether the output was...

3 weeks ago cs.LG cs.CR PDF

Benchmark MEDIUM

Towards Understanding the Robustness of Sparse Autoencoders

Ahson Saiyed, Sabrina Sadiekh, Chirag Agarwal

Large Language Models (LLMs) remain vulnerable to optimization-based jailbreak attacks that exploit internal gradient structure. While Sparse...

3 weeks ago cs.LG cs.AI cs.CL PDF

Attack MEDIUM

Beyond Indistinguishability: Measuring Extraction Risk in LLM APIs

Ruixuan Liu, David Evans, Li Xiong

Indistinguishability properties such as differential privacy bounds or low empirically measured membership inference are widely treated as proxies to...

3 weeks ago cs.CR cs.CL cs.LG PDF

Benchmark MEDIUM

AgenTEE: Confidential LLM Agent Execution on Edge Devices

Sina Abdollahi, Mohammad M Maheri, Javad Forough +5 more

Large Language Model (LLM) agents provide powerful automation capabilities, but they also create a substantially broader attack surface than...

3 weeks ago cs.CR cs.OS PDF

Defense MEDIUM

Committed SAE-Feature Traces for Audited-Session Substitution Detection in Hosted LLMs

Ziyang Liu

Hosted-LLM providers have a silent-substitution incentive: advertise a stronger model while serving cheaper replies. Probe-after-return schemes such...

3 weeks ago cs.CR cs.AI PDF

Defense MEDIUM

Owner-Harm: A Missing Threat Model for AI Agent Safety

Dongcheng Zhang, Yiqing Jiang

Existing AI agent safety benchmarks focus on generic criminal harm (cybercrime, harassment, weapon synthesis), leaving a systematic blind spot for a...

3 weeks ago cs.CR cs.AI cs.CL PDF

Benchmark MEDIUM

Spatiotemporal Sycophancy: Negation-Based Gaslighting in Video Large Language Models

Ziyao Tang, Pengkun Jiao, Bin Zhu +3 more

Video Large Language Models (Vid-LLMs) have demonstrated remarkable performance in video understanding tasks, yet their robustness under...

3 weeks ago cs.CV PDF

Defense MEDIUM

TitanCA: Lessons from Orchestrating LLM Agents to Discover 100+ CVEs

Ting Zhang, Yikun Li, Chengran Yang +15 more

Software vulnerabilities remain one of the most persistent threats to modern digital infrastructure. While static application security testing (SAST)...

3 weeks ago cs.CR PDF

Benchmark MEDIUM

Privacy-Preserving Product-Quantized Approximate Nearest Neighbor Search Framework for Large-scale Datasets via A Hybrid of Fully Homomorphic Encryption and Trusted Execution Environment

Shozo Saeki, Minoru Kawahara, Hirohisa Aman

A nearest-neighbor framework is a fundamental tool for various applications involving Large Language Models (LLMs) and Visual Language Models (VLMs)....

3 weeks ago cs.CR PDF

Tool MEDIUM

Reverse Constitutional AI: A Framework for Controllable Toxic Data Generation via Probability-Clamped RLAIF

Yuan Fang, Yiming Luo, Aimin Zhou +1 more

Ensuring the safety of large language models (LLMs) requires robust red teaming, yet the systematic synthesis of high-quality toxic data remains...

3 weeks ago cs.CL cs.AI PDF

Benchmark MEDIUM

SDLLMFuzz: Dynamic-static LLM-assisted greybox fuzzing for structured input programs

Yihao Zou, Tianming Zheng, Futai Zou +1 more

Fuzzing has become a widely adopted technique for vulnerability discovery, yet it remains ineffective for structured-input programs due to strict...

3 weeks ago cs.CR cs.PL PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial