AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 2441–2460 of 2,529 papers

Benchmark LOW

EMO-TTA: Improving Test-Time Adaptation of Audio-Language Models for Speech Emotion Recognition

Jiacheng Shi, Hongfei Du, Y. Alicia Hong +1 more

Speech emotion recognition (SER) with audio-language models (ALMs) remains vulnerable to distribution shifts at test time, leading to performance...

7 months ago cs.SD cs.AI PDF

Other LOW

A Method for Quantifying Human Risk and a Blueprint for LLM Integration

Giuseppe Canale

This paper presents the Cybersecurity Psychology Framework (CPF), a novel methodology for quantifying human-centric vulnerabilities in security...

7 months ago cs.CR PDF

Attack HIGH

Fingerprinting LLMs via Prompt Injection

Yuepeng Hu, Zhengyuan Jiang, Mengyuan Li +4 more

Large language models (LLMs) are often modified after release through post-processing such as post-training or quantization, which makes it...

7 months ago cs.CR cs.CL PDF

Survey MEDIUM

Where LLM Agents Fail and How They can Learn From Failures

Kunlun Zhu, Zijia Liu, Bingxuan Li +15 more

Large Language Model (LLM) agents, which integrate planning, memory, reflection, and tool-use modules, have shown promise in solving complex,...

7 months ago cs.AI PDF

Attack LOW

Incentive-Aligned Multi-Source LLM Summaries

Yanchen Jiang, Zhe Feng, Aranyak Mehta

Large language models (LLMs) are increasingly used in modern search and answer systems to synthesize multiple, sometimes conflicting, texts into a...

7 months ago cs.CL cs.AI cs.GT PDF

Benchmark LOW

GHOST: Hallucination-Inducing Image Generation for Multimodal LLMs

Aryan Yazdan Parast, Parsa Hosseini, Hesam Asadollahzadeh +4 more

Object hallucination in Multimodal Large Language Models (MLLMs) is a persistent failure mode that causes the model to perceive objects absent in the...

7 months ago cs.CV cs.AI cs.LG PDF

Defense MEDIUM

A Hybrid CAPTCHA Combining Generative AI with Keystroke Dynamics for Enhanced Bot Detection

Ayda Aghaei Nia

Completely Automated Public Turing tests to tell Computers and Humans Apart (CAPTCHAs) are a foundational component of web security, yet traditional...

7 months ago cs.CR cs.AI PDF

Survey LOW

Who's Your Judge? On the Detectability of LLM-Generated Judgments

Dawei Li, Zhen Tan, Chengshuai Zhao +6 more

Large Language Model (LLM)-based judgments leverage powerful LLMs to efficiently evaluate candidate content and provide judgment scores. However, the...

7 months ago cs.AI PDF

Defense LOW

Towards Trustworthy Lexical Simplification: Exploring Safety and Efficiency with Small LLMs

Akio Hayakawa, Stefan Bott, Horacio Saggion

Despite their strong performance, large language models (LLMs) face challenges in real-world application of lexical simplification (LS), particularly...

7 months ago cs.CL PDF

Tool MEDIUM

A-MemGuard: A Proactive Defense Framework for LLM-Based Agent Memory

Qianshan Wei, Tengchao Yang, Yaochen Wang +7 more

Large Language Model (LLM) agents use memory to learn from past interactions, enabling autonomous planning and decision-making in complex...

7 months ago cs.CR cs.AI PDF

Attack HIGH

SecInfer: Preventing Prompt Injection via Inference-time Scaling

Yupei Liu, Yanting Wang, Yuqi Jia +2 more

Prompt injection attacks pose a pervasive threat to the security of Large Language Models (LLMs). State-of-the-art prevention-based defenses...

7 months ago cs.CR cs.AI PDF

Benchmark LOW

Between Help and Harm: An Evaluation of Mental Health Crisis Handling by LLMs

Adrian Arnaiz-Rodriguez, Miguel Baidal, Erik Derner +5 more

Large language model-powered chatbots have transformed how people seek information, especially in high-stakes contexts like mental health. Despite...

7 months ago cs.CL cs.CY PDF

Attack MEDIUM

PRIVMARK: Private Large Language Models Watermarking with MPC

Thomas Fargues, Ye Dong, Tianwei Zhang +1 more

The rapid growth of Large Language Models (LLMs) has highlighted the pressing need for reliable mechanisms to verify content ownership and ensure...

7 months ago cs.CR PDF

Attack HIGH

TokenSwap: Backdoor Attack on the Compositional Understanding of Large Vision-Language Models

Zhifang Zhang, Qiqi Tao, Jiaqi Lv +3 more

Large vision-language models (LVLMs) have achieved impressive performance across a wide range of vision-language tasks, while they remain vulnerable...

7 months ago cs.CV PDF

Survey LOW

Sanitize Your Responses: Mitigating Privacy Leakage in Large Language Models

Wenjie Fu, Huandong Wang, Junyao Gao +2 more

As Large Language Models (LLMs) achieve remarkable success across a wide range of applications, such as chatbots and code copilots, concerns...

7 months ago cs.CL cs.CR cs.LG PDF

Attack MEDIUM

FuncPoison: Poisoning Function Library to Hijack Multi-agent Autonomous Driving Systems

Yuzhen Long, Songze Li

Autonomous driving systems increasingly rely on multi-agent architectures powered by large language models (LLMs), where specialized agents...

7 months ago cs.CR cs.LG PDF

Attack MEDIUM

Dual Mechanisms of Value Expression: Intrinsic vs. Prompted Values in Large Language Models

Jongwook Han, Jongwon Lim, Injin Kong +1 more

Large language models can express values in two main ways: (1) intrinsic expression, reflecting the model's inherent values learned during training,...

7 months ago cs.CL cs.AI PDF

Defense MEDIUM

DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models

Zherui Li, Zheng Nie, Zhenhong Zhou +7 more

The rapid advancement of Diffusion Large Language Models (dLLMs) introduces unprecedented vulnerabilities that are fundamentally distinct from...

7 months ago cs.CL cs.AI PDF

Survey HIGH

When MCP Servers Attack: Taxonomy, Feasibility, and Mitigation

Weibo Zhao, Jiahao Liu, Bonan Ruan +2 more

Model Context Protocol (MCP) servers enable AI applications to connect to external systems in a plug-and-play manner, but their rapid proliferation...

7 months ago cs.CR cs.SE PDF

Attack MEDIUM

AdvChain: Adversarial Chain-of-Thought Tuning for Robust Safety Alignment of Large Reasoning Models

Zihao Zhu, Xinyu Wu, Gehan Hu +3 more

Large Reasoning Models (LRMs) have demonstrated remarkable capabilities in complex problem-solving through Chain-of-Thought (CoT) reasoning. However,...

7 months ago cs.AI cs.CL PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial