AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1921–1936 of 1,936 papers

Clear filters

Benchmark MEDIUM

SEAL: Subspace-Anchored Watermarks for LLM Ownership

Yanbo Dai, Zongjie Li, Zhenlan Ji +1 more

Large language models (LLMs) have achieved remarkable success across a wide range of natural language processing tasks, demonstrating human-level...

5 months ago cs.CR PDF

Attack HIGH

NegBLEURT Forest: Leveraging Inconsistencies for Detecting Jailbreak Attacks

Lama Sleem, Jerome Francois, Lujun Li +3 more

Jailbreak attacks designed to bypass safety mechanisms pose a serious threat by prompting LLMs to generate harmful or inappropriate content, despite...

5 months ago cs.CR cs.AI PDF

Survey LOW

Privacy Challenges and Solutions in Retrieval-Augmented Generation-Enhanced LLMs for Healthcare Chatbots: A Review of Applications, Risks, and Future Directions

Shaowei Guan, Hin Chi Kwok, Ngai Fong Law +3 more

Retrieval-augmented generation (RAG) has rapidly emerged as a transformative approach for integrating large language models into clinical and...

5 months ago cs.CR cs.AI PDF

Defense MEDIUM

EcoAlign: An Economically Rational Framework for Efficient LVLM Alignment

Ruoxi Cheng, Haoxuan Ma, Teng Ma +1 more

Large Vision-Language Models (LVLMs) exhibit powerful reasoning capabilities but suffer sophisticated jailbreak vulnerabilities. Fundamentally,...

5 months ago cs.AI PDF

Defense HIGH

Prompt Engineering vs. Fine-Tuning for LLM-Based Vulnerability Detection in Solana and Algorand Smart Contracts

Biagio Boi, Christian Esposito

Smart contracts have emerged as key components within decentralized environments, enabling the automation of transactions through self-executing...

5 months ago cs.CR PDF

Attack MEDIUM

Data Poisoning Vulnerabilities Across Healthcare AI Architectures: A Security Threat Analysis

Farhad Abtahi, Fernando Seoane, Iván Pau +1 more

Healthcare AI systems face major vulnerabilities to data poisoning that current defenses and regulations cannot adequately address. We analyzed eight...

6 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

PATCHEVAL: A New Benchmark for Evaluating LLMs on Patching Real-World Vulnerabilities

Zichao Wei, Jun Zeng, Ming Wen +8 more

Software vulnerabilities are increasing at an alarming rate. However, manual patching is both time-consuming and resource-intensive, while existing...

6 months ago cs.CR cs.SE PDF

Benchmark MEDIUM

Robustness of LLM-enabled vehicle trajectory prediction under data security threats

Feilong Wang, Fuqiang Liu

The integration of large language models (LLMs) into automated driving systems has opened new possibilities for reasoning and decision-making by...

6 months ago cs.LG cs.AI cs.CR PDF

Benchmark MEDIUM

Synthetic Voices, Real Threats: Evaluating Large Text-to-Speech Models in Generating Harmful Audio

Guangke Chen, Yuhui Wang, Shouling Ji +2 more

Modern text-to-speech (TTS) systems, particularly those built on Large Audio-Language Models (LALMs), generate high-fidelity speech that faithfully...

6 months ago cs.SD cs.AI cs.CR PDF

Tool MEDIUM

ICX360: In-Context eXplainability 360 Toolkit

Dennis Wei, Ronny Luss, Xiaomeng Hu +6 more

Large Language Models (LLMs) have become ubiquitous in everyday life and are entering higher-stakes applications ranging from summarizing meeting...

6 months ago cs.CL cs.LG PDF

Benchmark MEDIUM

Can AI Models be Jailbroken to Phish Elderly Victims? An End-to-End Evaluation

Fred Heiding, Simon Lermen

We present an end-to-end demonstration of how attackers can exploit AI safety failures to harm vulnerable populations: from jailbreaking LLMs to...

6 months ago cs.CR cs.AI cs.CY PDF

Attack HIGH

PISanitizer: Preventing Prompt Injection to Long-Context LLMs via Prompt Sanitization

Runpeng Geng, Yanting Wang, Chenlong Yin +3 more

Long context LLMs are vulnerable to prompt injection, where an attacker can inject an instruction in a long context to induce an LLM to generate an...

6 months ago cs.CR cs.AI cs.CL PDF

Attack HIGH

Say It Differently: Linguistic Styles as Jailbreak Vectors

Srikant Panda, Avinash Rai

Large Language Models (LLMs) are commonly evaluated for robustness against paraphrased or semantically equivalent jailbreak prompts, yet little...

6 months ago cs.CL cs.AI PDF

Attack HIGH

BadThink: Triggered Overthinking Attacks on Chain-of-Thought Reasoning in Large Language Models

Shuaitong Liu, Renjue Li, Lijia Yu +3 more

Recent advances in Chain-of-Thought (CoT) prompting have substantially improved the reasoning capabilities of large language models (LLMs), but have...

6 months ago cs.CR cs.AI PDF

Benchmark LOW

OutSafe-Bench: A Benchmark for Multimodal Offensive Content Detection in Large Language Models

Yuping Yan, Yuhan Xie, Yuanshuai Li +3 more

Since Multimodal Large Language Models (MLLMs) are increasingly being integrated into everyday tools and intelligent agents, growing concerns have...

6 months ago cs.LG cs.CL PDF

Attack HIGH

Speech-Audio Compositional Attacks on Multimodal LLMs and Their Mitigation with SALMONN-Guard

Yudong Yang, Xuezhen Zhang, Zhifeng Han +6 more

Recent progress in LLMs has enabled understanding of audio signals, but has also exposed new safety risks arising from complex audio inputs that are...

6 months ago cs.SD cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial