AI Security Research

2,589+ academic papers on AI security, attacks, and defenses

Total

2,589

Attack

998

Benchmark

740

Defense

355

Tool

276

Survey

147

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1221–1240 of 1,931 papers

Clear filters

Defense LOW

Semantic Uncertainty Quantification of Hallucinations in LLMs: A Quantum Tensor Network Based Method

Pragatheeswaran Vipulanandan, Kamal Premaratne, Dilip Sarkar

Large language models (LLMs) exhibit strong generative capabilities but remain vulnerable to confabulations, fluent yet unreliable outputs that vary...

3 months ago cs.CL PDF

Benchmark MEDIUM

VoxMorph: Scalable Zero-shot Voice Identity Morphing via Disentangled Embeddings

Bharath Krishnamurthy, Ajita Rattani

Morphing techniques generate artificial biometric samples that combine features from multiple individuals, allowing each contributor to be verified...

3 months ago cs.SD cs.CR cs.LG PDF

Benchmark MEDIUM

Benchmarking LLAMA Model Security Against OWASP Top 10 For LLM Applications

Nourin Shahin, Izzat Alsmadi

As large language models (LLMs) move from research prototypes to enterprise systems, their security vulnerabilities pose serious risks to data...

3 months ago cs.CR cs.LG PDF

Tool MEDIUM

RvB: Automating AI System Hardening via Iterative Red-Blue Games

Lige Huang, Zicheng Liu, Jie Zhang +3 more

The dual offensive and defensive utility of Large Language Models (LLMs) highlights a critical gap in AI security: the lack of unified frameworks for...

3 months ago cs.CR cs.AI cs.CL PDF

Benchmark MEDIUM

Automated Safety Benchmarking: A Multi-agent Pipeline for LVLMs

Xiangyang Zhu, Yuan Tian, Zicheng Zhang +6 more

Large vision-language models (LVLMs) exhibit remarkable capabilities in cross-modal tasks but face significant safety challenges, which undermine...

3 months ago cs.CL PDF

Attack HIGH

LLM-VA: Resolving the Jailbreak-Overrefusal Trade-off via Vector Alignment

Haonan Zhang, Dongxia Wang, Yi Liu +2 more

Safety-aligned LLMs suffer from two failure modes: jailbreak (answering harmful inputs) and over-refusal (declining benign queries). Existing vector...

3 months ago cs.LG cs.AI PDF

Defense MEDIUM

From Internal Diagnosis to External Auditing: A VLM-Driven Paradigm for Online Test-Time Backdoor Defense

Binyan Xu, Fan Yang, Xilin Dai +2 more

Deep Neural Networks remain inherently vulnerable to backdoor attacks. Traditional test-time defenses largely operate under the paradigm of internal...

3 months ago cs.LG cs.CR PDF

Defense LOW

An Agentic AI Control Plane for 6G Network Slice Orchestration, Monitoring, and Trading

Eranga Bandara, Ross Gore, Sachin Shetty +9 more

6G networks are expected to be AI-native, intent-driven, and economically programmable, requiring fundamentally new approaches to network slice...

3 months ago cs.NI cs.AI PDF

Defense LOW

Epistemic Traps: Rational Misalignment Driven by Model Misspecification

Xingcheng Xu, Jingjing Qu, Qiaosheng Zhang +4 more

The rapid deployment of Large Language Models and AI agents across critical societal and technical domains is hindered by persistent behavioral...

3 months ago cs.AI cs.CL cs.LG PDF

Benchmark MEDIUM

Selective Steering: Norm-Preserving Control Through Discriminative Layer Selection

Quy-Anh Dang, Chris Ngo

Despite significant progress in alignment, large language models (LLMs) remain vulnerable to adversarial attacks that elicit harmful behaviors....

3 months ago cs.LG cs.AI PDF

Benchmark MEDIUM

VoxPrivacy: A Benchmark for Evaluating Interactional Privacy of Speech Language Models

Yuxiang Wang, Hongyu Liu, Dekun Chen +2 more

As Speech Language Models (SLMs) transition from personal devices to shared, multi-user environments such as smart homes, a new challenge emerges:...

3 months ago eess.AS cs.AI cs.SD PDF

Attack MEDIUM

LLMs Can Unlearn Refusal with Only 1,000 Benign Samples

Yangyang Guo, Ziwei Xu, Si Liu +2 more

This study reveals a previously unexplored vulnerability in the safety alignment of Large Language Models (LLMs). Existing aligned LLMs predominantly...

3 months ago cs.CR PDF

Attack MEDIUM

Contrastive Spectral Rectification: Test-Time Defense towards Zero-shot Adversarial Robustness of CLIP

Sen Nie, Jie Zhang, Zhuo Wang +2 more

Vision-language models (VLMs) such as CLIP have demonstrated remarkable zero-shot generalization, yet remain highly vulnerable to adversarial...

3 months ago cs.CV PDF

Benchmark LOW

Do Images Speak Louder than Words? Investigating the Effect of Textual Misinformation in VLMs

Chi Zhang, Wenxuan Ding, Jiale Liu +3 more

Vision-Language Models (VLMs) have shown strong multimodal reasoning capabilities on Visual-Question-Answering (VQA) benchmarks. However, their...

3 months ago cs.CL PDF

Tool HIGH

SHIELD: An Auto-Healing Agentic Defense Framework for LLM Resource Exhaustion Attacks

Nirhoshan Sivaroopan, Kanchana Thilakarathna, Albert Zomaya +6 more

Sponge attacks increasingly threaten LLM systems by inducing excessive computation and DoS. Existing defenses either rely on statistical filters that...

3 months ago cs.CR cs.AI PDF

Survey MEDIUM

AgenticSCR: An Autonomous Agentic Secure Code Review for Immature Vulnerabilities Detection

Wachiraphan Charoenwet, Kla Tantithamthavorn, Patanamon Thongtanunam +3 more

Secure code review is critical at the pre-commit stage, where vulnerabilities must be caught early under tight latency and limited-context...

3 months ago cs.CR cs.AI cs.LG PDF

Tool MEDIUM

Evaluating Nova 2.0 Lite model under Amazon's Frontier Model Safety Framework

Satyapriya Krishna, Matteo Memelli, Tong Wang +5 more

Amazon published its Frontier Model Safety Framework (FMSF) as part of the Paris AI summit, following which we presented a report on Amazon's Premier...

3 months ago cs.CR cs.SE PDF

Attack HIGH

Thought-Transfer: Indirect Targeted Poisoning Attacks on Chain-of-Thought Reasoning Models

Harsh Chaudhari, Ethan Rathbun, Hanna Foerster +5 more

Chain-of-Thought (CoT) reasoning has emerged as a powerful technique for enhancing large language models' capabilities by generating intermediate...

3 months ago cs.CR cs.LG PDF

Defense MEDIUM

Proactive Hardening of LLM Defenses with HASTE

Henry Chen, Victor Aranda, Samarth Keshari +2 more

Prompt-based attack techniques are one of the primary challenges in securely deploying and protecting LLM-based AI systems. LLM inputs are an...

3 months ago cs.CR PDF

Benchmark MEDIUM

Malicious Repurposing of Open Science Artefacts by Using Large Language Models

Zahra Hashemi, Zhiqiang Zhong, Jun Pang +1 more

The rapid evolution of large language models (LLMs) has fuelled enthusiasm about their role in advancing scientific discovery, with studies exploring...

3 months ago cs.CL PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial