AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 101–120 of 345 papers

Clear filters

Defense LOW

Hierarchical Decoding for Discrete Speech Synthesis with Multi-Resolution Spoof Detection

Junchuan Zhao, Minh Duc Vu, Ye Wang

Neural codec language models enable high-quality discrete speech synthesis, yet their inference remains vulnerable to token-level artifacts and...

2 months ago cs.SD eess.AS PDF

Defense MEDIUM

ThaiSafetyBench: Assessing Language Model Safety in Thai Cultural Contexts

Trapoom Ukarapol, Nut Chukamphaeng, Kunat Pipatanakul +1 more

The safety evaluation of large language models (LLMs) remains largely centered on English, leaving non-English languages and culturally grounded...

2 months ago cs.CL PDF

Defense MEDIUM

Steering Frozen LLMs: Adaptive Social Alignment via Online Prompt Routing

Zeyu Zhang, Xiangxiang Dai, Ziyi Han +2 more

Large language models (LLMs) are typically governed by post-training alignment (e.g., RLHF or DPO), which yields a largely static policy during...

2 months ago cs.LG cs.AI PDF

Defense LOW

Molt Dynamics: Emergent Social Phenomena in Autonomous AI Agent Populations

Brandon Yee, Krishna Sharma

MoltBook is a large-scale multi-agent coordination environment where over 770,000 autonomous LLM agents interact without human participation,...

2 months ago cs.MA cs.AI cs.SI PDF

Defense LOW

RIVA: Leveraging LLM Agents for Reliable Configuration Drift Detection

Sami Abuzakuk, Lucas Crijns, Anne-Marie Kermarrec +2 more

Infrastructure as code (IaC) tools automate cloud provisioning but verifying that deployed systems remain consistent with the IaC specifications...

2 months ago cs.SE cs.AI cs.MA PDF

Defense LOW

ZeroDayBench: Evaluating LLM Agents on Unseen Zero-Day Vulnerabilities for Cyberdefense

Nancy Lau, Louis Sloot, Jyoutir Raj +6 more

Large language models (LLMs) are increasingly being deployed as software engineering agents that autonomously contribute to repositories. A major...

2 months ago cs.CR cs.AI PDF

Defense MEDIUM

Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision

Manisha Mukherjee, Vincent J. Hellendoorn

Large Language Models (LLMs) are increasingly deployed for code generation in high-stakes software development, yet their limited transparency in...

2 months ago cs.SE cs.AI cs.CR PDF

Defense MEDIUM

Pragma-VL: Towards a Pragmatic Arbitration of Safety and Helpfulness in MLLMs

Ming Wen, Kun Yang, Xin Chen +4 more

Multimodal Large Language Models (MLLMs) pose critical safety challenges, as they are susceptible not only to adversarial attacks such as...

2 months ago cs.LG cs.AI PDF

Defense MEDIUM

TAS-GNN: A Status-Aware Signed Graph Neural Network for Anomaly Detection in Bitcoin Trust Systems

Chang Xue, Fang Liu, Jiaye Wang +2 more

Decentralized financial platforms rely heavily on Web of Trust reputation systems to mitigate counterparty risk in the absence of centralized...

2 months ago cs.CR cs.AI cs.LG PDF

Defense LOW

Look Carefully: Adaptive Visual Reinforcements in Multimodal Large Language Models for Hallucination Mitigation

Xingyu Zhu, Kesen Zhao, Liang Yi +4 more

Multimodal large language models (MLLMs) have achieved remarkable progress in vision-language reasoning, yet they remain vulnerable to hallucination,...

2 months ago cs.CV PDF

Defense LOW

LLM-Powered Silent Bug Fuzzing in Deep Learning Libraries via Versatile and Controlled Bug Transfer

Kunpeng Zhang, Dongwei Xiao, Daoyuan Wu +5 more

Deep learning (DL) libraries are widely used in critical applications, where even subtle silent bugs can lead to serious consequences. While existing...

2 months ago cs.SE PDF

Defense MEDIUM

Secure Semantic Communications via AI Defenses: Fundamentals, Solutions, and Future Directions

Lan Zhang, Chengsi Liang, Zeming Zhuang +4 more

Semantic communication (SemCom) redefines wireless communication from reproducing symbols to transmitting task-relevant semantics. However, this...

2 months ago cs.CR eess.SY PDF

Defense MEDIUM

MemoPhishAgent: Memory-Augmented Multi-Modal LLM Agent for Phishing URL Detection

Xuan Chen, Hao Liu, Tao Yuan +3 more

Traditional phishing website detection relies on static heuristics or reference lists, which lag behind rapidly evolving attacks. While recent...

2 months ago cs.CR PDF

Defense MEDIUM

Alignment-Weighted DPO: A principled reasoning approach to improve safety alignment

Mengxuan Hu, Vivek V. Datla, Anoop Kumar +4 more

Recent advances in alignment techniques such as Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and Direct...

2 months ago cs.CL cs.AI PDF

Defense MEDIUM

A Lightweight Defense Mechanism against Next Generation of Phishing Emails using Distilled Attention-Augmented BiLSTM

Morteza Eskandarian, Mahdi Rabbani, Arun Kaniyamattam +6 more

The current generation of large language models produces sophisticated social-engineering content that bypasses standard text screening systems in...

2 months ago cs.CR PDF

Defense MEDIUM

MANATEE: Inference-Time Lightweight Diffusion Based Safety Defense for LLMs

Chun Yan Ryan Kan, Tommy Tran, Vedant Yadav +4 more

Defending LLMs against adversarial jailbreak attacks remains an open challenge. Existing defenses rely on binary classifiers that fail when...

2 months ago cs.CR cs.AI cs.CL PDF

Defense LOW

Mining Type Constructs Using Patterns in AI-Generated Code

Imgyeong Lee, Tayyib Ul Hassan, Abram Hindle

Artificial Intelligence (AI) increasingly automates various parts of the software development tasks. Although AI has enhanced the productivity of...

2 months ago cs.SE PDF

Defense MEDIUM

Fail-Closed Alignment for Large Language Models

Zachary Coalson, Beth Sohler, Aiden Gabriel +1 more

We identify a structural weakness in current large language model (LLM) alignment: modern refusal mechanisms are fail-open. While existing approaches...

2 months ago cs.LG cs.CR PDF

Defense MEDIUM

NeST: Neuron Selective Tuning for LLM Safety

Sasha Behrouzi, Lichao Wu, Mohamadreza Rostami +1 more

Safety alignment is essential for the responsible deployment of large language models (LLMs). Yet, existing approaches often rely on heavyweight...

2 months ago cs.CR cs.LG PDF

Defense LOW

Agentic AI, Medical Morality, and the Transformation of the Patient-Physician Relationship

Robert Ranisch, Sabine Salloch

The emergence of agentic AI marks a new phase in the digital transformation of healthcare. Distinct from conventional generative AI, agentic AI...

2 months ago cs.CY PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial