AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1501–1520 of 1,929 papers

Clear filters

Attack HIGH

Emoji-Based Jailbreaking of Large Language Models

M P V S Gopinadh, S Mahaboob Hussain

Large Language Models (LLMs) are integral to modern AI applications, but their safety alignment mechanisms can be bypassed through adversarial prompt...

4 months ago cs.CR cs.AI PDF

Attack LOW

CSSBench: Evaluating the Safety of Lightweight LLMs against Chinese-Specific Adversarial Patterns

Zhenhong Zhou, Shilinlu Yan, Chuanpu Liu +3 more

Large language models (LLMs) are increasingly deployed in cost-sensitive and on-device scenarios, and safety guardrails have advanced mainly in...

4 months ago cs.CL PDF

Tool HIGH

Low Rank Comes with Low Security: Gradient Assembly Poisoning Attacks against Distributed LoRA-based LLM Systems

Yueyan Dong, Minghui Xu, Qin Hu +5 more

Low-Rank Adaptation (LoRA) has become a popular solution for fine-tuning large language models (LLMs) in federated settings, dramatically reducing...

4 months ago cs.CR PDF

Tool LOW

Improving LLM-Assisted Secure Code Generation through Retrieval-Augmented-Generation and Multi-Tool Feedback

Vidyut Sriram, Sawan Pandita, Achintya Lakshmanan +2 more

Large Language Models (LLMs) can generate code but often introduce security vulnerabilities, logical inconsistencies, and compilation errors. Prior...

4 months ago cs.CR cs.LG PDF

Defense MEDIUM

Defensive M2S: Training Guardrail Models on Compressed Multi-turn Conversations

Hyunjun Kim

Guardrail models are essential for ensuring the safety of Large Language Model (LLM) deployments, but processing full multi-turn conversation...

4 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

NOS-Gate: Queue-Aware Streaming IDS for Consumer Gateways under Timing-Controlled Evasion

Muhammad Bilal, Omer Tariq, Hasan Ahmed

Timing and burst patterns can leak through encryption, and an adaptive adversary can exploit them. This undermines metadata-only detection in a...

4 months ago cs.CR cs.LG cs.NI PDF

Attack HIGH

Engineering Attack Vectors and Detecting Anomalies in Additive Manufacturing

Md Mahbub Hasan, Marcus Sternhagen, Krishna Chandra Roy

Additive manufacturing (AM) is rapidly integrating into critical sectors such as aerospace, automotive, and healthcare. However, this cyber-physical...

4 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

PatchBlock: A Lightweight Defense Against Adversarial Patches for Embedded EdgeAI Devices

Nandish Chattopadhyay, Abdul Basit, Amira Guesmi +3 more

Adversarial attacks pose a significant challenge to the reliable deployment of machine learning models in EdgeAI applications, such as autonomous...

4 months ago cs.CR cs.AI PDF

Benchmark LOW

ClinicalReTrial: A Self-Evolving AI Agent for Clinical Trial Protocol Optimization

Sixue Xing, Xuanye Xia, Kerui Wu +3 more

Clinical trial failure remains a central bottleneck in drug development, where minor protocol design flaws can irreversibly compromise outcomes...

4 months ago cs.AI cs.MA PDF

Defense MEDIUM

Making Theft Useless: Adulteration-Based Protection of Proprietary Knowledge Graphs in GraphRAG Systems

Weijie Wang, Peizhuo Lv, Yan Wang +7 more

Graph Retrieval-Augmented Generation (GraphRAG) has emerged as a key technique for enhancing Large Language Models (LLMs) with proprietary Knowledge...

4 months ago cs.CR PDF

Attack MEDIUM

Rectifying Adversarial Examples Using Their Vulnerabilities

Fumiya Morimoto, Ryuto Morita, Satoshi Ono

Deep neural network-based classifiers are prone to errors when processing adversarial examples (AEs). AEs are minimally perturbed input data...

4 months ago cs.CR cs.LG cs.NE PDF

Benchmark HIGH

An Empirical Evaluation of LLM-Based Approaches for Code Vulnerability Detection: RAG, SFT, and Dual-Agent Systems

Md Hasan Saju, Maher Muhtadi, Akramul Azim

The rapid advancement of Large Language Models (LLMs) presents new opportunities for automated software vulnerability detection, a crucial task in...

4 months ago cs.SE cs.AI PDF

Attack HIGH

Overlooked Safety Vulnerability in LLMs: Malicious Intelligent Optimization Algorithm Request and its Jailbreak

Haoran Gu, Handing Wang, Yi Mei +2 more

The widespread deployment of large language models (LLMs) has raised growing concerns about their misuse risks and associated safety issues. While...

4 months ago cs.CR cs.CL PDF

Attack MEDIUM

The Trojan in the Vocabulary: Stealthy Sabotage of LLM Composition

Xiaoze Liu, Weichen Yu, Matt Fredrikson +2 more

The open-weight language model ecosystem is increasingly defined by model composition techniques (such as weight merging, speculative decoding, and...

4 months ago cs.LG cs.CL cs.CR PDF

Defense MEDIUM

Noise-Aware and Dynamically Adaptive Federated Defense Framework for SAR Image Target Recognition

Yuchao Hou, Zixuan Zhang, Jie Wang +9 more

As a critical application of computational intelligence in remote sensing, deep learning-based synthetic aperture radar (SAR) image target...

4 months ago cs.CR cs.CV cs.LG PDF

Benchmark MEDIUM

Encyclo-K: Evaluating LLMs with Dynamically Composed Knowledge Statements

Yiming Liang, Yizhi Li, Yantao Du +14 more

Benchmarks play a crucial role in tracking the rapid advancement of large language models (LLMs) and identifying their capability boundaries....

4 months ago cs.CL cs.AI PDF

Benchmark MEDIUM

PriceSeer: Evaluating Large Language Models in Real-Time Stock Prediction

Bohan Liang, Zijian Chen, Qi Jia +3 more

Stock prediction, a subject closely related to people's investment activities in fully dynamic and live environments, has been widely studied....

4 months ago q-fin.ST cs.LG PDF

Attack HIGH

Large Empirical Case Study: Go-Explore adapted for AI Red Team Testing

Manish Bhatt, Adrian Wood, Idan Habler +1 more

Production LLM agents with tool-using capabilities require security testing despite their safety training. We adapt Go-Explore to evaluate...

4 months ago cs.CR cs.AI cs.LG PDF

Benchmark MEDIUM

Safe in the Future, Dangerous in the Past: Dissecting Temporal and Linguistic Vulnerabilities in LLMs

Muhammad Abdullahi Said, Muhammad Sammani Sani

As Large Language Models (LLMs) integrate into critical global infrastructure, the assumption that safety alignment transfers zero-shot from English...

4 months ago cs.CL cs.AI cs.CY PDF

Attack HIGH

GCG Attack On A Diffusion LLM

Ruben Neyroud, Sam Corley

While most LLMs are autoregressive, diffusion-based LLMs have recently emerged as an alternative method for generation. Greedy Coordinate Gradient...

4 months ago cs.LG cs.CL cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial