AI Security Research

2,560+ academic papers on AI security, attacks, and defenses

Total

2,560

Attack

982

Benchmark

736

Defense

350

Tool

275

Survey

144

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 901–920 of 1,220 papers

Clear filters

Defense MEDIUM

When Harmless Words Harm: A New Threat to LLM Safety via Conceptual Triggers

Zhaoxin Zhang, Borui Chen, Yiming Hu +3 more

Recent research on large language model (LLM) jailbreaks has primarily focused on techniques that bypass safety mechanisms to elicit overtly harmful...

5 months ago cs.CL PDF

Tool MEDIUM

Trustworthy GenAI over 6G: Integrated Applications and Security Frameworks

Bui Duc Son, Trinh Van Chien, Dong In Kim

The integration of generative artificial intelligence (GenAI) into 6G networks promises substantial performance gains while simultaneously exposing...

5 months ago cs.CR cs.IT PDF

Benchmark MEDIUM

Can MLLMs Detect Phishing? A Comprehensive Security Benchmark Suite Focusing on Dynamic Threats and Multimodal Evaluation in Academic Environments

Jingzhuo Zhou

The rapid proliferation of Multimodal Large Language Models (MLLMs) has introduced unprecedented security challenges, particularly in phishing...

5 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Critical Evaluation of Quantum Machine Learning for Adversarial Robustness

Saeefa Rubaiyet Nowmi, Jesus Lopez, Md Mahmudul Alam Imon +2 more

Quantum Machine Learning (QML) integrates quantum computational principles into learning algorithms, offering improved representational capacity and...

5 months ago cs.CR PDF

Benchmark MEDIUM

Harmful Traits of AI Companions

W. Bradley Knox, Katie Bradford, Samanta Varela Castro +6 more

Amid the growing prevalence of human-AI interaction, large language models and other AI-based entities increasingly provide forms of companionship to...

5 months ago cs.HC cs.AI PDF

Benchmark MEDIUM

FLARE: Adaptive Multi-Dimensional Reputation for Robust Client Reliability in Federated Learning

Abolfazl Younesi, Leon Kiss, Zahra Najafabadi Samani +2 more

Federated learning (FL) enables collaborative model training while preserving data privacy. However, it remains vulnerable to malicious clients who...

5 months ago cs.LG cs.AI cs.CR PDF

Benchmark MEDIUM

ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning

Hongwei Liu, Junnan Liu, Shudong Liu +33 more

The rapid advancement of Large Language Models (LLMs) has led to performance saturation on many established benchmarks, questioning their ability to...

5 months ago cs.CL PDF

Defense MEDIUM

N-GLARE: An Non-Generative Latent Representation-Efficient LLM Safety Evaluator

Zheyu Lin, Jirui Yang, Yukui Qiu +3 more

Evaluating the safety robustness of LLMs is critical for their deployment. However, mainstream Red Teaming methods rely on online generation and...

5 months ago cs.LG cs.CR PDF

Defense MEDIUM

Certified but Fooled! Breaking Certified Defences with Ghost Certificates

Quoc Viet Vo, Tashreque M. Haq, Paul Montague +3 more

Certified defenses promise provable robustness guarantees. We study the malicious exploitation of probabilistic certification frameworks to better...

5 months ago cs.LG cs.CR cs.CV PDF

Benchmark MEDIUM

Tight and Practical Privacy Auditing for Differentially Private In-Context Learning

Yuyang Xia, Ruixuan Liu, Li Xiong

Large language models (LLMs) perform in-context learning (ICL) by adapting to tasks from prompt demonstrations, which in practice often contain...

5 months ago cs.CR PDF

Attack MEDIUM

DualTAP: A Dual-Task Adversarial Protector for Mobile MLLM Agents

Fuyao Zhang, Jiaming Zhang, Che Wang +6 more

The reliance of mobile GUI agents on Multimodal Large Language Models (MLLMs) introduces a severe privacy vulnerability: screenshots containing...

5 months ago cs.CR PDF

Benchmark MEDIUM

SmartPoC: Generating Executable and Validated PoCs for Smart Contract Bug Reports

Longfei Chen, Ruibin Yan, Taiyu Wong +2 more

Smart contracts are prone to vulnerabilities and are analyzed by experts as well as automated systems, such as static analysis and AI-assisted...

5 months ago cs.SE cs.CR PDF

Benchmark MEDIUM

Privacy-Preserving Federated Learning from Partial Decryption Verifiable Threshold Multi-Client Functional Encryption

Minjie Wang, Jinguang Han, Weizhi Meng

In federated learning, multiple parties can cooperate to train the model without directly exchanging their own private data, but the gradient leakage...

5 months ago cs.CR cs.AI PDF

Attack MEDIUM

Efficient Adversarial Malware Defense via Trust-Based Raw Override and Confidence-Adaptive Bit-Depth Reduction

Ayush Chaudhary, Sisir Doppalpudi

The deployment of robust malware detection systems in big data environments requires careful consideration of both security effectiveness and...

5 months ago cs.CR cs.LG PDF

Attack MEDIUM

LLM Reinforcement in Context

Thomas Rivasseau

Current Large Language Model alignment research mostly focuses on improving model robustness against adversarial attacks and misbehavior by training...

5 months ago cs.CL cs.CR PDF

Tool MEDIUM

Scalable Hierarchical AI-Blockchain Framework for Real-Time Anomaly Detection in Large-Scale Autonomous Vehicle Networks

Rathin Chandra Shit, Sharmila Subudhi

The security of autonomous vehicle networks is facing major challenges, owing to the complexity of sensor integration, real-time performance demands,...

5 months ago cs.CR cs.AI cs.LG PDF

Defense MEDIUM

SGuard-v1: Safety Guardrail for Large Language Models

JoonHo Lee, HyeonMin Cho, Jaewoong Yun +3 more

We present SGuard-v1, a lightweight safety guardrail for Large Language Models (LLMs), which comprises two specialized models to detect harmful...

5 months ago cs.CL cs.AI cs.CR PDF

Attack MEDIUM

ToxSearch: Evolving Prompts for Toxicity Search in Large Language Models

Onkar Shelar, Travis Desell

Large Language Models remain vulnerable to adversarial prompts that elicit toxic content even after safety alignment. We present ToxSearch, a...

5 months ago cs.NE cs.AI cs.CL PDF

Attack MEDIUM

The 'Sure' Trap: Multi-Scale Poisoning Analysis of Stealthy Compliance-Only Backdoors in Fine-Tuned Large Language Models

Yuting Tan, Yi Huang, Zhuo Li

Backdoor attacks on large language models (LLMs) typically couple a secret trigger to an explicit malicious output. We show that this explicit...

5 months ago cs.LG cs.CR PDF

Defense MEDIUM

Rethinking Deep Alignment Through The Lens Of Incomplete Learning

Thong Bach, Dung Nguyen, Thao Minh Le +1 more

Large language models exhibit systematic vulnerabilities to adversarial attacks despite extensive safety alignment. We provide a mechanistic analysis...

5 months ago cs.LG PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial