AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 241–260 of 345 papers

Clear filters

Defense MEDIUM

One Detector Fits All: Robust and Adaptive Detection of Malicious Packages from PyPI to Enterprises

Biagio Montaruli, Luca Compagna, Serena Elisa Ponta +1 more

The rise of supply chain attacks via malicious Python packages demands robust detection solutions. Current approaches, however, overlook two critical...

5 months ago cs.CR cs.LG PDF

Defense LOW

VLM as Strategist: Adaptive Generation of Safety-critical Testing Scenarios via Guided Diffusion

Xinzheng Wu, Junyi Chen, Naiting Zhong +1 more

The safe deployment of autonomous driving systems (ADSs) relies on comprehensive testing and evaluation. However, safety-critical scenarios that can...

5 months ago cs.RO cs.LG PDF

Defense LOW

SR-GRPO: Stable Rank as an Intrinsic Geometric Reward for Large Language Model Alignment

Yixuan Tang, Yi Yang

Aligning Large Language Models (LLMs) with human preferences typically relies on external supervision, which faces critical limitations: human...

5 months ago cs.CL PDF

Defense MEDIUM

Real Time Detection and Quantitative Analysis of Spurious Forgetting in Continual Learning

Weiwei Wang

Catastrophic forgetting remains a fundamental challenge in continual learning for large language models. Recent work revealed that performance...

5 months ago cs.LG cs.AI cs.CL PDF

Defense MEDIUM

The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search

Rongzhe Wei, Peizhi Niu, Xinjie Shen +7 more

Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety guardrails to elicit harmful outputs. Existing approaches...

5 months ago cs.CR PDF

Defense LOW

Minimal neuron ablation triggers catastrophic collapse in the language core of Large Vision-Language Models

Cen Lu, Yung-Chen Tang, Andrea Cavallaro

Large Vision-Language Models (LVLMs) have shown impressive multimodal understanding capabilities, yet their robustness is poorly understood. In this...

5 months ago cs.AI PDF

Defense MEDIUM

SD-CGAN: Conditional Sinkhorn Divergence GAN for DDoS Anomaly Detection in IoT Networks

Henry Onyeka, Emmanuel Samson, Liang Hong +3 more

The increasing complexity of IoT edge networks presents significant challenges for anomaly detection, particularly in identifying sophisticated...

5 months ago cs.LG cs.CR PDF

Defense MEDIUM

Are LLMs Good Safety Agents or a Propaganda Engine?

Neemesh Yadav, Francesco Ortu, Jiarui Liu +5 more

Large Language Models (LLMs) are trained to refuse to respond to harmful content. However, systematic analyses of whether this behavior is truly a...

5 months ago cs.CL PDF

Defense HIGH

Retrieval-Augmented Few-Shot Prompting Versus Fine-Tuning for Code Vulnerability Detection

Fouad Trad, Ali Chehab

Few-shot prompting has emerged as a practical alternative to fine-tuning for leveraging the capabilities of large language models (LLMs) in...

5 months ago cs.SE cs.AI cs.CL PDF

Defense LOW

Semantic Superiority vs. Forensic Efficiency: A Comparative Analysis of Deep Learning and Psycholinguistics for Business Email Compromise Detection

Yaw Osei Adjei, Frederick Ayivor, Davis Opoku

Business Email Compromise (BEC) is a sophisticated social engineering threat that manipulates organizational hierarchies, leading to significant...

5 months ago cs.LG cs.CR PDF

Defense LOW

Normative active inference: A numerical proof of principle for a computational and economic legal analytic approach to AI governance

Axel Constant, Mahault Albarracin, Karl J. Friston

This paper presents a computational account of how legal norms can influence the behavior of artificial intelligence (AI) agents, grounded in the...

5 months ago cs.CY PDF

Defense MEDIUM

Understanding and Mitigating Over-refusal for Large Language Models via Safety Representation

Junbo Zhang, Ran Chen, Qianli Zhou +2 more

Large language models demonstrate powerful capabilities across various natural language processing tasks, yet they also harbor safety...

5 months ago cs.CR cs.CL PDF

Defense MEDIUM

EAGER: Edge-Aligned LLM Defense for Robust, Efficient, and Accurate Cybersecurity Question Answering

Onat Gungor, Roshan Sood, Jiasheng Zhou +1 more

Large Language Models (LLMs) are highly effective for cybersecurity question answering (QA) but are difficult to deploy on edge devices due to their...

5 months ago cs.CR PDF

Defense MEDIUM

Beyond Binary Classification: A Semi-supervised Approach to Generalized AI-generated Image Detection

Hong-Hanh Nguyen-Le, Van-Tuan Tran, Dinh-Thuc Nguyen +1 more

The rapid advancement of generators (e.g., StyleGAN, Midjourney, DALL-E) has produced highly realistic synthetic images, posing significant...

5 months ago cs.LG cs.AI cs.CR PDF

Defense MEDIUM

SafeCiM: Investigating Resilience of Hybrid Floating-Point Compute-in-Memory Deep Learning Accelerators

Swastik Bhattacharya, Sanjay Das, Anand Menon +3 more

Deep Neural Networks (DNNs) continue to grow in complexity with Large Language Models (LLMs) incorporating vast numbers of parameters. Handling these...

5 months ago cs.AR cs.LG PDF

Defense MEDIUM

Entropy-Based Measurement of Value Drift and Alignment Work in Large Language Models

Samih Fadli

Large language model safety is usually assessed with static benchmarks, but key failures are dynamic: value drift under distribution shift, jailbreak...

5 months ago cs.CL cs.AI cs.LG PDF

Defense MEDIUM

When Harmless Words Harm: A New Threat to LLM Safety via Conceptual Triggers

Zhaoxin Zhang, Borui Chen, Yiming Hu +3 more

Recent research on large language model (LLM) jailbreaks has primarily focused on techniques that bypass safety mechanisms to elicit overtly harmful...

5 months ago cs.CL PDF

Defense MEDIUM

N-GLARE: An Non-Generative Latent Representation-Efficient LLM Safety Evaluator

Zheyu Lin, Jirui Yang, Yukui Qiu +3 more

Evaluating the safety robustness of LLMs is critical for their deployment. However, mainstream Red Teaming methods rely on online generation and...

5 months ago cs.LG cs.CR PDF

Defense MEDIUM

Certified but Fooled! Breaking Certified Defences with Ghost Certificates

Quoc Viet Vo, Tashreque M. Haq, Paul Montague +3 more

Certified defenses promise provable robustness guarantees. We study the malicious exploitation of probabilistic certification frameworks to better...

5 months ago cs.LG cs.CR cs.CV PDF

Defense LOW

Catastrophic Forgetting in Kolmogorov-Arnold Networks

Mohammad Marufur Rahman, Guanchu Wang, Kaixiong Zhou +2 more

Catastrophic forgetting is a longstanding challenge in continual learning, where models lose knowledge from earlier tasks when learning new ones....

5 months ago cs.LG cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial