AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 121–140 of 345 papers

Clear filters

Defense MEDIUM

Mind the Gap: Evaluating LLMs for High-Level Malicious Package Detection vs. Fine-Grained Indicator Identification

Ahmed Ryan, Ibrahim Khalil, Abdullah Al Jahid +4 more

The prevalence of malicious packages in open-source repositories, such as PyPI, poses a critical threat to the software supply chain. While Large...

2 months ago cs.CR cs.SE PDF

Defense LOW

Unforgeable Watermarks for Language Models via Robust Signatures

Huijia Lin, Kameron Shahabi, Min Jae Song

Language models now routinely produce text that is difficult to distinguish from human writing, raising the need for robust tools to verify content...

2 months ago cs.CR cs.AI cs.LG PDF

Defense LOW

Visual Persuasion: What Influences Decisions of Vision-Language Models?

Manuel Cherep, Pranav M R, Pattie Maes +1 more

The web is littered with images, once created for human consumption and now increasingly interpreted by agents using vision-language models (VLMs)....

2 months ago cs.CV cs.AI PDF

Defense MEDIUM

Weight space Detection of Backdoors in LoRA Adapters

David Puertolas Merenciano, Ekaterina Vasyagina, Raghav Dixit +4 more

LoRA adapters let users fine-tune large language models (LLMs) efficiently. However, LoRA adapters are shared through open repositories like Hugging...

2 months ago cs.CR cs.AI cs.CL PDF

Defense MEDIUM

A Trajectory-Based Safety Audit of Clawdbot (OpenClaw)

Tianyu Chen, Dongrui Liu, Xia Hu +2 more

Clawdbot is a self-hosted, tool-using personal AI agent with a broad action space spanning local execution and web-mediated workflows, which raises...

2 months ago cs.CR cs.AI PDF

Defense MEDIUM

GPTZero: Robust Detection of LLM-Generated Texts

George Alexandru Adam, Alexander Cui, Edwin Thomas +7 more

While historical considerations surrounding text authenticity revolved primarily around plagiarism, the advent of large language models (LLMs) has...

2 months ago cs.LG PDF

Defense LOW

Fool Me If You Can: On the Robustness of Binary Code Similarity Detection Models against Semantics-preserving Transformations

Jiyong Uhm, Minseok Kim, Michalis Polychronakis +1 more

Binary code analysis plays an essential role in cybersecurity, facilitating reverse engineering to reveal the inner workings of programs in the...

2 months ago cs.CR cs.LG PDF

Defense MEDIUM

SafeNeuron: Neuron-Level Safety Alignment for Large Language Models

Zhaoxin Wang, Jiaming Liang, Fengbin Zhu +5 more

Large language models (LLMs) and multimodal LLMs are typically safety-aligned before release to prevent harmful content generation. However, recent...

2 months ago cs.LG PDF

Defense MEDIUM

Capability-Oriented Training Induced Alignment Risk

Yujun Zhou, Yue Huang, Han Bao +8 more

While most AI alignment research focuses on preventing models from generating explicitly harmful content, a more subtle risk is emerging:...

2 months ago cs.LG cs.CL PDF

Defense MEDIUM

LoRA-based Parameter-Efficient LLMs for Continuous Learning in Edge-based Malware Detection

Christian Rondanini, Barbara Carminati, Elena Ferrari +2 more

The proliferation of edge devices has created an urgent need for security solutions capable of detecting malware in real time while operating under...

2 months ago cs.CR cs.AI cs.DC PDF

Defense MEDIUM

Future Mining: Learning for Safety and Security

Md Sazedur Rahman, Mizanur Rahman Jewel, Sanjay Madria

Mining is rapidly evolving into an AI driven cyber physical ecosystem where safety and operational reliability depend on robust perception,...

3 months ago cs.CR cs.DC PDF

Defense MEDIUM

Agentic Knowledge Distillation: Autonomous Training of Small Language Models for SMS Threat Detection

Adel ElZemity, Joshua Sylvester, Budi Arief +1 more

SMS-based phishing (smishing) attacks have surged, yet training effective on-device detectors requires labelled threat data that quickly becomes...

3 months ago cs.CR PDF

Defense HIGH

VulReaD: Knowledge-Graph-guided Software Vulnerability Reasoning and Detection

Samal Mukhtar, Yinghua Yao, Zhu Sun +3 more

Software vulnerability detection (SVD) is a critical challenge in modern systems. Large language models (LLMs) offer natural-language explanations...

3 months ago cs.SE cs.AI cs.CR PDF

Defense MEDIUM

Kill it with FIRE: On Leveraging Latent Space Directions for Runtime Backdoor Mitigation in Deep Neural Networks

Enrico Ahlers, Daniel Passon, Yannic Noller +1 more

Machine learning models are increasingly present in our everyday lives; as a result, they become targets of adversarial attackers seeking to...

3 months ago cs.LG cs.AI cs.CR PDF

Defense MEDIUM

TRACE: Timely Retrieval and Alignment for Cybersecurity Knowledge Graph Construction and Expansion

Zijing Xu, Ziwei Ning, Tiancheng Hu +4 more

The rapid evolution of cyber threats has highlighted significant gaps in security knowledge integration. Cybersecurity Knowledge Graphs (CKGs)...

3 months ago cs.CR PDF

Defense MEDIUM

SecCodePRM: A Process Reward Model for Code Security

Weichen Yu, Ravi Mangal, Yinyi Luo +4 more

Large Language Models are rapidly becoming core components of modern software development workflows, yet ensuring code security remains challenging....

3 months ago cs.CR cs.SE PDF

Defense LOW

The Hidden Costs of Domain Fine-Tuning: Pii-Bearing Data Degrades Safety and Increases Leakage

Jayesh Choudhari, Piyush Kumar Singh

Domain fine-tuning is a common path to deploy small instruction-tuned language models as customer-support assistants, yet its effects on...

3 months ago cs.CR cs.LG PDF

Defense MEDIUM

Omni-Safety under Cross-Modality Conflict: Vulnerabilities, Dynamics Mechanisms and Efficient Alignment

Kun Wang, Zherui Li, Zhenhong Zhou +8 more

Omni-modal Large Language Models (OLLMs) greatly expand LLMs' multimodal capabilities but also introduce cross-modal safety risks. However, a...

3 months ago cs.CR cs.AI cs.CL PDF

Defense MEDIUM

Stress-Testing Alignment Audits With Prompt-Level Strategic Deception

Oliver Daniels, Perusha Moodley, Benjamin M. Marlin +1 more

Alignment audits aim to robustly identify hidden goals from strategic, situationally aware misaligned models. Despite this threat model, existing...

3 months ago cs.LG PDF

Defense MEDIUM

Is Reasoning Capability Enough for Safety in Long-Context Language Models?

Yu Fu, Haz Sameen Shahgir, Huanli Gong +3 more

Large language models (LLMs) increasingly combine long-context processing with advanced reasoning, enabling them to retrieve and synthesize...

3 months ago cs.CL cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial