AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 781–800 of 1,228 papers

Clear filters

Defense MEDIUM

Auto-Tuning Safety Guardrails for Black-Box Large Language Models

Perry Abdulkadir

Large language models (LLMs) are increasingly deployed behind safety guardrails such as system prompts and content filters, especially in settings...

5 months ago cs.CR cs.CL cs.LG PDF

Attack MEDIUM

Adversarial Robustness in Financial Machine Learning: Defenses, Economic Impact, and Governance Evidence

Samruddhi Baviskar

We evaluate adversarial robustness in tabular machine learning models used in financial decision making. Using credit scoring and fraud detection...

5 months ago cs.LG cs.AI cs.CR PDF

Attack MEDIUM

GradID: Adversarial Detection via Intrinsic Dimensionality of Gradients

Mohammad Mahdi Razmjoo, Mohammad Mahdi Sharifian, Saeed Bagheri Shouraki

Despite their remarkable performance, deep neural networks exhibit a critical vulnerability: small, often imperceptible, adversarial perturbations...

5 months ago cs.LG cs.CR cs.CV PDF

Attack MEDIUM

CODE ACROSTIC: Robust Watermarking for Code Generation

Li Lin, Siyuan Xin, Yang Cao +1 more

Watermarking large language models (LLMs) is vital for preventing their misuse, including the fabrication of fake news, plagiarism, and spam. It is...

5 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

COBRA: Catastrophic Bit-flip Reliability Analysis of State-Space Models

Sanjay Das, Swastik Bhattacharya, Shamik Kundu +3 more

State-space models (SSMs), exemplified by the Mamba architecture, have recently emerged as state-of-the-art sequence-modeling frameworks, offering...

5 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

ceLLMate: Sandboxing Browser AI Agents

Luoxi Meng, Henry Feng, Ilia Shumailov +1 more

Browser-using agents (BUAs) are an emerging class of AI agents that interact with web browsers in human-like ways, including clicking, scrolling,...

5 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

Diverse LLMs vs. Vulnerabilities: Who Detects and Fixes Them Better?

Arastoo Zibaeirad, Marco Vieira

Large Language Models (LLMs) are increasingly being studied for Software Vulnerability Detection (SVD) and Repair (SVR). Individual LLMs have...

5 months ago cs.SE cs.AI PDF

Survey MEDIUM

The Role of AI in Modern Penetration Testing

J. Alexander Curtis, Nasir U. Eisty

Penetration testing is a cornerstone of cybersecurity, traditionally driven by manual, time-intensive processes. As systems grow in complexity, there...

5 months ago cs.SE PDF

Defense MEDIUM

Taint-Based Code Slicing for LLMs-based Malicious NPM Package Detection

Dang-Khoa Nguyen, Gia-Thang Ho, Quang-Minh Pham +5 more

Software supply chain attacks targeting the npm ecosystem have become increasingly sophisticated, leveraging obfuscation and complex logic to evade...

5 months ago cs.CR PDF

Attack MEDIUM

Keep the Lights On, Keep the Lengths in Check: Plug-In Adversarial Detection for Time-Series LLMs in Energy Forecasting

Hua Ma, Ruoxi Sun, Minhui Xue +4 more

Accurate time-series forecasting is increasingly critical for planning and operations in low-carbon power systems. Emerging time-series large...

5 months ago cs.CR cs.LG PDF

Tool MEDIUM

BRIDG-ICS: AI-Grounded Knowledge Graphs for Intelligent Threat Analytics in Industry~5.0 Cyber-Physical Systems

Padmeswari Nandiya, Ahmad Mohsin, Ahmed Ibrahim +2 more

Industry 5.0's increasing integration of IT and OT systems is transforming industrial operations but also expanding the cyber-physical attack...

5 months ago cs.CR PDF

Benchmark MEDIUM

CLOAK: Contrastive Guidance for Latent Diffusion-Based Data Obfuscation

Xin Yang, Omid Ardakanian

Data obfuscation is a promising technique for mitigating attribute inference attacks by semi-trusted parties with access to time-series data emitted...

5 months ago cs.LG cs.CR PDF

Benchmark MEDIUM

Factor(U,T): Controlling Untrusted AI by Monitoring their Plans

Edward Lue Chee Lip, Anthony Channg, Diana Kim +2 more

As AI capabilities advance, we increasingly rely on powerful models to decompose complex tasks $\unicode{x2013}$ but what if the decomposer itself is...

5 months ago cs.CR cs.AI PDF

Defense MEDIUM

Super Suffixes: Bypassing Text Generation Alignment and Guard Models Simultaneously

Andrew Adiletta, Kathryn Adiletta, Kemal Derya +1 more

The rapid deployment of Large Language Models (LLMs) has created an urgent need for enhanced security and privacy measures in Machine Learning (ML)....

5 months ago cs.CR cs.AI PDF

Attack MEDIUM

PHANTOM: Progressive High-fidelity Adversarial Network for Threat Object Modeling

Jamal Al-Karaki, Muhammad Al-Zafar Khan, Rand Derar Mohammad Al Athamneh

The scarcity of cyberattack data hinders the development of robust intrusion detection systems. This paper introduces PHANTOM, a novel adversarial...

5 months ago cs.CR cs.AI cs.LG PDF

Survey MEDIUM

Mapping AI Risk Mitigations: Evidence Scan and Preliminary AI Risk Mitigation Taxonomy

Alexander K. Saeri, Sophia Lloyd George, Jess Graham +4 more

Organizations and governments that develop, deploy, use, and govern AI must coordinate on effective risk mitigation. However, the landscape of AI...

5 months ago cs.CY cs.AI PDF

Defense MEDIUM

Challenges of Evaluating LLM Safety for User Welfare

Manon Kempermann, Sai Suresh Macharla Vasu, Mahalakshmi Raveenthiran +2 more

Safety evaluations of large language models (LLMs) typically focus on universal risks like dangerous capabilities or undesirable propensities....

5 months ago cs.AI cs.CY PDF

Attack MEDIUM

Adaptive Intrusion Detection System Leveraging Dynamic Neural Models with Adversarial Learning for 5G/6G Networks

Neha, Tarunpreet Bhatia

Intrusion Detection Systems (IDS) are critical components in safeguarding 5G/6G networks from both internal and external cyber threats. While...

5 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

Authority Backdoor: A Certifiable Backdoor Mechanism for Authoring DNNs

Han Yang, Shaofeng Li, Tian Dong +3 more

Deep Neural Networks (DNNs), as valuable intellectual property, face unauthorized use. Existing protections, such as digital watermarking, are...

5 months ago cs.CR cs.LG PDF

Benchmark MEDIUM

Differential Privacy for Secure Machine Learning in Healthcare IoT-Cloud Systems

N Mangala, Murtaza Rangwala, S Aishwarya +5 more

Healthcare has become exceptionally sophisticated, as wearables and connected medical devices are revolutionising remote patient monitoring,...

5 months ago cs.CR cs.DC PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial