AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 181–200 of 312 papers

Clear filters

Attack MEDIUM

Beyond Context: Large Language Models Failure to Grasp Users Intent

Ahmed M. Hussain, Salahuddin Salahuddin, Panos Papadimitratos

Current Large Language Models (LLMs) safety approaches focus on explicitly harmful content while overlooking a critical vulnerability: the inability...

4 months ago cs.AI cs.CL cs.CR PDF

Attack MEDIUM

The Imitation Game: Using Large Language Models as Chatbots to Combat Chat-Based Cybercrimes

Yifan Yao, Baojuan Wang, Jinhao Duan +4 more

Chat-based cybercrime has emerged as a pervasive threat, with attackers leveraging real-time messaging platforms to conduct scams that rely on...

4 months ago cs.CR PDF

Attack MEDIUM

AI Security Beyond Core Domains: Resume Screening as a Case Study of Adversarial Vulnerabilities in Specialized LLM Applications

Honglin Mu, Jinghao Liu, Kaiyang Wan +4 more

Large Language Models (LLMs) excel at text comprehension and generation, making them ideal for automated tasks like code review and content...

4 months ago cs.CL cs.AI PDF

Attack MEDIUM

IoT-based Android Malware Detection Using Graph Neural Network With Adversarial Defense

Rahul Yumlembam, Biju Issac, Seibu Mary Jacob +1 more

Since the Internet of Things (IoT) is widely adopted using Android applications, detecting malicious Android apps is essential. In recent years,...

4 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

Conditional Adversarial Fragility in Financial Machine Learning under Macroeconomic Stress

Samruddhi Baviskar

Machine learning models used in financial decision systems operate in nonstationary economic environments, yet adversarial robustness is typically...

4 months ago cs.LG cs.AI cs.CR PDF

Attack MEDIUM

SafeMed-R1: Adversarial Reinforcement Learning for Generalizable and Robust Medical Reasoning in Vision-Language Models

A. A. Gde Yogi Pramana, Jason Ray, Anthony Jaya +1 more

Vision--Language Models (VLMs) show significant promise for Medical Visual Question Answering (VQA), yet their deployment in clinical settings is...

4 months ago cs.AI PDF

Attack MEDIUM

AdvJudge-Zero: Binary Decision Flips in LLM-as-a-Judge via Adversarial Control Tokens

Tung-Ling Li, Yuhao Wu, Hongliang Liu

Reward models and LLM-as-a-Judge systems are central to modern post-training pipelines such as RLHF, DPO, and RLAIF, where they provide scalar...

4 months ago cs.LG cs.CL cs.CR PDF

Attack MEDIUM

Adversarially Robust Detection of Harmful Online Content: A Computational Design Science Approach

Yidong Chai, Yi Liu, Mohammadreza Ebrahimi +2 more

Social media platforms are plagued by harmful content such as hate speech, misinformation, and extremist rhetoric. Machine learning (ML) models are...

4 months ago cs.LG PDF

Attack MEDIUM

In-Context Probing for Membership Inference in Fine-Tuned Language Models

Zhexi Lu, Hongliang Chi, Nathalie Baracaldo +3 more

Membership inference attacks (MIAs) pose a critical privacy threat to fine-tuned large language models (LLMs), especially when models are adapted to...

4 months ago cs.CR cs.LG PDF

Attack MEDIUM

ChatGPT and Gemini participated in the Korean College Scholastic Ability Test -- Earth Science I

Seok-Hyun Ga, Chun-Yen Chang

The rapid development of Generative AI is bringing innovative changes to education and assessment. As the prevalence of students utilizing AI for...

4 months ago cs.AI cs.CL cs.CY PDF

Attack MEDIUM

From Adversarial Poetry to Adversarial Tales: An Interpretability Research Agenda

Piercosma Bisconti, Marcello Galisai, Matteo Prandi +6 more

Safety mechanisms in LLMs remain vulnerable to attacks that reframe harmful requests through culturally coded structures. We introduce Adversarial...

4 months ago cs.CL cs.AI cs.CY PDF

Attack MEDIUM

Practical challenges of control monitoring in frontier AI deployments

David Lindner, Charlie Griffin, Tomek Korbak +4 more

Automated control monitors could play an important role in overseeing highly capable AI agents that we do not fully trust. Prior work has explored...

4 months ago cs.CR cs.AI cs.MA PDF

Attack MEDIUM

Adversarial Robustness in Financial Machine Learning: Defenses, Economic Impact, and Governance Evidence

Samruddhi Baviskar

We evaluate adversarial robustness in tabular machine learning models used in financial decision making. Using credit scoring and fraud detection...

4 months ago cs.LG cs.AI cs.CR PDF

Attack MEDIUM

GradID: Adversarial Detection via Intrinsic Dimensionality of Gradients

Mohammad Mahdi Razmjoo, Mohammad Mahdi Sharifian, Saeed Bagheri Shouraki

Despite their remarkable performance, deep neural networks exhibit a critical vulnerability: small, often imperceptible, adversarial perturbations...

4 months ago cs.LG cs.CR cs.CV PDF

Attack MEDIUM

CODE ACROSTIC: Robust Watermarking for Code Generation

Li Lin, Siyuan Xin, Yang Cao +1 more

Watermarking large language models (LLMs) is vital for preventing their misuse, including the fabrication of fake news, plagiarism, and spam. It is...

4 months ago cs.CR cs.AI PDF

Attack MEDIUM

Keep the Lights On, Keep the Lengths in Check: Plug-In Adversarial Detection for Time-Series LLMs in Energy Forecasting

Hua Ma, Ruoxi Sun, Minhui Xue +4 more

Accurate time-series forecasting is increasingly critical for planning and operations in low-carbon power systems. Emerging time-series large...

5 months ago cs.CR cs.LG PDF

Attack MEDIUM

PHANTOM: Progressive High-fidelity Adversarial Network for Threat Object Modeling

Jamal Al-Karaki, Muhammad Al-Zafar Khan, Rand Derar Mohammad Al Athamneh

The scarcity of cyberattack data hinders the development of robust intrusion detection systems. This paper introduces PHANTOM, a novel adversarial...

5 months ago cs.CR cs.AI cs.LG PDF

Attack MEDIUM

Adaptive Intrusion Detection System Leveraging Dynamic Neural Models with Adversarial Learning for 5G/6G Networks

Neha, Tarunpreet Bhatia

Intrusion Detection Systems (IDS) are critical components in safeguarding 5G/6G networks from both internal and external cyber threats. While...

5 months ago cs.CR cs.LG PDF

Attack MEDIUM

Improved Pseudorandom Codes from Permuted Puzzles

Miranda Christ, Noah Golowich, Sam Gunn +2 more

Watermarks are an essential tool for identifying AI-generated content. Recently, Christ and Gunn (CRYPTO '24) introduced pseudorandom...

5 months ago cs.CR PDF

Attack MEDIUM

Insured Agents: A Decentralized Trust Insurance Mechanism for Agentic Economy

Botao 'Amber' Hu, Bangdao Chen

The emerging "agentic web" envisions large populations of autonomous agents coordinating, transacting, and delegating across open networks. Yet many...

5 months ago cs.CY cs.MA PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial