AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1621–1640 of 1,930 papers

Clear filters

Benchmark MEDIUM

BashArena: A Control Setting for Highly Privileged AI Agents

Adam Kaufman, James Lucassen, Tyler Tracy +2 more

Future AI agents might run autonomously with elevated privileges. If these agents are misaligned, they might abuse these privileges to cause serious...

4 months ago cs.CR cs.AI PDF

Attack HIGH

Adversarial versification in portuguese as a jailbreak operator in LLMs

Joao Queiroz

Recent evidence shows that the versification of prompts constitutes a highly effective adversarial mechanism against aligned LLMs. The study...

4 months ago cs.CL cs.AI PDF

Attack MEDIUM

ChatGPT and Gemini participated in the Korean College Scholastic Ability Test -- Earth Science I

Seok-Hyun Ga, Chun-Yen Chang

The rapid development of Generative AI is bringing innovative changes to education and assessment. As the prevalence of students utilizing AI for...

4 months ago cs.AI cs.CL cs.CY PDF

Survey LOW

Quantum Machine Learning for Cybersecurity: A Taxonomy and Future Directions

Siva Sai, Ishika Goyal, Shubham Sharma +3 more

The increasing number of cyber threats and rapidly evolving tactics, as well as the high volume of data in recent years, have caused classical...

4 months ago cs.LG cs.AI cs.CR PDF

Benchmark MEDIUM

MCP-SafetyBench: A Benchmark for Safety Evaluation of Large Language Models with Real-World MCP Servers

Xuanjun Zong, Zhiqi Shen, Lei Wang +2 more

Large language models (LLMs) are evolving into agentic systems that reason, plan, and operate external tools. The Model Context Protocol (MCP) is a...

4 months ago cs.CL cs.AI PDF

Attack HIGH

An Efficient Gradient-Based Inference Attack for Federated Learning

Pablo Montaña-Fernández, Ines Ortega-Fernandez

Federated Learning is a machine learning setting that reduces direct data exposure, improving the privacy guarantees of machine learning models. Yet,...

4 months ago cs.LG cs.CR PDF

Tool MEDIUM

Quantifying Return on Security Controls in LLM Systems

Richard Helder Moulton, Austin O'Brien, John D. Hastings

Although large language models (LLMs) are increasingly used in security-critical workflows, practitioners lack quantitative guidance on which...

4 months ago cs.CR cs.AI cs.CL PDF

Survey MEDIUM

Trust in LLM-controlled Robotics: a Survey of Security Threats, Defenses and Challenges

Xinyu Huang, Shyam Karthick V B, Taozhao Chen +5 more

The integration of Large Language Models (LLMs) into robotics has revolutionized their ability to interpret complex human commands and execute...

4 months ago cs.RO PDF

Benchmark LOW

Internal Reasoning vs. External Control: A Thermodynamic Analysis of Sycophancy in Large Language Models

Edward Y. Chang

Large Language Models exhibit sycophancy: prioritizing agreeableness over correctness. Current remedies evaluate reasoning outcomes: RLHF rewards...

4 months ago cs.CL cs.AI PDF

Defense MEDIUM

Cloud Security Leveraging AI: A Fusion-Based AISOC for Malware and Log Behaviour Detection

Nnamdi Philip Okonkwo, Lubna Luxmi Dhirani

Cloud Security Operations Center (SOC) enable cloud governance, risk and compliance by providing insights visibility and control. Cloud SOC triages...

4 months ago cs.CR cs.LG PDF

Benchmark LOW

Cybercrime and Computer Forensics in Epoch of Artificial Intelligence in India

Sahibpreet Singh, Shikha Dhiman

The integration of generative Artificial Intelligence into the digital ecosystem necessitates a critical re-evaluation of Indian criminal...

5 months ago cs.CR cs.AI cs.CY PDF

Tool MEDIUM

Penetration Testing of Agentic AI: A Comparative Security Analysis Across Models and Frameworks

Viet K. Nguyen, Mohammad I. Husain

Agentic AI introduces security vulnerabilities that traditional LLM safeguards fail to address. Although recent work by Unit 42 at Palo Alto Networks...

5 months ago cs.CR cs.AI PDF

Tool MEDIUM

MALCDF: A Distributed Multi-Agent LLM Framework for Real-Time Cyber

Arth Bhardwaj, Sia Godika, Yuvam Loonker

Traditional, centralized security tools often miss adaptive, multi-vector attacks. We present the Multi-Agent LLM Cyber Defense Framework (MALCDF), a...

5 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

PerProb: Indirectly Evaluating Memorization in Large Language Models

Yihan Liao, Jacky Keung, Xiaoxue Ma +2 more

The rapid advancement of Large Language Models (LLMs) has been driven by extensive datasets that may contain sensitive information, raising serious...

5 months ago cs.CR PDF

Defense MEDIUM

C-ing Clearly: Enhanced Binary Code Explanations using C code

Teodor Poncu, Ioana Pintilie, Marius Dragoi +2 more

Large Language Models (LLMs) typically excel at coding tasks involving high-level programming languages, as opposed to lower-level programming...

5 months ago cs.CL cs.LG PDF

Attack MEDIUM

From Adversarial Poetry to Adversarial Tales: An Interpretability Research Agenda

Piercosma Bisconti, Marcello Galisai, Matteo Prandi +6 more

Safety mechanisms in LLMs remain vulnerable to attacks that reframe harmful requests through culturally coded structures. We introduce Adversarial...

5 months ago cs.CL cs.AI cs.CY PDF

Attack HIGH

Reasoning-Style Poisoning of LLM Agents via Stealthy Style Transfer: Process-Level Attacks and Runtime Monitoring in RSV Space

Xingfu Zhou, Pengfei Wang

Large Language Model (LLM) agents relying on external retrieval are increasingly deployed in high-stakes environments. While existing adversarial...

5 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

PentestEval: Benchmarking LLM-based Penetration Testing with Modular and Stage-Level Design

Ruozhao Yang, Mingfei Cheng, Gelei Deng +3 more

Penetration testing is essential for assessing and strengthening system security against real-world threats, yet traditional workflows remain highly...

5 months ago cs.SE cs.AI cs.CR PDF

Attack HIGH

IntentMiner: Intent Inversion Attack via Tool Call Analysis in the Model Context Protocol

Yunhao Yao, Zhiqiang Wang, Haoran Cheng +3 more

The evolution of Large Language Models (LLMs) into Agentic AI has established the Model Context Protocol (MCP) as the standard for connecting...

5 months ago cs.CR cs.AI PDF

Attack HIGH

CIS-BA: Continuous Interaction Space Based Backdoor Attack for Object Detection in the Real-World

Shuxin Zhao, Bo Lang, Nan Xiao +1 more

Object detection models deployed in real-world applications such as autonomous driving face serious threats from backdoor attacks. Despite their...

5 months ago cs.CV cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial