AI Security Research

2,589+ academic papers on AI security, attacks, and defenses

Total

2,589

Attack

998

Benchmark

740

Defense

355

Tool

276

Survey

147

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 1241–1260 of 1,931 papers

Clear filters

Benchmark MEDIUM

$α^3$-SecBench: A Large-Scale Evaluation Suite of Security, Resilience, and Trust for LLM-based UAV Agents over 6G Networks

Mohamed Amine Ferrag, Abderrahmane Lakas, Merouane Debbah

Autonomous unmanned aerial vehicle (UAV) systems are increasingly deployed in safety-critical, networked environments where they must operate...

3 months ago cs.CR cs.AI PDF

Tool LOW

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Dongrui Liu, Qihan Ren, Chen Qian +40 more

The rise of AI agents introduces complex safety and security challenges arising from autonomous tool use and environmental interactions. Current...

3 months ago cs.AI cs.CC cs.CL PDF

Defense HIGH

MulVul: Retrieval-augmented Multi-Agent Code Vulnerability Detection via Cross-Model Prompt Evolution

Zihan Wu, Jie Xu, Yun Peng +2 more

Large Language Models (LLMs) struggle to automate real-world vulnerability detection due to two key limitations: the heterogeneity of vulnerability...

3 months ago cs.SE cs.AI PDF

Attack HIGH

ARMOR: Agentic Reasoning for Methods Orchestration and Reparameterization for Robust Adversarial Attacks

Gabriel Lee Jun Rong, Christos Korgialas, Dion Jia Xu Ho +3 more

Existing automated attack suites operate as static ensembles with fixed sequences, lacking strategic adaptation and semantic awareness. This paper...

3 months ago cs.CV PDF

Benchmark MEDIUM

A Generative AI-Driven Reliability Layer for Action-Oriented Disaster Resilience

Geunsik Lim

As climate-related hazards intensify, conventional early warning systems (EWS) disseminate alerts rapidly but often fail to trigger timely protective...

3 months ago cs.AI cs.SI eess.SY PDF

Defense LOW

V-Loop: Visual Logical Loop Verification for Hallucination Detection in Medical Visual Question Answering

Mengyuan Jin, Zehui Liao, Yong Xia

Multimodal Large Language Models (MLLMs) have shown remarkable capability in assisting disease diagnosis in medical visual question answering (VQA)....

3 months ago cs.CV PDF

Benchmark MEDIUM

From Transcripts to AI Agents: Knowledge Extraction, RAG Integration, and Robust Evaluation of Conversational AI Assistants

Krittin Pachtrachai, Petmongkon Pornpichitsuwan, Wachiravit Modecrua +1 more

Building reliable conversational AI assistants for customer-facing industries remains challenging due to noisy conversational data, fragmented...

3 months ago cs.CL PDF

Benchmark MEDIUM

MalURLBench: A Benchmark Evaluating Agents' Vulnerabilities When Processing Web URLs

Dezhang Kong, Zhuxi Wu, Shiqi Liu +8 more

LLM-based web agents have become increasingly popular for their utility in daily life and work. However, they exhibit critical vulnerabilities when...

3 months ago cs.CR cs.AI PDF

Other LOW

Mitigating the OWASP Top 10 For Large Language Models Applications using Intelligent Agents

Mohammad Fasha, Faisal Abul Rub, Nasim Matar +2 more

Large Language Models (LLMs) have emerged as a transformative and disruptive technology, enabling a wide range of applications in natural language...

3 months ago cs.CR cs.AI PDF

Attack HIGH

Comparison requires valid measurement: Rethinking attack success rate comparisons in AI red teaming

Alexandra Chouldechova, A. Feder Cooper, Solon Barocas +3 more

We argue that conclusions drawn about relative system safety or attack method efficacy via AI red teaming are often not supported by evidence...

3 months ago cs.LG PDF

Benchmark HIGH

Prompt Injection Evaluations: Refusal Boundary Instability and Artifact-Dependent Compliance in GPT-4-Series Models

Thomas Heverin

Prompt injection evaluations typically treat refusal as a stable, binary indicator of safety. This study challenges that paradigm by modeling refusal...

3 months ago cs.CR PDF

Defense MEDIUM

When Personalization Legitimizes Risks: Uncovering Safety Vulnerabilities in Personalized Dialogue Agents

Jiahe Guo, Xiangran Guo, Yulin Hu +8 more

Long-term memory enables large language model (LLM) agents to support personalized and sustained interactions. However, most work on personalized...

3 months ago cs.AI PDF

Benchmark MEDIUM

An Effective and Cost-Efficient Agentic Framework for Ethereum Smart Contract Auditing

Xiaohui Hu, Wun Yu Chan, Yuejie Shi +5 more

Smart contract security is paramount, but identifying intricate business logic vulnerabilities remains a persistent challenge because existing...

3 months ago cs.CR PDF

Benchmark HIGH

Multi-Agent End-to-End Vulnerability Management for Mitigating Recurring Vulnerabilities

Zelong Zheng, Jiayuan Zhou, Xing Hu +2 more

Software vulnerability management has become increasingly critical as modern systems scale in size and complexity. However, existing automated...

3 months ago cs.SE PDF

Benchmark MEDIUM

Improving User Privacy in Personalized Generation: Client-Side Retrieval-Augmented Modification of Server-Side Generated Speculations

Alireza Salemi, Hamed Zamani

Personalization is crucial for aligning Large Language Model (LLM) outputs with individual user preferences and background knowledge....

3 months ago cs.CL cs.AI cs.CR PDF

Tool HIGH

Sponge Tool Attack: Stealthy Denial-of-Efficiency against Tool-Augmented Agentic Reasoning

Qi Li, Xinchao Wang

Enabling large language models (LLMs) to solve complex reasoning tasks is a key step toward artificial general intelligence. Recent work augments...

3 months ago cs.CV PDF

Tool HIGH

Breaking the Protocol: Security Analysis of the Model Context Protocol Specification and Prompt Injection Vulnerabilities in Tool-Integrated LLM Agents

Narek Maloyan, Dmitry Namiot

The Model Context Protocol (MCP) has emerged as a de facto standard for integrating Large Language Models with external tools, yet no formal security...

3 months ago cs.CR cs.AI PDF

Attack HIGH

Prompt Injection Attacks on Agentic Coding Assistants: A Systematic Analysis of Vulnerabilities in Skills, Tools, and Protocol Ecosystems

Narek Maloyan, Dmitry Namiot

The proliferation of agentic AI coding assistants, including Claude Code, GitHub Copilot, Cursor, and emerging skill-based architectures, has...

3 months ago cs.CR PDF

Benchmark MEDIUM

Unintended Memorization of Sensitive Information in Fine-Tuned Language Models

Marton Szep, Jorge Marin Ruiz, Georgios Kaissis +4 more

Fine-tuning Large Language Models (LLMs) on sensitive datasets carries a substantial risk of unintended memorization and leakage of Personally...

3 months ago cs.LG cs.AI cs.CL PDF

Attack HIGH

Physical Prompt Injection Attacks on Large Vision-Language Models

Chen Ling, Kai Hu, Hangcheng Liu +3 more

Large Vision-Language Models (LVLMs) are increasingly deployed in real-world intelligent systems for perception and reasoning in open physical...

3 months ago cs.CV cs.AI PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial