AI Security Research

2,529+ academic papers on AI security, attacks, and defenses

Total

2,529

Attack

969

Benchmark

729

Defense

345

Tool

272

Survey

142

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 141–160 of 2,529 papers

Benchmark MEDIUM

FCMBench-Video: Benchmarking Document Video Intelligence

Runze Cui, Fangxin Shang, Yehui Yang +2 more

Document understanding is a critical capability in financial credit review, onboarding, and remote verification, where both decision accuracy and...

2 weeks ago cs.CV cs.CE cs.MM PDF

Benchmark MEDIUM

MGTEVAL: An Interactive Platform for Systemtic Evaluation of Machine-Generated Text Detectors

Yuanfan Li, Qi Zhou, Chengzhengxu Li +5 more

We present MGTEVAL, an extensible platform for systematic evaluation of Machine-Generated Text (MGT) detectors. Despite rapid progress in MGT...

2 weeks ago cs.CR cs.CL PDF

Defense MEDIUM

One Perturbation, Two Failure Modes: Probing VLM Safety via Embedding-Guided Typographic Perturbations

Ravikumar Balakrishnan, Sanket Mendapara

Typographic prompt injection exploits vision language models' (VLMs) ability to read text rendered in images, posing a growing threat as VLMs power...

2 weeks ago cs.CV PDF

Attack HIGH

Adaptive Prompt Embedding Optimization for LLM Jailbreaking

Miles Q. Li, Benjamin C. M. Fung, Boyang Li +2 more

Existing white-box jailbreak attacks against aligned LLMs typically append discrete adversarial suffixes to the user prompt, which visibly alters the...

2 weeks ago cs.AI PDF

Attack HIGH

Poisoning Learned Index Structures: Static and Dynamic Adversarial Attacks on ALEX

Allen Jue

Learned index structures achieve high performance by modeling the cumulative distribution function (CDF) of keys, but this reliance on data...

2 weeks ago cs.CR cs.DB PDF

Survey MEDIUM

SUDP: Secret-Use Delegation Protocol for Agentic Systems

Xiaohang Yu, Hejia Geng, William Knottenbelt

Agentic systems increasingly act with user secrets for APIs, messaging platforms, and cloud services. Today's bearer-secret interfaces implement...

2 weeks ago cs.CR cs.AI PDF

Benchmark MEDIUM

Green Shielding: A User-Centric Approach Towards Trustworthy AI

Aaron J. Li, Nicolas Sanchez, Hao Huang +8 more

Large language models (LLMs) are increasingly deployed, yet their outputs can be highly sensitive to routine, non-adversarial variation in how users...

2 weeks ago cs.CL cs.AI PDF

Benchmark LOW

Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents

German Marin, Jatin Chaudhary

Autonomous AI agents can remain fully authorized and still become unsafe as behavior drifts, adversaries adapt, and decision patterns shift without...

2 weeks ago cs.AI PDF

Benchmark MEDIUM

A Comparative Evaluation of AI Agent Security Guardrails

Qi Li, Jiu Li, Pingtao Wei +8 more

This report presents a comparative evaluation of DKnownAI Guard in AI agent security scenarios, benchmarked against three competing products: AWS...

2 weeks ago cs.CR cs.AI PDF

Defense MEDIUM

Layerwise Convergence Fingerprints for Runtime Misbehavior Detection in Large Language Models

Nay Myat Min, Long H. Pham, Jun Sun

Large language models deployed at runtime can misbehave in ways that clean-data validation cannot anticipate: training-time backdoors lie dormant...

2 weeks ago cs.CR cs.AI cs.CL PDF

Benchmark MEDIUM

GAMMAF: A Common Framework for Graph-Based Anomaly Monitoring Benchmarking in LLM Multi-Agent Systems

Pablo Mateo-Torrejón, Alfonso Sánchez-Macián

The rapid integration of Large Language Models (LLMs) into Multi-Agent Systems (MAS) has significantly enhanced their collaborative problem-solving...

2 weeks ago cs.CR cs.AI cs.MA PDF

Survey MEDIUM

A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations

Zihan Liu, Yizhen Wang, Rui Wang +2 more

Fine-tuning unlocks large language models (LLMs) for specialized applications, but its high computational cost often puts it out of reach for...

2 weeks ago cs.CR cs.CL cs.DC PDF

Attack MEDIUM

Unveiling the Backdoor Mechanism Hidden Behind Catastrophic Overfitting in Fast Adversarial Training

Mengnan Zhao, Lihe Zhang, Tianhang Zheng +2 more

Fast Adversarial Training (FAT) has attracted significant attention due to its efficiency in enhancing neural network robustness against adversarial...

2 weeks ago cs.LG cs.AI cs.CR PDF

Tool LOW

OS-SPEAR: A Toolkit for the Safety, Performance,Efficiency, and Robustness Analysis of OS Agents

Zheng Wu, Yi Hua, Zhaoyuan Huang +8 more

The evolution of Multimodal Large Language Models (MLLMs) has shifted the focus from text generation to active behavioral execution, particularly via...

2 weeks ago cs.CL PDF

Benchmark MEDIUM

GoAT-X: A Graph of Auditing Thoughts for Securing Token Transactions in Cross-Chain Contracts

Zijun Feng, Yuming Feng, Yu Wang +4 more

Cross-chain bridges, the critical infrastructure of the multi-chain ecosystem, have become a primary target for attackers, resulting in over $2.8...

2 weeks ago cs.CR PDF

Attack MEDIUM

Mitigating Error Amplification in Fast Adversarial Training

Mengnan Zhao, Lihe Zhang, Bo Wang +3 more

Fast Adversarial Training (FAT) has proven effective in enhancing model robustness by encouraging networks to learn perturbation-invariant...

2 weeks ago cs.LG cs.CR PDF

Benchmark MEDIUM

Dynamic Cyber Ranges

Víctor Mayoral-Vilches, María Sanz-Gómez, Francesco Balassone +6 more

As LLM-driven agents advance in cybersecurity, Jeopardy CTF benchmarks are approaching saturation and cyber ranges, the natural next evaluation...

2 weeks ago cs.CR PDF

Defense MEDIUM

Defusing the Trigger: Plug-and-Play Defense for Backdoored LLMs via Tail-Risk Intrinsic Geometric Smoothing

Kaisheng Fan, Weizhe Zhang, Yishu Gao +2 more

Defending against backdoor attacks in large language models remains a critical practical challenge. Existing defenses mitigate these threats but...

2 weeks ago cs.CR cs.AI PDF

Attack HIGH

AgentVisor: Defending LLM Agents Against Prompt Injection via Semantic Virtualization

Zonghao Ying, Haozheng Wang, Jiangfan Liu +5 more

Large Language Model (LLM) agents are increasingly used to automate complex workflows, but integrating untrusted external data with privileged...

2 weeks ago cs.CR PDF

Tool LOW

Closing the Loop: A Software Framework for AI to Support Business Decision Making

Jeffrey Wong, Antoine Creux

Create an idea, prototype it, evaluate if users like it, then learn. It is the circle of business. If AI can operate in all parts of the circle, it...

2 weeks ago cs.SE cs.MS stat.AP PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial