AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 801–820 of 1,927 papers

Clear filters

Defense LOW

Look Carefully: Adaptive Visual Reinforcements in Multimodal Large Language Models for Hallucination Mitigation

Xingyu Zhu, Kesen Zhao, Liang Yi +4 more

Multimodal large language models (MLLMs) have achieved remarkable progress in vision-language reasoning, yet they remain vulnerable to hallucination,...

2 months ago cs.CV PDF

Benchmark HIGH

Jailbreak Foundry: From Papers to Runnable Attacks for Reproducible Benchmarking

Zhicheng Fang, Jingjie Zheng, Chenxu Fu +1 more

Jailbreak techniques for large language models (LLMs) evolve faster than benchmarks, making robustness estimates stale and difficult to compare...

2 months ago cs.CR cs.AI cs.CL PDF

Attack MEDIUM

SwitchCraft: Training-Free Multi-Event Video Generation with Attention Controls

Qianxun Xu, Chenxi Song, Yujun Cai +1 more

Recent advances in text-to-video diffusion models have enabled high-fidelity and temporally coherent videos synthesis. However, current models are...

2 months ago cs.CV PDF

Attack MEDIUM

SwitchCraft: Training-Free Multi-Event Video Generation with Attention Controls

Qianxun Xu, Chenxi Song, Yujun Cai +1 more

Recent advances in text-to-video diffusion models have enabled high-fidelity and temporally coherent videos synthesis. However, current models are...

2 months ago cs.CV PDF

Benchmark HIGH

Enhancing Continual Learning for Software Vulnerability Prediction: Addressing Catastrophic Forgetting via Hybrid-Confidence-Aware Selective Replay for Temporal LLM Fine-Tuning

Xuhui Dou, Hayretdin Bahsi, Alejandro Guerra-Manzanares

Recent work applies Large Language Models (LLMs) to source-code vulnerability detection, but most evaluations still rely on random train-test splits...

2 months ago cs.CR cs.AI cs.LG PDF

Tool MEDIUM

LiaisonAgent: An Multi-Agent Framework for Autonomous Risk Investigation and Governance

Chuanming Tang, Ling Qing, Shifeng Chen

The rapid evolution of sophisticated cyberattacks has strained modern Security Operations Centers (SOC), which traditionally rely on rule-based or...

2 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Your Inference Request Will Become a Black Box: Confidential Inference for Cloud-based Large Language Models

Chung-ju Huang, Huiqiang Zhao, Yuanpeng He +5 more

The increasing reliance on cloud-hosted Large Language Models (LLMs) exposes sensitive client data, such as prompts and responses, to potential...

2 months ago cs.CR cs.AI cs.CL PDF

Tool MEDIUM

SGAgent: Suggestion-Guided LLM-Based Multi-Agent Framework for Repository-Level Software Repair

Quanjun Zhang, Chengyu Gao, Yu Han +4 more

The rapid advancement of Large Language Models (LLMs) has led to the emergence of intelligent agents capable of autonomously interacting with...

2 months ago cs.SE PDF

Benchmark MEDIUM

Detecting Cognitive Signatures in Typing Behavior for Non-Intrusive Authorship Verification

David Condrey

The proliferation of AI-generated text has intensified the need for reliable authorship verification, yet current output-based methods are...

2 months ago cs.CR cs.HC cs.LG PDF

Benchmark LOW

CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era

Zhengqing Yuan, Kaiwen Shi, Zheyuan Zhang +3 more

Scientific research relies on accurate citation for attribution and integrity, yet large language models (LLMs) introduce a new risk: fabricated...

2 months ago cs.CL cs.DL PDF

Other LOW

ESAA: Event Sourcing for Autonomous Agents in LLM-Based Software Engineering

Elzo Brito dos Santos Filho

Autonomous agents based on Large Language Models (LLMs) have evolved from reactive assistants to systems capable of planning, executing actions via...

2 months ago cs.AI PDF

Attack HIGH

Hidden in the Metadata: Stealth Poisoning Attacks on Multimodal Retrieval-Augmented Generation

Kennedy Edemacu, Mohammad Mahdi Shokri

Retrieval-augmented generation (RAG) has emerged as a powerful paradigm for enhancing multimodal large language models by grounding their responses...

2 months ago cs.CR cs.AI PDF

Defense LOW

LLM-Powered Silent Bug Fuzzing in Deep Learning Libraries via Versatile and Controlled Bug Transfer

Kunpeng Zhang, Dongwei Xiao, Daoyuan Wu +5 more

Deep learning (DL) libraries are widely used in critical applications, where even subtle silent bugs can lead to serious consequences. While existing...

2 months ago cs.SE PDF

Benchmark LOW

SkillNet: Create, Evaluate, and Connect AI Skills

Yuan Liang, Ruobin Zhong, Haoming Xu +46 more

Current AI agents can flexibly invoke tools and execute complex tasks, yet their long-term advancement is hindered by the lack of systematic...

2 months ago cs.AI cs.CL cs.CV PDF

Attack HIGH

Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search

Xun Huang, Simeng Qin, Xiaoshuang Jia +6 more

As Large Language Models (LLMs) are increasingly used, their security risks have drawn increasing attention. Existing research reveals that LLMs are...

2 months ago cs.AI cs.CR PDF

Benchmark LOW

Learning to Generate Secure Code via Token-Level Rewards

Jiazheng Quan, Xiaodong Li, Bin Wang +5 more

Large language models (LLMs) have demonstrated strong capabilities in code generation, yet they remain prone to producing security vulnerabilities....

2 months ago cs.CR cs.AI cs.SE PDF

Attack HIGH

AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification

Tian Zhang, Yiwei Xu, Juan Wang +8 more

Large language model (LLM) agents increasingly rely on external tools and retrieval systems to autonomously complete complex tasks. However, this...

2 months ago cs.CR cs.AI PDF

Attack HIGH

Reverse CAPTCHA: Evaluating LLM Susceptibility to Invisible Unicode Instruction Injection

Marcus Graves

We introduce Reverse CAPTCHA, an evaluation framework that tests whether large language models follow invisible Unicode-encoded instructions embedded...

2 months ago cs.CR cs.AI PDF

Attack HIGH

Systems-Level Attack Surface of Edge Agent Deployments on IoT

Zhonghao Zhan, Krinos Li, Yefan Zhang +1 more

Edge deployment of LLM agents on IoT hardware introduces attack surfaces absent from cloud-hosted orchestration. We present an empirical security...

2 months ago cs.CR PDF

Benchmark LOW

Beyond performance-wise Contribution Evaluation in Federated Learning

Balazs Pejo

Federated learning offers a privacy-friendly collaborative learning framework, yet its success, like any joint venture, hinges on the contributions...

2 months ago cs.LG cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial