AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 121–140 of 658 papers

Clear filters

Attack HIGH

Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain

Hanzhi Liu, Chaofan Shou, Hongbo Wen +3 more

Large language model (LLM) agents increasingly rely on third-party API routers to dispatch tool-calling requests across multiple upstream providers....

1 months ago cs.CR PDF

Survey HIGH

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

Yuming Xu, Mingtao Zhang, Zhuohan Ge +5 more

Retrieval-augmented generation (RAG) significantly enhances large language models (LLMs) but introduces novel security risks through external...

1 months ago cs.CR cs.AI PDF

Attack HIGH

Silencing the Guardrails: Inference-Time Jailbreaking via Dynamic Contextual Representation Ablation

Wenpeng Xing, Moran Fang, Guangtai Wang +2 more

While Large Language Models (LLMs) have achieved remarkable performance, they remain vulnerable to jailbreak attacks that circumvent safety...

1 months ago cs.AI PDF

Attack HIGH

Are GUI Agents Focused Enough? Automated Distraction via Semantic-level UI Element Injection

Wenkui Yang, Chao Jin, Haisu Zhu +7 more

Existing red-teaming studies on GUI agents have important limitations. Adversarial perturbations typically require white-box access, which is...

1 months ago cs.CR cs.CL cs.CV PDF

Attack HIGH

TrajGuard: Streaming Hidden-state Trajectory Detection for Decoding-time Jailbreak Defense

Cheng Liu, Xiaolei Liu, Xingyu Li +2 more

Existing jailbreak defense paradigms primarily rely on static detection of prompts, outputs, or internal states, often neglecting the dynamic...

1 months ago cs.CR cs.AI PDF

Other HIGH

VulGD: A LLM-Powered Dynamic Open-Access Vulnerability Graph Database

Luat Do, Jiao Yin, Jinli Cao +1 more

Software vulnerabilities continue to pose significant threats to modern information systems, requiring a timely and accurate risk assessment. Public...

1 months ago cs.CR cs.DB PDF

Attack HIGH

Making MLLMs Blind: Adversarial Smuggling Attacks in MLLM Content Moderation

Zhiheng Li, Zongyang Ma, Yuntong Pan +8 more

Multimodal Large Language Models (MLLMs) are increasingly being deployed as automated content moderators. Within this landscape, we uncover a...

1 months ago cs.CV PDF

Attack HIGH

Making MLLMs Blind: Adversarial Smuggling Attacks in MLLM Content Moderation

Zhiheng Li, Zongyang Ma, Yuntong Pan +8 more

Multimodal Large Language Models (MLLMs) are increasingly being deployed as automated content moderators. Within this landscape, we uncover a...

1 months ago cs.CV PDF

Attack HIGH

RefineRAG: Word-Level Poisoning Attacks via Retriever-Guided Text Refinement

Ziye Wang, Guanyu Wang, Kailong Wang

Retrieval-Augmented Generation (RAG) significantly enhances Large Language Models (LLMs), but simultaneously exposes a critical vulnerability to...

1 months ago cs.CR PDF

Attack HIGH

MirageBackdoor: A Stealthy Attack that Induces Think-Well-Answer-Wrong Reasoning

Yizhe Zeng, Wei Zhang, Yunpeng Li +3 more

While Chain-of-Thought (CoT) prompting has become a standard paradigm for eliciting complex reasoning capabilities in Large Language Models, it...

1 months ago cs.CR PDF

Defense HIGH

Argus: Reorchestrating Static Analysis via a Multi-Agent Ensemble for Full-Chain Security Vulnerability Detection

Zi Liang, Qipeng Xie, Jun He +7 more

Recent advancements in Large Language Models (LLMs) have sparked interest in their application to Static Application Security Testing (SAST),...

1 months ago cs.CR cs.CL cs.SE PDF

Benchmark HIGH

PoC-Adapt: Semantic-Aware Automated Vulnerability Reproduction with LLM Multi-Agents and Reinforcement Learning-Driven Adaptive Policy

Phan The Duy, Nguyen Viet Duy, Khoa Ngo-Khanh +2 more

While recent approaches leverage large language models (LLMs) and multi-agent pipelines to automatically generate proof-of-concept (PoC) exploits...

1 months ago cs.CR PDF

Attack HIGH

Can Drift-Adaptive Malware Detectors Be Made Robust? Attacks and Defenses Under White-Box and Black-Box Threats

Adrian Shuai Li, Md Ajwad Akil, Elisa Bertino

Concept drift and adversarial evasion are two major challenges for deploying machine learning-based malware detectors. While both have been studied...

1 months ago cs.CR PDF

Attack HIGH

The Defense Trilemma: Why Prompt Injection Defense Wrappers Fail?

Manish Bhatt, Sarthak Munshi, Vineeth Sai Narajala +4 more

We prove that no continuous, utility-preserving wrapper defense-a function $D: X\to X$ that preprocesses inputs before the model sees them-can make...

1 months ago cs.CR cs.AI PDF

Attack HIGH

Reading Between the Pixels: An Inscriptive Jailbreak Attack on Text-to-Image Models

Zonghao Ying, Haowen Dai, Lianyu Hu +5 more

Modern text-to-image (T2I) models can now render legible, paragraph-length text, enabling a fundamentally new class of misuse. We identify and...

1 months ago cs.CV PDF

Attack HIGH

Reading Between the Pixels: An Inscriptive Jailbreak Attack on Text-to-Image Models

Zonghao Ying, Haowen Dai, Lianyu Hu +5 more

Modern text-to-image (T2I) models can now render legible, paragraph-length text, enabling a fundamentally new class of misuse. We identify and...

1 months ago cs.CV PDF

Attack HIGH

Stealthy and Adjustable Text-Guided Backdoor Attacks on Multimodal Pretrained Models

Yiyang Zhang, Chaojian Yu, Ziming Hong +4 more

Multimodal pretrained models are vulnerable to backdoor attacks, yet most existing methods rely on visual or multimodal triggers, which are...

1 months ago cs.CR cs.LG PDF

Benchmark HIGH

Uncovering Linguistic Fragility in Vision-Language-Action Models via Diversity-Aware Red Teaming

Baoshun Tong, Haoran He, Ling Pan +2 more

Vision-Language-Action (VLA) models have achieved remarkable success in robotic manipulation. However, their robustness to linguistic nuances remains...

1 months ago cs.RO cs.CV PDF

Other HIGH

Stop Fixating on Prompts: Reasoning Hijacking and Constraint Tightening for Red-Teaming LLM Agents

Yanxu Mao, Peipei Liu, Tiehan Cui +3 more

With the widespread application of LLM-based agents across various domains, their complexity has introduced new security threats. Existing red-team...

1 months ago cs.CL PDF

Attack HIGH

Do No Harm: Exposing Hidden Vulnerabilities of LLMs via Persona-based Client Simulation Attack in Psychological Counseling

Qingyang Xu, Yaling Shen, Stephanie Fong +7 more

The increasing use of large language models (LLMs) in mental healthcare raises safety concerns in high-stakes therapeutic interactions. A key...

1 months ago cs.CL PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial