AI Security Research

2,583+ academic papers on AI security, attacks, and defenses

Total

2,583

Attack

994

Benchmark

740

Defense

355

Tool

275

Survey

146

Type

All Attack Defense Survey Benchmark Tool

Relevance

All High Medium

Date

All time 7 days 30 days 6 months

Showing 581–600 of 1,930 papers

Clear filters

Attack HIGH

Structured Semantic Cloaking for Jailbreak Attacks on Large Language Models

Xiaobing Sun, Perry Lam, Shaohua Li +4 more

Modern LLMs employ safety mechanisms that extend beyond surface-level input filtering to latent semantic representations and generation-time...

1 months ago cs.CL PDF

Tool MEDIUM

SIA: A Synthesize-Inject-Align Framework for Knowledge-Grounded and Secure E-commerce Search LLMs with Industrial Deployment

Zhouwei Zhai, Mengxiang Chen, Anmeng Zhang

Large language models offer transformative potential for e-commerce search by enabling intent-aware recommendations. However, their industrial...

1 months ago cs.CL PDF

Tool LOW

From Workflow Automation to Capability Closure: A Formal Framework for Safe and Revenue-Aware Customer Service AI

Cosimo Spera

Customer service automation is undergoing a structural transformation. The dominant paradigm is shifting from scripted chatbots and single-agent...

1 months ago cs.AI PDF

Attack MEDIUM

Do Not Leave a Gap: Hallucination-Free Object Concealment in Vision-Language Models

Amira Guesmi, Muhammad Shafique

Vision-language models (VLMs) have recently shown remarkable capabilities in visual understanding and generation, but remain vulnerable to...

1 months ago cs.CR cs.CV PDF

Defense LOW

The Internet of Physical AI Agents: Interoperability, Longevity, and the Cost of Getting It Wrong

Roberto Morabito, Mallik Tatipamula

The Internet has evolved by progressively expanding what humanity connects: first computers, then people, and later billions of devices through the...

1 months ago cs.NI cs.AI PDF

Defense MEDIUM

Evolving Contextual Safety in Multi-Modal Large Language Models via Inference-Time Self-Reflective Memory

Ce Zhang, Jinxi He, Junyi He +2 more

Multi-modal Large Language Models (MLLMs) have achieved remarkable performance across a wide range of visual reasoning tasks, yet their vulnerability...

2 months ago cs.CV cs.CL cs.CR PDF

Benchmark LOW

Mechanistic Origin of Moral Indifference in Language Models

Lingyu Li, Yan Teng, Yingchun Wang

Existing behavioral alignment techniques for Large Language Models (LLMs) often neglect the discrepancy between surface compliance and internal...

2 months ago cs.CL cs.AI PDF

Tool HIGH

ClawWorm: Self-Propagating Attacks Across LLM Agent Ecosystems

Yihao Zhang, Zeming Wei, Xiaokun Luan +7 more

Autonomous LLM-based agents increasingly operate as long-running processes forming densely interconnected multi-agent ecosystems, whose security...

2 months ago cs.CR cs.AI cs.LG PDF

Tool HIGH

ClawWorm: Self-Propagating Attacks Across LLM Agent Ecosystems

Yihao Zhang, Zeming Wei, Xiaokun Luan +7 more

Autonomous LLM-based agents increasingly operate as long-running processes forming densely interconnected multi-agent ecosystems, whose security...

2 months ago cs.CR cs.AI cs.LG PDF

Benchmark LOW

Context-Length Robustness in Question Answering Models: A Comparative Empirical Study

Trishita Dhara, Siddhesh Sheth

Large language models are increasingly deployed in settings where relevant information is embedded within long and noisy contexts. Despite this,...

2 months ago cs.AI PDF

Defense MEDIUM

Are Dilemmas and Conflicts in LLM Alignment Solvable? A View from Priority Graph

Zhenheng Tang, Xiang Liu, Qian Wang +3 more

As Large Language Models (LLMs) become more powerful and autonomous, they increasingly face conflicts and dilemmas in many scenarios. We first...

2 months ago cs.AI cs.CY PDF

Benchmark LOW

Evasive Intelligence: Lessons from Malware Analysis for Evaluating AI Agents

Simone Aonzo, Merve Sahin, Aurélien Francillon +1 more

Artificial intelligence (AI) systems are increasingly adopted as tool-using agents that can plan, observe their environment, and take actions over...

2 months ago cs.CR cs.AI PDF

Benchmark LOW

CLAG: Adaptive Memory Organization via Agent-Driven Clustering for Small Language Model Agents

Taeyun Roh, Wonjune Jang, Junha Jung +1 more

Large language model agents heavily rely on external memory to support knowledge reuse and complex reasoning tasks. Yet most memory systems store...

2 months ago cs.CL cs.AI PDF

Defense MEDIUM

Amplification Effects in Test-Time Reinforcement Learning: Safety and Reasoning Vulnerabilities

Vanshaj Khattar, Md Rafi ur Rashid, Moumita Choudhury +4 more

Test-time training (TTT) has recently emerged as a promising method to improve the reasoning abilities of large language models (LLMs), in which the...

2 months ago cs.LG cs.AI cs.CL PDF

Survey MEDIUM

TrinityGuard: A Unified Framework for Safeguarding Multi-Agent Systems

Kai Wang, Biaojie Zeng, Zeming Wei +7 more

With the rapid development of LLM-based multi-agent systems (MAS), their significant safety and security concerns have emerged, which introduce novel...

2 months ago cs.CR cs.AI cs.CL PDF

Benchmark MEDIUM

SFCoT: Safer Chain-of-Thought via Active Safety Evaluation and Calibration

Yu Pan, Wenlong Yu, Tiejun Wu +4 more

Large language models (LLMs) have demonstrated remarkable capabilities in complex reasoning tasks. However, they remain highly susceptible to...

2 months ago cs.CR cs.AI PDF

Attack HIGH

How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition

Mateusz Dziemian, Maxwell Lin, Xiaohan Fu +28 more

LLM based agents are increasingly deployed in high stakes settings where they process external data sources such as emails, documents, and code...

2 months ago cs.CR cs.AI PDF

Benchmark MEDIUM

Directional Embedding Smoothing for Robust Vision Language Models

Ye Wang, Jing Liu, Toshiaki Koike-Akino

The safety and reliability of vision-language models (VLMs) are a crucial part of deploying trustworthy agentic AI systems. However, VLMs remain...

2 months ago cs.LG cs.AI cs.CL PDF

Benchmark MEDIUM

SCAN: Sparse Circuit Anchor Interpretable Neuron for Lifelong Knowledge Editing

Yuhuan Liu, Haitian Zhong, Xinyuan Xia +3 more

Large Language Models (LLMs) often suffer from catastrophic forgetting and collapse during sequential knowledge editing. This vulnerability stems...

2 months ago cs.AI PDF

Attack HIGH

From Storage to Steering: Memory Control Flow Attacks on LLM Agents

Zhenlin Xu, Xiaogang Zhu, Yu Yao +2 more

Modern agentic systems allow Large Language Model (LLM) agents to tackle complex tasks through extensive tool usage, forming structured control flows...

2 months ago cs.CR PDF

Track AI security vulnerabilities in real time

Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.

Start 14-Day Free Trial