AI Agents May Always Fall for Prompt Injections
Sahar Abdelnabi, Eugene Bagdasarian
Prompt injection is the most critical vulnerability in deployed AI agents. Despite recent progress, we show that the prevailing defense paradigm...
AI Threat Alert indexes 3,023+ peer-reviewed and preprint papers on AI/ML security — covering adversarial attacks, model defenses, red-teaming benchmarks, surveys, and security tooling. Papers are sourced from arXiv, classified by type and by relevance to real-world threats, and cross-referenced with the CVEs and incidents they relate to.
Showing 101–120 of 748 papers
Clear filtersSahar Abdelnabi, Eugene Bagdasarian
Prompt injection is the most critical vulnerability in deployed AI agents. Despite recent progress, we show that the prevailing defense paradigm...
Udari Madhushani Sehwag, Zhengyang Shan, Heming Liu +3 more
Clarification-seeking behavior is widely regarded as a desirable property of LLM agents, enabling them to resolve ambiguity before acting on...
Rui Wen, Mark Russinovich, Andrew Paverd +2 more
Backdoor attacks pose a serious security threat to large language models (LLMs), which are increasingly deployed as general-purpose assistants in...
Tri Cao, Yulin Chen, Hieu Cao +8 more
Web agents can autonomously complete online tasks by interacting with websites, but their exposure to open web environments makes them vulnerable to...
Yi Wang, Hongye Qiu, Yue Xu +4 more
Large Language Models (LLMs) and Vision Language Models (VLMs) have demonstrated impressive capabilities but remain vulnerable to jailbreaking...
Chenyi Wang, Ruoyu Song, Raymond Muller +5 more
Autonomous vehicles depend on online HD map construction to perceive lane boundaries, dividers, and pedestrian crossings -- safety-critical road...
Narek Maloyan, Dmitry Namiot
Always-on AI agents (OpenClaw, Hermes Agent) run as a single persistent process under the owner's identity, folding messaging, memory, self-authored...
Xiaozhe Zhang, Chaozhuo Li, Hui Liu +4 more
Large language models remain vulnerable to adversarial prompts that elicit harmful outputs. Existing safety paradigms typically couple red-teaming...
Shuqiang Wang, Wei Cao, Jiaqi Weng +4 more
Large Reasoning Models (LRMs) are increasingly integrated into systems requiring reliable multi-step inference, yet this growing dependence exposes...
Shuqiang Wang, Wei Cao, Jiaqi Weng +4 more
Large Reasoning Models (LRMs) are increasingly integrated into systems requiring reliable multi-step inference, yet this growing dependence exposes...
Ying Li, Hongbo Wen, Yanju Chen +3 more
LLM-powered agents can silently delete documents, leak credentials, or transfer funds on a routine user request, not because the agent was attacked,...
Zvi Topol
Large language models (LLMs) are increasingly deployed in a wide range of applications, yet remain vulnerable to adversarial jailbreak attacks that...
Buyun Liang, Jinqi Luo, Liangzu Peng +6 more
Large language models (LLMs) achieve strong performance across many tasks but remain vulnerable to hallucinations, motivating the need for realistic...
Matthew D. Laws, Alina Oprea, Cristina Nita-Rotaru
Agentic AI governance is a critical component of agentic AI infrastructure ensuring that agents follow their owner's communication and interaction...
Chang Jin, An Wang, Zeming Wei +7 more
Reusable skills are becoming a common interface for extending large language model agents, packaging procedural guidance with access to files, tools,...
Zhaojiacheng Zhou
Agent skills extend LLM agents with reusable instructions, tool interfaces, and executable code, and users increasingly install third-party skills...
Chia-Pei, Chen, Kentaroh Toyoda +2 more
Web-browsing AI agents are increasingly deployed in enterprise settings under strict whitelists of approved domains, yet adversaries can still...
Cristian Morasso, Anisa Halimi, Muhammad Zaid Hameed +1 more
Automated red-teaming for LLMs often discovers narrow attack slices, missing diverse real-world threats, and yielding insufficient data for safety...
Zeguan Xiao, Xuanzhe Xu, Yun Chen +4 more
Large language model (LLM) unlearning aims to remove specific data influences from pre-trained model without costly retraining, addressing privacy,...
Ziyu Liu, Tao Li, Tianjie Ni +5 more
Backdoor vulnerabilities widely exist in the fine-tuning of large language models(LLMs). Most backdoor poisoning methods operate mainly at the token...
AI security research studies how AI and machine-learning systems can be attacked and defended — covering adversarial examples, prompt injection, model poisoning, training-data extraction, and the mitigations against them. AI Threat Alert curates this research from academic sources so security teams can track the threats behind emerging AI risks.
AI Threat Alert indexes 3,023+ papers on AI/ML security, classified across attack, defense, benchmark, survey, and tool categories and updated continuously.
Papers are sourced from arXiv, then classified by type and by relevance to real-world AI/ML threats, and cross-referenced with the CVEs and incidents they relate to.
Coverage spans adversarial attacks, model and system defenses, red-teaming benchmarks, literature surveys, and security tooling for LLMs, ML libraries, AI agents, and inference pipelines.
Every paper is filtered for AI security relevance and linked to the vulnerabilities, vendors, and incidents it relates to, so the research connects directly to operational threat intelligence.
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial