When Skills Lie: Hidden-Comment Injection in LLM Agents
Qianli Wang, Boyang Ma, Minghui Xu +1 more
LLM agents often rely on Skills to describe available tools and recommended procedures. We study a hidden-comment prompt injection risk in this...
2,583+ academic papers on AI security, attacks, and defenses
Showing 321–340 of 890 papers
Clear filtersQianli Wang, Boyang Ma, Minghui Xu +1 more
LLM agents often rely on Skills to describe available tools and recommended procedures. We study a hidden-comment prompt injection risk in this...
Peiran Wang, Xinfeng Li, Chong Xiang +5 more
The evolution of Large Language Models (LLMs) has resulted in a paradigm shift towards autonomous agents, necessitating robust security against...
Tri Nguyen, Huy Hoang Bao Le, Lohith Srikanth Pentapalli +2 more
Detecting jailbreak attempts in clinical training large language models (LLMs) requires accurate modeling of linguistic deviations that signal unsafe...
Adriana Alvarado Garcia, Ruyuan Wan, Ozioma C. Oguine +1 more
Recently, red teaming, with roots in security, has become a key evaluative approach to ensure the safety and reliability of Generative Artificial...
George Tsigkourakos, Constantinos Patsakis
Static Application Security Testing (SAST) tools are integral to modern DevSecOps pipelines, yet tools like CodeQL, Semgrep, and SonarQube remain...
Hayfa Dhabhi, Kashyap Thimmaraju
Large Language Models (LLMs) deploy safety mechanisms to prevent harmful outputs, yet these defenses remain vulnerable to adversarial prompts. While...
Chaeyun Kim, YongTaek Lim, Kihyun Kim +2 more
Existing red-teaming benchmarks, when adapted to new languages via direct translation, fail to capture socio-technical vulnerabilities rooted in...
Georgios Syros, Evan Rose, Brian Grinstead +4 more
Large language model (LLM) based web agents are increasingly deployed to automate complex online tasks by directly interacting with web sites and...
Kotekar Annapoorna Prabhu, Andrew Gan, Zahra Ghodsi
Machine learning relies on randomness as a fundamental component in various steps such as data sampling, data augmentation, weight initialization,...
Yu Yan, Sheng Sun, Shengjia Cheng +3 more
Vision-Language Models (VLMs) with multimodal reasoning capabilities are high-value attack targets, given their potential for handling complex...
Suraj Ranganath, Atharv Ramesh
AI-text detectors face a critical robustness challenge: adversarial paraphrasing attacks that preserve semantics while evading detection. We...
Suraj Ranganath, Atharv Ramesh
AI-text detectors face a critical robustness challenge: adversarial paraphrasing attacks that preserve semantics while evading detection. We...
Jona te Lintelo, Lichao Wu, Stjepan Picek
The rapid adoption of Mixture-of-Experts (MoE) architectures marks a major shift in the deployment of Large Language Models (LLMs). MoE LLMs improve...
Yanzhang Fu, Zizheng Guo, Jizhou Luo
Score-based query attacks pose a serious threat to deep learning models by crafting adversarial examples (AEs) using only black-box access to model...
Scott Thornton
Hybrid Retrieval-Augmented Generation (RAG) pipelines combine vector similarity search with knowledge graph expansion for multi-hop reasoning. We...
Yuhang Wang, Feiming Xu, Zheng Lin +6 more
Although large language model (LLM)-based agents, exemplified by OpenClaw, are increasingly evolving from task-oriented systems into personalized AI...
Xiaoxu Peng, Dong Zhou, Jianwen Zhang +3 more
Vision Language Models (VLMs) have advanced perception in autonomous driving (AD), but they remain vulnerable to adversarial threats. These risks...
Sahar Zargarzadeh, Mohammad Islam
The Internet of Things (IoT) has revolutionized connectivity by linking billions of devices worldwide. However, this rapid expansion has also...
Md Rafi Ur Rashid, MD Sadik Hossain Shanto, Vishnu Asutosh Dasu +1 more
Vision-Language Models (VLMs) are now a core part of modern AI. Recent work proposed several visual jailbreak attacks using single/ holistic images....
Nanda Rani, Kimberly Milner, Minghao Shao +9 more
Real-world offensive security operations are inherently open-ended: attackers explore unknown attack surfaces, revise hypotheses under uncertainty,...
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial