Attack MEDIUM
Yizhe Xie, Congcong Zhu, Xinyue Zhang +5 more
Large Language Model-based Multi-Agent Systems (LLM-MAS) are increasingly applied to complex collaborative scenarios. However, their collaborative...
2 months ago cs.MA cs.AI
PDF
Attack MEDIUM
Achyutha Menon, Magnus Saebo, Tyler Crosse +3 more
The accelerating adoption of language models (LMs) as agents for deployment in long-context tasks motivates a thorough understanding of goal drift:...
Attack MEDIUM
Edouard Lansiaux
Federated Learning (FL) enables collaborative training of medical AI models across hospitals without centralizing patient data. However, the exchange...
2 months ago cs.CR cs.AI
PDF
Attack MEDIUM
Shuyi Zhou, Zeen Song, Wenwen Qiang +4 more
Large Language Models remain vulnerable to adversarial prefix attacks (e.g., ``Sure, here is'') despite robust standard safety. We diagnose this...
Attack MEDIUM
Guoxin Shi, Haoyu Wang, Zaihui Yang +2 more
Adversarial behavior plays a central role in aligning large language models with human values. However, existing alignment methods largely rely on...
2 months ago cs.CR cs.AI
PDF
Attack MEDIUM
Martin Odersky, Yaoyu Zhao, Yichen Xu +2 more
AI agents that interact with the real world through tool calls pose fundamental safety challenges: agents might leak private information, cause...
2 months ago cs.AI cs.PL
PDF
Attack MEDIUM
Jingyuan Xie, Wenjie Wang, Ji Wu +1 more
Supervised fine-tuning (SFT) is essential for the development of medical large language models (LLMs), yet prior poisoning studies have mainly...
2 months ago cs.CR cs.AI cs.LG
PDF
Attack MEDIUM
Qianxun Xu, Chenxi Song, Yujun Cai +1 more
Recent advances in text-to-video diffusion models have enabled high-fidelity and temporally coherent videos synthesis. However, current models are...
Attack MEDIUM
Qianxun Xu, Chenxi Song, Yujun Cai +1 more
Recent advances in text-to-video diffusion models have enabled high-fidelity and temporally coherent videos synthesis. However, current models are...
Attack MEDIUM
Idan Habler, Vineeth Sai Narajala, Stav Koren +2 more
Retrieval-Augmented Generation (RAG) systems are essential to contemporary AI applications, allowing large language models to obtain external...
2 months ago cs.CR cs.AI
PDF
Attack MEDIUM
Bruce W. Lee, Chen Yueh-Han, Tomek Korbak
Frontier AI agents may pursue hidden goals while concealing their pursuit from oversight. Alignment training aims to prevent such behavior by...
2 months ago cs.LG cs.AI
PDF
Attack MEDIUM
Sarthak Munshi, Manish Bhatt, Vineeth Sai Narajala +4 more
While prior work has focused on projecting adversarial examples back onto the manifold of natural data to restore safety, we argue that a...
2 months ago cs.LG cs.AI cs.CR
PDF
Attack MEDIUM
Inderjeet Singh, Vikas Pahuja, Aishvariya Priya Rathina Sabapathy +8 more
Current stateless defences for multimodal agentic RAG fail to detect adversarial strategies that distribute malicious semantics across retrieval,...
2 months ago cs.CR cs.AI cs.CL
PDF
Attack MEDIUM
Zac Garby, Andrew D. Gordon, David Sands
A conversation with a large language model (LLM) is a sequence of prompts and responses, with each response generated from the preceding...
2 months ago cs.PL cs.AI cs.CR
PDF
Attack MEDIUM
Natalie Shapira, Chris Wendler, Avery Yen +35 more
We report an exploratory red-teaming study of autonomous language-model-powered agents deployed in a live laboratory environment with persistent...
2 months ago cs.AI cs.CY
PDF
Attack MEDIUM
Xunzhuo Liu, Huamin Chen, Samzong Lu +27 more
As large language models (LLMs) diversify across modalities, capabilities, and cost profiles, the problem of intelligent request routing -- selecting...
2 months ago cs.NI cs.AI
PDF
Attack MEDIUM
Kaiwen Wang, Xiaolin Chang, Yuehan Dong +1 more
Secure comparison is a fundamental primitive in multi-party computation, supporting privacy-preserving applications such as machine learning and data...
Attack MEDIUM
Diego Soi, Silvia Lucia Sanna, Lorenzo Pisu +2 more
In recent years, stealthy Android malware has increasingly adopted sophisticated techniques to bypass automatic detection mechanisms and harden...
Attack MEDIUM
Justin Albrethsen, Yash Datta, Kunal Kumar +1 more
While Large Language Model (LLM) capabilities have scaled, safety guardrails remain largely stateless, treating multi-turn dialogues as a series of...
2 months ago cs.AI cs.ET cs.LG
PDF
Attack MEDIUM
Nils Palumbo, Sarthak Choudhary, Jihye Choi +2 more
LLM-based agents are increasingly being deployed in contexts requiring complex authorization policies: customer service protocols, approval...
2 months ago cs.CR cs.AI cs.MA
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial