Attack HIGH
Zhixin Xie, Xurui Song, Jun Luo
Despite substantial efforts in safety alignment, recent research indicates that Large Language Models (LLMs) remain highly susceptible to jailbreak...
Attack HIGH
Chinthana Wimalasuriya, Spyros Tragoudas
Adversarial attacks present a significant threat to modern machine learning systems. Yet, existing detection methods often lack the ability to detect...
5 months ago cs.CR cs.CV cs.LG
PDF
Attack HIGH
Zhaorun Chen, Xun Liu, Mintong Kang +4 more
As vision-language models (VLMs) gain prominence, their multimodal interfaces also introduce new safety vulnerabilities, making the safety evaluation...
5 months ago cs.AI cs.LG
PDF
Benchmark HIGH
Chengquan Guo, Chulin Xie, Yu Yang +6 more
Code agents have gained widespread adoption due to their strong code generation capabilities and integration with code interpreters, enabling dynamic...
Tool HIGH
Jonathan Sneh, Ruomei Yan, Jialin Yu +6 more
As LLMs increasingly power agents that interact with external tools, tool use has become an essential mechanism for extending their capabilities....
5 months ago cs.CR cs.AI
PDF
Attack HIGH
Ruohao Guo, Afshin Oroojlooy, Roshan Sridhar +3 more
Despite recent rapid progress in AI safety, current large language models remain vulnerable to adversarial attacks in multi-turn interaction...
5 months ago cs.LG cs.AI cs.CL
PDF
Attack HIGH
Kedong Xiu, Churui Zeng, Tianhang Zheng +6 more
Existing gradient-based jailbreak attacks typically optimize an adversarial suffix to induce a fixed affirmative response, e.g., ``Sure, here...
5 months ago cs.CR cs.AI
PDF
Attack HIGH
Milad Nasr, Yanick Fratantonio, Luca Invernizzi +7 more
As deep learning models become widely deployed as components within larger production systems, their individual shortcomings can create system-level...
5 months ago cs.CR cs.LG
PDF
Attack HIGH
John Hawkins, Aditya Pramar, Rodney Beard +1 more
Large Language Models (LLMs) suffer from a range of vulnerabilities that allow malicious users to solicit undesirable responses through manipulation...
5 months ago cs.CL cs.AI cs.CY
PDF
Attack HIGH
Isha Gupta, Rylan Schaeffer, Joshua Kazdan +2 more
The field of adversarial robustness has long established that adversarial examples can successfully transfer between image classifiers and that text...
5 months ago cs.LG cs.AI
PDF
Tool HIGH
Shoumik Saha, Jifan Chen, Sam Mayers +3 more
Code-capable large language model (LLM) agents are increasingly embedded into software engineering workflows where they can read, write, and execute...
5 months ago cs.CR cs.AI
PDF
Benchmark HIGH
Yinuo Liu, Ruohan Xu, Xilong Wang +2 more
Multiple prompt injection attacks have been proposed against web agents. At the same time, various methods have been developed to detect general...
5 months ago cs.CR cs.AI cs.CL
PDF
Attack HIGH
Xiangfang Li, Yu Wang, Bo Li
With the rapid advancement of large language models (LLMs), ensuring their safe use becomes increasingly critical. Fine-tuning is a widely used...
Attack HIGH
Alexandrine Fortier, Thomas Thebaud, Jesús Villalba +2 more
Large Language Models (LLMs) and their multimodal extensions are becoming increasingly popular. One common approach to enable multimodality is to...
5 months ago cs.CL cs.CR cs.SD
PDF
Defense HIGH
Shojiro Yamabe, Jun Sakuma
Diffusion language models (DLMs) generate tokens in parallel through iterative denoising, which can reduce latency and enable bidirectional...
5 months ago cs.AI cs.LG
PDF
Attack HIGH
Raik Dankworth, Gesina Schwalbe
Deep neural networks (NNs) for computer vision are vulnerable to adversarial attacks, i.e., miniscule malicious changes to inputs may induce...
5 months ago cs.CR cs.LG
PDF
Attack HIGH
Chenxiang Luo, David K. Y. Yau, Qun Song
Federated learning (FL) enables collaborative model training without sharing raw data but is vulnerable to gradient inversion attacks (GIAs), where...
5 months ago cs.CR cs.LG
PDF
Benchmark HIGH
Haoran Xi, Minghao Shao, Brendan Dolan-Gavitt +2 more
Large language models show promise for vulnerability discovery, yet prevailing methods inspect code in isolation, struggle with long contexts, and...
5 months ago cs.SE cs.CR cs.LG
PDF
Attack HIGH
Qinjian Zhao, Jiaqi Wang, Zhiqiang Gao +3 more
Large Language Models (LLMs) have achieved impressive performance across diverse natural language processing tasks, but their growing power also...
Attack HIGH
Xiaobao Wang, Ruoxiao Sun, Yujun Zhang +4 more
Graph Neural Networks (GNNs) have demonstrated strong performance across tasks such as node classification, link prediction, and graph...
5 months ago cs.LG cs.CR
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial