Attack MEDIUM
Shuyi Zhou, Zeen Song, Wenwen Qiang +4 more
Large Language Models remain vulnerable to adversarial prefix attacks (e.g., ``Sure, here is'') despite robust standard safety. We diagnose this...
Tool MEDIUM
Zixuan Xu, Tiancheng He, Huahui Yi +7 more
Vision-language models remain susceptible to multimodal jailbreaks and over-refusal because safety hinges on both visual evidence and user intent,...
Benchmark MEDIUM
Minseok Choi, Dongjin Kim, Seungbin Yang +5 more
With the growing deployment of large language models (LLMs) in real-world applications, establishing robust safety guardrails to moderate their...
Benchmark MEDIUM
Zhongxi Wang, Yueqian Lin, Jingyang Zhang +2 more
Safety evaluation and red-teaming of large language models remain predominantly text-centric, and existing frameworks lack the infrastructure to...
3 weeks ago cs.LG cs.CL cs.CV
PDF
Tool MEDIUM
Bhanu Pallakonda, Mikkel Hindsbo, Sina Ehsani +1 more
The proliferation of open-weight Large Language Models (LLMs) has democratized agentic AI, yet fine-tuned weights are frequently shared and adopted...
3 weeks ago cs.CR cs.AI
PDF
Survey MEDIUM
Tatiana Chakravorti, Pranav Narayanan Venkit, Sourojit Ghosh +1 more
Generative AI tools are increasingly entering academic peer review workflows, raising questions about fairness, accountability, and the legitimacy of...
3 weeks ago cs.CY cs.AI cs.HC
PDF
Attack MEDIUM
Guoxin Shi, Haoyu Wang, Zaihui Yang +2 more
Adversarial behavior plays a central role in aligning large language models with human values. However, existing alignment methods largely rely on...
3 weeks ago cs.CR cs.AI
PDF
Survey MEDIUM
Zhihang Deng, Jiaping Gui, Weinan Zhang
Large Language Models (LLMs) are increasingly deployed as agentic systems that plan, memorize, and act in open-world environments. This shift brings...
Benchmark MEDIUM
Yu Lin, Qizhi Zhang, Wenqiang Ruan +6 more
The rapid development of large language models (LLMs) has driven the widespread adoption of cloud-based LLM inference services, while also bringing...
3 weeks ago cs.CR cs.AI
PDF
Defense MEDIUM
Manisha Mukherjee, Vincent J. Hellendoorn
Large Language Models (LLMs) are increasingly deployed for code generation in high-stakes software development, yet their limited transparency in...
3 weeks ago cs.SE cs.AI cs.CR
PDF
Benchmark MEDIUM
Rahul Marchand, Art O Cathain, Jerome Wynne +5 more
Large language models (LLMs) increasingly act as autonomous agents, using tools to execute code, read and write files, and access networks, creating...
3 weeks ago cs.CR cs.AI
PDF
Tool MEDIUM
Qingxiao Xu, Ze Sheng, Zhicheng Chen +1 more
Large language models (LLMs) have shown promise for automated patching, but their effectiveness depends strongly on how they are integrated into...
3 weeks ago cs.CR cs.SE
PDF
Benchmark MEDIUM
Huajie Chen, Tianqing Zhu, Yuchen Zhong +7 more
Dataset distillation compresses a large real dataset into a small synthetic one, enabling models trained on the synthetic data to achieve performance...
3 weeks ago cs.CR cs.AI cs.LG
PDF
Attack MEDIUM
Martin Odersky, Yaoyu Zhao, Yichen Xu +2 more
AI agents that interact with the real world through tool calls pose fundamental safety challenges: agents might leak private information, cause...
3 weeks ago cs.AI cs.PL
PDF
Defense MEDIUM
Ming Wen, Kun Yang, Xin Chen +4 more
Multimodal Large Language Models (MLLMs) pose critical safety challenges, as they are susceptible not only to adversarial attacks such as...
3 weeks ago cs.LG cs.AI
PDF
Benchmark MEDIUM
Haodong Zhao, Jinming Hu, Zhaomin Wu +7 more
Federated Instruction Tuning (FIT) enables collaborative instruction tuning of large language models across multiple organizations (clients) in a...
Attack MEDIUM
Jingyuan Xie, Wenjie Wang, Ji Wu +1 more
Supervised fine-tuning (SFT) is essential for the development of medical large language models (LLMs), yet prior poisoning studies have mainly...
3 weeks ago cs.CR cs.AI cs.LG
PDF
Tool MEDIUM
Yijun Yu
Agentic AI systems exhibit numerous crosscutting concerns -- security, observability, cost management, fault tolerance -- that are poorly modularized...
3 weeks ago cs.AI cs.SE
PDF
Defense MEDIUM
Chang Xue, Fang Liu, Jiaye Wang +2 more
Decentralized financial platforms rely heavily on Web of Trust reputation systems to mitigate counterparty risk in the absence of centralized...
3 weeks ago cs.CR cs.AI cs.LG
PDF
Benchmark MEDIUM
Om Tailor
Colluding language-model agents can hide coordination in messages that remain policy-compliant at the surface level. We present CLBC, a protocol...
3 weeks ago cs.CR cs.AI eess.SY
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial