Benchmark MEDIUM
Jinhu Qi, Yifan Li, Minghao Zhao +4 more
As agentic AI systems move beyond static question answering into open-ended, tool-augmented, and multi-step real-world workflows, their increased...
2 months ago cs.CL cs.DB
PDF
Tool MEDIUM
Zhuoshang Wang, Yubing Ren, Yanan Cao +3 more
While watermarking serves as a critical mechanism for LLM provenance, existing secret-key schemes tightly couple detection with injection, requiring...
2 months ago cs.CR cs.CL
PDF
Defense MEDIUM
Yewon Han, Yumin Seol, EunGyung Kong +2 more
Existing jailbreak defence frameworks for Large Vision-Language Models often suffer from a safety utility tradeoff, where strengthening safety...
2 months ago cs.CV cs.AI
PDF
Attack MEDIUM
Ruyi Zhang, Heng Gao, Songlei Jian +2 more
Backdoor attacks compromise model reliability by using triggers to manipulate outputs. Trigger inversion can accurately locate these triggers via a...
2 months ago cs.CR cs.AI
PDF
Benchmark HIGH
Lidor Erez, Omer Hofman, Tamir Nizri +1 more
Automated LLM vulnerability scanners are increasingly used to assess security risks by measuring different attack type success rates (ASR). Yet the...
2 months ago cs.CR cs.PF
PDF
Benchmark LOW
Andrew Seohwan Yu, Mohsen Hariri, Kunio Nakamura +3 more
Vision language models (VLMs) have shown significant promise in visual grounding for images as well as videos. In medical imaging research, VLMs...
2 months ago cs.CV cs.LG
PDF
Benchmark LOW
Andrew Seohwan Yu, Mohsen Hariri, Kunio Nakamura +3 more
Vision language models (VLMs) have shown significant promise in visual grounding for images as well as videos. In medical imaging research, VLMs...
2 months ago cs.CV cs.LG
PDF
Defense LOW
Max Hellrigel-Holderbaum, Edward James Young
As AI systems advance in capabilities, measuring their safety and alignment to human values is becoming paramount. A fast-growing field of AI...
2 months ago cs.CY cs.AI cs.CL
PDF
Benchmark LOW
Xiaoya Lu, Yijin Zhou, Zeren Chen +6 more
Vision-Language Models (VLMs) empower embodied agents to execute complex instructions, yet they remain vulnerable to contextual safety risks where...
Defense MEDIUM
Suvadeep Hajra, Palash Nandi, Tanmoy Chakraborty
Safety tuning through supervised fine-tuning and reinforcement learning from human feedback has substantially improved the robustness of large...
Defense MEDIUM
Pengcheng Li, Jie Zhang, Tianwei Zhang +5 more
Safety alignment in large language models is typically evaluated under isolated queries, yet real-world use is inherently multi-turn. Although...
2 months ago cs.CR cs.AI
PDF
Tool MEDIUM
Ziling Zhou
AI agents dynamically acquire capabilities at runtime via MCP and A2A, yet no framework detects when capabilities change post-authorization. We term...
Tool MEDIUM
Ziling Zhou
AI agents dynamically acquire tools, orchestrate sub-agents, and transact across organizational boundaries, yet no existing security layer verifies...
Other LOW
Xiaowen Jiang, Andrei Semenov, Sebastian U. Stich
While spectral-based optimizers like Muon operate directly on the spectrum of updates, standard adaptive methods such as AdamW do not account for the...
2 months ago cs.LG math.OC
PDF
Attack HIGH
Maël Jenny, Jérémie Dentan, Sonia Vanier +1 more
Most jailbreak techniques for Large Language Models (LLMs) primarily rely on prompt modifications, including paraphrasing, obfuscation, or...
Attack HIGH
Chongxin Li, Hanzhang Wang, Lian Duan
Safety prompts constitute an interpretable layer of defense against jailbreak attacks in vision-language models (VLMs); however, their efficacy is...
Benchmark MEDIUM
Ivan Lopez, Selin S. Everett, Bryan J. Bunning +10 more
Large language models (LLMs) are entering clinician workflows, yet evaluations rarely measure how clinician reasoning shapes model behavior during...
2 months ago cs.HC cs.LG
PDF
Attack HIGH
Yiling Tao, Xinran Zheng, Shuo Yang +2 more
While large language model-based agents demonstrate great potential in collaborative tasks, their interactivity also introduces security...
Attack HIGH
Zijian Ling, Pingyi Hu, Xiuyong Gao +6 more
Speech-driven large language models (LLMs) are increasingly accessed through speech interfaces, introducing new security risks via open acoustic...
2 months ago cs.CR cs.AI cs.SD
PDF
Survey LOW
Zakia Zaman, Praveen Gauravaram, Mahbub Hassan +2 more
The rapid proliferation of the Internet of Things has intensified demand for robust privacy-preserving machine learning mechanisms to safeguard...
2 months ago cs.LG cs.CR
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial