Benchmark LOW
Seung hee Choi, MinJu Jeon, Hyunwoo Oh +2 more
Existing retrieval-augmented approaches for Dense Video Captioning (DVC) often fail to achieve accurate temporal segmentation aligned with true event...
Benchmark LOW
Maximilian Wendlinger, Daniel Kowatsch, Konstantin Böttinger +1 more
Large Language Models (LLMs) show remarkable capabilities in understanding natural language and generating complex code. However, as practitioners...
2 months ago cs.CR cs.LG
PDF
Benchmark MEDIUM
Marc Damie, Murat Bilgehan Ertan, Domenico Essoussi +3 more
With their increasing capabilities, Large Language Models (LLMs) are now used across many industries. They have become useful tools for software...
2 months ago cs.LG cs.CL cs.CR
PDF
Benchmark MEDIUM
Chuan Guo, Juan Felipe Ceron Uribe, Sicheng Zhu +10 more
Instruction hierarchy (IH) defines how LLMs prioritize system, developer, user, and tool instructions under conflict, providing a concrete,...
2 months ago cs.AI cs.CL cs.CR
PDF
Benchmark MEDIUM
Manit Baser, Alperen Yildiz, Dinil Mon Divakaran +1 more
The static knowledge representations of large language models (LLMs) inevitably become outdated or incorrect over time. While model-editing...
Benchmark MEDIUM
Amir Al-Maamari
Large Language Models (LLMs) show promise for Automated Program Repair (APR), yet their effectiveness on security vulnerabilities remains poorly...
2 months ago cs.CR cs.AI
PDF
Benchmark LOW
Zhishu Liu, Kaishen Yuan, Bo Zhao +2 more
Micro-expression Action Unit (AU) detection identifies localized AUs from subtle facial muscle activations, providing a foundation for decoding...
Benchmark MEDIUM
Chenxi Li, Xianggan Liu, Dake Shen +9 more
Despite the rapid progress of Large Vision-Language Models (LVLMs), the integration of visual modalities introduces new safety vulnerabilities that...
2 months ago cs.CV cs.LG
PDF
Benchmark MEDIUM
Yige Li, Wei Zhao, Zhe Li +6 more
Backdoor mechanisms have traditionally been studied as security threats that compromise the integrity of machine learning models. However, the same...
2 months ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Yuxu Ge
Autonomous agents powered by large language models introduce a class of execution-layer vulnerabilities -- prompt injection, retrieval poisoning, and...
2 months ago cs.CR cs.AI
PDF
Benchmark HIGH
Zheng Yu, Wenxuan Shi, Xinqian Sun +3 more
Automated Vulnerability Repair (AVR) systems, especially those leveraging large language models (LLMs), have demonstrated promising results in...
Benchmark HIGH
Zheng Yu, Wenxuan Shi, Xinqian Sun +3 more
Automated Vulnerability Repair (AVR) systems, especially those leveraging large language models (LLMs), have demonstrated promising results in...
Benchmark LOW
Yanbang Sun, Quan Luo, Yuelin Wang +6 more
Network protocols are the foundation of modern communication, yet their implementations often contain semantic vulnerabilities stemming from...
2 months ago cs.CR cs.CY
PDF
Benchmark LOW
Amirpasha Mozaffari, Amanda Duarte, Lina Teckentrup +8 more
The rapid adoption of AI in Earth system science promises unprecedented speed and fidelity in the generation of climate information. However, this...
2 months ago physics.ao-ph cs.AI cs.LG
PDF
Benchmark MEDIUM
Xiaoguang Li, Hanyi Wang, Yaowei Huang +6 more
Shuffler-based differential privacy (shuffle-DP) is a privacy paradigm providing high utility by involving a shuffler to permute noisy report from...
Benchmark MEDIUM
Yuchen Shi, Huajie Chen, Heng Xu +6 more
Transfer learning is devised to leverage knowledge from pre-trained models to solve new tasks with limited data and computational resources....
2 months ago cs.CR cs.LG
PDF
Benchmark MEDIUM
Kelly L Vomo-Donfack, Adryel Hoszu, Grégory Ginot +1 more
Federated learning (FL) faces two structural tensions: gradient sharing enables data-reconstruction attacks, while non-IID client distributions...
2 months ago cs.LG cs.CR cs.DC
PDF
Benchmark MEDIUM
Jiaxun Guo, Ziyuan Yang, Mengyu Sun +3 more
The rapid adoption of Large Language Models (LLMs) has transformed modern software development by enabling automated code generation at scale. While...
2 months ago cs.SE cs.CL
PDF
Benchmark MEDIUM
Maheep Chaudhary
Humans often become more self-aware under threat, yet can lose self-awareness when absorbed in a task; we hypothesize that language models exhibit...
2 months ago cs.AI cs.CL cs.LG
PDF
Benchmark MEDIUM
Aradhye Agarwal, Gurdit Siyan, Yash Pandya +3 more
Agentic language models operate in a fundamentally different safety regime than chat models: they must plan, call tools, and execute long-horizon...
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial