Benchmark HIGH
Siddharth Srikanth, Freddie Liang, Sophie Hsu +9 more
Vision-Language-Action (VLA) models have significant potential to enable general-purpose robotic systems for a range of vision-language tasks....
1 weeks ago cs.RO cs.AI cs.CL
PDF
Benchmark MEDIUM
Ninghui Li, Kaiyuan Zhang, Kyle Polley +1 more
This article, a lightly adapted version of Perplexity's response to NIST/CAISI Request for Information 2025-0035, details our observations and...
1 weeks ago cs.LG cs.AI cs.CR
PDF
Benchmark LOW
Chiyuan He, Zihuan Qiu, Fanman Meng +4 more
Continual learning of pretrained vision-language models (VLMs) is prone to catastrophic forgetting, yet current approaches adapt to new tasks without...
1 weeks ago cs.CV cs.LG
PDF
Benchmark LOW
Chiyuan He, Zihuan Qiu, Fanman Meng +4 more
Continual learning of pretrained vision-language models (VLMs) is prone to catastrophic forgetting, yet current approaches adapt to new tasks without...
1 weeks ago cs.CV cs.LG
PDF
Benchmark MEDIUM
Junjie Chu, Yiting Qu, Ye Leng +4 more
Large Language Models (LLMs) are increasingly trained to align with human values, primarily focusing on task level, i.e., refusing to execute...
1 weeks ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Qizhi Chen, Chao Qi, Yihong Huang +5 more
Graph-based Retrieval-Augmented Generation (GraphRAG) constructs the Knowledge Graph (KG) from external databases to enhance the timeliness and...
1 weeks ago cs.LG cs.AI cs.CR
PDF
Benchmark LOW
Yan Tan, Xiangchen Meng, Zijun Jiang +1 more
Large language models (LLMs) have demonstrated impressive capabilities in generating software code for high-level programming languages such as...
1 weeks ago cs.PL cs.AR
PDF
Benchmark LOW
Seung hee Choi, MinJu Jeon, Hyunwoo Oh +2 more
Existing retrieval-augmented approaches for Dense Video Captioning (DVC) often fail to achieve accurate temporal segmentation aligned with true event...
Benchmark LOW
Maximilian Wendlinger, Daniel Kowatsch, Konstantin Böttinger +1 more
Large Language Models (LLMs) show remarkable capabilities in understanding natural language and generating complex code. However, as practitioners...
1 weeks ago cs.CR cs.LG
PDF
Benchmark MEDIUM
Marc Damie, Murat Bilgehan Ertan, Domenico Essoussi +3 more
With their increasing capabilities, Large Language Models (LLMs) are now used across many industries. They have become useful tools for software...
2 weeks ago cs.LG cs.CL cs.CR
PDF
Benchmark MEDIUM
Chuan Guo, Juan Felipe Ceron Uribe, Sicheng Zhu +10 more
Instruction hierarchy (IH) defines how LLMs prioritize system, developer, user, and tool instructions under conflict, providing a concrete,...
2 weeks ago cs.AI cs.CL cs.CR
PDF
Benchmark MEDIUM
Manit Baser, Alperen Yildiz, Dinil Mon Divakaran +1 more
The static knowledge representations of large language models (LLMs) inevitably become outdated or incorrect over time. While model-editing...
Benchmark MEDIUM
Amir Al-Maamari
Large Language Models (LLMs) show promise for Automated Program Repair (APR), yet their effectiveness on security vulnerabilities remains poorly...
2 weeks ago cs.CR cs.AI
PDF
Benchmark LOW
Zhishu Liu, Kaishen Yuan, Bo Zhao +2 more
Micro-expression Action Unit (AU) detection identifies localized AUs from subtle facial muscle activations, providing a foundation for decoding...
Benchmark MEDIUM
Chenxi Li, Xianggan Liu, Dake Shen +9 more
Despite the rapid progress of Large Vision-Language Models (LVLMs), the integration of visual modalities introduces new safety vulnerabilities that...
2 weeks ago cs.CV cs.LG
PDF
Benchmark MEDIUM
Yige Li, Wei Zhao, Zhe Li +6 more
Backdoor mechanisms have traditionally been studied as security threats that compromise the integrity of machine learning models. However, the same...
2 weeks ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Yuxu Ge
Autonomous agents powered by large language models introduce a class of execution-layer vulnerabilities -- prompt injection, retrieval poisoning, and...
2 weeks ago cs.CR cs.AI
PDF
Benchmark HIGH
Zheng Yu, Wenxuan Shi, Xinqian Sun +3 more
Automated Vulnerability Repair (AVR) systems, especially those leveraging large language models (LLMs), have demonstrated promising results in...
Benchmark HIGH
Zheng Yu, Wenxuan Shi, Xinqian Sun +3 more
Automated Vulnerability Repair (AVR) systems, especially those leveraging large language models (LLMs), have demonstrated promising results in...
Benchmark LOW
Yanbang Sun, Quan Luo, Yuelin Wang +6 more
Network protocols are the foundation of modern communication, yet their implementations often contain semantic vulnerabilities stemming from...
2 weeks ago cs.CR cs.CY
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial