Benchmark LOW
Chiyuan He, Zihuan Qiu, Fanman Meng +4 more
Continual learning of pretrained vision-language models (VLMs) is prone to catastrophic forgetting, yet current approaches adapt to new tasks without...
2 months ago cs.CV cs.LG
PDF
Benchmark LOW
Chiyuan He, Zihuan Qiu, Fanman Meng +4 more
Continual learning of pretrained vision-language models (VLMs) is prone to catastrophic forgetting, yet current approaches adapt to new tasks without...
2 months ago cs.CV cs.LG
PDF
Tool HIGH
Sarbartha Banerjee, Prateek Sahu, Anjo Vahldiek-Oberwagner +2 more
Rapid progress in generative AI has given rise to Compound AI systems - pipelines comprised of multiple large language models (LLM), software tools...
2 months ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Junjie Chu, Yiting Qu, Ye Leng +4 more
Large Language Models (LLMs) are increasingly trained to align with human values, primarily focusing on task level, i.e., refusing to execute...
2 months ago cs.CR cs.AI
PDF
Survey LOW
Kele Xu, Yifan Wang, Ming Feng +5 more
Human-computer interaction has traditionally relied on the acoustic channel, a dependency that introduces systemic vulnerabilities to environmental...
Attack HIGH
J Alex Corll
Prompt injection defenses are often framed as semantic understanding problems and delegated to increasingly large neural detectors. For the first...
2 months ago cs.CR cs.AI
PDF
Survey LOW
Jiongchi Yu, Xiaolin Wen, Sizhe Cheng +3 more
Fuzz testing is one of the most effective techniques for detecting bugs and vulnerabilities in software. However, as the basis of fuzz testing,...
2 months ago cs.SE cs.HC
PDF
Tool MEDIUM
Frank Li
Tool-augmented LLM agents introduce security risks that extend beyond user-input filtering, including indirect prompt injection through fetched...
Tool LOW
Chingkwun Lam, Jiaxin Li, Lingfei Zhang +1 more
Long-term memory has emerged as a foundational component of autonomous Large Language Model (LLM) agents, enabling continuous adaptation, lifelong...
Defense MEDIUM
Xinhao Deng, Yixiang Zhang, Jiaqing Wu +15 more
Autonomous Large Language Model (LLM) agents, exemplified by OpenClaw, demonstrate remarkable capabilities in executing complex, long-horizon tasks....
2 months ago cs.CR cs.AI
PDF
Defense LOW
Lu Niu, Cheng Xue
Vision-language models offer strong few-shot capability through prompt tuning but remain vulnerable to noisy labels, which can corrupt prompts and...
Benchmark MEDIUM
Qizhi Chen, Chao Qi, Yihong Huang +5 more
Graph-based Retrieval-Augmented Generation (GraphRAG) constructs the Knowledge Graph (KG) from external databases to enhance the timeliness and...
2 months ago cs.LG cs.AI cs.CR
PDF
Benchmark LOW
Yan Tan, Xiangchen Meng, Zijun Jiang +1 more
Large language models (LLMs) have demonstrated impressive capabilities in generating software code for high-level programming languages such as...
2 months ago cs.PL cs.AR
PDF
Benchmark LOW
Seung hee Choi, MinJu Jeon, Hyunwoo Oh +2 more
Existing retrieval-augmented approaches for Dense Video Captioning (DVC) often fail to achieve accurate temporal segmentation aligned with true event...
Defense MEDIUM
Zhiyu Xue, Zimo Qi, Guangliang Liu +2 more
Safety alignment aims to ensure that large language models (LLMs) refuse harmful requests by post-training on harmful queries paired with refusal...
Attack HIGH
Indranil Halder, Annesya Banerjee, Cengiz Pehlevan
Adversarial attacks can reliably steer safety-aligned large language models toward unsafe behavior. Empirically, we find that adversarial...
2 months ago cs.LG cs.AI
PDF
Tool LOW
Raj Sanjay Shah, Jing Huang, Keerthiram Murugesan +2 more
Unlearning in Large Language Models (LLMs) aims to enhance safety, mitigate biases, and comply with legal mandates, such as the right to be...
Benchmark LOW
Maximilian Wendlinger, Daniel Kowatsch, Konstantin Böttinger +1 more
Large Language Models (LLMs) show remarkable capabilities in understanding natural language and generating complex code. However, as practitioners...
2 months ago cs.CR cs.LG
PDF
Tool HIGH
Xiangwen Wang, Ananth Balashankar, Varun Chandrasekaran
Large language models remain vulnerable to jailbreak attacks, yet we still lack a systematic understanding of how jailbreak success scales with...
2 months ago cs.LG cs.CR
PDF
Benchmark MEDIUM
Marc Damie, Murat Bilgehan Ertan, Domenico Essoussi +3 more
With their increasing capabilities, Large Language Models (LLMs) are now used across many industries. They have become useful tools for software...
2 months ago cs.LG cs.CL cs.CR
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial