Benchmark MEDIUM
Mojtaba Eshghie, Gabriele Morello, Matteo Lauretano +2 more
Smart contract vulnerabilities cost billions of dollars annually, yet existing automated analysis tools fail to generate deployable defenses. We...
6 months ago cs.CR cs.SE
PDF
Benchmark MEDIUM
Christoph Bühler, Matteo Biagiola, Luca Di Grazia +1 more
Large Language Models (LLMs) have evolved into AI agents that interact with external tools and environments to perform complex tasks. The Model...
6 months ago cs.CR cs.AI cs.SE
PDF
Benchmark MEDIUM
Divyanshu Kumar, Nitin Aravind Birur, Tanay Baswa +2 more
Frontier Large Language Models (LLMs) pose unprecedented dual-use risks through the potential proliferation of chemical, biological, radiological,...
6 months ago cs.CR cs.AI
PDF
Benchmark LOW
Mohamed Seif, Malcolm Egan, Andrea J. Goldsmith +1 more
AI-based sensing at wireless edge devices has the potential to significantly enhance Artificial Intelligence (AI) applications, particularly for...
6 months ago cs.IT cs.CR cs.LG
PDF
Benchmark LOW
Zhenghao Xu, Qin Lu, Qingru Zhang +9 more
Reward model (RM) plays a pivotal role in reinforcement learning with human feedback (RLHF) for aligning large language models (LLMs). However,...
Benchmark HIGH
Euodia Dodd, Nataša Krčo, Igor Shilov +1 more
Membership inference attacks (MIAs) have emerged as the standard tool for evaluating the privacy risks of AI models. However, state-of-the-art...
6 months ago cs.LG cs.CR
PDF
Benchmark MEDIUM
Chengcan Wu, Zhixin Zhang, Mingqian Xu +2 more
Large Language Model (LLM)-based Multi-Agent Systems (MAS) have become a popular paradigm of AI applications. However, trustworthiness issues in MAS...
6 months ago cs.CR cs.AI cs.LG
PDF
Benchmark LOW
Joydeep Chandra, Satyam Kumar Navneet
Domestic AI agents faces ethical, autonomy, and inclusion challenges, particularly for overlooked groups like children, elderly, and Neurodivergent...
6 months ago cs.HC cs.AI cs.LG
PDF
Benchmark LOW
Sophia Xiao Pu, Sitao Cheng, Xin Eric Wang +1 more
Oversensitivity occurs when language models defensively reject prompts that are actually benign. This behavior not only disrupts user interactions...
Benchmark MEDIUM
Alexander Nemecek, Zebin Yun, Zahra Rahmani +4 more
As large language models (LLMs) become progressively more embedded in clinical decision-support, documentation, and patient-information systems,...
6 months ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Marco Alecci, Jordan Samhi, Tegawendé F. Bissyandé +1 more
Mobile apps often embed authentication secrets, such as API keys, tokens, and client IDs, to integrate with cloud services. However, developers often...
6 months ago cs.CR cs.SE
PDF
Benchmark MEDIUM
Giovanni De Muri, Mark Vero, Robin Staab +1 more
LLMs are often used by downstream users as teacher models for knowledge distillation, compressing their capabilities into memory-efficient models....
6 months ago cs.LG cs.AI cs.CR
PDF
Benchmark HIGH
Osama Al Haddad, Muhammad Ikram, Ejaz Ahmed +1 more
Security analysts face increasing pressure to triage large and complex vulnerability backlogs. Large Language Models (LLMs) offer a potential aid by...
Benchmark LOW
Yasser Hamidullah, Koel Dutta Chowdhury, Yusser Al Ghussin +4 more
Hallucination, where models generate fluent text unsupported by visual evidence, remains a major flaw in vision-language models and is particularly...
Benchmark MEDIUM
Yixuan Liu, Xinlei Li, Yi Li
Phishing attacks in Web3 ecosystems are increasingly sophisticated, exploiting deceptive contract logic, malicious frontend scripts, and token...
Benchmark LOW
Lei Li, Xiao Zhou, Yingying Zhang +1 more
Medical question answering (QA) requires extensive access to domain-specific knowledge. A promising direction is to enhance large language models...
6 months ago cs.CL cs.AI
PDF
Benchmark LOW
Jiahao Shi, Tianyi Zhang
Despite recent advances, Large Language Models (LLMs) still generate vulnerable code. Retrieval-Augmented Generation (RAG) has the potential to...
6 months ago cs.CR cs.LG cs.SE
PDF
Benchmark HIGH
Pranshav Gajjar, Molham Khoja, Abiodun Ganiyu +4 more
The impending adoption of Open Radio Access Network (O-RAN) is fueling innovation in the RAN towards data-driven operation. Unlike traditional RAN...
6 months ago cs.CR cs.NI
PDF
Benchmark HIGH
Chengquan Guo, Yuzhou Nie, Chulin Xie +3 more
As large language models (LLMs) are increasingly used for code generation, concerns over the security risks have grown substantially. Early research...
Benchmark LOW
Navreet Kaur, Hoda Ayad, Hayoung Jung +3 more
Language model users often embed personal and social context in their questions. The asker's role -- implicit in how the question is framed --...
6 months ago cs.CL cs.AI cs.CY
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial