Benchmark MEDIUM
Antonio Colacicco, Vito Guida, Dario Di Palma +2 more
Large Language Models (LLMs) are increasingly applied in recommendation scenarios due to their strong natural language understanding and generation...
4 months ago cs.IR cs.AI cs.CL
PDF
Benchmark MEDIUM
Jinwei Hu, Xinmiao Huang, Youcheng Sun +2 more
As large language models (LLMs) transition to autonomous agents synthesizing real-time information, their reasoning capabilities introduce an...
4 months ago cs.CL cs.AI cs.MA
PDF
Benchmark MEDIUM
Junyu Liu, Zirui Li, Qian Niu +7 more
As Large Language Models (LLMs) are increasingly deployed in healthcare field, it becomes essential to carefully evaluate their medical safety before...
4 months ago cs.CL cs.AI
PDF
Benchmark MEDIUM
Muntasir Adnan, Carlos C. N. Kuhn
Large Language Models have become integral to software development, yet they frequently generate vulnerable code. Existing code vulnerability...
4 months ago cs.SE cs.AI
PDF
Benchmark MEDIUM
Zhuoran Tan, Run Hao, Jeremy Singer +2 more
Tool-augmented LLM agents raise new security risks: tool executions can introduce runtime-only behaviors, including prompt injection and unintended...
4 months ago cs.CR cs.SE
PDF
Benchmark MEDIUM
Milad Rahmati, Nima Rahmati
The proliferation of Internet of Things devices in critical infrastructure has created unprecedented cybersecurity challenges, necessitating...
4 months ago cs.CR cs.LG
PDF
Benchmark MEDIUM
Muhammad Bilal, Omer Tariq, Hasan Ahmed
Timing and burst patterns can leak through encryption, and an adaptive adversary can exploit them. This undermines metadata-only detection in a...
4 months ago cs.CR cs.LG cs.NI
PDF
Benchmark MEDIUM
Yiming Liang, Yizhi Li, Yantao Du +14 more
Benchmarks play a crucial role in tracking the rapid advancement of large language models (LLMs) and identifying their capability boundaries....
4 months ago cs.CL cs.AI
PDF
Benchmark MEDIUM
Bohan Liang, Zijian Chen, Qi Jia +3 more
Stock prediction, a subject closely related to people's investment activities in fully dynamic and live environments, has been widely studied....
4 months ago q-fin.ST cs.LG
PDF
Benchmark MEDIUM
Muhammad Abdullahi Said, Muhammad Sammani Sani
As Large Language Models (LLMs) integrate into critical global infrastructure, the assumption that safety alignment transfers zero-shot from English...
4 months ago cs.CL cs.AI cs.CY
PDF
Benchmark MEDIUM
Zhe Huang, Hao Wen, Aiming Hao +6 more
Multimodal Large Language Models (MLLMs) have made remarkable progress in video understanding. However, they suffer from a critical vulnerability: an...
4 months ago cs.CV cs.AI
PDF
Benchmark MEDIUM
Heba Osama, Omar Elebiary, Youssef Qassim +4 more
Web applications increasingly face evasive and polymorphic attack payloads, yet traditional web application firewalls (WAFs) based on static rule...
Benchmark MEDIUM
Karolina Korgul, Yushi Yang, Arkadiusz Drohomirecki +7 more
Web-based agents powered by large language models are increasingly used for tasks such as email management or professional networking. Their reliance...
4 months ago cs.HC cs.AI cs.MA
PDF
Benchmark MEDIUM
Yifan Huang, Xiaojun Jia, Wenbo Guo +4 more
Large language models (LLMs) have revolutionized software development through AI-assisted coding tools, enabling developers with limited programming...
4 months ago cs.CR cs.AI cs.SE
PDF
Benchmark MEDIUM
Jiashuo Liu, Jiayun Wu, Chunjie Wu +5 more
The rapid proliferation of Large Language Models (LLMs) and diverse specialized benchmarks necessitates a shift from fragmented, task-specific...
4 months ago cs.LG cs.AI cs.PF
PDF
Benchmark MEDIUM
Adam Elaoumari
The purpose of this project is to assess how well defenders can detect DNS-over-HTTPS (DoH) file exfiltration, and which evasion strategies can be...
4 months ago cs.CR cs.AI cs.NI
PDF
Benchmark MEDIUM
Aaron Chan, Alex Ding, Frank Chen +3 more
The rapid integration of Large Language Models (LLMs) into decentralized physical infrastructure networks (DePIN) is currently bottlenecked by the...
Benchmark MEDIUM
Naseem Machlovi, Maryam Saleki, Ruhul Amin +5 more
As large language models (LLMs) become deeply embedded in daily life, the urgent need for safer moderation systems, distinguishing between naive from...
4 months ago cs.CL cs.AI cs.HC
PDF
Benchmark MEDIUM
Naseem Machlovi, Maryam Saleki, Ruhul Amin +5 more
As large language models (LLMs) become deeply embedded in daily life, the urgent need for safer moderation systems that distinguish between naive and...
4 months ago cs.CL cs.AI cs.HC
PDF
Benchmark MEDIUM
Sumanth Bharadwaj Hachalli Karanam, Dhiwahar Adhithya Kennady
Manual software beta testing is costly and time-consuming, while single-agent large language model (LLM) approaches suffer from hallucinations and...
4 months ago cs.SE cs.AI cs.MA
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial