Benchmark MEDIUM
André V. Duarte, Xuying li, Bin Zeng +3 more
If we cannot inspect the training data of a large language model (LLM), how can we ever know what it has seen? We believe the most compelling...
Survey MEDIUM
Robert A. Bridges, Thomas R. Mitchell, Mauricio Muñoz +1 more
The advent of Large Language Models (LLMs) promised to resolve the long-standing paradox in honeypot design, achieving high-fidelity deception with...
Benchmark MEDIUM
Simon Yu, Peilin Yu, Hongbo Zheng +3 more
We present VISAT, a novel open dataset and benchmarking suite for evaluating model robustness in the task of traffic sign recognition with the...
4 months ago cs.CR cs.AI cs.LG
PDF
Tool MEDIUM
Ken Huang, Kyriakos Rock Lambros, Jerry Huang +8 more
This paper introduces the Agentic AI Governance Assurance & Trust Engine (AAGATE), a Kubernetes-native control plane designed to address the unique...
4 months ago cs.CR cs.AI
PDF
Attack MEDIUM
Guangzhi Su, Shuchang Huang, Yutong Ke +3 more
Multimodal large language models (MLLMs) have achieved impressive performance across diverse tasks by jointly reasoning over textual and visual...
4 months ago cs.LG cs.CR
PDF
Benchmark MEDIUM
Zheng Zhang, Guanlong Wu, Sen Deng +2 more
In the rapidly expanding landscape of Large Language Model (LLM) applications, real-time output streaming has become the dominant interaction...
Benchmark MEDIUM
Juan Ren, Mark Dras, Usman Naseem
Agentic methods have emerged as a powerful and autonomous paradigm that enhances reasoning, collaboration, and adaptive control, enabling systems to...
Attack MEDIUM
Elizabeth Lin, Jonah Ghebremichael, William Enck +5 more
Software supply chains, while providing immense economic and software development value, are only as strong as their weakest link. Over the past...
Benchmark MEDIUM
Yifan Wu, Xuewei Feng, Yuxiang Yang +1 more
As the core of the Internet infrastructure, the TCP/IP protocol stack undertakes the task of network data transmission. However, due to the...
4 months ago cs.CR cs.NI
PDF
Benchmark MEDIUM
María Sanz-Gómez, Víctor Mayoral-Vilches, Francesco Balassone +3 more
Cybersecurity spans multiple interconnected domains, complicating the development of meaningful, labor-relevant benchmarks. Existing benchmarks...
Defense MEDIUM
Xingyu Zhu, Beier Zhu, Shuo Wang +2 more
Vision-language models (VLMs) such as CLIP demonstrate strong generalization in zero-shot classification but remain highly vulnerable to adversarial...
4 months ago cs.CV cs.MA
PDF
Benchmark MEDIUM
Vladyslav Larin, Ihor Naumenko, Aleksei Ivashov +2 more
As centralized AI hits compute ceilings and diminishing returns from ever-larger training runs, meeting demand requires an inference layer that...
4 months ago cs.LG cs.AI cs.CL
PDF
Other MEDIUM
Yifan Zhang, Xin Zhang
Directed greybox fuzzing (DGF) aims to efficiently trigger bugs at specific target locations by prioritizing seeds whose execution paths are more...
4 months ago cs.CR cs.PL cs.SE
PDF
Benchmark MEDIUM
Hiromu Takahashi, Shotaro Ishihara
We propose Fast-MIA (https://github.com/Nikkei/fast-mia), a Python library for efficiently evaluating membership inference attacks (MIA) against...
5 months ago cs.CR cs.CL
PDF
Survey MEDIUM
Bin Wang, Zexin Liu, Hao Yu +6 more
The Model Context Protocol (MCP) has emerged as a standardized interface enabling seamless integration between Large Language Models (LLMs) and...
5 months ago cs.CR cs.AI
PDF
Attack MEDIUM
Myeongseob Ko, Nikhil Reddy Billa, Adam Nguyen +3 more
The memorization of training data in large language models (LLMs) poses significant privacy and copyright concerns. Existing data extraction methods,...
5 months ago cs.CL cs.AI
PDF
Attack MEDIUM
Bin Wang, YiLu Zhong, MiDi Wan +4 more
Large language models (LLMs) have become indispensable for automated code generation, yet the quality and security of their outputs remain a critical...
5 months ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Armin Gerami, Kazem Faghih, Ramani Duraiswami
Retrieval Augmented Generation (RAG) enhances Large Language Models (LLMs) by connecting them to external knowledge, improving accuracy and reducing...
5 months ago cs.IR cs.AI cs.CL
PDF
Attack MEDIUM
Jiaxiang Liu, Jiawei Du, Xiao Liu +2 more
Pre-trained vision-language models (VLMs) such as CLIP have demonstrated strong zero-shot capabilities across diverse domains, yet remain highly...
Benchmark MEDIUM
Julia Bazinska, Max Mathys, Francesco Casucci +4 more
AI agents powered by large language models (LLMs) are being deployed at scale, yet we lack a systematic understanding of how the choice of backbone...
5 months ago cs.CR cs.AI cs.LG
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial