Benchmark MEDIUM
Ying Zhou, Jiacheng Wei, Yu Qi +2 more
Large language models (LLMs) demonstrate remarkable capabilities in natural language understanding and generation. Despite being trained on...
2 months ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Vasanth Iyer, Leonardo Bobadilla, S. S. Iyengar
Large Language Models (LLMs) such as Gemma-2B have shown strong performance in various natural language processing tasks. However, general-purpose...
Benchmark MEDIUM
Qiang Zhang, Elena Emma Wang, Jiaming Li +1 more
This study presents a Secure Multi-Tenant Architecture (SMTA) combined with a novel concept Burn-After-Use (BAU) mechanism for enterprise LLM...
2 months ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Minfeng Qi, Dongyang He, Qin Wang +1 more
Visual Reasoning CAPTCHAs (VRCs) combine visual scenes with natural-language queries that demand compositional inference over objects, attributes,...
2 months ago cs.CR cs.CV cs.ET
PDF
Benchmark MEDIUM
Keyang Zhang, Zeyu Chen, Xuan Feng +4 more
The security of scripting languages such as PowerShell is critical given their powerful automation and administration capabilities, often exercised...
2 months ago cs.CR cs.PL
PDF
Benchmark MEDIUM
Hoang-Chau Luong, Lingwei Chen
Low-Rank Adaptation (LoRA) is widely used for parameter-efficient fine-tuning of large language models, but it is notably ineffective at removing...
Benchmark MEDIUM
Tianshi Li
On December 4, 2025, Anthropic released Anthropic Interviewer, an AI tool for running qualitative interviews at scale, along with a public dataset of...
2 months ago cs.CR cs.AI cs.CY
PDF
Benchmark MEDIUM
Zhi Yang, Runguo Li, Qiqi Qiang +15 more
Financial agents powered by large language models (LLMs) are increasingly deployed for investment analysis, risk assessment, and automated...
2 months ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Suyash Mishra, Qiang Li, Srikanth Patil +1 more
Vision Language Models (VLMs) are poised to revolutionize the digital transformation of pharmacyceutical industry by enabling intelligent, scalable,...
2 months ago cs.CV cs.LG
PDF
Benchmark MEDIUM
Konstantinos E. Kampourakis, Vyron Kampourakis, Efstratios Chatzoglou +2 more
Realistic, large-scale, and well-labeled cybersecurity datasets are essential for training and evaluating Intrusion Detection Systems (IDS). However,...
Benchmark MEDIUM
Huawei Zheng, Xinqi Jiang, Sen Yang +3 more
Large language models (LLMs) are increasingly applied in specialized domains such as finance and healthcare, where they introduce unique safety...
2 months ago cs.CL cs.AI
PDF
Benchmark MEDIUM
Xiaoyu Xu, Minxin Du, Zitong Li +6 more
Although machine unlearning is essential for removing private, harmful, or copyrighted content from LLMs, current benchmarks often fail to faithfully...
2 months ago cs.CL cs.AI cs.CR
PDF
Benchmark MEDIUM
Dinesh Srivasthav P, Ashok Urlana, Rahul Mishra +2 more
Machine unlearning aims to selectively remove the influence of specific training samples to satisfy privacy regulations such as the GDPR's 'Right to...
2 months ago cs.CR cs.AI cs.CL
PDF
Benchmark MEDIUM
Antonio Colacicco, Vito Guida, Dario Di Palma +2 more
Large Language Models (LLMs) are increasingly applied in recommendation scenarios due to their strong natural language understanding and generation...
2 months ago cs.IR cs.AI cs.CL
PDF
Benchmark MEDIUM
Jinwei Hu, Xinmiao Huang, Youcheng Sun +2 more
As large language models (LLMs) transition to autonomous agents synthesizing real-time information, their reasoning capabilities introduce an...
2 months ago cs.CL cs.AI cs.MA
PDF
Benchmark MEDIUM
Junyu Liu, Zirui Li, Qian Niu +7 more
As Large Language Models (LLMs) are increasingly deployed in healthcare field, it becomes essential to carefully evaluate their medical safety before...
2 months ago cs.CL cs.AI
PDF
Benchmark MEDIUM
Muntasir Adnan, Carlos C. N. Kuhn
Large Language Models have become integral to software development, yet they frequently generate vulnerable code. Existing code vulnerability...
2 months ago cs.SE cs.AI
PDF
Benchmark MEDIUM
Zhuoran Tan, Run Hao, Jeremy Singer +2 more
Tool-augmented LLM agents raise new security risks: tool executions can introduce runtime-only behaviors, including prompt injection and unintended...
2 months ago cs.CR cs.SE
PDF
Benchmark MEDIUM
Milad Rahmati, Nima Rahmati
The proliferation of Internet of Things devices in critical infrastructure has created unprecedented cybersecurity challenges, necessitating...
2 months ago cs.CR cs.LG
PDF
Benchmark MEDIUM
Muhammad Bilal, Omer Tariq, Hasan Ahmed
Timing and burst patterns can leak through encryption, and an adaptive adversary can exploit them. This undermines metadata-only detection in a...
2 months ago cs.CR cs.LG cs.NI
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial