Defense MEDIUM
Zhenxiong Yu, Zhi Yang, Zhiheng Jin +19 more
As large language models (LLMs) evolve into autonomous agents, their real-world applicability has expanded significantly, accompanied by new security...
1 months ago cs.CR cs.AI
PDF
Tool MEDIUM
Guangwei Zhang, Jianing Zhu, Cheng Qian +12 more
We present Copyright Detective, the first interactive forensic system for detecting, analyzing, and visualizing potential copyright risks in LLM...
Benchmark MEDIUM
Ruixin Yang, Ethan Mendes, Arthur Wang +4 more
Vision-language models (VLMs) have demonstrated strong performance in image geolocation, a capability further sharpened by frontier multimodal large...
1 months ago cs.CR cs.AI
PDF
Attack MEDIUM
Vishruti Kakkad, Paul Chung, Hanan Hibshi +1 more
An exponential growth of Machine Learning and its Generative AI applications brings with it significant security challenges, often referred to as...
1 months ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Casey Ford, Madison Van Doren, Emily Dix
Multimodal large language models (MLLMs) are increasingly deployed in real-world systems, yet their safety under adversarial prompting remains...
1 months ago cs.CL cs.AI cs.HC
PDF
Attack MEDIUM
Yike Sun, Haotong Yang, Zhouchen Lin +1 more
Tokenization is fundamental to how language models represent and process text, yet the behavior of widely used BPE tokenizers has received far less...
Attack MEDIUM
Ariel Fogel, Omer Hofman, Eilon Cohen +1 more
Open-weight language models are increasingly used in production settings, raising new security challenges. One prominent threat in this context is...
1 months ago cs.CR cs.LG
PDF
Attack MEDIUM
Leo Schwinn, Moritz Ladenburger, Tim Beyer +3 more
Automated \enquote{LLM-as-a-Judge} frameworks have become the de facto standard for scalable evaluation across natural language processing. For...
1 months ago cs.CL cs.AI
PDF
Benchmark MEDIUM
Debargha Ganguly, Sreehari Sankar, Biyao Zhang +8 more
Current approaches to LLM safety fundamentally rely on a brittle cat-and-mouse game of identifying and blocking known threats via guardrails. We...
1 months ago cs.CL cs.AI cs.DC
PDF
Defense MEDIUM
Jiacheng Liang, Yuhui Wang, Tanqiu Jiang +1 more
Mixture-of-Experts (MoE) language models introduce unique challenges for safety alignment due to their sparse routing mechanisms, which can enable...
1 months ago cs.LG cs.AI cs.CR
PDF
Tool MEDIUM
Gautam Savaliya, Robert Aufschläger, Abhishek Subedi +2 more
Artificial intelligence systems introduce complex privacy risks throughout their lifecycle, especially when processing sensitive or high-dimensional...
1 months ago cs.CR cs.AI
PDF
Attack MEDIUM
Youngji Roh, Hyunjin Cho, Jaehyung Kim
Large Language Models (LLMs) exhibit highly anisotropic internal representations, often characterized by massive activations, a phenomenon where a...
Attack MEDIUM
Zeming Wei, Qiaosheng Zhang, Xia Hu +1 more
Large Reasoning Models (LRMs) have achieved tremendous success with their chain-of-thought (CoT) reasoning, yet also face safety issues similar to...
1 months ago cs.LG cs.AI cs.CL
PDF
Defense MEDIUM
Guang Yang, Xing Hu, Xiang Chen +1 more
Large language models (LLMs) for Verilog code generation are increasingly adopted in hardware design, yet remain vulnerable to backdoor attacks where...
1 months ago cs.SE cs.CR
PDF
Attack MEDIUM
Andrew Draganov, Tolga H. Dur, Anandmayi Bhongade +1 more
We present a data poisoning attack -- Phantom Transfer -- with the property that, even if you know precisely how the poison was placed into an...
1 months ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Omar Abdelnasser, Fatemah Alharbi, Khaled Khasawneh +2 more
Safety alignment in Language Models (LMs) is fundamental for trustworthy AI. However, while different stakeholders are trying to leverage Arabic...
1 months ago cs.CL cs.AI
PDF
Attack MEDIUM
Matthew P. Lad, Louisa Conwill, Megan Levis Scheirer
With the rapid growth of Large Language Models (LLMs), criticism of their societal impact has also grown. Work in Responsible AI (RAI) has focused on...
Defense MEDIUM
Sidahmed Benabderrahmane, Petko Valtchev, James Cheney +1 more
Detecting rare and diverse anomalies in highly imbalanced datasets-such as Advanced Persistent Threats (APTs) in cybersecurity-remains a fundamental...
1 months ago cs.LG cs.AI cs.CR
PDF
Benchmark MEDIUM
Tomer Kordonsky, Maayan Yamin, Noam Benzimra +2 more
LLMs are increasingly used for code generation, but their outputs often follow recurring templates that can induce predictable vulnerabilities. We...
1 months ago cs.CR cs.AI
PDF
Defense MEDIUM
Rohan Saxena
Fine-tuning language models on narrowly harmful data causes emergent misalignment (EM) -- behavioral failures extending far beyond training...
1 months ago cs.CL cs.AI
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial