Attack HIGH
Berk Atil, Rebecca J. Passonneau, Fred Morstatter
Large language models (LLMs) undergo safety alignment after training and tuning, yet recent work shows that safety can be bypassed through jailbreak...
Attack MEDIUM
Kasimir Schulz, Amelia Kawasaki, Leo Ring
Large language models (LLMs) are widely deployed across various applications, often with safeguards to prevent the generation of harmful or...
6 months ago cs.CR cs.AI
PDF
Tool LOW
Dong Chen, Yanzhe Wei, Zonglin He +7 more
Large language models (LLMs) offer transformative potential for clinical decision support in spine surgery but pose significant risks through...
6 months ago cs.LG cs.AI cs.CY
PDF
Attack HIGH
Peng Ding, Jun Kuang, Wen Sun +5 more
Large language models (LLMs) remain vulnerable to jailbreaking attacks despite their impressive capabilities. Investigating these weaknesses is...
Attack HIGH
Phil Blandfort, Robert Graham
Activation probes are attractive monitors for AI systems due to low cost and latency, but their real-world robustness remains underexplored. We ask:...
6 months ago cs.LG cs.AI
PDF
Benchmark MEDIUM
Ariyan Hossain, Khondokar Mohammad Ahanaf Hannan, Rakinul Haque +4 more
Gender bias in language models has gained increasing attention in the field of natural language processing. Encoder-based transformer models, which...
Defense MEDIUM
Yifan Xia, Guorui Chen, Wenqian Yu +3 more
Large language models (LLMs) excel in diverse applications but face dual challenges: generating harmful content under jailbreak attacks and...
6 months ago cs.AI cs.CR
PDF
Defense MEDIUM
Mohammed N. Swileh, Shengli Zhang
Centralized Software-Defined Networking (cSDN) offers flexible and programmable control of networks but suffers from scalability and reliability...
6 months ago cs.CR cs.AI
PDF
Attack HIGH
Ruofan Liu, Yun Lin, Zhiyong Huang +1 more
Large language models (LLMs) are increasingly integrated into IT infrastructures, where they process user data according to predefined instructions....
6 months ago cs.CR cs.AI
PDF
Attack HIGH
Xin Yao, Haiyang Zhao, Yimin Chen +3 more
The Contrastive Language-Image Pretraining (CLIP) model has significantly advanced vision-language modeling by aligning image-text pairs from...
6 months ago cs.CV cs.CR cs.LG
PDF
Attack HIGH
Kayua Oleques Paim, Rodrigo Brandao Mansilha, Diego Kreutz +2 more
The rapid proliferation of Large Language Models (LLMs) has raised significant concerns about their security against adversarial attacks. In this...
6 months ago cs.CR cs.AI cs.LG
PDF
Attack MEDIUM
David Lüdke, Tom Wollschläger, Paul Ungermann +2 more
We introduce a novel framework that transforms the resource-intensive (adversarial) prompt optimization problem into an \emph{efficient, amortized...
6 months ago cs.LG stat.ML
PDF
Other LOW
David Farr, Lynnette Hui Xian Ng, Stephen Prochaska +2 more
Disinformation campaigns can distort public perception and destabilize institutions. Understanding how different populations respond to information...
6 months ago cs.SI cs.AI cs.CL
PDF
Defense HIGH
Md Abdul Hannan, Ronghao Ni, Chi Zhang +3 more
Large language models (LLMs) have demonstrated impressive capabilities across a wide range of coding tasks, including summarization, translation,...
6 months ago cs.SE cs.CR cs.LG
PDF
Survey MEDIUM
Kathrin Grosse, Nico Ebert
Recent improvement gains in large language models (LLMs) have lead to everyday usage of AI-based Conversational Agents (CAs). At the same time, LLMs...
Attack MEDIUM
Chenghao Du, Quanfeng Huang, Tingxuan Tang +3 more
Large Language Models (LLMs) have transformed software development, enabling AI-powered applications known as LLM-based agents that promise to...
Benchmark MEDIUM
Heehwan Kim, Sungjune Park, Daeseon Choi
Large Language Models (LLMs) are generally equipped with guardrails to block the generation of harmful responses. However, existing defenses always...
6 months ago cs.CL cs.AI
PDF
Benchmark MEDIUM
Arnabh Borah, Md Tanvirul Alam, Nidhi Rastogi
Security applications are increasingly relying on large language models (LLMs) for cyber threat detection; however, their opaque reasoning often...
6 months ago cs.CR cs.AI
PDF
Attack HIGH
Alex Irpan, Alexander Matt Turner, Mark Kurzeja +2 more
An LLM's factuality and refusal training can be compromised by simple changes to a prompt. Models often adopt user beliefs (sycophancy) or satisfy...
6 months ago cs.LG cs.AI
PDF
Benchmark MEDIUM
Zishuo Zheng, Vidhisha Balachandran, Chan Young Park +2 more
As large language model (LLM) based systems take on high-stakes roles in real-world decision-making, they must reconcile competing instructions from...
6 months ago cs.CL cs.AI
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial