Attack HIGH
Christos Ziakas, Nicholas Loo, Nishita Jain +1 more
Automated red-teaming has emerged as a scalable approach for auditing Large Language Models (LLMs) prior to deployment, yet existing approaches lack...
Attack HIGH
Artur Horal, Daniel Pina, Henrique Paz +7 more
This paper presents the vision, scientific contributions, and technical details of RedTWIZ: an adaptive and diverse multi-turn red teaming framework,...
5 months ago cs.CR cs.CL
PDF
Attack HIGH
Sri Durga Sai Sowmya Kadali, Evangelos E. Papalexakis
Jailbreaking large language models (LLMs) has emerged as a pressing concern with the increasing prevalence and accessibility of conversational LLMs....
Attack HIGH
Giorgio Giannone, Guangxuan Xu, Nikhil Shivakumar Nayak +4 more
Inference-Time Scaling (ITS) improves language models by allocating more computation at generation time. Particle Filtering (PF) has emerged as a...
5 months ago cs.LG cs.AI cs.CL
PDF
Attack HIGH
Nouar Aldahoul, Yasir Zaki
The rapid spread of misinformation on digital platforms threatens public discourse, emotional stability, and decision-making. While prior work has...
5 months ago cs.CL cs.AI cs.CR
PDF
Attack HIGH
Raffaele Mura, Giorgio Piras, Kamilė Lukošiūtė +3 more
Jailbreaks are adversarial attacks designed to bypass the built-in safety mechanisms of large language models. Automated jailbreaks typically...
5 months ago cs.CL cs.AI cs.LG
PDF
Attack HIGH
Meng Tong, Yuntao Du, Kejiang Chen +2 more
Membership inference attacks (MIAs) are widely used to assess the privacy risks associated with machine learning models. However, when these attacks...
5 months ago cs.CR cs.AI
PDF
Attack HIGH
Xiaogeng Liu, Chaowei Xiao
Recent advancements in jailbreaking large language models (LLMs), such as AutoDAN-Turbo, have demonstrated the power of automated strategy discovery....
5 months ago cs.CR cs.AI
PDF
Attack HIGH
Weiliang Zhao, Jinjun Peng, Daniel Ben-Levi +2 more
The proliferation of powerful large language models (LLMs) has necessitated robust safety alignment, yet these models remain vulnerable to evolving...
5 months ago cs.CR cs.CL
PDF
Attack HIGH
Kuofeng Gao, Yiming Li, Chao Du +4 more
Jailbreaking attacks on the vision modality typically rely on imperceptible adversarial perturbations, whereas attacks on the textual modality are...
5 months ago cs.CL cs.AI cs.CR
PDF
Attack HIGH
Yuxin Wen, Arman Zharmagambetov, Ivan Evtimov +4 more
Prompt injection poses a serious threat to the reliability and safety of LLM agents. Recent defenses against prompt injection, such as Instruction...
5 months ago cs.CR cs.LG
PDF
Attack HIGH
Santhosh KumarRavindran
The rapid adoption of large language models (LLMs) in enterprise systems exposes vulnerabilities to prompt injection attacks, strategic deception,...
5 months ago cs.CR cs.AI
PDF
Attack HIGH
Buyun Liang, Liangzu Peng, Jinqi Luo +3 more
Large Language Models (LLMs) are increasingly deployed in high-risk domains. However, state-of-the-art LLMs often exhibit hallucinations, raising...
5 months ago cs.CL cs.AI cs.CR
PDF
Attack HIGH
Yu Cui, Sicheng Pan, Yifei Liu +2 more
Large language models (LLMs) have been widely deployed in Conversational AIs (CAIs), while exposing privacy and security threats. Recent research...
Attack HIGH
Yanjie Li, Yiming Cao, Dong Wang +1 more
Multimodal agents built on large vision-language models (LVLMs) are increasingly deployed in open-world settings but remain highly vulnerable to...
5 months ago cs.CR cs.AI
PDF
Attack HIGH
Xiangxiang Chen, Peixin Zhang, Jun Sun +2 more
Model quantization is a popular technique for deploying deep learning models on resource-constrained environments. However, it may also introduce...
5 months ago cs.CR cs.AI cs.LG
PDF
Attack HIGH
Yulin Chen, Haoran Li, Yuan Sui +2 more
With the development of technology, large language models (LLMs) have dominated the downstream natural language processing (NLP) tasks. However,...
Attack HIGH
Rabeya Amin Jhuma, Mostafa Mohaimen Akand Faisal
This study explored how in-context learning (ICL) in large language models can be disrupted by data poisoning attacks in the setting of public health...
5 months ago cs.LG cs.CL cs.CR
PDF
Attack HIGH
Maraz Mia, Mir Mehedi A. Pritom
Explainable Artificial Intelligence (XAI) has aided machine learning (ML) researchers with the power of scrutinizing the decisions of the black-box...
5 months ago cs.CR cs.AI
PDF
Attack HIGH
Javad Rafiei Asl, Sidhant Narula, Mohammad Ghasemigol +2 more
Large Language Models (LLMs) have revolutionized natural language processing but remain vulnerable to jailbreak attacks, especially multi-turn...
5 months ago cs.CR cs.AI
PDF
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial