SafeMT: Multi-turn Safety for Multimodal Language Models
Han Zhu, Juntao Dai, Jiaming Ji +8 more
With the widespread use of multi-modal Large Language models (MLLMs), safety issues have become a growing concern. Multi-turn dialogues, which are...
2,529+ academic papers on AI security, attacks, and defenses
Showing 201–220 of 222 papers
Clear filtersHan Zhu, Juntao Dai, Jiaming Ji +8 more
With the widespread use of multi-modal Large Language models (MLLMs), safety issues have become a growing concern. Multi-turn dialogues, which are...
Jiahao Liu, Bonan Ruan, Xianglin Yang +5 more
LLM-based agents have demonstrated promising adaptability in real-world applications. However, these agents remain vulnerable to a wide range of...
Zhuochen Yang, Kar Wai Fok, Vrizlynn L. L. Thing
Large language models have gained widespread attention recently, but their potential security vulnerabilities, especially privacy leakage, are also...
Yuyi Huang, Runzhe Zhan, Lidia S. Chao +2 more
As large language models (LLMs) are increasingly deployed for complex reasoning tasks, Long Chain-of-Thought (Long-CoT) prompting has emerged as a...
MingSheng Li, Guangze Zhao, Sichen Liu
Large Vision-Language Models (LVLMs) have achieved remarkable progress in multimodal perception and generation, yet their safety alignment remains a...
Xiangtao Meng, Tianshuo Cong, Li Wang +4 more
Large Language Models (LLMs) have shown remarkable performance across various applications, but their deployment in real-world settings faces several...
Thusitha Dayaratne, Ngoc Duy Pham, Viet Vo +5 more
The quality and experience of mobile communication have significantly improved with the introduction of 5G, and these improvements are expected to...
Shuai Zhao, Xinyi Wu, Shiqian Zhao +4 more
During fine-tuning, large language models (LLMs) are increasingly vulnerable to data-poisoning backdoor attacks, which compromise their reliability...
Anindya Sundar Das, Kangjie Chen, Monowar Bhuyan
Pre-trained language models have achieved remarkable success across a wide range of natural language processing (NLP) tasks, particularly when...
Rui Wu, Yihao Quan, Zeru Shi +3 more
Safety-aligned Large Language Models (LLMs) still show two dominant failure modes: they are easily jailbroken, or they over-refuse harmless inputs...
Lesly Miculicich, Mihir Parmar, Hamid Palangi +4 more
The deployment of autonomous AI agents in sensitive domains, such as healthcare, introduces critical risks to safety, security, and privacy. These...
Yuhao Sun, Zhuoer Xu, Shiwen Cui +4 more
Large Language Models (LLMs) have achieved remarkable progress across a wide range of tasks, but remain vulnerable to safety risks such as harmful...
Guobin Shen, Dongcheng Zhao, Haibo Tong +3 more
Ensuring Large Language Model (LLM) safety remains challenging due to the absence of universal standards and reliable content validators, making it...
Ayda Aghaei Nia
Completely Automated Public Turing tests to tell Computers and Humans Apart (CAPTCHAs) are a foundational component of web security, yet traditional...
Zherui Li, Zheng Nie, Zhenhong Zhou +7 more
The rapid advancement of Diffusion Large Language Models (dLLMs) introduces unprecedented vulnerabilities that are fundamentally distinct from...
Gauri Kholkar, Ratinder Ahuja
As autonomous AI agents are used in regulated and safety-critical settings, organizations need effective ways to turn policy into enforceable...
Yuqiao Meng, Luoxi Tang, Feiyang Yu +4 more
Large language models (LLMs) are increasingly used to help security analysts manage the surge of cyber threats, automating tasks from vulnerability...
Zeyu Shen, Basileal Imana, Tong Wu +3 more
Retrieval-Augmented Generation (RAG) enhances Large Language Models by grounding their outputs in external documents. These systems, however, remain...
Charles E. Gagnon, Steven H. H. Ding, Philippe Charland +1 more
Binary code similarity detection is a core task in reverse engineering. It supports malware analysis and vulnerability discovery by identifying...
Anton Korznikov, Andrey Galichin, Alexey Dontsov +3 more
Activation steering is a promising technique for controlling LLM behavior by adding semantically meaningful vectors directly into a model's hidden...
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial