Multimodal Large Language Models (MLLMs) have demonstrated impressive capabilities in cross-modal understanding, but remain vulnerable to adversarial...
The growing misuse of Vision-Language Models (VLMs) has led providers to deploy multiple safeguards, including alignment tuning, system prompts, and...
Strahinja Janjusevic, Anna Baron Garcia, Sohrob Kazerounian
Generative AI is reshaping offensive cybersecurity by enabling autonomous red team agents that can plan, execute, and adapt during penetration tests....
Large language model safety is usually assessed with static benchmarks, but key failures are dynamic: value drift under distribution shift, jailbreak...
Recent research on large language model (LLM) jailbreaks has primarily focused on techniques that bypass safety mechanisms to elicit overtly harmful...
Retrieval-augmented generation (RAG) systems have become widely used for enhancing large language model capabilities, but they introduce significant...
W. Bradley Knox, Katie Bradford, Samanta Varela Castro +6 more
Amid the growing prevalence of human-AI interaction, large language models and other AI-based entities increasingly provide forms of companionship to...
Adaptive optimizers with decoupled weight decay, such as AdamW, are the de facto standard for pre-training large transformer-based generative models....
Large Language Models (LLMs) are increasingly integrated into educational applications. However, they remain vulnerable to jailbreak and fine-tuning...