Large Language Models (LLMs) have been augmented with web search to overcome the limitations of the static knowledge boundary by accessing up-to-date...
Large Language Models (LLMs) have shown remarkable performance across various applications, but their deployment in real-world settings faces several...
Safety alignment of large language models currently faces a central challenge: existing alignment techniques often prioritize mitigating responses to...
Chongyu Fan, Changsheng Wang, Yancheng Huang +2 more
Machine unlearning for large language models (LLMs) aims to remove undesired data, knowledge, and behaviors (e.g., for safety, privacy, or copyright)...
Large Language Model (LLM)-based Multi-Agent Systems (MAS) have emerged as a powerful paradigm for tackling complex, multi-step tasks across diverse...
Christos Ziakas, Nicholas Loo, Nishita Jain +1 more
Automated red-teaming has emerged as a scalable approach for auditing Large Language Models (LLMs) prior to deployment, yet existing approaches lack...
Abhishek Anand, Matthias C. Caro, Ari Karchmer +1 more
Quantum learning from remotely accessed quantum compute and data must address two key challenges: verifying the correctness of data and ensuring the...
Although Large Language Models (LLMs) show promising solutions to automated code generation, they often produce insecure code that threatens software...
This paper presents the vision, scientific contributions, and technical details of RedTWIZ: an adaptive and diverse multi-turn red teaming framework,...
Models are susceptible to adversarially out-of-distribution (OOD) data despite large training-compute investments into their robustification. Zaremba...