Large Language Models (LLMs) have achieved remarkable success but remain highly susceptible to jailbreak attacks, in which adversarial prompts coerce...
AI agents are increasingly deployed across diverse domains to automate complex workflows through long-horizon and high-stakes action executions. Due...
Raja Sekhar Rao Dheekonda, Will Pearce, Nick Landers
AI systems are entering critical domains like healthcare, finance, and defense, yet remain vulnerable to adversarial attacks. While AI red teaming is...
We show that remotely hosted applications employing in-context learning when augmented with a retrieval function to select in-context examples can be...
Large language models (LLMs) employ safety mechanisms to prevent harmful outputs, yet these defenses primarily rely on semantic pattern matching. We...
Mario Rodríguez Béjar, Francisco J. Cortés-Delgado, S. Braghin +1 more
Large language models (LLMs) remain vulnerable to jailbreak attacks that bypass safety alignment and elicit harmful responses. A growing body of work...
Arne Roszeitis, Bartosz Burgiel, Victor Jüttner +1 more
Smart devices, such as light bulbs, TVs, fridges, etc., equipped with computing capabilities and wireless communication, are part of everyday life in...
Bring-Your-Own-Key (BYOK) agent architectures let users route LLM traffic through third-party relays, creating a critical integrity gap: a malicious...