Towards a Science of AI Agent Reliability
Stephan Rabanser, Sayash Kapoor, Peter Kirgis +3 more
AI agents are increasingly deployed to execute important tasks. While rising accuracy scores on standard benchmarks suggest rapid progress, many...
2,589+ academic papers on AI security, attacks, and defenses
Showing 901–920 of 1,931 papers
Clear filtersStephan Rabanser, Sayash Kapoor, Peter Kirgis +3 more
AI agents are increasingly deployed to execute important tasks. While rising accuracy scores on standard benchmarks suggest rapid progress, many...
Adib Sakhawat, Fardeen Sadab
Evaluating the social intelligence of Large Language Models (LLMs) increasingly requires moving beyond static text generation toward dynamic,...
Thomas Michel, Debabrota Basu, Emilie Kaufmann
Modern AI models are not static. They go through multiple updates in their lifecycles. Thus, exploiting the model dynamics to create stronger...
Robert Ranisch, Sabine Salloch
The emergence of agentic AI marks a new phase in the digital transformation of healthcare. Distinct from conventional generative AI, agentic AI...
Doron Shavit
Jailbreak prompts are a practical and evolving threat to large language models (LLMs), particularly in agentic systems that execute tools over...
Yiwen Lu
Federated Learning (FL) enables collaborative model training without exposing clients' private data, and has been widely adopted in privacy-sensitive...
Michael Cunningham
We present a practical system for privacy-aware large language model (LLM) inference that splits a transformer between a trusted local GPU and an...
Philipp Schoenegger, Matt Carlson, Chris Schneider +1 more
Multiagent AI systems require consistent communication, but we lack methods to verify that agents share the same understanding of the terms used....
Nivya Talokar, Ayush K Tarun, Murari Mandal +2 more
LLM-based agents execute real-world workflows via tools and memory. These affordances enable ill-intended adversaries to also use these agents to...
Johannes Bertram, Jonas Geiping
We introduce NESSiE, the NEceSsary SafEty benchmark for large language models (LLMs). With minimal test cases of information and access security,...
Ahmed Ryan, Ibrahim Khalil, Abdullah Al Jahid +4 more
The prevalence of malicious packages in open-source repositories, such as PyPI, poses a critical threat to the software supply chain. While Large...
Yu Yin, Shuai Wang, Bevan Koopman +1 more
Large Language Models (LLMs) have emerged as powerful re-rankers. Recent research has however showed that simple prompt injections embedded within a...
Scott Thornton
AI-assisted code review is widely used to detect vulnerabilities before production release. Prior work shows that adversarial prompt manipulation can...
Brennan Bell, Andreas Trügler, Konstantin Beyer +1 more
We study a sequential coherent side-channel model in which an adversarial probe qubit interacts with a target qubit during a hidden gate sequence....
Yuval Felendler, Parth A. Gandhi, Idan Habler +2 more
Model Context Protocols (MCPs) provide a unified platform for agent systems to discover, select, and orchestrate tools across heterogeneous execution...
Shahriar Golchin, Marc Wetter
We systematically evaluate the quality of widely used AI safety datasets from two perspectives: in isolation and in practice. In isolation, we...
Or Zamir
A natural and informal approach to verifiable (or zero-knowledge) ML inference over floating-point data is: ``prove that each layer was computed...
Haodong Zhao, Jinming Hu, Gongshen Liu
Federated learning security research has predominantly focused on backdoor threats from a minority of malicious clients that intentionally corrupt...
Xianglin Yang, Yufei He, Shuo Ji +2 more
Self-evolving LLM agents update their internal state across sessions, often by writing and reusing long-term memory. This design improves performance...
Varun Pratap Bhardwaj
We present SuperLocalMemory, a local-first memory system for multi-agent AI that defends against OWASP ASI06 memory poisoning through architectural...
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial