The LLMbda Calculus: AI Agents, Conversations, and Information Flow
Zac Garby, Andrew D. Gordon, David Sands
A conversation with a large language model (LLM) is a sequence of prompts and responses, with each response generated from the preceding...
2,077+ academic papers on AI security, attacks, and defenses
Showing 161–180 of 986 papers
Clear filtersZac Garby, Andrew D. Gordon, David Sands
A conversation with a large language model (LLM) is a sequence of prompts and responses, with each response generated from the preceding...
Natalie Shapira, Chris Wendler, Avery Yen +35 more
We report an exploratory red-teaming study of autonomous language-model-powered agents deployed in a live laboratory environment with persistent...
Xunzhuo Liu, Huamin Chen, Samzong Lu +27 more
As large language models (LLMs) diversify across modalities, capabilities, and cost profiles, the problem of intelligent request routing -- selecting...
Yedi Zhang, Haoyu Wang, Xianglin Yang +2 more
LLM-enabled applications are rapidly reshaping the software ecosystem by using large language models as core reasoning components for complex task...
Kaiwen Wang, Xiaolin Chang, Yuehan Dong +1 more
Secure comparison is a fundamental primitive in multi-party computation, supporting privacy-preserving applications such as machine learning and data...
Lei Ba, Qinbin Li, Songze Li
LLM-based code interpreter agents are increasingly deployed in critical workflows, yet their robustness against risks introduced by their code...
Jingwei Shi, Xinxiang Yin, Jing Huang +2 more
The evaluation of Large Language Models (LLMs) for code generation relies heavily on the quality and robustness of test cases. However, existing...
Florin Adrian Chitan
The proliferation of autonomous AI agents capable of executing real-world actions - filesystem operations, API calls, database modifications,...
Kiarash Ahi, Vaibhav Agrawal, Saeed Valizadeh
Large Language Models (LLMs) & Generative AI are transforming cybersecurity, enabling both advanced defenses and new attacks. Organizations now use...
Emmanuel Bamidele
Long-running LLM agents require persistent memory to preserve state across interactions, yet most deployed systems manage memory with age-based...
Abdullah Caglar Oksuz, Anisa Halimi, Erman Ayday
Membership inference attacks (MIAs) threaten the privacy of machine learning models by revealing whether a specific data point was used during...
Chun Yan Ryan Kan, Tommy Tran, Vedant Yadav +4 more
Defending LLMs against adversarial jailbreak attacks remains an open challenge. Existing defenses rely on binary classifiers that fail when...
Diego Soi, Silvia Lucia Sanna, Lorenzo Pisu +2 more
In recent years, stealthy Android malware has increasingly adopted sophisticated techniques to bypass automatic detection mechanisms and harden...
Zachary Coalson, Bo Fang, Sanghyun Hong
Multi-turn interaction length is a dominant factor in the operational costs of conversational LLMs. In this work, we present a new failure mode in...
Gelei Deng, Yi Liu, Yuekang Li +5 more
LLM-based agents show promise for automating penetration testing, yet reported performance varies widely across systems and benchmarks. We analyze 28...
Boyang Ma, Hechuan Guo, Peizhuo Lv +5 more
Embodied AI systems (e.g., autonomous vehicles, service robots, and LLM-driven interactive agents) are rapidly transitioning from controlled...
Zachary Coalson, Beth Sohler, Aiden Gabriel +1 more
We identify a structural weakness in current large language model (LLM) alignment: modern refusal mechanisms are fail-open. While existing approaches...
Arnold Cartagena, Ariane Teixeira
Large language models deployed as agents increasingly interact with external systems through tool calls--actions with real-world consequences that...
Justin Albrethsen, Yash Datta, Kunal Kumar +1 more
While Large Language Model (LLM) capabilities have scaled, safety guardrails remain largely stateless, treating multi-turn dialogues as a series of...
Sasha Behrouzi, Lichao Wu, Mohamadreza Rostami +1 more
Safety alignment is essential for the responsible deployment of large language models (LLMs). Yet, existing approaches often rely on heavyweight...
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial