AgentTrust: Runtime Safety Evaluation and Interception for AI Agent Tool Use
Chenglin Yang
Modern AI agents execute real-world side effects through tool calls such as file operations, shell commands, HTTP requests, and database queries. A...
2,529+ academic papers on AI security, attacks, and defenses
Showing 61–80 of 1,906 papers
Clear filtersChenglin Yang
Modern AI agents execute real-world side effects through tool calls such as file operations, shell commands, HTTP requests, and database queries. A...
Zheng Fang, Xiaosen Wang, Shenyi Zhang +2 more
Jailbreak attacks on audio language models (ALMs) optimize audio perturbations to elicit unsafe generations, and they typically update the entire...
Jan Dolejš, Martin Jureček, Róbert Lórencz
Modern malware detection pipelines rely on continuous data ingestion and machine learning to counter the high volume of novel threats. This work...
Kaifeng He, Xiaojun Zhang, Peiliang Cai +7 more
Large language models (LLMs) frequently generate defective outputs in code generation tasks, ranging from logical bugs to security vulnerabilities....
Hanum Ko, Sangheum Yeon, Jong Hwan Ko +1 more
As DRAM scales in density and adopts 3D integration, raw fault rates increase and multi-bit errors are no longer rare. Such errors can severely...
Zekun Fei, Zihao Wang, Weijie Liu +4 more
Mixture-of-Experts (MoE) architectures have emerged as a leading paradigm for scaling large language models through sparse, routing-based...
Zhenning Yang, Yuhan Chen, Patrick Tser Jern Kon +5 more
To unleash the full potential of AI for Science, we must untether the agents from a purely digital environment. The agent's ability to control and...
Jie Zhang, Pura Peetathawatchai, Florian Tramèr +1 more
Vision-language models (VLMs) are increasingly deployed as trusted authorities -- fact-checking images on social media, comparing products, and...
Sarthak Choudhary, Atharv Singh Patlan, Nils Palumbo +3 more
We present Sparse Backdoor, a supply-chain attack that plants a \emph{provably undetectable} backdoor in pre-trained image classifiers, including...
Raja Sekhar Rao Dheekonda, Will Pearce, Nick Landers
AI systems are entering critical domains like healthcare, finance, and defense, yet remain vulnerable to adversarial attacks. While AI red teaming is...
Shravya Kanchi, Xiaoyan Zang, Ying Zhang +2 more
Developers create modern software applications (Apps) on top of third-party libraries (Libs). When library vulnerabilities are reachable through...
Gabriel Hortea, Juan Tapiador
Malware authors have traditionally relied on polymorphic techniques to produce variants in the same malware family, complicating signature-based...
Gabriel Hortea, Juan Tapiador
Malware authors have traditionally relied on polymorphic techniques to produce variants in the same malware family, complicating signature-based...
Tejas Kulkarni, Antti Koskela, Laith Zumot
We show that remotely hosted applications employing in-context learning when augmented with a retrieval function to select in-context examples can be...
Ishrith Gowda
Persistent external memory enables LLM agents to maintain context across sessions, yet its security properties remain formally uncharacterized. We...
Haoyu Zhang, Mohammad Zandsalimy, Shanu Sushmita
Large language models (LLMs) employ safety mechanisms to prevent harmful outputs, yet these defenses primarily rely on semantic pattern matching. We...
Srinath Perera, Kaviru Hapuarachchi, Frank Leymann +1 more
We present Robust Agent Compensation (RAC), a log-based recovery paradigm (providing a safety net) implemented through an architectural extension...
Rishi Raj Sahoo, Jyotirmaya Shivottam, Subhankar Mishra
Regulatory frameworks such as GDPR increasingly require that ML predictions be accompanied by post-hoc explanations, even when raw data and trained...
Bikrant Bikram Pratap Maurya, Nitin Choudhury, Daksh Agarwal +1 more
Acoustic side-channel attacks (ASCA) on keyboards pose a significant security risk, as keystrokes can be inferred from typing acoustics, revealing...
Shihao Weng, Yang Feng, Jinrui Zhang +3 more
The rise of Large Language Model (LLM) agents, augmented with tool use, skills, and external knowledge, has introduced new security risks. Among...
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial