Red Teaming Large Reasoning Models
Jiawei Chen, Yang Yang, Chao Yu +6 more
Large Reasoning Models (LRMs) have emerged as a powerful advancement in multi-step reasoning tasks, offering enhanced transparency and logical...
2,583+ academic papers on AI security, attacks, and defenses
Showing 1801–1820 of 1,934 papers
Clear filtersJiawei Chen, Yang Yang, Chao Yu +6 more
Large Reasoning Models (LRMs) have emerged as a powerful advancement in multi-step reasoning tasks, offering enhanced transparency and logical...
Mohammad M Maheri, Xavier Cadet, Peter Chin +1 more
Approximate machine unlearning aims to efficiently remove the influence of specific data points from a trained model, offering a practical...
Henry Onyeka, Emmanuel Samson, Liang Hong +3 more
The increasing complexity of IoT edge networks presents significant challenges for anomaly detection, particularly in identifying sophisticated...
Aayush Garg, Zanis Ali Khan, Renzo Degiovanni +1 more
Automated vulnerability patching is crucial for software security, and recent advancements in Large Language Models (LLMs) present promising...
Neemesh Yadav, Francesco Ortu, Jiarui Liu +5 more
Large Language Models (LLMs) are trained to refuse to respond to harmful content. However, systematic analyses of whether this behavior is truly a...
Tong Wu, Weibin Wu, Zibin Zheng
Equipped with various tools and knowledge, GPTs, one kind of customized AI agents based on OpenAI's large language models, have illustrated great...
Fouad Trad, Ali Chehab
Few-shot prompting has emerged as a practical alternative to fine-tuning for leveraging the capabilities of large language models (LLMs) in...
Peng Kuang, Xiangxiang Wang, Wentao Liu +2 more
Multimodal Large Language Models (MLLMs) have achieved impressive performances in mathematical reasoning, yet they remain vulnerable to visual...
Kaixiang Wang, Zhaojiacheng Zhou, Bunyod Suvonov +2 more
Large Language Model (LLM)-based Multi-Agent Systems (MAS) are susceptible to linguistic attacks that can trigger cascading failures across the...
Anudeex Shetty
Large Language Models (LLMs) have demonstrated exceptional capabilities in natural language understanding and generation. Based on these LLMs,...
Zeng Wang, Minghao Shao, Akashdeep Saha +4 more
Graph neural networks (GNNs) have shown promise in hardware security by learning structural motifs from netlist graphs. However, this reliance on...
Abeer Matar A. Almalky, Ziyan Wang, Mohaiminul Al Nahian +2 more
In recent years, large language models (LLMs) have achieved substantial advancements and are increasingly integrated into critical applications...
Mohaiminul Al Nahian, Abeer Matar A. Almalky, Gamana Aragonda +6 more
Adversarial weight perturbation has emerged as a concerning threat to LLMs that either use training privileges or system-level access to inject...
Boyuan Chen, Sitong Fang, Jiaming Ji +57 more
As intelligence increases, so does its shadow. AI deception, in which systems induce false beliefs to secure self-beneficial outcomes, has evolved...
Richard J. Young
Large Language Model (LLM) safety guardrail models have emerged as a primary defense mechanism against harmful content generation, yet their...
Tianyu Zhang, Zihang Xi, Jingyu Hua +1 more
In the realm of black-box jailbreak attacks on large language models (LLMs), the feasibility of constructing a narrow safety proxy, a lightweight...
Shaona Ghosh, Barnaby Simkin, Kyriacos Shiarlis +9 more
This paper introduces a dynamic and actionable framework for securing agentic AI systems in enterprise deployment. We contend that safety and...
Gauri Pradhan, Joonas Jälkö, Santiago Zanella-Bèguelin +1 more
Training machine learning models with differential privacy (DP) limits an adversary's ability to infer sensitive information about the training data....
Junjian Wang, Lidan Zhao, Xi Sheryl Zhang
Ensuring the safety of embodied AI agents during task planning is critical for real-world deployment, especially in household environments where...
Rebeka Toth, Tamas Bisztray, Nils Gruschka
In this paper, we introduce a metadata-enriched generation framework (PhishFuzzer) that seeds real emails into Large Language Models (LLMs) to...
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act), and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial