Benchmark MEDIUM
Kai Williams, Rohan Subramani, Francis Rhys Ward
Frontier AI developers may fail to align or control highly-capable AI agents. In many cases, it could be useful to have emergency shutdown mechanisms...
3 months ago cs.CR cs.AI cs.CY
PDF
Benchmark HIGH
Jiawei Chen, Yang Yang, Chao Yu +6 more
Large Reasoning Models (LRMs) have emerged as a powerful advancement in multi-step reasoning tasks, offering enhanced transparency and logical...
3 months ago cs.CR cs.AI
PDF
Attack HIGH
Mohammad M Maheri, Xavier Cadet, Peter Chin +1 more
Approximate machine unlearning aims to efficiently remove the influence of specific data points from a trained model, offering a practical...
3 months ago cs.LG cs.AI cs.CR
PDF
Defense MEDIUM
Henry Onyeka, Emmanuel Samson, Liang Hong +3 more
The increasing complexity of IoT edge networks presents significant challenges for anomaly detection, particularly in identifying sophisticated...
3 months ago cs.LG cs.CR
PDF
Benchmark MEDIUM
Aayush Garg, Zanis Ali Khan, Renzo Degiovanni +1 more
Automated vulnerability patching is crucial for software security, and recent advancements in Large Language Models (LLMs) present promising...
3 months ago cs.CR cs.AI cs.SE
PDF
Defense MEDIUM
Neemesh Yadav, Francesco Ortu, Jiarui Liu +5 more
Large Language Models (LLMs) are trained to refuse to respond to harmful content. However, systematic analyses of whether this behavior is truly a...
Attack MEDIUM
Tong Wu, Weibin Wu, Zibin Zheng
Equipped with various tools and knowledge, GPTs, one kind of customized AI agents based on OpenAI's large language models, have illustrated great...
3 months ago cs.CR cs.SE
PDF
Defense HIGH
Fouad Trad, Ali Chehab
Few-shot prompting has emerged as a practical alternative to fine-tuning for leveraging the capabilities of large language models (LLMs) in...
3 months ago cs.SE cs.AI cs.CL
PDF
Benchmark LOW
Peng Kuang, Xiangxiang Wang, Wentao Liu +2 more
Multimodal Large Language Models (MLLMs) have achieved impressive performances in mathematical reasoning, yet they remain vulnerable to visual...
Tool MEDIUM
Kaixiang Wang, Zhaojiacheng Zhou, Bunyod Suvonov +2 more
Large Language Model (LLM)-based Multi-Agent Systems (MAS) are susceptible to linguistic attacks that can trigger cascading failures across the...
3 months ago cs.MA cs.AI cs.CR
PDF
Benchmark MEDIUM
Anudeex Shetty
Large Language Models (LLMs) have demonstrated exceptional capabilities in natural language understanding and generation. Based on these LLMs,...
3 months ago cs.CL cs.CR cs.LG
PDF
Attack MEDIUM
Zeng Wang, Minghao Shao, Akashdeep Saha +4 more
Graph neural networks (GNNs) have shown promise in hardware security by learning structural motifs from netlist graphs. However, this reliance on...
4 months ago cs.CR cs.AI
PDF
Benchmark MEDIUM
Abeer Matar A. Almalky, Ziyan Wang, Mohaiminul Al Nahian +2 more
In recent years, large language models (LLMs) have achieved substantial advancements and are increasingly integrated into critical applications...
Benchmark MEDIUM
Mohaiminul Al Nahian, Abeer Matar A. Almalky, Gamana Aragonda +6 more
Adversarial weight perturbation has emerged as a concerning threat to LLMs that either use training privileges or system-level access to inject...
Survey LOW
Boyuan Chen, Sitong Fang, Jiaming Ji +57 more
As intelligence increases, so does its shadow. AI deception, in which systems induce false beliefs to secure self-beneficial outcomes, has evolved...
Attack HIGH
Richard J. Young
Large Language Model (LLM) safety guardrail models have emerged as a primary defense mechanism against harmful content generation, yet their...
Attack HIGH
Tianyu Zhang, Zihang Xi, Jingyu Hua +1 more
In the realm of black-box jailbreak attacks on large language models (LLMs), the feasibility of constructing a narrow safety proxy, a lightweight...
4 months ago cs.CR cs.AI
PDF
Tool MEDIUM
Shaona Ghosh, Barnaby Simkin, Kyriacos Shiarlis +9 more
This paper introduces a dynamic and actionable framework for securing agentic AI systems in enterprise deployment. We contend that safety and...
4 months ago cs.LG cs.AI cs.CR
PDF
Benchmark MEDIUM
Gauri Pradhan, Joonas Jälkö, Santiago Zanella-Bèguelin +1 more
Training machine learning models with differential privacy (DP) limits an adversary's ability to infer sensitive information about the training data....
4 months ago cs.CR cs.LG
PDF
Benchmark LOW
Junjian Wang, Lidan Zhao, Xi Sheryl Zhang
Ensuring the safety of embodied AI agents during task planning is critical for real-world deployment, especially in household environments where...
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial