Tool MEDIUM
Shaofei Huang, Christopher M. Poskitt, Lwin Khin Shar
Cyber-physical systems often contend with incomplete architectural documentation or outdated information resulting from legacy technologies,...
1 months ago cs.CR cs.AI
PDF
Benchmark HIGH
Baoshun Tong, Haoran He, Ling Pan +2 more
Vision-Language-Action (VLA) models have achieved remarkable success in robotic manipulation. However, their robustness to linguistic nuances remains...
1 months ago cs.RO cs.CV
PDF
Other HIGH
Yanxu Mao, Peipei Liu, Tiehan Cui +3 more
With the widespread application of LLM-based agents across various domains, their complexity has introduced new security threats. Existing red-team...
Defense MEDIUM
Md Shamimul Islam, Luis G. Jaimes, Ayesha S. Dina
Network Intrusion Detection Systems (NIDS) face important limitations. Signature-based methods are effective for known attack patterns, but they...
1 months ago cs.CR cs.AI
PDF
Tool MEDIUM
Anes Abdennebi, Nadjia Kara, Laaziz Lahlou +1 more
Modern Security Operations Centers struggle with alert fatigue, fragmented tooling, and limited cross-source event correlation. Challenges that...
1 months ago cs.CR cs.AI
PDF
Tool MEDIUM
Wuyang Zhang, Shichao Pei
Tool-use large language model (LLM) agents are increasingly deployed to support sensitive workflows, relying on tool calls for retrieval, external...
1 months ago cs.CR cs.AI
PDF
Defense MEDIUM
Purva Chiniya, Kevin Scaria, Sagar Chaturvedi
Large language models (LLMs) remain susceptible to jailbreak and direct prompt-injection attacks, yet the strongest defensive filters frequently...
Benchmark MEDIUM
Geert Trooskens, Aaron Karlsberg, Anmol Sharma +6 more
We study compiled AI, a paradigm in which large language models generate executable code artifacts during a compilation phase, after which workflows...
1 months ago cs.SE cs.AI
PDF
Tool MEDIUM
Jiling Zhou, Aisvarya Adeseye, Seppo Virtanen +2 more
Chain-of-Thought (CoT) prompting has been used to enhance the reasoning capability of LLMs. However, its reliability in security-sensitive analytical...
1 months ago cs.CR cs.AI
PDF
Attack HIGH
Qingyang Xu, Yaling Shen, Stephanie Fong +7 more
The increasing use of large language models (LLMs) in mental healthcare raises safety concerns in high-stakes therapeutic interactions. A key...
Benchmark LOW
Cheng Xu, Changhong Jin, Yingjie Niu +5 more
The rapid development of Large Language Models (LLMs) has transformed fake news detection and fact-checking tasks from simple classification to...
1 months ago cs.CL cs.AI
PDF
Benchmark LOW
Houzhe Wang, Xiaojie Zhu, Chi Chen
With the increasing importance of data privacy and security, federated unlearning has emerged as a novel research field dedicated to ensuring that...
1 months ago cs.LG cs.CR
PDF
Defense MEDIUM
Zijun Wang, Haoqin Tu, Letian Zhang +11 more
OpenClaw, the most widely deployed personal AI agent in early 2026, operates with full local system access and integrates with sensitive services...
1 months ago cs.CR cs.AI cs.CL
PDF
Attack MEDIUM
Vinod Vaikuntanathan, Or Zamir
AI agents are increasingly deployed to interact with other agents on behalf of users and organizations. We ask whether two such agents, operated by...
1 months ago cs.CR cs.AI cs.LG
PDF
Benchmark LOW
Kanishk Jain, Qian Yang, Shravan Nayak +3 more
Vision-language Models (VLMs), despite achieving strong performance on multimodal benchmarks, often misinterpret straightforward visual concepts that...
1 months ago cs.CV cs.AI
PDF
Benchmark MEDIUM
Zhuohao Yu, Zhiwei Steven Wu, Adam Block
Inference-time compute scaling has emerged as a powerful paradigm for improving language model performance on a wide range of tasks, but the question...
Survey HIGH
Charafeddine Mouzouni
LLM agents with tool access can discover and exploit security vulnerabilities. This is known. What is not known is which features of a system prompt...
1 months ago cs.CR cs.AI cs.CL
PDF
Benchmark MEDIUM
Jia Chengyu, AprilPyone MaungMaung, Huy H. Nguyen +2 more
Recent advances in vision-language models (VLMs) trained on web-scale image-text pairs have enabled impressive zero-shot transfer across a diverse...
Benchmark MEDIUM
Shuyao Gao, Minghao Huang
The deployment of Large Language Models (LLMs) has ignited concerns about technological unemployment. Existing task-based evaluations predominantly...
1 months ago cs.CY econ.GN
PDF
Tool HIGH
Zhuowen Yuan, Zhaorun Chen, Zhen Xiang +5 more
Existing research on LLM agent security mainly focuses on prompt injection and unsafe input/output behaviors. However, as agents increasingly rely on...
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial