Attack HIGH
Xingyu Zhu, Beier Zhu, Shuo Wang +4 more
As vision-language models (VLMs) are increasingly deployed in open-world scenarios, they can be easily induced by visual jailbreak attacks to...
Benchmark MEDIUM
Zhanguang Zhang, Zhiyuan Li, Behnam Rahmati +10 more
Robot action planning in the real world is challenging as it requires not only understanding the current state of the environment but also predicting...
Benchmark MEDIUM
Marco Arazzi, Vignesh Kumar Kembu, Antonino Nocera
Large language models are becoming pervasive core components in many real-world applications. As a consequence, security alignment represents a...
Yesterday cs.CR cs.AI cs.CL
PDF
Defense MEDIUM
Rui Yang Tan, Yujia Hu, Roy Ka-Wei Lee
Multimodal Large Language Models (MLLMs) extend text-only LLMs with visual reasoning, but also introduce new safety failure modes under visually...
2 days ago cs.CR cs.AI cs.MM
PDF
Benchmark LOW
Mohammad Asadi, Jack W. O'Sullivan, Fang Cao +5 more
Multimodal AI systems have achieved remarkable performance across a broad range of real-world tasks, yet the mechanisms underlying visual-language...
Survey MEDIUM
Yanming Mu, Hao Hu, Feiyang Li +7 more
Retrieval-Augmented Generation (RAG) significantly mitigates the hallucinations and domain knowledge deficiency in large language models by...
2 days ago cs.CR cs.AI
PDF
Tool HIGH
Charoes Huang, Xin Huang, Amin Milani Fard
Prompt injection is listed as the number-one vulnerability class in the OWASP Top 10 for LLM Applications that can subvert LLM guardrails, disclose...
2 days ago cs.CR cs.SE
PDF
Benchmark LOW
Zhongyi Li, Wan Tian, Jingyu Chen +8 more
Multi-agent collaboration has emerged as a powerful paradigm for enhancing the reasoning capabilities of large language models, yet it suffers from...
Attack MEDIUM
Huamin Chen, Xunzhuo Liu, Bowei He +5 more
Over the past year, the vLLM Semantic Router project has released a series of work spanning: (1) core routing mechanisms -- signal-driven routing,...
2 days ago cs.LG cs.DC
PDF
Benchmark LOW
Zongjie Li, Chaozheng Wang, Yuchong Xie +2 more
Large Language Models are increasingly being considered for deployment in safety-critical military applications. However, current benchmarks suffer...
2 days ago cs.CY cs.AI
PDF
Attack MEDIUM
Kwanyoung Kim, Byeongsu Sim
Reinforcement learning from human feedback (RLHF) has proven effective in aligning large language models with human preferences, inspiring the...
3 days ago cs.LG cs.AI
PDF
Attack HIGH
Zihui Chen, Yuling Wang, Pengfei Jiao +4 more
Text-attributed graphs (TAGs) enhance graph learning by integrating rich textual semantics and topological context for each node. While boosting...
Tool LOW
Octavian Untila
An autonomous AI ecosystem (SUBSTRATE S3), generating product specifications without explicit instructions about formal methods, independently...
3 days ago cs.SE cs.AI
PDF
Attack HIGH
Yasamin Medghalchi, Milad Yazdani, Amirhossein Dabiriaghdam +7 more
Ultrasound is widely used in clinical practice due to its portability, cost-effectiveness, safety, and real-time imaging capabilities. However, image...
Benchmark LOW
Zihan Guo, Zhiyu Chen, Xiaohang Nie +3 more
With the rapid evolution of Large Language Model (LLM) agent ecosystems, centralized skill marketplaces have emerged as pivotal infrastructure for...
3 days ago cs.CR cs.SE
PDF
Attack MEDIUM
Abed K. Musaffar, Ambuj Singh, Francesco Bullo
Large language models (LLMs) are increasingly deployed in human-AI teams as support agents for complex tasks such as information retrieval,...
3 days ago cs.LG cs.AI cs.HC
PDF
Defense MEDIUM
Xinyue Liu, Niloofar Mireshghallah, Jane C. Ginsburg +1 more
Frontier LLM companies have repeatedly assured courts and regulators that their models do not store copies of training data. They further rely on...
3 days ago cs.CL cs.AI cs.CY
PDF
Tool MEDIUM
Uchi Uchibeke
AI agents today have passwords but no permission slips. They execute tool calls (fund transfers, database queries, shell commands, sub-agent...
3 days ago cs.CR cs.AI
PDF
Survey HIGH
Shouqiao Wang, Marcello Politi, Samuele Marro +1 more
As agentic systems move into real-world deployments, their decisions increasingly depend on external inputs such as retrieved content, tool outputs,...
Benchmark LOW
Yandan Zheng, Haoran Luo, Zhenghong Lin +2 more
Benchmarks are the de facto standard for tracking progress in large language models (LLMs), yet static test sets can rapidly saturate, become...
Track AI security vulnerabilities in real time
Get breaking CVE alerts, compliance reports (ISO 42001, EU AI Act),
and CISO risk assessments for your AI/ML stack.
Start 14-Day Free Trial