Any unauthenticated attacker can crash your vLLM inference server with a single HTTP request containing a 1x1 pixel image if you're running Idefics3 multimodal models on versions 0.6.4–0.11.x — no credentials, no sophistication required. Patch to vLLM 0.12.0 immediately. If patching is delayed, add API gateway input validation to reject images below a minimum dimension threshold.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| vllm | pip | >= 0.6.4, < 0.12.0 | 0.12.0 |
| vllm | pip | — | No patch |
Severity & Risk
Recommended Action
- 1. PATCH: Upgrade vLLM to 0.12.0 or later — this is the definitive fix. 2. WORKAROUND (if patching is delayed): Implement API gateway or middleware input validation to reject images below a minimum dimension threshold (e.g., block images smaller than 32x32 pixels). 3. DETECTION: Monitor for abnormal inference server termination events, especially correlated with multimodal API requests containing small image payloads. Alert on process restart events in vLLM containers and track inference endpoint availability. 4. RESILIENCE: Verify vLLM containers have auto-restart policies and health checks to minimize per-attack downtime windows. 5. AUDIT: Enumerate all endpoints (internal and external) that accept image inputs routed to vLLM Idefics3 models and prioritize patching for public-facing instances.
Classification
Compliance Impact
This CVE is relevant to:
Technical Details
NVD Description
vLLM is an inference and serving engine for large language models (LLMs). In versions from 0.6.4 to before 0.12.0, users can crash the vLLM engine serving multimodal models that use the Idefics3 vision model implementation by sending a specially crafted 1x1 pixel image. This causes a tensor dimension mismatch that results in an unhandled runtime error, leading to complete server termination. This issue has been patched in version 0.12.0.
Exploitation Scenario
An attacker identifies a public-facing API or internal endpoint serving a multimodal LLM application built on vLLM — discoverable via job postings, GitHub repos, or API fingerprinting. With zero credentials required, the attacker constructs a multipart HTTP POST request containing a 1x1 pixel PNG image and submits it to any inference endpoint processing images via Idefics3. The vLLM server attempts to process the image, encounters a tensor dimension mismatch on the anomalous image shape, throws an unhandled runtime exception, and terminates the entire server process. Total exploit complexity: generate a 1x1 image (trivially done with any image library or even manually), send one HTTP request. The attacker can repeat this loop to maintain a sustained denial-of-service condition against any unpatched deployment.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H References
- github.com/advisories/GHSA-grg2-63fw-f2qr
- github.com/vllm-project/vllm/commit/0ec84221718d920c3f46da879cc354f94b8fb59e
- github.com/vllm-project/vllm/pull/29881
- github.com/vllm-project/vllm/security/advisories/GHSA-grg2-63fw-f2qr
- nvd.nist.gov/vuln/detail/CVE-2026-22773
- github.com/vllm-project/vllm/security/advisories/GHSA-grg2-63fw-f2qr Exploit Vendor