If your organization runs vLLM with multimodal (image) capabilities on any version between 0.8.3 and 0.14.0, patch to 0.14.1 immediately — this is a no-auth RCE via a two-stage exploit chain requiring zero credentials and zero user interaction. An attacker sends a malformed image, extracts a heap address from the verbatim error response, then chains it with a JPEG2000 heap overflow for reliable code execution on your inference servers. If patching is blocked, disable image inputs at the API gateway or WAF as a stopgap; do not consider anything less than that sufficient mitigation.
Risk Assessment
Critical. CVSS 9.8 with network-accessible attack vector, no authentication, and no user interaction makes this trivially weaponizable at scale. The two-stage exploit reduces ASLR entropy from ~32 bits to ~3 bits (8 guesses vs. 4 billion), converting what would normally be a probabilistic heap overflow into a near-deterministic RCE. vLLM is widely deployed as the inference backbone for production LLM APIs, often with direct internet exposure. Blast radius includes full server compromise: model weights, system prompts, API credentials, customer inference data, and internal network pivot. EPSS of 0.00084 reflects early scoring before exploit code becomes public — treat this as pre-weaponization, not low-risk.
Affected Systems
Severity & Risk
Attack Surface
Recommended Action
6 steps-
PATCH (priority 1): Upgrade vLLM to 0.14.1 immediately — this fixes both the heap address leak in error responses and tightens exception handling.
-
WORKAROUND (if patching is blocked same-day): Disable multimodal endpoints via vLLM config or block image content-types (image/jpeg, image/png, image/jp2) at the API gateway or WAF layer.
-
HARDEN
Ensure API error responses never return raw exception messages or stack traces to clients — implement a generic error handler that logs internally but returns sanitized responses externally.
-
DETECT
Search existing logs and response bodies for memory address patterns (regex: 0x[0-9a-fA-F]{8,16}) in API error payloads; alert on unexpected outbound TCP connections from inference servers.
-
NETWORK
Enforce egress filtering on inference servers to limit post-compromise pivot; segment from internal networks and data stores.
-
AUDIT
Review access logs from vLLM 0.8.3 deployment date onward for patterns of malformed image submissions, especially JPEG2000 (.jp2, .j2k) files with anomalous sizes.
CISA SSVC Assessment
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2026-22778?
If your organization runs vLLM with multimodal (image) capabilities on any version between 0.8.3 and 0.14.0, patch to 0.14.1 immediately — this is a no-auth RCE via a two-stage exploit chain requiring zero credentials and zero user interaction. An attacker sends a malformed image, extracts a heap address from the verbatim error response, then chains it with a JPEG2000 heap overflow for reliable code execution on your inference servers. If patching is blocked, disable image inputs at the API gateway or WAF as a stopgap; do not consider anything less than that sufficient mitigation.
Is CVE-2026-22778 actively exploited?
No confirmed active exploitation of CVE-2026-22778 has been reported, but organizations should still patch proactively.
How to fix CVE-2026-22778?
1. PATCH (priority 1): Upgrade vLLM to 0.14.1 immediately — this fixes both the heap address leak in error responses and tightens exception handling. 2. WORKAROUND (if patching is blocked same-day): Disable multimodal endpoints via vLLM config or block image content-types (image/jpeg, image/png, image/jp2) at the API gateway or WAF layer. 3. HARDEN: Ensure API error responses never return raw exception messages or stack traces to clients — implement a generic error handler that logs internally but returns sanitized responses externally. 4. DETECT: Search existing logs and response bodies for memory address patterns (regex: 0x[0-9a-fA-F]{8,16}) in API error payloads; alert on unexpected outbound TCP connections from inference servers. 5. NETWORK: Enforce egress filtering on inference servers to limit post-compromise pivot; segment from internal networks and data stores. 6. AUDIT: Review access logs from vLLM 0.8.3 deployment date onward for patterns of malformed image submissions, especially JPEG2000 (.jp2, .j2k) files with anomalous sizes.
What systems are affected by CVE-2026-22778?
This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, Multimodal model serving, Vision-language model deployments, RAG pipelines with image processing, AI API gateways, AI agent frameworks with vision tools.
What is the CVSS score for CVE-2026-22778?
CVE-2026-22778 has a CVSS v3.1 base score of 9.8 (CRITICAL). The EPSS exploitation probability is 0.09%.
Technical Details
NVD Description
vLLM is an inference and serving engine for large language models (LLMs). From 0.8.3 to before 0.14.1, when an invalid image is sent to vLLM's multimodal endpoint, PIL throws an error. vLLM returns this error to the client, leaking a heap address. With this leak, we reduce ASLR from 4 billion guesses to ~8 guesses. This vulnerability can be chained a heap overflow with JPEG2000 decoder in OpenCV/FFmpeg to achieve remote code execution. This vulnerability is fixed in 0.14.1.
Exploitation Scenario
An unauthenticated external attacker targets an organization's public-facing LLM API powered by vLLM with vision capabilities. Stage 1 — ASLR Defeat: The attacker sends a crafted invalid image (truncated or structurally malformed JPEG) to the multimodal endpoint. PIL raises an unhandled exception whose message contains a heap memory address; vLLM forwards this verbatim in the HTTP 500 error response. The attacker parses the JSON error body, extracts the address, and computes the heap base — reducing ASLR entropy from ~32 bits to ~3 bits. Stage 2 — RCE: The attacker crafts a malicious JPEG2000 file engineered to trigger the heap overflow in OpenCV/FFmpeg's libopenjp2 decoder. With heap layout now known, memory corruption offsets are reliable. The attacker achieves remote code execution on the inference server. Stage 3 — Post-Exploitation: A reverse shell or C2 beacon is installed. The attacker exfiltrates model weights, API keys stored in environment variables, system prompts, and customer inference logs. From the inference server, the attacker pivots to internal storage, databases, or adjacent AI infrastructure — all without ever authenticating to the API.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H References
- github.com/advisories/GHSA-4r2x-xpjr-7cvv
- nvd.nist.gov/vuln/detail/CVE-2026-22778
- github.com/vllm-project/vllm/pull/31987 Issue Patch
- github.com/vllm-project/vllm/pull/32319 Issue Patch
- github.com/vllm-project/vllm/releases/tag/v0.14.1 Release
- github.com/vllm-project/vllm/security/advisories/GHSA-4r2x-xpjr-7cvv Vendor
Timeline
Related Vulnerabilities
CVE-2024-9053 9.8 vllm: RCE via unsafe pickle deserialization in RPC server
Same package: vllm CVE-2026-25960 9.8 vllm: SSRF allows internal network access
Same package: vllm CVE-2025-47277 9.8 vLLM: RCE via exposed TCPStore in distributed inference
Same package: vllm CVE-2024-11041 9.8 vllm: RCE via unsafe pickle deserialization in MessageQueue
Same package: vllm CVE-2025-32444 9.8 vLLM: RCE via pickle deserialization on ZeroMQ
Same package: vllm
AI Threat Alert