CVE-2026-54236: vLLM heap address leak enables ASLR

CISO Take

CVE-2026-54236 is an incomplete fix for the parent RCE chain CVE-2026-22778: vLLM's Anthropic API router and realtime speech-to-text WebSocket echo raw PIL exception messages — including heap object addresses — directly to callers because those routes catch exceptions internally, bypassing the global sanitizing handler added in the original patch. This leaks the documented Stage 1 primitive of a two-stage exploit that chains this address disclosure with a libopenjp2 heap overflow for remote code execution; the attack requires no authentication and only a single malformed image payload against any exposed /v1/messages endpoint. With 130 downstream dependents and an EPSS percentile in the top 5%, exposure is broad despite no confirmed active exploitation at publication time. Teams running vLLM ≤0.23.0 should prioritize applying the fix from PR #45119 immediately; as an interim measure, wrap str(e) with sanitize_message() at the five identified code sites and monitor for HTTP 500 responses containing hex address patterns in error bodies.

Sources: NVD GitHub Advisory EPSS ATLAS

What is the risk?

Standalone CVSS 5.3 understates the chained risk profile. This is the documented Stage 1 primitive that reduces ASLR entropy from approximately 4 billion candidates to 8, directly enabling the libopenjp2 heap overflow RCE from the parent CVE. The attack surface is unauthenticated, network-accessible, and low-complexity — a single malformed JPEG or PNG in a multimodal API request triggers the address leak. Five separate code paths across HTTP and WebSocket transports are affected, introduced over a four-month window without the fix being applied, indicating a systemic gap in the project's patch propagation process. No active KEV listing or public PoC beyond the researcher's controlled reproduction, keeping near-term exploitation risk moderate but elevated for inference servers with external exposure.

How does the attack unfold?

Initial Access

Attacker sends POST /v1/messages to any exposed vLLM Anthropic API endpoint with a base64-encoded malformed image in the content part — no authentication required.

AML.T0049

Exploitation

PIL's Image.open() raises UnidentifiedImageError whose message contains the BytesIO heap object address; the route's internal catch block returns str(e) without sanitization, bypassing the global handler.

AML.T0040

Address Disclosure

The raw heap address is returned verbatim in the JSON error.message field, reducing ASLR entropy from approximately 4 billion to 8 candidates.

AML.T0107

Chained RCE

Attacker uses the leaked address to calibrate offsets for a libopenjp2 heap overflow (from parent CVE-2026-22778), achieving remote code execution on the vLLM inference server.

AML.T0049

Initial Access

Attacker sends POST /v1/messages to any exposed vLLM Anthropic API endpoint with a base64-encoded malformed image in the content part — no authentication required.

AML.T0049

Exploitation

PIL's Image.open() raises UnidentifiedImageError whose message contains the BytesIO heap object address; the route's internal catch block returns str(e) without sanitization, bypassing the global handler.

AML.T0040

Address Disclosure

The raw heap address is returned verbatim in the JSON error.message field, reducing ASLR entropy from approximately 4 billion to 8 candidates.

AML.T0107

Chained RCE

Attacker uses the leaked address to calibrate offsets for a libopenjp2 heap overflow (from parent CVE-2026-22778), achieving remote code execution on the vLLM inference server.

AML.T0049

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
vLLM	pip	<= 0.23.0	No patch
82.8K 130 dependents Pushed 3d ago 35% patched ~30d to patch Full package profile →

Do you use vLLM? You're affected.

How severe is it?

CVSS 3.1

N/A

EPSS

0.0%

chance of exploitation in 30 days

Higher than 5% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

No known exploitation

Sophistication

Trivial

What should I do?

5 steps

Patch: Apply vLLM PR #45119 (commit 94923629) as soon as available in a tagged release.
Immediate workaround: In api_router.py lines 78 and 124, serving.py line 808, and connection.py lines 75 and 265, replace str(e) with sanitize_message(str(e)), importing sanitize_message from vllm.entrypoints.utils.
Strengthen the sanitize_message regex from the narrow pattern ' at 0x[0-9a-f]+>' to the broader '\b0x[0-9a-fA-F]{6,}\b' to capture future non-standard repr formats.
Network controls: If the Anthropic API router is not required externally, restrict /v1/messages and /v1/messages/count_tokens to trusted internal networks.
Detection: Alert on HTTP 500 responses or WebSocket error frames from vLLM containing 0x[0-9a-fA-F]{6,} patterns in the body; correlate with repeated multimodal requests from a single source IP.

How is it classified?

Data Leakage Code Execution Inference API AML.T0040 - AI Model Inference API Access AML.T0049 - Exploit Public-Facing Application AML.T0107 - Exploitation for Defense Evasion

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Art. 15 - Accuracy, robustness and cybersecurity

ISO 42001

8.4 - AI system technical specifications

NIST AI RMF

MANAGE 2.4 - Identified AI risks are prioritized and managed

OWASP LLM Top 10

LLM02 - Sensitive Information Disclosure

Frequently Asked Questions

What is CVE-2026-54236?

CVE-2026-54236 is an incomplete fix for the parent RCE chain CVE-2026-22778: vLLM's Anthropic API router and realtime speech-to-text WebSocket echo raw PIL exception messages — including heap object addresses — directly to callers because those routes catch exceptions internally, bypassing the global sanitizing handler added in the original patch. This leaks the documented Stage 1 primitive of a two-stage exploit that chains this address disclosure with a libopenjp2 heap overflow for remote code execution; the attack requires no authentication and only a single malformed image payload against any exposed /v1/messages endpoint. With 130 downstream dependents and an EPSS percentile in the top 5%, exposure is broad despite no confirmed active exploitation at publication time. Teams running vLLM ≤0.23.0 should prioritize applying the fix from PR #45119 immediately; as an interim measure, wrap str(e) with sanitize_message() at the five identified code sites and monitor for HTTP 500 responses containing hex address patterns in error bodies.

Is CVE-2026-54236 actively exploited?

No confirmed active exploitation of CVE-2026-54236 has been reported, but organizations should still patch proactively.

How to fix CVE-2026-54236?

1. Patch: Apply vLLM PR #45119 (commit 94923629) as soon as available in a tagged release. 2. Immediate workaround: In api_router.py lines 78 and 124, serving.py line 808, and connection.py lines 75 and 265, replace str(e) with sanitize_message(str(e)), importing sanitize_message from vllm.entrypoints.utils. 3. Strengthen the sanitize_message regex from the narrow pattern ' at 0x[0-9a-f]+>' to the broader '\b0x[0-9a-fA-F]{6,}\b' to capture future non-standard repr formats. 4. Network controls: If the Anthropic API router is not required externally, restrict /v1/messages and /v1/messages/count_tokens to trusted internal networks. 5. Detection: Alert on HTTP 500 responses or WebSocket error frames from vLLM containing 0x[0-9a-fA-F]{6,} patterns in the body; correlate with repeated multimodal requests from a single source IP.

What systems are affected by CVE-2026-54236?

This vulnerability affects the following AI/ML architecture patterns: LLM inference servers, multimodal AI pipelines, model serving, real-time speech AI services.

What is the CVSS score for CVE-2026-54236?

No CVSS score has been assigned yet.

What is the AI security impact?

Affected AI Architectures

LLM inference serversmultimodal AI pipelinesmodel servingreal-time speech AI services

MITRE ATLAS Techniques

AML.T0040 AI Model Inference API Access

AML.T0049 Exploit Public-Facing Application

AML.T0107 Exploitation for Defense Evasion

Compliance Controls Affected

EU AI Act: Art. 15

ISO 42001: 8.4

NIST AI RMF: MANAGE 2.4

OWASP LLM Top 10: LLM02

What are the technical details?

Original Advisory

# vLLM: incomplete CVE-2026-22778 fix leaks PIL repr addresses via the Anthropic API router **Researcher:** Kai Aizen — SnailSploit (@SnailSploit), Adversarial & Offensive Security Research **Severity:** CVSS 3.1 5.3 (Medium) `AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N` **Target:** https://github.com/vllm-project/vllm --- ## Summary The fix for CVE-2026-22778 / GHSA-4r2x-xpjr-7cvv (PRs #31987 and #32319) introduced `sanitize_message` and applied it at four FastAPI exception-handling sites in the OpenAI router. The sanitizer strips object-repr memory addresses (`<_io.BytesIO object at 0x7a95e299e750>` → `<_io.BytesIO object>`) before error messages reach the client, defeating the ASLR-bypass primitive that CVE-2026-22778 chained with a libopenjp2 heap overflow for RCE. The fix is incomplete: response paths added to vLLM at or after the same time as the fix continue to echo `str(exc)` directly to clients without `sanitize_message`. The original Stage 1 primitive — sending malformed image bytes so PIL raises `UnidentifiedImageError` whose message contains the BytesIO object repr — reaches all of them unmodified and leaks the heap address verbatim in the response body. All five lines below are present in `main` HEAD (`771e1e48b`, 2026-05-26). ## Affected sites Current `main` HEAD (`771e1e48b`, 2026-05-26): | # | File | Line | Code | |---|---|---|---| | 1 | `vllm/entrypoints/anthropic/api_router.py` | 78 | `message=str(e),` (inside `POST /v1/messages` exception handler) | | 2 | `vllm/entrypoints/anthropic/api_router.py` | 124 | `message=str(e),` (inside `POST /v1/messages/count_tokens`) | | 3 | `vllm/entrypoints/anthropic/serving.py` | 808 | `error=AnthropicError(type="internal_error", message=str(e)),` (SSE streaming converter) | | 4 | `vllm/entrypoints/speech_to_text/realtime/connection.py` | 75 | `await self.send_error(str(e), "processing_error")` (WebSocket event loop) | | 5 | `vllm/entrypoints/speech_to_text/realtime/connection.py` | 265 | `await self.send_error(str(e), "processing_error")` (WebSocket generation loop) | ## Why the global exception handler does not save these paths `api_server.py` registers a catch-all `app.exception_handler(Exception)(exception_handler)` at line 262, and that handler calls `create_error_response(exc)` which DOES apply `sanitize_message`. However, FastAPI exception handlers fire only on **unhandled** exceptions that propagate out of a route function. All affected HTTP paths catch `Exception` *inside* the route coroutine and construct the response themselves: ```python # vllm/entrypoints/anthropic/api_router.py:71-81 (POST /v1/messages) try: generator = await handler.create_messages(request, raw_request) except Exception as e: logger.exception("Error in create_messages: %s", e) return JSONResponse( status_code=HTTPStatus.INTERNAL_SERVER_ERROR.value, content=AnthropicErrorResponse( error=AnthropicError( type="internal_error", message=str(e), # <-- unsanitized ) ).model_dump(), ) ``` Because the exception is caught and a `JSONResponse` is returned in-route, every registered FastAPI exception handler — including the sanitizing global one — is bypassed. The WebSocket path bypasses it for a different reason: WebSocket frames don't traverse FastAPI's HTTP exception handler chain at all. ## Reachability — the same primitive as the parent CVE The Anthropic Messages API accepts image content parts in the request body (`type: "image"` with base64 `source.data` or `type: "image_url"`). Image bytes are passed to the same multimodal loader used by the OpenAI router. Malformed bytes cause `PIL.Image.open` to raise: ``` UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7a95e299e750> ``` The exception propagates up through `handler.create_messages` into the `except Exception as e:` at `api_router.py:75`. `str(e)` returns the exception message verbatim, including the address. The address ends up in the `error.message` field of the JSON response body returned to the attacker. ASLR entropy on the affected process drops from ~4 billion to ~8 candidates, identically to CVE-2026-22778 Stage 1. The same primitive is reachable on `POST /v1/messages/count_tokens` (route #2), inside the SSE streaming converter when an exception is raised mid-stream (route #3), and over the realtime speech-to-text WebSocket when audio decoder or generation paths raise an exception containing any object repr (routes #4, #5). ## Chronology — these are scope misses, not legacy code - **2026-01-09:** PR #31987 (`aa125ecf0`) introduces `sanitize_message` and applies it to OpenAI router HTTP exception handlers. - **2026-01-15** (six days later): PR #32369 (`4c1c501a7`) adds `vllm/entrypoints/anthropic/api_router.py` containing line 78's `message=str(e)`. The fix was not applied to the new router. - **2026-03-02** (~two months later): PR #35588 (`9a87b0578`) adds the Anthropic `count_tokens` endpoint, replicating the same `message=str(e)` pattern at line 124. - **2026-05-12** (~four months later): PR #42370 (`d37e25ffb`) consolidates speech-to-text entrypoints and the realtime WebSocket uses `send_error(str(e), ...)` for both error paths. - **2026-05-26:** current `main` HEAD, all five lines still present. ## Remediation ### 1. Apply `sanitize_message` symmetrically to the five sites ```python # vllm/entrypoints/anthropic/api_router.py — add at top: from vllm.entrypoints.utils import sanitize_message # Line 78 (POST /v1/messages) and Line 124 (count_tokens): message=sanitize_message(str(e)), ``` ```python # vllm/entrypoints/anthropic/serving.py — add at top: from vllm.entrypoints.utils import sanitize_message # Line 808: error=AnthropicError(type="internal_error", message=sanitize_message(str(e))), ``` ```python # vllm/entrypoints/speech_to_text/realtime/connection.py — add at top: from vllm.entrypoints.utils import sanitize_message # Lines 75 and 265: await self.send_error(sanitize_message(str(e)), "processing_error") ``` ### 2. Tighten the regex (defense in depth) The current regex `r" at 0x[0-9a-f]+>"` is narrow — it only matches the exact CPython builtin object-repr suffix in lowercase hex with a trailing `>`. Future Python versions, C extensions, or custom `__repr__` methods could produce non-matching formats that re-enable the leak: ```python # vllm/entrypoints/utils.py def sanitize_message(message: str) -> str: # Strip any standalone hex address; downstream observers don't need them. return re.sub(r"\b0x[0-9a-fA-F]{6,}\b", "0x?", message) ``` ### 3. Future-proofing: consider a response middleware Both the route-local exception handling pattern (Anthropic router) and the WebSocket path bypass FastAPI's exception handler chain. A response-level middleware that always invokes `sanitize_message` on outgoing error bodies would prevent this class of regression entirely. ## Affected versions - All vLLM versions containing `vllm/entrypoints/anthropic/api_router.py` (introduced 2026-01-15 in PR #32369). - All vLLM versions containing `vllm/entrypoints/speech_to_text/realtime/connection.py` (introduced 2026-05-12 in PR #42370). - Confirmed present in `main` HEAD `771e1e48b` (2026-05-26). ## Steps to reproduce 1. Clone the target: `git clone --depth 1 https://github.com/vllm-project/vllm` 2. Run the proof of concept (`PoC.py`) against the cloned source. 3. Observe the result shown under *Verified result* below. ## Credit Kai Aizen — SnailSploit (@SnailSploit). Adversarial & Offensive Security Research. ## Fix A fix for this vulnerability was added here: https://github.com/vllm-project/vllm/pull/45119

Exploitation Scenario

An attacker targeting an organization's vLLM-powered Anthropic-compatible inference endpoint constructs a POST to /v1/messages containing a valid request body with an image content part bearing base64-encoded malformed bytes — for example, a truncated JPEG with an invalid header. The vLLM multimodal handler passes the payload to PIL's Image.open(), which raises UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7a95e299e750>. The Anthropic router's internal exception catch at api_router.py:75 returns this message verbatim in the JSON response field error.message, bypassing the global sanitizing handler. The attacker parses the hex address, maps the heap layout of the vLLM process, and uses this to calibrate offsets for a follow-on libopenjp2 heap overflow — completing the two-stage RCE chain from the parent CVE. The same primitive is reachable via the count_tokens endpoint and the speech-to-text WebSocket with analogous malformed payloads, providing multiple redundant channels for the leak.

Weaknesses (CWE)

CWE-532 Insertion of Sensitive Information into Log File Primary

CWE-532 — Insertion of Sensitive Information into Log File: The product writes sensitive information to a log file.

[Architecture and Design, Implementation] Consider seriously the sensitivity of the information written into log files. Do not write secrets into the log files.
[Distribution] Remove debug log files before deploying the application into production.

Source: MITRE CWE corpus.