CVE-2026-34755 — MEDIUM (CVSS 6.5)

Q: Is CVE-2026-34755 actively exploited?

No confirmed active exploitation of CVE-2026-34755 has been reported, but organizations should still patch proactively.

Q: How to fix CVE-2026-34755?

1. PATCH: Upgrade to vLLM 0.19.0 (PR #38636 adds frame count enforcement to the load_base64 video/jpeg path). 2. WORKAROUND (if patching is delayed): At the API gateway or reverse proxy, enforce a maximum Content-Length on POST requests to /v1/chat/completions (e.g., 50 MB) and reject requests whose body contains data:video/jpeg with more than N commas (approximate frame count check via regex). 3. RATE LIMIT: Apply per-user/per-token rate limits on multimodal endpoints to reduce blast radius. 4. MONITOR: Alert on RSS memory spikes on vLLM worker processes, or on requests containing data:video/jpeg URLs with large payloads. 5. NETWORK SEGMENTATION: Ensure vLLM endpoints are not publicly exposed without authentication; require valid tokens even for internal use.

Q: What systems are affected by CVE-2026-34755?

This vulnerability affects the following AI/ML architecture patterns: model serving, multimodal AI pipelines, agent frameworks, LLM inference API.

Q: What is the CVSS score for CVE-2026-34755?

CVE-2026-34755 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.05%.

CISO Take

Any API consumer with a valid token can crash your vLLM inference server by sending a single multimodal request with thousands of base64-encoded JPEG frames, bypassing the built-in frame limit. Patch to vLLM 0.19.0 immediately; if you can't patch, block or rate-limit multimodal endpoints at the API gateway. This is trivially exploitable and availability impact is complete.

What is the risk?

Medium CVSS but operationally high-impact for teams running vLLM in production. The attack requires only low privileges—any authenticated API user can trigger it—and the payload is small (~100 KB compressed) yet decompresses to several gigabytes, creating a severe memory amplification ratio. No complex technique is required; the attacker just crafts a data URL with thousands of comma-separated frames. In multi-tenant or public-facing vLLM deployments the availability impact is critical since a single request can OOM-kill the process serving all tenants.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
vllm	pip	>= 0.7.0, < 0.19.0	`0.19.0`
80.2K 127 dependents Pushed 3d ago 56% patched ~33d to patch Full package profile →

Do you use vllm? You're affected.

Severity & Risk

CVSS 3.1

6.5 / 10

EPSS

0.1%

chance of exploitation in 30 days

Higher than 17% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

No known exploitation

Sophistication

Trivial

Attack Surface

AV Network

AC Low

PR Low

UI None

S Unchanged

C None

I None

A High

What should I do?

5 steps

PATCH

Upgrade to vLLM 0.19.0 (PR #38636 adds frame count enforcement to the load_base64 video/jpeg path).
WORKAROUND (if patching is delayed): At the API gateway or reverse proxy, enforce a maximum Content-Length on POST requests to /v1/chat/completions (e.g., 50 MB) and reject requests whose body contains data:video/jpeg with more than N commas (approximate frame count check via regex).
RATE LIMIT

Apply per-user/per-token rate limits on multimodal endpoints to reduce blast radius.
MONITOR

Alert on RSS memory spikes on vLLM worker processes, or on requests containing data:video/jpeg URLs with large payloads.
NETWORK SEGMENTATION

Ensure vLLM endpoints are not publicly exposed without authentication; require valid tokens even for internal use.

CISA SSVC Assessment

Decision Track

Exploitation none

Automatable No

Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

DoS Framework Inference API AML.T0029 - Denial of AI Service AML.T0034 - Cost Harvesting AML.T0049 - Exploit Public-Facing Application

Compliance Impact

This CVE is relevant to:

EU AI Act

Article 15 - Accuracy, Robustness and Cybersecurity

ISO 42001

8.4 - AI System Operation — Input data controls

NIST AI RMF

MANAGE 2.2 - Mechanisms to sustain AI system value and availability are established

OWASP LLM Top 10

LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2026-34755?

Any API consumer with a valid token can crash your vLLM inference server by sending a single multimodal request with thousands of base64-encoded JPEG frames, bypassing the built-in frame limit. Patch to vLLM 0.19.0 immediately; if you can't patch, block or rate-limit multimodal endpoints at the API gateway. This is trivially exploitable and availability impact is complete.

Is CVE-2026-34755 actively exploited?

No confirmed active exploitation of CVE-2026-34755 has been reported, but organizations should still patch proactively.

How to fix CVE-2026-34755?

1. PATCH: Upgrade to vLLM 0.19.0 (PR #38636 adds frame count enforcement to the load_base64 video/jpeg path). 2. WORKAROUND (if patching is delayed): At the API gateway or reverse proxy, enforce a maximum Content-Length on POST requests to /v1/chat/completions (e.g., 50 MB) and reject requests whose body contains data:video/jpeg with more than N commas (approximate frame count check via regex). 3. RATE LIMIT: Apply per-user/per-token rate limits on multimodal endpoints to reduce blast radius. 4. MONITOR: Alert on RSS memory spikes on vLLM worker processes, or on requests containing data:video/jpeg URLs with large payloads. 5. NETWORK SEGMENTATION: Ensure vLLM endpoints are not publicly exposed without authentication; require valid tokens even for internal use.

What systems are affected by CVE-2026-34755?

This vulnerability affects the following AI/ML architecture patterns: model serving, multimodal AI pipelines, agent frameworks, LLM inference API.

What is the CVSS score for CVE-2026-34755?

CVE-2026-34755 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.05%.

Technical Details

NVD Description

vLLM is an inference and serving engine for large language models (LLMs). From 0.7.0 to before 0.19.0, the VideoMediaIO.load_base64() method at vllm/multimodal/media/video.py splits video/jpeg data URLs by comma to extract individual JPEG frames, but does not enforce a frame count limit. The num_frames parameter (default: 32), which is enforced by the load_bytes() code path, is completely bypassed in the video/jpeg base64 path. An attacker can send a single API request containing thousands of comma-separated base64-encoded JPEG frames, causing the server to decode all frames into memory and crash with OOM. This vulnerability is fixed in 0.19.0.

Exploitation Scenario

An attacker with a valid API token (employee, contractor, or compromised credential) crafts a POST to /v1/chat/completions containing a messages array with a content item of type image_url whose url is a data:video/jpeg;base64,<frame1>,<frame2>,...,<frame5000> string. Each frame is a small but valid base64-encoded JPEG (~20 KB compressed). The vLLM server decodes all 5000 frames into numpy arrays (~921 KB each decoded for 640x480 RGB) and then np.stack() allocates a combined array, consuming ~4.6 GB of RAM plus a copy, crashing the server with OOM. The 5000-frame payload compresses to roughly 100 MB over the wire—well within typical request size limits. The server crashes, denying service to all other users. In Kubernetes deployments, the pod restarts automatically, but repeated requests keep it in a crash loop.