CVE-2026-24779 — HIGH (CVSS 7.1) AI Security Vulnerability

Q: Is CVE-2026-24779 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2026-24779, increasing the risk of exploitation.

Q: How to fix CVE-2026-24779?

1. PATCH: Upgrade vLLM to 0.14.1 or later — this is the authoritative fix. 2. WORKAROUND (if patch is not immediately possible): Block multimodal URL inputs at the API gateway; reject or pre-validate any user-supplied URL parameters before they reach the MediaConnector. 3. NETWORK CONTROLS: Apply strict egress rules on vLLM pods — deny outbound access to RFC-1918 ranges, cloud metadata endpoints (169.254.169.254, fd00:ec2::254), and Kubernetes service CIDR. 4. DETECTION: Alert on outbound HTTP/S requests from vLLM pods to internal RFC-1918 addresses; monitor for access to /metadata, /api/v1, or management endpoints from AI inference pods. 5. AUDIT: Review who has access to multimodal inference endpoints — treat inference API access as privileged if multimodal is enabled.

Q: What systems are affected by CVE-2026-24779?

This vulnerability affects the following AI/ML architecture patterns: model serving, multimodal AI pipelines, containerized LLM deployments, LLM inference endpoints, AI agent frameworks using vLLM as backend.

Q: What is the CVSS score for CVE-2026-24779?

CVE-2026-24779 has a CVSS v3.1 base score of 7.1 (HIGH). The EPSS exploitation probability is 0.02%.

CISO Take

vLLM servers processing multimodal inputs are vulnerable to SSRF via a URL parser inconsistency that bypasses host restrictions — any user with inference API access can pivot into your internal network. In Kubernetes-based AI serving stacks (llm-d, KServe), this enables pod-to-pod reconnaissance and service disruption. Patch to vLLM 0.14.1 immediately; if you cannot patch, disable multimodal URL loading at the network perimeter.

Risk Assessment

Effective risk is higher than the CVSS 7.1 suggests in AI-native cloud environments. Attack complexity is low once an attacker understands the backslash parsing differential between Python URL libraries — this is documented in the advisory and reproducible. The low-privileges requirement means any user with access to the vLLM inference endpoint (including trial/free-tier customers in multi-tenant deployments) is a potential attacker. In containerized environments with flat internal networking, the blast radius extends well beyond the vLLM pod itself. EPSS is currently near-zero, but the advisory details are public and the technique is straightforward.

Affected Systems

Package	Ecosystem	Vulnerable Range	Patched
vllm	pip	—	No patch
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →
vllm	pip	< 0.14.1	`0.14.1`
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →

Severity & Risk

CVSS 3.1

7.1 / 10

EPSS

0.0%

chance of exploitation in 30 days

Higher than 5% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Moderate

Exploitation Confidence

medium

○ CISA SSVC: Public PoC

○ Public PoC indexed (trickest/cve)

Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV Network

AC Low

PR Low

UI None

S Unchanged

C High

I None

A Low

Recommended Action

5 steps

PATCH

Upgrade vLLM to 0.14.1 or later — this is the authoritative fix.
WORKAROUND (if patch is not immediately possible): Block multimodal URL inputs at the API gateway; reject or pre-validate any user-supplied URL parameters before they reach the MediaConnector.
NETWORK CONTROLS

Apply strict egress rules on vLLM pods — deny outbound access to RFC-1918 ranges, cloud metadata endpoints (169.254.169.254, fd00:ec2::254), and Kubernetes service CIDR.
DETECTION

Alert on outbound HTTP/S requests from vLLM pods to internal RFC-1918 addresses; monitor for access to /metadata, /api/v1, or management endpoints from AI inference pods.
AUDIT

Review who has access to multimodal inference endpoints — treat inference API access as privileged if multimodal is enabled.

CISA SSVC Assessment

Decision Track*

Exploitation poc

Automatable No

Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

Auth Bypass Data Extraction DoS Inference Framework API AML.T0006 - Active Scanning AML.T0025 - Exfiltration via Cyber Means AML.T0029 - Denial of AI Service AML.T0040 - AI Model Inference API Access AML.T0049 - Exploit Public-Facing Application AML.T0075 - Cloud Service Discovery

Compliance Impact

This CVE is relevant to:

EU AI Act

Article 15 - Accuracy, Robustness and Cybersecurity Article 9 - Risk Management System

ISO 42001

A.6.2 - Risk Assessment and Treatment for AI A.8.3 - AI System Security A.9.4 - Information security for AI systems

NIST AI RMF

GOVERN 6.2 - Policies for Vulnerability Management MANAGE 2.2 - Mechanisms to respond to AI risks MEASURE 2.6 - Information Security and Data Privacy Risk Monitoring

OWASP LLM Top 10

LLM02:2025 - Sensitive Information Disclosure LLM06 - Excessive Agency LLM06:2025 - Excessive Agency LLM07 - Insecure Plugin Design

Frequently Asked Questions

What is CVE-2026-24779?

vLLM servers processing multimodal inputs are vulnerable to SSRF via a URL parser inconsistency that bypasses host restrictions — any user with inference API access can pivot into your internal network. In Kubernetes-based AI serving stacks (llm-d, KServe), this enables pod-to-pod reconnaissance and service disruption. Patch to vLLM 0.14.1 immediately; if you cannot patch, disable multimodal URL loading at the network perimeter.

Is CVE-2026-24779 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2026-24779, increasing the risk of exploitation.

How to fix CVE-2026-24779?

1. PATCH: Upgrade vLLM to 0.14.1 or later — this is the authoritative fix. 2. WORKAROUND (if patch is not immediately possible): Block multimodal URL inputs at the API gateway; reject or pre-validate any user-supplied URL parameters before they reach the MediaConnector. 3. NETWORK CONTROLS: Apply strict egress rules on vLLM pods — deny outbound access to RFC-1918 ranges, cloud metadata endpoints (169.254.169.254, fd00:ec2::254), and Kubernetes service CIDR. 4. DETECTION: Alert on outbound HTTP/S requests from vLLM pods to internal RFC-1918 addresses; monitor for access to /metadata, /api/v1, or management endpoints from AI inference pods. 5. AUDIT: Review who has access to multimodal inference endpoints — treat inference API access as privileged if multimodal is enabled.

What systems are affected by CVE-2026-24779?

This vulnerability affects the following AI/ML architecture patterns: model serving, multimodal AI pipelines, containerized LLM deployments, LLM inference endpoints, AI agent frameworks using vLLM as backend.

What is the CVSS score for CVE-2026-24779?

CVE-2026-24779 has a CVSS v3.1 base score of 7.1 (HIGH). The EPSS exploitation probability is 0.02%.

Technical Details

NVD Description

vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.14.1, a Server-Side Request Forgery (SSRF) vulnerability exists in the `MediaConnector` class within the vLLM project's multimodal feature set. The load_from_url and load_from_url_async methods obtain and process media from URLs provided by users, using different Python parsing libraries when restricting the target host. These two parsing libraries have different interpretations of backslashes, which allows the host name restriction to be bypassed. This allows an attacker to coerce the vLLM server into making arbitrary requests to internal network resources. This vulnerability is particularly critical in containerized environments like `llm-d`, where a compromised vLLM pod could be used to scan the internal network, interact with other pods, and potentially cause denial of service or access sensitive data. For example, an attacker could make the vLLM pod send malicious requests to an internal `llm-d` management endpoint, leading to system instability by falsely reporting metrics like the KV cache state. Version 0.14.1 contains a patch for the issue.

Exploitation Scenario

An attacker with standard user-level access to a multi-tenant vLLM deployment crafts an image URL that appears to target an allowed external host (e.g., cdn.allowed-host.com) but uses a backslash in the URL path component. The two Python URL parsing libraries used by MediaConnector interpret this differently: the allowlist check sees cdn.allowed-host.com, while the actual HTTP request resolves to an internal address — e.g., http://169.254.169.254/latest/meta-data/ (AWS IMDS) or http://10.96.0.1/api/v1/secrets (Kubernetes API). In an llm-d cluster, the attacker targets the llm-d management endpoint and submits falsified KV cache metrics, causing the scheduler to misroute requests or trigger unnecessary pod scaling, resulting in service degradation across the inference fleet.