CVE-2025-30202: vLLM ZeroMQ socket exposure enables

Q: Is CVE-2025-30202 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-30202, increasing the risk of exploitation.

Q: How to fix CVE-2025-30202?

1. PATCH: Upgrade vLLM to 0.8.5 immediately—the fix restricts ZeroMQ socket binding to localhost/peer interfaces. 2. WORKAROUND (if patching delayed): Block the ZeroMQ XPUB port (default 5557 or as configured) at the firewall or security group for all vLLM primary nodes. 3. NETWORK SEGMENTATION: Isolate LLM inference nodes in a private network segment not accessible from untrusted hosts or the public internet. 4. DETECTION: Monitor for unexpected TCP connections to ZeroMQ ports on vLLM hosts; alert on connection counts from non-peer IPs. 5. AUDIT: Scan all vLLM deployments for exposed ZeroMQ ports using nmap or equivalent; verify multi-node topologies are using 0.8.5+.

Q: What systems are affected by CVE-2025-30202?

This vulnerability affects the following AI/ML architecture patterns: Multi-node LLM inference serving, Distributed model serving, LLM inference infrastructure.

Q: What is the CVSS score for CVE-2025-30202?

CVE-2025-30202 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.49%.

CISO Take

Multi-node vLLM deployments (versions 0.5.2–0.8.4) expose an unauthenticated ZeroMQ XPUB socket on all network interfaces, allowing any network-adjacent attacker to flood connections and stall LLM inference availability. Patch to 0.8.5 immediately; if patching is delayed, firewall the ZeroMQ port at the network perimeter. This is an infrastructure hardening failure—production LLM serving nodes should never have unfiltered external socket exposure.

What is the risk?

HIGH availability risk for organizations running distributed multi-node vLLM deployments. The attack requires zero authentication, zero prior access, and zero AI/ML expertise—only TCP connectivity to the host. Confidentiality impact is limited to internal state data (not model weights or user prompts). Primary risk is DoS against production LLM inference infrastructure, which in enterprise settings cascades to all AI-powered applications depending on that serving layer. EPSS is currently low (0.45%) but the trivial attack complexity warrants proactive patching.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
vLLM	pip	—	No patch
83.4K 130 dependents Pushed 3d ago 34% patched ~32d to patch Full package profile →
vLLM	pip	>= 0.5.2, < 0.8.5	`0.8.5`
83.4K 130 dependents Pushed 3d ago 34% patched ~32d to patch Full package profile →

How severe is it?

CVSS 3.1

7.5 / 10

EPSS

0.5%

chance of exploitation in 30 days

Higher than 38% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Trivial

Exploitation Confidence

medium

○ Public PoC indexed (trickest/cve)

Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV Network

AC Low

PR None

UI None

S Unchanged

C None

I None

A High

What should I do?

5 steps

PATCH

Upgrade vLLM to 0.8.5 immediately—the fix restricts ZeroMQ socket binding to localhost/peer interfaces.
WORKAROUND (if patching delayed): Block the ZeroMQ XPUB port (default 5557 or as configured) at the firewall or security group for all vLLM primary nodes.
NETWORK SEGMENTATION

Isolate LLM inference nodes in a private network segment not accessible from untrusted hosts or the public internet.
DETECTION

Monitor for unexpected TCP connections to ZeroMQ ports on vLLM hosts; alert on connection counts from non-peer IPs.
AUDIT

Scan all vLLM deployments for exposed ZeroMQ ports using nmap or equivalent; verify multi-node topologies are using 0.8.5+.

What does CISA's SSVC say?

Decision Track

Exploitation none

Automatable Yes

Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

DoS Data Leakage Inference Framework AML.T0006 - Active Scanning AML.T0025 - Exfiltration via Cyber Means AML.T0029 - Denial of AI Service AML.T0049 - Exploit Public-Facing Application

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Article 9 - Risk management system

ISO 42001

A.6.2 - AI system security

NIST AI RMF

MANAGE-2.2 - Mechanisms to sustain AI system operation

OWASP LLM Top 10

LLM10:2025 - Unbounded Consumption

Frequently Asked Questions

What is CVE-2025-30202?

Multi-node vLLM deployments (versions 0.5.2–0.8.4) expose an unauthenticated ZeroMQ XPUB socket on all network interfaces, allowing any network-adjacent attacker to flood connections and stall LLM inference availability. Patch to 0.8.5 immediately; if patching is delayed, firewall the ZeroMQ port at the network perimeter. This is an infrastructure hardening failure—production LLM serving nodes should never have unfiltered external socket exposure.

Is CVE-2025-30202 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-30202, increasing the risk of exploitation.

How to fix CVE-2025-30202?

1. PATCH: Upgrade vLLM to 0.8.5 immediately—the fix restricts ZeroMQ socket binding to localhost/peer interfaces. 2. WORKAROUND (if patching delayed): Block the ZeroMQ XPUB port (default 5557 or as configured) at the firewall or security group for all vLLM primary nodes. 3. NETWORK SEGMENTATION: Isolate LLM inference nodes in a private network segment not accessible from untrusted hosts or the public internet. 4. DETECTION: Monitor for unexpected TCP connections to ZeroMQ ports on vLLM hosts; alert on connection counts from non-peer IPs. 5. AUDIT: Scan all vLLM deployments for exposed ZeroMQ ports using nmap or equivalent; verify multi-node topologies are using 0.8.5+.

What systems are affected by CVE-2025-30202?

This vulnerability affects the following AI/ML architecture patterns: Multi-node LLM inference serving, Distributed model serving, LLM inference infrastructure.

What is the CVSS score for CVE-2025-30202?

CVE-2025-30202 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.49%.

What is the AI security impact?

Affected AI Architectures

Multi-node LLM inference servingDistributed model servingLLM inference infrastructure

MITRE ATLAS Techniques

AML.T0006 Active Scanning

AML.T0025 Exfiltration via Cyber Means

AML.T0029 Denial of AI Service

AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 9

ISO 42001: A.6.2

NIST AI RMF: MANAGE-2.2

OWASP LLM Top 10: LLM10:2025

What are the technical details?

Original Advisory

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.5.2 and prior to 0.8.5 are vulnerable to denial of service and data exposure via ZeroMQ on multi-node vLLM deployment. In a multi-node vLLM deployment, vLLM uses ZeroMQ for some multi-node communication purposes. The primary vLLM host opens an XPUB ZeroMQ socket and binds it to ALL interfaces. While the socket is always opened for a multi-node deployment, it is only used when doing tensor parallelism across multiple hosts. Any client with network access to this host can connect to this XPUB socket unless its port is blocked by a firewall. Once connected, these arbitrary clients will receive all of the same data broadcasted to all of the secondary vLLM hosts. This data is internal vLLM state information that is not useful to an attacker. By potentially connecting to this socket many times and not reading data published to them, an attacker can also cause a denial of service by slowing down or potentially blocking the publisher. This issue has been patched in version 0.8.5.

Exploitation Scenario

An attacker with network access to a corporate AI inference cluster—via a compromised internal host or misconfigured network segment—port-scans and discovers the ZeroMQ XPUB socket open on port 5557 of the primary vLLM node. Using a trivial ZeroMQ subscriber script (under 10 lines of Python), the attacker spawns hundreds of simultaneous connections that subscribe but never consume data. The publisher's send buffer fills; internal state broadcasts to legitimate secondary nodes slow and eventually block entirely. The multi-node LLM inference cluster degrades then stalls, taking down all AI-powered applications relying on that serving layer. No credentials, no malware, no AI knowledge required—just a ZeroMQ client library.

Weaknesses (CWE)

CWE-770 Allocation of Resources Without Limits or Throttling Primary CWE-770 Allocation of Resources Without Limits or Throttling

CWE-770 — Allocation of Resources Without Limits or Throttling: The product allocates a reusable resource or group of resources on behalf of an actor without imposing any intended restrictions on the size or number of resources that can be allocated.

[Requirements] Clearly specify the minimum and maximum expectations for capabilities, and dictate which behaviors are acceptable when resource allocation reaches limits.
[Architecture and Design] Limit the amount of resources that are accessible to unprivileged users. Set per-user limits for resources. Allow the system administrator to define these limits. Be careful to avoid CWE-410.

Source: MITRE CWE corpus.