CVE-2025-30202: vLLM: ZeroMQ socket exposure enables DoS in multi-node

GHSA-9f8f-2vmf-885j HIGH PoC AVAILABLE
Published April 30, 2025
CISO Take

Multi-node vLLM deployments (versions 0.5.2–0.8.4) expose an unauthenticated ZeroMQ XPUB socket on all network interfaces, allowing any network-adjacent attacker to flood connections and stall LLM inference availability. Patch to 0.8.5 immediately; if patching is delayed, firewall the ZeroMQ port at the network perimeter. This is an infrastructure hardening failure—production LLM serving nodes should never have unfiltered external socket exposure.

What is the risk?

HIGH availability risk for organizations running distributed multi-node vLLM deployments. The attack requires zero authentication, zero prior access, and zero AI/ML expertise—only TCP connectivity to the host. Confidentiality impact is limited to internal state data (not model weights or user prompts). Primary risk is DoS against production LLM inference infrastructure, which in enterprise settings cascades to all AI-powered applications depending on that serving layer. EPSS is currently low (0.45%) but the trivial attack complexity warrants proactive patching.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
vLLM pip No patch
83.4K 130 dependents Pushed 3d ago 34% patched ~32d to patch Full package profile →
vLLM pip >= 0.5.2, < 0.8.5 0.8.5
83.4K 130 dependents Pushed 3d ago 34% patched ~32d to patch Full package profile →

How severe is it?

CVSS 3.1
7.5 / 10
EPSS
0.5%
chance of exploitation in 30 days
Higher than 38% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Unchanged
C None
I None
A High

What should I do?

5 steps
  1. PATCH

    Upgrade vLLM to 0.8.5 immediately—the fix restricts ZeroMQ socket binding to localhost/peer interfaces.

  2. WORKAROUND (if patching delayed): Block the ZeroMQ XPUB port (default 5557 or as configured) at the firewall or security group for all vLLM primary nodes.

  3. NETWORK SEGMENTATION

    Isolate LLM inference nodes in a private network segment not accessible from untrusted hosts or the public internet.

  4. DETECTION

    Monitor for unexpected TCP connections to ZeroMQ ports on vLLM hosts; alert on connection counts from non-peer IPs.

  5. AUDIT

    Scan all vLLM deployments for exposed ZeroMQ ports using nmap or equivalent; verify multi-node topologies are using 0.8.5+.

What does CISA's SSVC say?

Decision Track
Exploitation none
Automatable Yes
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 9 - Risk management system
ISO 42001
A.6.2 - AI system security
NIST AI RMF
MANAGE-2.2 - Mechanisms to sustain AI system operation
OWASP LLM Top 10
LLM10:2025 - Unbounded Consumption

Frequently Asked Questions

What is CVE-2025-30202?

Multi-node vLLM deployments (versions 0.5.2–0.8.4) expose an unauthenticated ZeroMQ XPUB socket on all network interfaces, allowing any network-adjacent attacker to flood connections and stall LLM inference availability. Patch to 0.8.5 immediately; if patching is delayed, firewall the ZeroMQ port at the network perimeter. This is an infrastructure hardening failure—production LLM serving nodes should never have unfiltered external socket exposure.

Is CVE-2025-30202 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-30202, increasing the risk of exploitation.

How to fix CVE-2025-30202?

1. PATCH: Upgrade vLLM to 0.8.5 immediately—the fix restricts ZeroMQ socket binding to localhost/peer interfaces. 2. WORKAROUND (if patching delayed): Block the ZeroMQ XPUB port (default 5557 or as configured) at the firewall or security group for all vLLM primary nodes. 3. NETWORK SEGMENTATION: Isolate LLM inference nodes in a private network segment not accessible from untrusted hosts or the public internet. 4. DETECTION: Monitor for unexpected TCP connections to ZeroMQ ports on vLLM hosts; alert on connection counts from non-peer IPs. 5. AUDIT: Scan all vLLM deployments for exposed ZeroMQ ports using nmap or equivalent; verify multi-node topologies are using 0.8.5+.

What systems are affected by CVE-2025-30202?

This vulnerability affects the following AI/ML architecture patterns: Multi-node LLM inference serving, Distributed model serving, LLM inference infrastructure.

What is the CVSS score for CVE-2025-30202?

CVE-2025-30202 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.49%.

What is the AI security impact?

Affected AI Architectures

Multi-node LLM inference servingDistributed model servingLLM inference infrastructure

MITRE ATLAS Techniques

AML.T0006 Active Scanning
AML.T0025 Exfiltration via Cyber Means
AML.T0029 Denial of AI Service
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 9
ISO 42001: A.6.2
NIST AI RMF: MANAGE-2.2
OWASP LLM Top 10: LLM10:2025

What are the technical details?

Original Advisory

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.5.2 and prior to 0.8.5 are vulnerable to denial of service and data exposure via ZeroMQ on multi-node vLLM deployment. In a multi-node vLLM deployment, vLLM uses ZeroMQ for some multi-node communication purposes. The primary vLLM host opens an XPUB ZeroMQ socket and binds it to ALL interfaces. While the socket is always opened for a multi-node deployment, it is only used when doing tensor parallelism across multiple hosts. Any client with network access to this host can connect to this XPUB socket unless its port is blocked by a firewall. Once connected, these arbitrary clients will receive all of the same data broadcasted to all of the secondary vLLM hosts. This data is internal vLLM state information that is not useful to an attacker. By potentially connecting to this socket many times and not reading data published to them, an attacker can also cause a denial of service by slowing down or potentially blocking the publisher. This issue has been patched in version 0.8.5.

Exploitation Scenario

An attacker with network access to a corporate AI inference cluster—via a compromised internal host or misconfigured network segment—port-scans and discovers the ZeroMQ XPUB socket open on port 5557 of the primary vLLM node. Using a trivial ZeroMQ subscriber script (under 10 lines of Python), the attacker spawns hundreds of simultaneous connections that subscribe but never consume data. The publisher's send buffer fills; internal state broadcasts to legitimate secondary nodes slow and eventually block entirely. The multi-node LLM inference cluster degrades then stalls, taking down all AI-powered applications relying on that serving layer. No credentials, no malware, no AI knowledge required—just a ZeroMQ client library.

Weaknesses (CWE)

CWE-770 — Allocation of Resources Without Limits or Throttling: The product allocates a reusable resource or group of resources on behalf of an actor without imposing any intended restrictions on the size or number of resources that can be allocated.

  • [Requirements] Clearly specify the minimum and maximum expectations for capabilities, and dictate which behaviors are acceptable when resource allocation reaches limits.
  • [Architecture and Design] Limit the amount of resources that are accessible to unprivileged users. Set per-user limits for resources. Allow the system administrator to define these limits. Be careful to avoid CWE-410.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
April 30, 2025
Last Modified
May 14, 2025
First Seen
April 30, 2025

Related Vulnerabilities