CVE-2025-29783: vLLM: RCE via unsafe deserialization in Mooncake KV
GHSA-x3m8-f7g5-qhm7 CRITICAL PoC AVAILABLE CISA: TRACK*Any vLLM deployment running Mooncake for distributed KV cache (v0.6.5–v0.7.x) is exposed to unauthenticated RCE from any adjacent-network host with zero user interaction. Patch to v0.8.0 immediately—this is trivial to exploit once network-adjacent and no special AI knowledge is required. If patching is blocked, disable Mooncake KV distribution and isolate ZMQ/TCP inference ports at the network layer until remediation is complete.
Risk Assessment
Critical operational risk despite AV:A scope. Distributed LLM inference clusters typically share flat internal network segments with CI/CD systems, data pipelines, and developer workstations—making 'adjacent network' far easier to reach than perimeter controls suggest. Attack complexity is low, privileges required are minimal, and no user interaction is needed. A single compromised inference node provides full cluster access, model weights, cached inference data containing potentially sensitive prompts and responses, and lateral movement paths to adjacent infrastructure. Organizations running large-scale inference farms should treat this as P0.
Affected Systems
Severity & Risk
Attack Surface
Recommended Action
6 steps-
PATCH
Upgrade vLLM to >= 0.8.0 immediately. This is the only complete fix per the advisory.
-
WORKAROUND (if patching is blocked): Disable Mooncake KV distribution entirely; fall back to single-node inference or an alternative KV backend.
-
NETWORK SEGMENTATION
Apply strict firewall rules on ZMQ/TCP ports used by Mooncake. Only authenticated inference cluster nodes should reach these endpoints—block all other sources including developer and CI/CD networks.
-
ISOLATION
Place inference nodes in a dedicated network segment with no direct access from developer machines, containers, or build pipelines.
-
DETECTION
Monitor vLLM worker processes for anomalous outbound connections, unexpected child process spawns, or unauthorized file system access. Alert on any new listening sockets opened by inference processes.
-
AUDIT
Rotate any credentials, API keys, or tokens accessible from inference node environments as a precaution if exposure cannot be ruled out.
CISA SSVC Assessment
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2025-29783?
Any vLLM deployment running Mooncake for distributed KV cache (v0.6.5–v0.7.x) is exposed to unauthenticated RCE from any adjacent-network host with zero user interaction. Patch to v0.8.0 immediately—this is trivial to exploit once network-adjacent and no special AI knowledge is required. If patching is blocked, disable Mooncake KV distribution and isolate ZMQ/TCP inference ports at the network layer until remediation is complete.
Is CVE-2025-29783 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2025-29783, increasing the risk of exploitation.
How to fix CVE-2025-29783?
1. PATCH: Upgrade vLLM to >= 0.8.0 immediately. This is the only complete fix per the advisory. 2. WORKAROUND (if patching is blocked): Disable Mooncake KV distribution entirely; fall back to single-node inference or an alternative KV backend. 3. NETWORK SEGMENTATION: Apply strict firewall rules on ZMQ/TCP ports used by Mooncake. Only authenticated inference cluster nodes should reach these endpoints—block all other sources including developer and CI/CD networks. 4. ISOLATION: Place inference nodes in a dedicated network segment with no direct access from developer machines, containers, or build pipelines. 5. DETECTION: Monitor vLLM worker processes for anomalous outbound connections, unexpected child process spawns, or unauthorized file system access. Alert on any new listening sockets opened by inference processes. 6. AUDIT: Rotate any credentials, API keys, or tokens accessible from inference node environments as a precaution if exposure cannot be ruled out.
What systems are affected by CVE-2025-29783?
This vulnerability affects the following AI/ML architecture patterns: distributed LLM inference, model serving, multi-node GPU inference clusters, LLM inference pipelines.
What is the CVSS score for CVE-2025-29783?
CVE-2025-29783 has a CVSS v3.1 base score of 9.0 (CRITICAL). The EPSS exploitation probability is 2.81%.
Technical Details
NVD Description
vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP on all network interfaces will allow attackers to execute remote code on distributed hosts. This is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts. This vulnerability is fixed in 0.8.0.
Exploitation Scenario
An attacker with access to the same internal network segment as a vLLM cluster—via a compromised developer laptop, a rogue container in the same Kubernetes namespace, or an insider—scans for open ZMQ/TCP ports on inference worker nodes. Using a crafted pickle payload or other malicious serialized object, the attacker sends it directly to the Mooncake ZMQ endpoint. The unsafe deserialization triggers arbitrary code execution under the vLLM worker process. The attacker then: establishes a reverse shell for persistent access, harvests model weights and KV cache contents (which may include thousands of prior user prompts), extracts API keys or cloud credentials from the process environment, pivots laterally to other cluster nodes, and optionally backdoors the inference pipeline to manipulate LLM outputs for downstream applications without detection.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:A/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H References
- github.com/advisories/GHSA-x3m8-f7g5-qhm7
- github.com/pypa/advisory-database/tree/main/vulns/vllm/PYSEC-2025-63.yaml
- nvd.nist.gov/vuln/detail/CVE-2025-29783
- github.com/vllm-project/vllm/commit/288ca110f68d23909728627d3100e5a8db820aa2 Patch
- github.com/vllm-project/vllm/pull/14228 Issue Vendor
- github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7 Vendor
- github.com/honysyang/eleaipoc Exploit
Timeline
Related Vulnerabilities
CVE-2024-9053 9.8 vllm: RCE via unsafe pickle deserialization in RPC server
Same package: vllm CVE-2026-25960 9.8 vllm: SSRF allows internal network access
Same package: vllm CVE-2025-47277 9.8 vLLM: RCE via exposed TCPStore in distributed inference
Same package: vllm CVE-2024-11041 9.8 vllm: RCE via unsafe pickle deserialization in MessageQueue
Same package: vllm CVE-2025-32444 9.8 vLLM: RCE via pickle deserialization on ZeroMQ
Same package: vllm
AI Threat Alert