CVE-2025-47277: vLLM: RCE via exposed TCPStore in distributed inference
GHSA-hjq4-87xh-g4fv CRITICAL PoC AVAILABLE CISA: TRACK*If your organization runs vLLM in distributed mode with PyNcclPipe KV cache transfer and V0 engine, you have an unauthenticated RCE vulnerability reachable from the network — patch to 0.8.5 immediately. The TCPStore socket was binding to all interfaces instead of the private KV network, meaning any host that can reach that port can send a malicious pickle payload and execute arbitrary code. This is a 9.8 CVSS fire drill for any team running distributed LLM inference at scale.
What is the risk?
Severity is effectively maximum for affected configurations: no authentication, no user interaction, network-exploitable, and CWE-502 deserialization means RCE is the likely outcome. Scope is narrow — only PyNcclPipe + V0 engine users — but that covers high-value targets: orgs running distributed multi-GPU or multi-node vLLM inference, which are typically the largest and most sensitive deployments. EPSS at 0.865% reflects no observed exploitation yet, but the low technical barrier (find the port, send a pickle payload) means weaponization is fast. Not in KEV, but treat this as pre-KEV.
What systems are affected?
How severe is it?
What is the attack surface?
What should I do?
5 steps-
PATCH
Upgrade vLLM to 0.8.5 immediately — this is the only complete fix.
-
WORKAROUND (if patching is delayed): Use host-level firewall rules (iptables/security groups) to restrict access to the KV cache port (--kv-ip target) to only trusted inference nodes.
-
VERIFY
Confirm which nodes run PyNcclPipe + V0 engine by checking launch configs for --kv-transfer-config with PyNcclPipe and absence of --enable-v1.
-
DETECT
Scan for unexpected connections to the TCPStore port from non-inference-cluster IPs. Check for unusual process spawning from vLLM worker processes.
-
AUDIT
Review cloud security group rules — the vLLM docs warned about network isolation but the default was insecure.
What does CISA's SSVC say?
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2025-47277?
If your organization runs vLLM in distributed mode with PyNcclPipe KV cache transfer and V0 engine, you have an unauthenticated RCE vulnerability reachable from the network — patch to 0.8.5 immediately. The TCPStore socket was binding to all interfaces instead of the private KV network, meaning any host that can reach that port can send a malicious pickle payload and execute arbitrary code. This is a 9.8 CVSS fire drill for any team running distributed LLM inference at scale.
Is CVE-2025-47277 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2025-47277, increasing the risk of exploitation.
How to fix CVE-2025-47277?
1. PATCH: Upgrade vLLM to 0.8.5 immediately — this is the only complete fix. 2. WORKAROUND (if patching is delayed): Use host-level firewall rules (iptables/security groups) to restrict access to the KV cache port (--kv-ip target) to only trusted inference nodes. 3. VERIFY: Confirm which nodes run PyNcclPipe + V0 engine by checking launch configs for --kv-transfer-config with PyNcclPipe and absence of --enable-v1. 4. DETECT: Scan for unexpected connections to the TCPStore port from non-inference-cluster IPs. Check for unusual process spawning from vLLM worker processes. 5. AUDIT: Review cloud security group rules — the vLLM docs warned about network isolation but the default was insecure.
What systems are affected by CVE-2025-47277?
This vulnerability affects the following AI/ML architecture patterns: distributed LLM inference, multi-node model serving, disaggregated prefill/decode inference, multi-GPU inference clusters.
What is the CVSS score for CVE-2025-47277?
CVE-2025-47277 has a CVSS v3.1 base score of 9.8 (CRITICAL). The EPSS exploitation probability is 0.93%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.001 AI Software AML.T0025 Exfiltration via Cyber Means AML.T0049 Exploit Public-Facing Application AML.T0072 Reverse Shell Compliance Controls Affected
What are the technical details?
Original Advisory
vLLM, an inference and serving engine for large language models (LLMs), has an issue in versions 0.6.5 through 0.8.4 that ONLY impacts environments using the `PyNcclPipe` KV cache transfer integration with the V0 engine. No other configurations are affected. vLLM supports the use of the `PyNcclPipe` class to establish a peer-to-peer communication domain for data transmission between distributed nodes. The GPU-side KV-Cache transmission is implemented through the `PyNcclCommunicator` class, while CPU-side control message passing is handled via the `send_obj` and `recv_obj` methods on the CPU side. The intention was that this interface should only be exposed to a private network using the IP address specified by the `--kv-ip` CLI parameter. The vLLM documentation covers how this must be limited to a secured network. The default and intentional behavior from PyTorch is that the `TCPStore` interface listens on ALL interfaces, regardless of what IP address is provided. The IP address given was only used as a client-side address to use. vLLM was fixed to use a workaround to force the `TCPStore` instance to bind its socket to a specified private interface. As of version 0.8.5, vLLM limits the `TCPStore` socket to the private interface as configured.
Exploitation Scenario
Attacker scans for organizations running vLLM (job postings, GitHub repos, API fingerprinting). They identify a distributed inference cluster where the TCPStore port is reachable (misconfigured security group or internal network access via another compromise). Using PyTorch's distributed communication protocol, they connect to the exposed TCPStore and send a maliciously crafted serialized Python object via the recv_obj/send_obj interface. PyTorch deserializes the pickle payload, executing arbitrary code on the inference worker node. From there, the attacker can exfiltrate model weights, read KV cache contents containing other users' prompts, establish persistence, or pivot deeper into the ML infrastructure.
Weaknesses (CWE)
CWE-502 — Deserialization of Untrusted Data: The product deserializes untrusted data without sufficiently ensuring that the resulting data will be valid.
- [Architecture and Design, Implementation] If available, use the signing/sealing features of the programming language to assure that deserialized data has not been tainted. For example, a hash-based message authentication code (HMAC) could be used to ensure that data has not been modified.
- [Implementation] When deserializing data, populate a new object rather than just deserializing. The result is that the data flows through safe input validation and that the functions are safe.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H References
- github.com/advisories/GHSA-hjq4-87xh-g4fv
- nvd.nist.gov/vuln/detail/CVE-2025-47277
- docs.vllm.ai/en/latest/deployment/security.html Technical
- github.com/vllm-project/vllm/commit/0d6e187e88874c39cda7409cf673f9e6546893e7 Patch
- github.com/vllm-project/vllm/pull/15988 Issue Patch
- github.com/vllm-project/vllm/security/advisories/GHSA-hjq4-87xh-g4fv Exploit Vendor
- github.com/ARPSyndicate/cve-scores Exploit
- github.com/Threekiii/CVE Exploit
- github.com/funscoietyxboyz/funscoietyxboyz Exploit
- github.com/honysyang/eleaipoc Exploit
- github.com/tanjiti/sec_profile Exploit
Timeline
Related Vulnerabilities
CVE-2025-32444 9.8 vLLM: RCE via pickle deserialization on ZeroMQ
Same package: vllm CVE-2026-25960 9.8 vllm: SSRF allows internal network access
Same package: vllm CVE-2024-9053 9.8 vllm: RCE via unsafe pickle deserialization in RPC server
Same package: vllm CVE-2024-11041 9.8 vllm: RCE via unsafe pickle deserialization in MessageQueue
Same package: vllm CVE-2026-22807 9.8 vllm: Code Injection enables RCE
Same package: vllm