CVE-2024-11041: vllm: RCE via unsafe pickle deserialization in MessageQueue
GHSA-5vqr-wprc-cpp7 CRITICAL PoC AVAILABLE CISA: ATTENDAny attacker with network access to a vllm v0.6.2 inference server can achieve full remote code execution with zero authentication required. This is trivially exploitable on one of the most widely deployed open-source LLM inference engines. Upgrade immediately; if patching is blocked, firewall the distributed MessageQueue ports to trusted hosts only.
Risk Assessment
Critical risk for organizations running vllm for on-premises or private cloud LLM inference. CVSS 9.8 reflects the worst-case profile: network-exploitable, no authentication, no user interaction, full C/I/A compromise. vllm powers LLaMA, Mistral, Qwen, and similar deployments at scale. Multi-node and multi-GPU distributed inference configurations are most exposed since the MessageQueue is used for inter-process communication across nodes. EPSS of 1.25% suggests exploitation is not yet widespread, but the barrier is extremely low—any standard pickle payload generator produces a working exploit.
Affected Systems
Severity & Risk
Attack Surface
Recommended Action
5 steps-
PATCH
Upgrade vllm beyond v0.6.2—verify the fix is present in the target release.
-
ISOLATE
Restrict network access to vllm inter-process communication ports via firewall rules, namespace isolation, or VPC security groups to trusted hosts only.
-
DETECT
Monitor inference servers for unexpected outbound connections, new listening ports, and anomalous process spawning from vllm worker processes.
-
AUDIT
Review who has network-level access to vllm serving infrastructure and enforce least-privilege networking.
-
WORKAROUND
If immediate patching is blocked, wrap MessageQueue transport in an authenticated/signed layer or replace pickle with a safe serialization format (JSON, msgpack).
CISA SSVC Assessment
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2024-11041?
Any attacker with network access to a vllm v0.6.2 inference server can achieve full remote code execution with zero authentication required. This is trivially exploitable on one of the most widely deployed open-source LLM inference engines. Upgrade immediately; if patching is blocked, firewall the distributed MessageQueue ports to trusted hosts only.
Is CVE-2024-11041 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2024-11041, increasing the risk of exploitation.
How to fix CVE-2024-11041?
1. PATCH: Upgrade vllm beyond v0.6.2—verify the fix is present in the target release. 2. ISOLATE: Restrict network access to vllm inter-process communication ports via firewall rules, namespace isolation, or VPC security groups to trusted hosts only. 3. DETECT: Monitor inference servers for unexpected outbound connections, new listening ports, and anomalous process spawning from vllm worker processes. 4. AUDIT: Review who has network-level access to vllm serving infrastructure and enforce least-privilege networking. 5. WORKAROUND: If immediate patching is blocked, wrap MessageQueue transport in an authenticated/signed layer or replace pickle with a safe serialization format (JSON, msgpack).
What systems are affected by CVE-2024-11041?
This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, distributed multi-GPU inference, multi-node inference clusters, on-premises model serving, AI serving infrastructure.
What is the CVSS score for CVE-2024-11041?
CVE-2024-11041 has a CVSS v3.1 base score of 9.8 (CRITICAL). The EPSS exploitation probability is 5.60%.
Technical Details
NVD Description
vllm-project vllm version v0.6.2 contains a vulnerability in the MessageQueue.dequeue() API function. The function uses pickle.loads to parse received sockets directly, leading to a remote code execution vulnerability. An attacker can exploit this by sending a malicious payload to the MessageQueue, causing the victim's machine to execute arbitrary code.
Exploitation Scenario
An adversary with internal network access (lateral movement from compromised workstation, or exposed vllm endpoint) scans for the vllm MessageQueue socket. Using standard Python tooling (pickletools, pwntools) they craft a malicious pickle payload that spawns a reverse shell. They send the payload directly to the MessageQueue. When the vllm worker calls dequeue(), pickle.loads() executes the payload without any checks. The attacker lands on a GPU server with access to model weights, internal APIs, and cloud credentials in environment variables—enabling model exfiltration, lateral movement through the AI serving cluster, or persistent backdoor installation.
Weaknesses (CWE)
CVSS Vector
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H References
Timeline
Related Vulnerabilities
CVE-2025-32444 9.8 vLLM: RCE via pickle deserialization on ZeroMQ
Same package: vllm CVE-2024-9053 9.8 vllm: RCE via unsafe pickle deserialization in RPC server
Same package: vllm CVE-2026-25960 9.8 vllm: SSRF allows internal network access
Same package: vllm CVE-2025-47277 9.8 vLLM: RCE via exposed TCPStore in distributed inference
Same package: vllm CVE-2026-22807 9.8 vllm: Code Injection enables RCE
Same package: vllm
AI Threat Alert