CVE-2024-11041: vllm: RCE via unsafe pickle deserialization in MessageQueue

GHSA-5vqr-wprc-cpp7 CRITICAL PoC AVAILABLE CISA: ATTEND
Published March 20, 2025
CISO Take

Any attacker with network access to a vllm v0.6.2 inference server can achieve full remote code execution with zero authentication required. This is trivially exploitable on one of the most widely deployed open-source LLM inference engines. Upgrade immediately; if patching is blocked, firewall the distributed MessageQueue ports to trusted hosts only.

Risk Assessment

Critical risk for organizations running vllm for on-premises or private cloud LLM inference. CVSS 9.8 reflects the worst-case profile: network-exploitable, no authentication, no user interaction, full C/I/A compromise. vllm powers LLaMA, Mistral, Qwen, and similar deployments at scale. Multi-node and multi-GPU distributed inference configurations are most exposed since the MessageQueue is used for inter-process communication across nodes. EPSS of 1.25% suggests exploitation is not yet widespread, but the barrier is extremely low—any standard pickle payload generator produces a working exploit.

Affected Systems

Package Ecosystem Vulnerable Range Patched
vllm pip No patch
79.5K 127 dependents Pushed today 56% patched ~32d to patch Full package profile →
vllm pip <= 0.6.2 No patch
79.5K 127 dependents Pushed today 56% patched ~32d to patch Full package profile →

Severity & Risk

CVSS 3.1
9.8 / 10
EPSS
5.6%
chance of exploitation in 30 days
Higher than 90% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Unchanged
C High
I High
A High

Recommended Action

5 steps
  1. PATCH

    Upgrade vllm beyond v0.6.2—verify the fix is present in the target release.

  2. ISOLATE

    Restrict network access to vllm inter-process communication ports via firewall rules, namespace isolation, or VPC security groups to trusted hosts only.

  3. DETECT

    Monitor inference servers for unexpected outbound connections, new listening ports, and anomalous process spawning from vllm worker processes.

  4. AUDIT

    Review who has network-level access to vllm serving infrastructure and enforce least-privilege networking.

  5. WORKAROUND

    If immediate patching is blocked, wrap MessageQueue transport in an authenticated/signed layer or replace pickle with a safe serialization format (JSON, msgpack).

CISA SSVC Assessment

Decision Attend
Exploitation poc
Automatable Yes
Technical Impact total

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2.6 - AI system security controls
NIST AI RMF
MANAGE 2.4 - Residual risks are managed and monitored
OWASP LLM Top 10
LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2024-11041?

Any attacker with network access to a vllm v0.6.2 inference server can achieve full remote code execution with zero authentication required. This is trivially exploitable on one of the most widely deployed open-source LLM inference engines. Upgrade immediately; if patching is blocked, firewall the distributed MessageQueue ports to trusted hosts only.

Is CVE-2024-11041 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2024-11041, increasing the risk of exploitation.

How to fix CVE-2024-11041?

1. PATCH: Upgrade vllm beyond v0.6.2—verify the fix is present in the target release. 2. ISOLATE: Restrict network access to vllm inter-process communication ports via firewall rules, namespace isolation, or VPC security groups to trusted hosts only. 3. DETECT: Monitor inference servers for unexpected outbound connections, new listening ports, and anomalous process spawning from vllm worker processes. 4. AUDIT: Review who has network-level access to vllm serving infrastructure and enforce least-privilege networking. 5. WORKAROUND: If immediate patching is blocked, wrap MessageQueue transport in an authenticated/signed layer or replace pickle with a safe serialization format (JSON, msgpack).

What systems are affected by CVE-2024-11041?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, distributed multi-GPU inference, multi-node inference clusters, on-premises model serving, AI serving infrastructure.

What is the CVSS score for CVE-2024-11041?

CVE-2024-11041 has a CVSS v3.1 base score of 9.8 (CRITICAL). The EPSS exploitation probability is 5.60%.

Technical Details

NVD Description

vllm-project vllm version v0.6.2 contains a vulnerability in the MessageQueue.dequeue() API function. The function uses pickle.loads to parse received sockets directly, leading to a remote code execution vulnerability. An attacker can exploit this by sending a malicious payload to the MessageQueue, causing the victim's machine to execute arbitrary code.

Exploitation Scenario

An adversary with internal network access (lateral movement from compromised workstation, or exposed vllm endpoint) scans for the vllm MessageQueue socket. Using standard Python tooling (pickletools, pwntools) they craft a malicious pickle payload that spawns a reverse shell. They send the payload directly to the MessageQueue. When the vllm worker calls dequeue(), pickle.loads() executes the payload without any checks. The attacker lands on a GPU server with access to model weights, internal APIs, and cloud credentials in environment variables—enabling model exfiltration, lateral movement through the AI serving cluster, or persistent backdoor installation.

CVSS Vector

CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H

Timeline

Published
March 20, 2025
Last Modified
July 31, 2025
First Seen
March 20, 2025

Related Vulnerabilities