CVE-2025-48944: vLLM: input validation DoS crashes inference worker

GHSA-vrq3-r879-7m65 MEDIUM PoC AVAILABLE CISA: TRACK*
Published May 30, 2025
CISO Take

Any authenticated API user can crash a vLLM inference worker with a single malformed tools request, causing service outage until manually restarted. Low-privilege access requirement makes this a realistic insider or compromised-credential threat in multi-tenant LLM deployments. Patch to vLLM 0.9.0 immediately—there is no safe workaround short of disabling tools functionality entirely.

Risk Assessment

Medium-high operational risk despite the CVSS 6.5 score. Attack complexity is low and only low-privilege API credentials are required, making exploitation trivial for any user with API access. The impact is total disruption of the affected inference worker with no automatic recovery—manual intervention required each time. Organizations running vLLM 0.8.x with the tools API exposed face persistent availability risk from any authenticated principal, internal or external.

Affected Systems

Package Ecosystem Vulnerable Range Patched
vllm pip No patch
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →
vllm pip >= 0.8.0, < 0.9.0 0.9.0
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →

Severity & Risk

CVSS 3.1
6.5 / 10
EPSS
0.3%
chance of exploitation in 30 days
Higher than 55% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Network
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

Recommended Action

1 step
  1. 1) Upgrade vLLM to 0.9.0 immediately—this is the only complete fix. 2) If patching is blocked, restrict /v1/chat/completions tools functionality to trusted internal principals at the API gateway level, or disable it entirely. 3) Implement automatic worker restart via systemd watchdog, Kubernetes liveness probes, or equivalent to minimize MTTR if exploitation occurs. 4) Add API gateway or WAF rules to validate and reject structurally malformed 'pattern' and 'type' fields in tool definitions before they reach the inference worker. 5) Alert on unexpected vLLM worker process terminations—treat crashes as potential exploitation indicators.

CISA SSVC Assessment

Decision Track*
Exploitation poc
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Art. 15 - Accuracy, robustness and cybersecurity
ISO 42001
8.1 - Operational planning and control
NIST AI RMF
MANAGE 2.2 - Mechanisms to sustain AI system performance
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2025-48944?

Any authenticated API user can crash a vLLM inference worker with a single malformed tools request, causing service outage until manually restarted. Low-privilege access requirement makes this a realistic insider or compromised-credential threat in multi-tenant LLM deployments. Patch to vLLM 0.9.0 immediately—there is no safe workaround short of disabling tools functionality entirely.

Is CVE-2025-48944 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-48944, increasing the risk of exploitation.

How to fix CVE-2025-48944?

1) Upgrade vLLM to 0.9.0 immediately—this is the only complete fix. 2) If patching is blocked, restrict /v1/chat/completions tools functionality to trusted internal principals at the API gateway level, or disable it entirely. 3) Implement automatic worker restart via systemd watchdog, Kubernetes liveness probes, or equivalent to minimize MTTR if exploitation occurs. 4) Add API gateway or WAF rules to validate and reject structurally malformed 'pattern' and 'type' fields in tool definitions before they reach the inference worker. 5) Alert on unexpected vLLM worker process terminations—treat crashes as potential exploitation indicators.

What systems are affected by CVE-2025-48944?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, agent frameworks, function-calling pipelines, model serving APIs, RAG pipelines with tool use.

What is the CVSS score for CVE-2025-48944?

CVE-2025-48944 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.32%.

Technical Details

NVD Description

vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in the "pattern" and "type" fields when the tools functionality is invoked. These inputs are not validated before being compiled or parsed, causing a crash of the inference worker with a single request. The worker will remain down until it is restarted. Version 0.9.0 fixes the issue.

Exploitation Scenario

An attacker with basic API credentials—or a compromised low-privilege service account—sends a single POST request to /v1/chat/completions with a crafted 'tools' array containing a malformed 'pattern' field (e.g., an invalid regex that fails to compile) or a structurally invalid 'type' annotation. The vLLM backend attempts to compile or parse this input without prior validation, triggering an unhandled exception that terminates the inference worker process. The LLM service is fully unavailable until an operator manually restarts the worker. In a shared or multi-tenant LLM platform, a single request denies service to all downstream users of that worker. The attack requires no AI/ML expertise—only knowledge of the OpenAI-compatible tools API schema.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 30, 2025
Last Modified
July 1, 2025
First Seen
May 30, 2025

Related Vulnerabilities