CVE-2025-48943: vLLM: ReDoS crashes inference server via malformed regex

GHSA-9hcf-v7m4-6m2j MEDIUM PoC AVAILABLE
Published May 30, 2025
CISO Take

Any authenticated user of vLLM 0.8.x can crash the entire inference server by submitting a malformed regex as a structured output constraint — no special skills required. This is a shared-resource risk: one bad request takes down inference for all downstream services and agents. Upgrade to vLLM 0.9.0 immediately; if blocked, add regex validation at the API gateway layer.

Risk Assessment

Operationally higher risk than CVSS 6.5 suggests for shared inference infrastructure. Low Complexity + Low Privileges means any authenticated API consumer — including internal developers, CI pipelines, or compromised service accounts — can trigger a full server crash with a single request. EPSS is negligible (0.00084), indicating no observed exploitation yet, but the attack is trivially reproducible once the advisory is public. The sibling vulnerability CVE-2025-48942 (same pattern, JSON schema instead of regex) confirms a systemic input validation gap in vLLM's structured output subsystem.

Affected Systems

Package Ecosystem Vulnerable Range Patched
vllm pip No patch
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →
vllm pip >= 0.8.0, < 0.9.0 0.9.0
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →

Severity & Risk

CVSS 3.1
6.5 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 47% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Network
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

Recommended Action

6 steps
  1. PATCH

    Upgrade vLLM to >= 0.9.0 immediately (pip install --upgrade vllm).

  2. WORKAROUND

    If upgrade is blocked, add a pre-validation layer that sanitizes regex patterns before forwarding to vLLM — reject patterns exceeding complexity thresholds (e.g., nested quantifiers).

  3. RATE-LIMIT: Apply per-user rate limiting on structured output endpoints to slow down brute-force crash attempts.

  4. DETECT

    Monitor for sudden vLLM process restarts or spikes in 5xx errors on structured output endpoints. Alert on repeated server crashes from the same API key/user.

  5. ISOLATE

    Run vLLM behind an internal-only API gateway; avoid exposing the inference API directly to untrusted users.

  6. AUDIT

    Review whether structured output endpoints are exposed to external or low-trust identities.

CISA SSVC Assessment

Decision Track
Exploitation none
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2.8 - AI system availability and performance
NIST AI RMF
GOVERN-1.7 - Processes for AI risk identification and response MANAGE-2.2 - Mechanisms to sustain AI system value and manage risks over time
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2025-48943?

Any authenticated user of vLLM 0.8.x can crash the entire inference server by submitting a malformed regex as a structured output constraint — no special skills required. This is a shared-resource risk: one bad request takes down inference for all downstream services and agents. Upgrade to vLLM 0.9.0 immediately; if blocked, add regex validation at the API gateway layer.

Is CVE-2025-48943 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-48943, increasing the risk of exploitation.

How to fix CVE-2025-48943?

1. PATCH: Upgrade vLLM to >= 0.9.0 immediately (pip install --upgrade vllm). 2. WORKAROUND: If upgrade is blocked, add a pre-validation layer that sanitizes regex patterns before forwarding to vLLM — reject patterns exceeding complexity thresholds (e.g., nested quantifiers). 3. RATE-LIMIT: Apply per-user rate limiting on structured output endpoints to slow down brute-force crash attempts. 4. DETECT: Monitor for sudden vLLM process restarts or spikes in 5xx errors on structured output endpoints. Alert on repeated server crashes from the same API key/user. 5. ISOLATE: Run vLLM behind an internal-only API gateway; avoid exposing the inference API directly to untrusted users. 6. AUDIT: Review whether structured output endpoints are exposed to external or low-trust identities.

What systems are affected by CVE-2025-48943?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, structured output pipelines, AI agent frameworks, multi-tenant LLM API gateways, RAG pipelines with constrained generation.

What is the CVSS score for CVE-2025-48943?

CVE-2025-48943 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.24%.

Technical Details

NVD Description

vLLM is an inference and serving engine for large language models (LLMs). Version 0.8.0 up to but excluding 0.9.0 have a Denial of Service (ReDoS) that causes the vLLM server to crash if an invalid regex was provided while using structured output. This vulnerability is similar to GHSA-6qc9-v4r8-22xg/CVE-2025-48942, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue.

Exploitation Scenario

An adversary with low-privilege API access (e.g., a developer API key, a compromised service account, or a malicious internal user) sends a POST request to the vLLM inference endpoint with a carefully crafted regex pattern in the guided generation parameters — such as a pattern with catastrophic backtracking like `(a+)+$`. The vLLM server attempts to compile and validate the regex, triggering exponential backtracking that consumes all CPU and crashes the process. All concurrent inference requests fail. In a Kubernetes environment without proper liveness probes, the pod may hang rather than restart, causing a prolonged outage. An adversary can repeat this attack to maintain denial of service against a critical LLM serving layer.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 30, 2025
Last Modified
June 27, 2025
First Seen
May 30, 2025

Related Vulnerabilities