CVE-2025-48942: vLLM: DoS via malformed JSON schema guided param

GHSA-6qc9-v4r8-22xg MEDIUM PoC AVAILABLE CISA: TRACK*
Published May 30, 2025
CISO Take

Any authenticated user with API access to vLLM 0.8.x can crash the entire inference server by sending a malformed JSON schema as a guided completion parameter — no skill required. Upgrade to vLLM 0.9.0 immediately; this patches both this issue and the companion regex DoS (CVE-2025-48943). In shared or multi-tenant inference environments, a single bad request brings down the service for all consumers.

Risk Assessment

Operational risk is higher than CVSS 6.5 suggests in AI serving contexts. Low complexity, network-accessible, and requires only authenticated access — making it exploitable by any internal user or compromised API token holder. In organizations running vLLM as a shared inference backend for RAG pipelines, AI assistants, or internal tooling, a single malicious or misconfigured request causes full service outage. No confidentiality or integrity impact, but availability loss in AI inference is often business-critical.

Affected Systems

Package Ecosystem Vulnerable Range Patched
vllm pip No patch
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →
vllm pip >= 0.8.0, < 0.9.0 0.9.0
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →

Severity & Risk

CVSS 3.1
6.5 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 43% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Network
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

Recommended Action

5 steps
  1. Upgrade vLLM to 0.9.0 — patches both CVE-2025-48942 (JSON schema) and CVE-2025-48943 (regex).

  2. If immediate upgrade is not possible, enforce input validation at the API gateway or reverse proxy layer: reject guided_json parameters that fail JSON Schema validation before forwarding to vLLM.

  3. Implement API authentication hardening — minimize the number of principals with /v1/completions access.

  4. Deploy process supervision (systemd, supervisor, or k8s liveness probes) to auto-restart vLLM on crash and reduce MTTR.

  5. Monitor for repeated 5xx errors or process restarts on the inference server as a detection signal for exploitation attempts.

CISA SSVC Assessment

Decision Track*
Exploitation poc
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 9 - Risk Management System
ISO 42001
A.6.2.6 - AI System Operational Continuity
NIST AI RMF
MANAGE 2.2 - AI Risk Treatment
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2025-48942?

Any authenticated user with API access to vLLM 0.8.x can crash the entire inference server by sending a malformed JSON schema as a guided completion parameter — no skill required. Upgrade to vLLM 0.9.0 immediately; this patches both this issue and the companion regex DoS (CVE-2025-48943). In shared or multi-tenant inference environments, a single bad request brings down the service for all consumers.

Is CVE-2025-48942 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-48942, increasing the risk of exploitation.

How to fix CVE-2025-48942?

1. Upgrade vLLM to 0.9.0 — patches both CVE-2025-48942 (JSON schema) and CVE-2025-48943 (regex). 2. If immediate upgrade is not possible, enforce input validation at the API gateway or reverse proxy layer: reject guided_json parameters that fail JSON Schema validation before forwarding to vLLM. 3. Implement API authentication hardening — minimize the number of principals with /v1/completions access. 4. Deploy process supervision (systemd, supervisor, or k8s liveness probes) to auto-restart vLLM on crash and reduce MTTR. 5. Monitor for repeated 5xx errors or process restarts on the inference server as a detection signal for exploitation attempts.

What systems are affected by CVE-2025-48942?

This vulnerability affects the following AI/ML architecture patterns: LLM inference APIs, model serving, RAG pipelines, agent frameworks.

What is the CVSS score for CVE-2025-48942?

CVE-2025-48942 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.21%.

Technical Details

NVD Description

vLLM is an inference and serving engine for large language models (LLMs). In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid json_schema as a Guided Param kills the vllm server. This vulnerability is similar GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue.

Exploitation Scenario

An internal developer or compromised API token holder sends a POST to /v1/completions with an intentionally malformed JSON schema in the guided_json field (e.g., a syntactically invalid schema or one with circular references). vLLM fails to validate or catch the exception when parsing the schema, causing an unhandled crash that terminates the server process. In a Kubernetes deployment without proper liveness probes, the pod restarts but the attacker can repeat the request on each recovery, sustaining a denial-of-service that disrupts all dependent AI applications until the cluster operator deploys 0.9.0.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 30, 2025
Last Modified
June 27, 2025
First Seen
May 30, 2025

Related Vulnerabilities