CVE-2025-48942: vLLM: DoS via malformed JSON schema guided param

GHSA-6qc9-v4r8-22xg MEDIUM PoC AVAILABLE CISA: TRACK*
Published May 30, 2025
CISO Take

Any authenticated user with API access to vLLM 0.8.x can crash the entire inference server by sending a malformed JSON schema as a guided completion parameter — no skill required. Upgrade to vLLM 0.9.0 immediately; this patches both this issue and the companion regex DoS (CVE-2025-48943). In shared or multi-tenant inference environments, a single bad request brings down the service for all consumers.

What is the risk?

Operational risk is higher than CVSS 6.5 suggests in AI serving contexts. Low complexity, network-accessible, and requires only authenticated access — making it exploitable by any internal user or compromised API token holder. In organizations running vLLM as a shared inference backend for RAG pipelines, AI assistants, or internal tooling, a single malicious or misconfigured request causes full service outage. No confidentiality or integrity impact, but availability loss in AI inference is often business-critical.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
vLLM pip No patch
83.4K 130 dependents Pushed 2d ago 34% patched ~32d to patch Full package profile →
vLLM pip >= 0.8.0, < 0.9.0 0.9.0
83.4K 130 dependents Pushed 2d ago 34% patched ~32d to patch Full package profile →

How severe is it?

CVSS 3.1
6.5 / 10
EPSS
0.5%
chance of exploitation in 30 days
Higher than 36% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

What should I do?

5 steps
  1. Upgrade vLLM to 0.9.0 — patches both CVE-2025-48942 (JSON schema) and CVE-2025-48943 (regex).

  2. If immediate upgrade is not possible, enforce input validation at the API gateway or reverse proxy layer: reject guided_json parameters that fail JSON Schema validation before forwarding to vLLM.

  3. Implement API authentication hardening — minimize the number of principals with /v1/completions access.

  4. Deploy process supervision (systemd, supervisor, or k8s liveness probes) to auto-restart vLLM on crash and reduce MTTR.

  5. Monitor for repeated 5xx errors or process restarts on the inference server as a detection signal for exploitation attempts.

What does CISA's SSVC say?

Decision Track*
Exploitation poc
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 9 - Risk Management System
ISO 42001
A.6.2.6 - AI System Operational Continuity
NIST AI RMF
MANAGE 2.2 - AI Risk Treatment
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2025-48942?

Any authenticated user with API access to vLLM 0.8.x can crash the entire inference server by sending a malformed JSON schema as a guided completion parameter — no skill required. Upgrade to vLLM 0.9.0 immediately; this patches both this issue and the companion regex DoS (CVE-2025-48943). In shared or multi-tenant inference environments, a single bad request brings down the service for all consumers.

Is CVE-2025-48942 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-48942, increasing the risk of exploitation.

How to fix CVE-2025-48942?

1. Upgrade vLLM to 0.9.0 — patches both CVE-2025-48942 (JSON schema) and CVE-2025-48943 (regex). 2. If immediate upgrade is not possible, enforce input validation at the API gateway or reverse proxy layer: reject guided_json parameters that fail JSON Schema validation before forwarding to vLLM. 3. Implement API authentication hardening — minimize the number of principals with /v1/completions access. 4. Deploy process supervision (systemd, supervisor, or k8s liveness probes) to auto-restart vLLM on crash and reduce MTTR. 5. Monitor for repeated 5xx errors or process restarts on the inference server as a detection signal for exploitation attempts.

What systems are affected by CVE-2025-48942?

This vulnerability affects the following AI/ML architecture patterns: LLM inference APIs, model serving, RAG pipelines, agent frameworks.

What is the CVSS score for CVE-2025-48942?

CVE-2025-48942 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.45%.

What is the AI security impact?

Affected AI Architectures

LLM inference APIsmodel servingRAG pipelinesagent frameworks

MITRE ATLAS Techniques

AML.T0029 Denial of AI Service
AML.T0040 AI Model Inference API Access
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 9
ISO 42001: A.6.2.6
NIST AI RMF: MANAGE 2.2
OWASP LLM Top 10: LLM04

What are the technical details?

Original Advisory

vLLM is an inference and serving engine for large language models (LLMs). In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid json_schema as a Guided Param kills the vllm server. This vulnerability is similar GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue.

Exploitation Scenario

An internal developer or compromised API token holder sends a POST to /v1/completions with an intentionally malformed JSON schema in the guided_json field (e.g., a syntactically invalid schema or one with circular references). vLLM fails to validate or catch the exception when parsing the schema, causing an unhandled crash that terminates the server process. In a Kubernetes deployment without proper liveness probes, the pod restarts but the attacker can repeat the request on each recovery, sustaining a denial-of-service that disrupts all dependent AI applications until the cluster operator deploys 0.9.0.

Weaknesses (CWE)

CWE-248 — Uncaught Exception: An exception is thrown from a function, but it is not caught.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 30, 2025
Last Modified
June 27, 2025
First Seen
May 30, 2025

Related Vulnerabilities