CVE-2025-48942: vLLM: DoS via malformed JSON schema guided param
GHSA-6qc9-v4r8-22xg MEDIUM PoC AVAILABLE CISA: TRACK*Any authenticated user with API access to vLLM 0.8.x can crash the entire inference server by sending a malformed JSON schema as a guided completion parameter — no skill required. Upgrade to vLLM 0.9.0 immediately; this patches both this issue and the companion regex DoS (CVE-2025-48943). In shared or multi-tenant inference environments, a single bad request brings down the service for all consumers.
What is the risk?
Operational risk is higher than CVSS 6.5 suggests in AI serving contexts. Low complexity, network-accessible, and requires only authenticated access — making it exploitable by any internal user or compromised API token holder. In organizations running vLLM as a shared inference backend for RAG pipelines, AI assistants, or internal tooling, a single malicious or misconfigured request causes full service outage. No confidentiality or integrity impact, but availability loss in AI inference is often business-critical.
What systems are affected?
How severe is it?
What is the attack surface?
What should I do?
5 steps-
Upgrade vLLM to 0.9.0 — patches both CVE-2025-48942 (JSON schema) and CVE-2025-48943 (regex).
-
If immediate upgrade is not possible, enforce input validation at the API gateway or reverse proxy layer: reject guided_json parameters that fail JSON Schema validation before forwarding to vLLM.
-
Implement API authentication hardening — minimize the number of principals with /v1/completions access.
-
Deploy process supervision (systemd, supervisor, or k8s liveness probes) to auto-restart vLLM on crash and reduce MTTR.
-
Monitor for repeated 5xx errors or process restarts on the inference server as a detection signal for exploitation attempts.
What does CISA's SSVC say?
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2025-48942?
Any authenticated user with API access to vLLM 0.8.x can crash the entire inference server by sending a malformed JSON schema as a guided completion parameter — no skill required. Upgrade to vLLM 0.9.0 immediately; this patches both this issue and the companion regex DoS (CVE-2025-48943). In shared or multi-tenant inference environments, a single bad request brings down the service for all consumers.
Is CVE-2025-48942 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2025-48942, increasing the risk of exploitation.
How to fix CVE-2025-48942?
1. Upgrade vLLM to 0.9.0 — patches both CVE-2025-48942 (JSON schema) and CVE-2025-48943 (regex). 2. If immediate upgrade is not possible, enforce input validation at the API gateway or reverse proxy layer: reject guided_json parameters that fail JSON Schema validation before forwarding to vLLM. 3. Implement API authentication hardening — minimize the number of principals with /v1/completions access. 4. Deploy process supervision (systemd, supervisor, or k8s liveness probes) to auto-restart vLLM on crash and reduce MTTR. 5. Monitor for repeated 5xx errors or process restarts on the inference server as a detection signal for exploitation attempts.
What systems are affected by CVE-2025-48942?
This vulnerability affects the following AI/ML architecture patterns: LLM inference APIs, model serving, RAG pipelines, agent frameworks.
What is the CVSS score for CVE-2025-48942?
CVE-2025-48942 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.45%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0029 Denial of AI Service AML.T0040 AI Model Inference API Access AML.T0049 Exploit Public-Facing Application Compliance Controls Affected
What are the technical details?
Original Advisory
vLLM is an inference and serving engine for large language models (LLMs). In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid json_schema as a Guided Param kills the vllm server. This vulnerability is similar GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue.
Exploitation Scenario
An internal developer or compromised API token holder sends a POST to /v1/completions with an intentionally malformed JSON schema in the guided_json field (e.g., a syntactically invalid schema or one with circular references). vLLM fails to validate or catch the exception when parsing the schema, causing an unhandled crash that terminates the server process. In a Kubernetes deployment without proper liveness probes, the pod restarts but the attacker can repeat the request on each recovery, sustaining a denial-of-service that disrupts all dependent AI applications until the cluster operator deploys 0.9.0.
Weaknesses (CWE)
CWE-248 — Uncaught Exception: An exception is thrown from a function, but it is not caught.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H References
- github.com/advisories/GHSA-6qc9-v4r8-22xg
- github.com/pypa/advisory-database/tree/main/vulns/vllm/PYSEC-2025-54.yaml
- nvd.nist.gov/vuln/detail/CVE-2025-48942
- github.com/vllm-project/vllm/commit/08bf7840780980c7568c573c70a6a8db94fd45ff Patch
- github.com/vllm-project/vllm/issues/17248 Issue
- github.com/vllm-project/vllm/pull/17623 Issue Patch
- github.com/vllm-project/vllm/security/advisories/GHSA-6qc9-v4r8-22xg Exploit Vendor
- github.com/ARPSyndicate/cve-scores Exploit
Timeline
Related Vulnerabilities
CVE-2024-9053 9.8 vllm: RCE via unsafe pickle deserialization in RPC server
Same package: vllm CVE-2024-11041 9.8 vllm: RCE via unsafe pickle deserialization in MessageQueue
Same package: vllm CVE-2025-47277 9.8 vLLM: RCE via exposed TCPStore in distributed inference
Same package: vllm CVE-2026-25960 9.8 vllm: SSRF allows internal network access
Same package: vllm CVE-2025-32444 9.8 vLLM: RCE via pickle deserialization on ZeroMQ
Same package: vllm