CVE-2025-48943: vLLM: ReDoS crashes inference server via malformed regex
GHSA-9hcf-v7m4-6m2j MEDIUM PoC AVAILABLEAny authenticated user of vLLM 0.8.x can crash the entire inference server by submitting a malformed regex as a structured output constraint — no special skills required. This is a shared-resource risk: one bad request takes down inference for all downstream services and agents. Upgrade to vLLM 0.9.0 immediately; if blocked, add regex validation at the API gateway layer.
What is the risk?
Operationally higher risk than CVSS 6.5 suggests for shared inference infrastructure. Low Complexity + Low Privileges means any authenticated API consumer — including internal developers, CI pipelines, or compromised service accounts — can trigger a full server crash with a single request. EPSS is negligible (0.00084), indicating no observed exploitation yet, but the attack is trivially reproducible once the advisory is public. The sibling vulnerability CVE-2025-48942 (same pattern, JSON schema instead of regex) confirms a systemic input validation gap in vLLM's structured output subsystem.
What systems are affected?
How severe is it?
What is the attack surface?
What should I do?
6 steps-
PATCH
Upgrade vLLM to >= 0.9.0 immediately (pip install --upgrade vllm).
-
WORKAROUND
If upgrade is blocked, add a pre-validation layer that sanitizes regex patterns before forwarding to vLLM — reject patterns exceeding complexity thresholds (e.g., nested quantifiers).
-
RATE-LIMIT: Apply per-user rate limiting on structured output endpoints to slow down brute-force crash attempts.
-
DETECT
Monitor for sudden vLLM process restarts or spikes in 5xx errors on structured output endpoints. Alert on repeated server crashes from the same API key/user.
-
ISOLATE
Run vLLM behind an internal-only API gateway; avoid exposing the inference API directly to untrusted users.
-
AUDIT
Review whether structured output endpoints are exposed to external or low-trust identities.
What does CISA's SSVC say?
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2025-48943?
Any authenticated user of vLLM 0.8.x can crash the entire inference server by submitting a malformed regex as a structured output constraint — no special skills required. This is a shared-resource risk: one bad request takes down inference for all downstream services and agents. Upgrade to vLLM 0.9.0 immediately; if blocked, add regex validation at the API gateway layer.
Is CVE-2025-48943 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2025-48943, increasing the risk of exploitation.
How to fix CVE-2025-48943?
1. PATCH: Upgrade vLLM to >= 0.9.0 immediately (pip install --upgrade vllm). 2. WORKAROUND: If upgrade is blocked, add a pre-validation layer that sanitizes regex patterns before forwarding to vLLM — reject patterns exceeding complexity thresholds (e.g., nested quantifiers). 3. RATE-LIMIT: Apply per-user rate limiting on structured output endpoints to slow down brute-force crash attempts. 4. DETECT: Monitor for sudden vLLM process restarts or spikes in 5xx errors on structured output endpoints. Alert on repeated server crashes from the same API key/user. 5. ISOLATE: Run vLLM behind an internal-only API gateway; avoid exposing the inference API directly to untrusted users. 6. AUDIT: Review whether structured output endpoints are exposed to external or low-trust identities.
What systems are affected by CVE-2025-48943?
This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, structured output pipelines, AI agent frameworks, multi-tenant LLM API gateways, RAG pipelines with constrained generation.
What is the CVSS score for CVE-2025-48943?
CVE-2025-48943 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.40%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0029 Denial of AI Service AML.T0034 Cost Harvesting AML.T0040 AI Model Inference API Access AML.T0049 Exploit Public-Facing Application Compliance Controls Affected
What are the technical details?
Original Advisory
vLLM is an inference and serving engine for large language models (LLMs). Version 0.8.0 up to but excluding 0.9.0 have a Denial of Service (ReDoS) that causes the vLLM server to crash if an invalid regex was provided while using structured output. This vulnerability is similar to GHSA-6qc9-v4r8-22xg/CVE-2025-48942, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue.
Exploitation Scenario
An adversary with low-privilege API access (e.g., a developer API key, a compromised service account, or a malicious internal user) sends a POST request to the vLLM inference endpoint with a carefully crafted regex pattern in the guided generation parameters — such as a pattern with catastrophic backtracking like `(a+)+$`. The vLLM server attempts to compile and validate the regex, triggering exponential backtracking that consumes all CPU and crashes the process. All concurrent inference requests fail. In a Kubernetes environment without proper liveness probes, the pod may hang rather than restart, causing a prolonged outage. An adversary can repeat this attack to maintain denial of service against a critical LLM serving layer.
Weaknesses (CWE)
CWE-248 — Uncaught Exception: An exception is thrown from a function, but it is not caught.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H References
- github.com/advisories/GHSA-9hcf-v7m4-6m2j
- github.com/pypa/advisory-database/tree/main/vulns/vllm/PYSEC-2025-55.yaml
- nvd.nist.gov/vuln/detail/CVE-2025-48943
- github.com/vllm-project/vllm/commit/08bf7840780980c7568c573c70a6a8db94fd45ff Patch
- github.com/vllm-project/vllm/issues/17313 Issue
- github.com/vllm-project/vllm/pull/17623 Issue Patch
- github.com/vllm-project/vllm/security/advisories/GHSA-9hcf-v7m4-6m2j Vendor
- github.com/ARPSyndicate/cve-scores Exploit
Timeline
Related Vulnerabilities
CVE-2024-9053 9.8 vllm: RCE via unsafe pickle deserialization in RPC server
Same package: vllm CVE-2024-11041 9.8 vllm: RCE via unsafe pickle deserialization in MessageQueue
Same package: vllm CVE-2026-25960 9.8 vllm: SSRF allows internal network access
Same package: vllm CVE-2025-47277 9.8 vLLM: RCE via exposed TCPStore in distributed inference
Same package: vllm CVE-2025-32444 9.8 vLLM: RCE via pickle deserialization on ZeroMQ
Same package: vllm