CVE-2025-57809: xgrammar: uncontrolled recursion in grammar parsing causes DoS

GHSA-5cmr-4px5-23pc HIGH PoC AVAILABLE
Published August 25, 2025
CISO Take

Any LLM inference service using xgrammar for structured/constrained generation (vLLM, MLC-LLM, llama.cpp wrappers) is vulnerable to a remote, unauthenticated DoS via a crafted user-supplied grammar. The attack requires zero privileges and no user interaction — a single malformed grammar crashes the inference server. Patch to xgrammar >= 0.1.21 immediately; if patching is blocked, disable structured output / grammar endpoints as a temporary workaround.

Risk Assessment

High severity (CVSS 7.5) with a trivially low exploitation barrier: network-accessible, no authentication, no user interaction, low complexity. The EPSS score (0.031%) reflects early-stage disclosure but expect rapid exploitation given the simplicity. Impact is limited to availability — no confidentiality or integrity loss — but for production AI inference infrastructure serving multiple tenants or downstream agents, repeated DoS is operationally critical. Particularly dangerous in multi-tenant LLM API platforms where grammar endpoints are externally exposed.

Affected Systems

Package Ecosystem Vulnerable Range Patched
xgrammar pip < 0.1.21 0.1.21
1.7K 152 dependents Pushed 8d ago 100% patched ~5d to patch Full package profile →

Do you use xgrammar? You're affected.

Severity & Risk

CVSS 3.1
7.5 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 9% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Unchanged
C None
I None
A High

Recommended Action

5 steps
  1. PATCH

    Upgrade xgrammar to >= 0.1.21 (fix commit b943feac). Check transitive dependencies — vLLM and similar frameworks pin xgrammar internally.

  2. WORKAROUND (if patching delayed): Disable or gate structured output endpoints behind authentication; reject user-supplied grammars at the API gateway layer.

  3. DETECTION

    Monitor for unexpected inference worker crashes or stack overflow signals (SIGSEGV, SIGABRT) in process logs. Alert on abnormal grammar submission rates.

  4. VALIDATION

    Add grammar size/depth limits at the application layer before passing to xgrammar as defense-in-depth.

  5. VERIFY

    Run pip show xgrammar across all inference environments to confirm version.

CISA SSVC Assessment

Decision Track
Exploitation none
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.9.3 - Operational processes for AI systems
NIST AI RMF
MANAGE-2.2 - Mechanisms to address AI risks and harms
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2025-57809?

Any LLM inference service using xgrammar for structured/constrained generation (vLLM, MLC-LLM, llama.cpp wrappers) is vulnerable to a remote, unauthenticated DoS via a crafted user-supplied grammar. The attack requires zero privileges and no user interaction — a single malformed grammar crashes the inference server. Patch to xgrammar >= 0.1.21 immediately; if patching is blocked, disable structured output / grammar endpoints as a temporary workaround.

Is CVE-2025-57809 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-57809, increasing the risk of exploitation.

How to fix CVE-2025-57809?

1. PATCH: Upgrade xgrammar to >= 0.1.21 (fix commit b943feac). Check transitive dependencies — vLLM and similar frameworks pin xgrammar internally. 2. WORKAROUND (if patching delayed): Disable or gate structured output endpoints behind authentication; reject user-supplied grammars at the API gateway layer. 3. DETECTION: Monitor for unexpected inference worker crashes or stack overflow signals (SIGSEGV, SIGABRT) in process logs. Alert on abnormal grammar submission rates. 4. VALIDATION: Add grammar size/depth limits at the application layer before passing to xgrammar as defense-in-depth. 5. VERIFY: Run `pip show xgrammar` across all inference environments to confirm version.

What systems are affected by CVE-2025-57809?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, model serving, agent frameworks, structured output pipelines.

What is the CVSS score for CVE-2025-57809?

CVE-2025-57809 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.03%.

Technical Details

NVD Description

### Summary This issue: http://github.com/mlc-ai/xgrammar/issues/250 should have it's own security advisory. Since several tools accept and pass user supplied grammars to xgrammar, and it is so easy to trigger it seems like a High.

Exploitation Scenario

Attacker discovers an LLM API endpoint supporting structured output (e.g., OpenAI-compatible `/v1/chat/completions` with `response_format: {type: json_schema}`). They submit a crafted JSON Schema or EBNF grammar containing deeply nested or self-referential recursive rules. xgrammar's parser processes the grammar without recursion depth limits, triggering a stack overflow that crashes the inference worker process. On vLLM or similar multi-worker setups, the attacker repeats the request to exhaust all worker processes, causing full service unavailability. No authentication, prior access, or AI/ML expertise required — a single HTTP request suffices.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
August 25, 2025
Last Modified
September 10, 2025
First Seen
March 24, 2026

Related Vulnerabilities