CVE-2025-57809: xgrammar: uncontrolled recursion in grammar parsing causes DoS

GHSA-5cmr-4px5-23pc HIGH PoC AVAILABLE
Published August 25, 2025
CISO Take

Any LLM inference service using xgrammar for structured/constrained generation (vLLM, MLC-LLM, llama.cpp wrappers) is vulnerable to a remote, unauthenticated DoS via a crafted user-supplied grammar. The attack requires zero privileges and no user interaction — a single malformed grammar crashes the inference server. Patch to xgrammar >= 0.1.21 immediately; if patching is blocked, disable structured output / grammar endpoints as a temporary workaround.

What is the risk?

High severity (CVSS 7.5) with a trivially low exploitation barrier: network-accessible, no authentication, no user interaction, low complexity. The EPSS score (0.031%) reflects early-stage disclosure but expect rapid exploitation given the simplicity. Impact is limited to availability — no confidentiality or integrity loss — but for production AI inference infrastructure serving multiple tenants or downstream agents, repeated DoS is operationally critical. Particularly dangerous in multi-tenant LLM API platforms where grammar endpoints are externally exposed.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
XGrammar pip < 0.1.21 0.1.21
1.8K 160 dependents Pushed 12d ago 100% patched ~5d to patch Full package profile →

Do you use XGrammar? You're affected.

How severe is it?

CVSS 3.1
7.5 / 10
EPSS
0.4%
chance of exploitation in 30 days
Higher than 35% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Unchanged
C None
I None
A High

What should I do?

5 steps
  1. PATCH

    Upgrade xgrammar to >= 0.1.21 (fix commit b943feac). Check transitive dependencies — vLLM and similar frameworks pin xgrammar internally.

  2. WORKAROUND (if patching delayed): Disable or gate structured output endpoints behind authentication; reject user-supplied grammars at the API gateway layer.

  3. DETECTION

    Monitor for unexpected inference worker crashes or stack overflow signals (SIGSEGV, SIGABRT) in process logs. Alert on abnormal grammar submission rates.

  4. VALIDATION

    Add grammar size/depth limits at the application layer before passing to xgrammar as defense-in-depth.

  5. VERIFY

    Run pip show xgrammar across all inference environments to confirm version.

What does CISA's SSVC say?

Decision Track
Exploitation none
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.9.3 - Operational processes for AI systems
NIST AI RMF
MANAGE-2.2 - Mechanisms to address AI risks and harms
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2025-57809?

Any LLM inference service using xgrammar for structured/constrained generation (vLLM, MLC-LLM, llama.cpp wrappers) is vulnerable to a remote, unauthenticated DoS via a crafted user-supplied grammar. The attack requires zero privileges and no user interaction — a single malformed grammar crashes the inference server. Patch to xgrammar >= 0.1.21 immediately; if patching is blocked, disable structured output / grammar endpoints as a temporary workaround.

Is CVE-2025-57809 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-57809, increasing the risk of exploitation.

How to fix CVE-2025-57809?

1. PATCH: Upgrade xgrammar to >= 0.1.21 (fix commit b943feac). Check transitive dependencies — vLLM and similar frameworks pin xgrammar internally. 2. WORKAROUND (if patching delayed): Disable or gate structured output endpoints behind authentication; reject user-supplied grammars at the API gateway layer. 3. DETECTION: Monitor for unexpected inference worker crashes or stack overflow signals (SIGSEGV, SIGABRT) in process logs. Alert on abnormal grammar submission rates. 4. VALIDATION: Add grammar size/depth limits at the application layer before passing to xgrammar as defense-in-depth. 5. VERIFY: Run `pip show xgrammar` across all inference environments to confirm version.

What systems are affected by CVE-2025-57809?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, model serving, agent frameworks, structured output pipelines.

What is the CVSS score for CVE-2025-57809?

CVE-2025-57809 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.44%.

What is the AI security impact?

Affected AI Architectures

LLM inference servingmodel servingagent frameworksstructured output pipelines

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0029 Denial of AI Service
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: A.9.3
NIST AI RMF: MANAGE-2.2
OWASP LLM Top 10: LLM04

What are the technical details?

Original Advisory

### Summary This issue: http://github.com/mlc-ai/xgrammar/issues/250 should have it's own security advisory. Since several tools accept and pass user supplied grammars to xgrammar, and it is so easy to trigger it seems like a High.

Exploitation Scenario

Attacker discovers an LLM API endpoint supporting structured output (e.g., OpenAI-compatible `/v1/chat/completions` with `response_format: {type: json_schema}`). They submit a crafted JSON Schema or EBNF grammar containing deeply nested or self-referential recursive rules. xgrammar's parser processes the grammar without recursion depth limits, triggering a stack overflow that crashes the inference worker process. On vLLM or similar multi-worker setups, the attacker repeats the request to exhaust all worker processes, causing full service unavailability. No authentication, prior access, or AI/ML expertise required — a single HTTP request suffices.

Weaknesses (CWE)

CWE-674 — Uncontrolled Recursion: The product does not properly control the amount of recursion that takes place, consuming excessive resources, such as allocated memory or the program stack.

  • [Implementation] Ensure that an end condition will be reached under all logic conditions. The end condition may include checking against the depth of recursion and exiting with an error if the recursion goes too deep. The complexity of the end condition contributes to the effectiveness of this action.
  • [Implementation] Increase the stack size.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
August 25, 2025
Last Modified
September 10, 2025
First Seen
March 24, 2026

Related Vulnerabilities