CVE-2025-57809 — HIGH (CVSS 7.5) AI Security Vulnerability

CISO Take

Any LLM inference service using xgrammar for structured/constrained generation (vLLM, MLC-LLM, llama.cpp wrappers) is vulnerable to a remote, unauthenticated DoS via a crafted user-supplied grammar. The attack requires zero privileges and no user interaction — a single malformed grammar crashes the inference server. Patch to xgrammar >= 0.1.21 immediately; if patching is blocked, disable structured output / grammar endpoints as a temporary workaround.

Risk Assessment

High severity (CVSS 7.5) with a trivially low exploitation barrier: network-accessible, no authentication, no user interaction, low complexity. The EPSS score (0.031%) reflects early-stage disclosure but expect rapid exploitation given the simplicity. Impact is limited to availability — no confidentiality or integrity loss — but for production AI inference infrastructure serving multiple tenants or downstream agents, repeated DoS is operationally critical. Particularly dangerous in multi-tenant LLM API platforms where grammar endpoints are externally exposed.

Affected Systems

Package	Ecosystem	Vulnerable Range	Patched
xgrammar	pip	< 0.1.21	`0.1.21`
1.7K 152 dependents Pushed 8d ago 100% patched ~5d to patch Full package profile →

Do you use xgrammar? You're affected.

Severity & Risk

CVSS 3.1

7.5 / 10

EPSS

0.0%

chance of exploitation in 30 days

Higher than 9% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Trivial

Exploitation Confidence

medium

○ Public PoC indexed (trickest/cve)

Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV Network

AC Low

PR None

UI None

S Unchanged

C None

I None

A High

Recommended Action

5 steps

PATCH

Upgrade xgrammar to >= 0.1.21 (fix commit b943feac). Check transitive dependencies — vLLM and similar frameworks pin xgrammar internally.
WORKAROUND (if patching delayed): Disable or gate structured output endpoints behind authentication; reject user-supplied grammars at the API gateway layer.
DETECTION

Monitor for unexpected inference worker crashes or stack overflow signals (SIGSEGV, SIGABRT) in process logs. Alert on abnormal grammar submission rates.
VALIDATION

Add grammar size/depth limits at the application layer before passing to xgrammar as defense-in-depth.
VERIFY

Run pip show xgrammar across all inference environments to confirm version.

CISA SSVC Assessment

Decision Track

Exploitation none

Automatable No

Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

DoS Supply Chain Framework Inference AML.T0010.001 - AI Software AML.T0029 - Denial of AI Service AML.T0049 - Exploit Public-Facing Application

Compliance Impact

This CVE is relevant to:

EU AI Act

Article 15 - Accuracy, robustness and cybersecurity

ISO 42001

A.9.3 - Operational processes for AI systems

NIST AI RMF

MANAGE-2.2 - Mechanisms to address AI risks and harms

OWASP LLM Top 10

LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2025-57809?

Any LLM inference service using xgrammar for structured/constrained generation (vLLM, MLC-LLM, llama.cpp wrappers) is vulnerable to a remote, unauthenticated DoS via a crafted user-supplied grammar. The attack requires zero privileges and no user interaction — a single malformed grammar crashes the inference server. Patch to xgrammar >= 0.1.21 immediately; if patching is blocked, disable structured output / grammar endpoints as a temporary workaround.

Is CVE-2025-57809 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-57809, increasing the risk of exploitation.

How to fix CVE-2025-57809?

1. PATCH: Upgrade xgrammar to >= 0.1.21 (fix commit b943feac). Check transitive dependencies — vLLM and similar frameworks pin xgrammar internally. 2. WORKAROUND (if patching delayed): Disable or gate structured output endpoints behind authentication; reject user-supplied grammars at the API gateway layer. 3. DETECTION: Monitor for unexpected inference worker crashes or stack overflow signals (SIGSEGV, SIGABRT) in process logs. Alert on abnormal grammar submission rates. 4. VALIDATION: Add grammar size/depth limits at the application layer before passing to xgrammar as defense-in-depth. 5. VERIFY: Run `pip show xgrammar` across all inference environments to confirm version.

What systems are affected by CVE-2025-57809?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, model serving, agent frameworks, structured output pipelines.

What is the CVSS score for CVE-2025-57809?

CVE-2025-57809 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.03%.

Technical Details

NVD Description

### Summary This issue: http://github.com/mlc-ai/xgrammar/issues/250 should have it's own security advisory. Since several tools accept and pass user supplied grammars to xgrammar, and it is so easy to trigger it seems like a High.

Exploitation Scenario

Attacker discovers an LLM API endpoint supporting structured output (e.g., OpenAI-compatible `/v1/chat/completions` with `response_format: {type: json_schema}`). They submit a crafted JSON Schema or EBNF grammar containing deeply nested or self-referential recursive rules. xgrammar's parser processes the grammar without recursion depth limits, triggering a stack overflow that crashes the inference worker process. On vLLM or similar multi-worker setups, the attacker repeats the request to exhaust all worker processes, causing full service unavailability. No authentication, prior access, or AI/ML expertise required — a single HTTP request suffices.