CVE-2026-7669: SGLang: deserialization in tokenizer loader enables RCE

GHSA-6m5f-673f-5vh7 MEDIUM PoC AVAILABLE CISA: TRACK*
Published May 2, 2026
CISO Take

SGLang up to version 0.5.9 contains an unsafe deserialization flaw in its HuggingFace tokenizer loading path that allows unauthenticated remote attackers to execute arbitrary code by sending crafted serialized payloads to exposed inference endpoints. While the CVSS score is medium (5.6), the EPSS places this in the top 82% for exploitation likelihood, and with 7,841 downstream dependents the blast radius across LLM inference deployments is substantial — a compromised SGLang server typically has access to loaded models, inference secrets, and internal network segments. Compounding risk: the vendor did not respond to disclosure, there is no patch available (affected range is all versions ≤ 0.5.9 with no patched version listed), and the OpenSSF scorecard of 4.9/10 signals weak supply chain security posture for this package. Immediate action: audit all SGLang deployments, restrict network access to inference endpoints to trusted sources only, and evaluate replacing SGLang with a patched alternative until an official fix is released.

Sources: NVD EPSS GitHub Advisory OpenSSF ATLAS

What is the risk?

Despite a medium CVSS (5.6), the combination of no available patch, vendor non-response, high EPSS percentile (top 82%), and 7,841 downstream dependents elevates operational risk for organizations running SGLang inference infrastructure. Attack complexity is rated HIGH (AC:H), which reduces immediate commodity exploitation risk, but a public PoC repository (github.com/gouldnicholas/CVE-2026-7669-PoC) exists, lowering the bar for targeted attacks. LLM inference servers are high-value targets with broad internal access. The package's history of 28 CVEs and a 4.9/10 OpenSSF score indicate systemic security debt.

How does the attack unfold?

Endpoint Discovery
Adversary identifies an exposed SGLang inference API endpoint via network scanning or OSINT on the target organization's ML infrastructure.
AML.T0006
Payload Crafting
Using the public PoC as reference, adversary crafts a malicious serialized Python object designed to trigger arbitrary code execution when deserialized by SGLang's get_tokenizer function.
AML.T0016.000
Deserialization Exploitation
Adversary submits the crafted payload to the SGLang inference endpoint; the HuggingFace tokenizer handler deserializes it without validation, executing the embedded malicious code on the inference server.
AML.T0049
Inference Host Compromise
With code execution on the inference server, adversary exfiltrates model weights, API keys, and environment credentials, then pivots to connected internal systems.
AML.T0112

What systems are affected?

Package Ecosystem Vulnerable Range Patched
Transformers pip <= 0.5.9 No patch
161.8K OpenSSF 6.4 8.3K dependents Pushed 4d ago 39% patched ~97d to patch Full package profile →

Do you use Transformers? You're affected.

How severe is it?

CVSS 3.1
5.6 / 10
EPSS
0.4%
chance of exploitation in 30 days
Higher than 28% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Advanced
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC High
PR None
UI None
S Unchanged
C Low
I Low
A Low

What should I do?

6 steps
  1. PATCH

    No official patch exists for sglang ≤ 0.5.9. Monitor the package repository for a patched release and upgrade immediately upon availability.

  2. NETWORK CONTROLS

    Restrict SGLang inference API endpoints to trusted internal IP ranges via firewall rules or service mesh policies — this CVE requires network access.

  3. ISOLATION

    Run SGLang in isolated containers with minimal filesystem and network permissions; avoid storing credentials or API keys in the same environment.

  4. INPUT VALIDATION

    If modifying source is feasible, add strict allow-listing of tokenizer sources to reject untrusted serialized inputs before they reach get_tokenizer.

  5. DETECTION

    Monitor for anomalous process spawning from SGLang worker processes, unexpected outbound network connections from inference hosts, and unusual file system writes — these are indicators of post-exploitation activity.

  6. INVENTORY

    Use the public PoC reference (GHSA-6m5f-673f-5vh7) to identify affected deployments across your environment.

What does CISA's SSVC say?

Decision Track*
Exploitation poc
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, Robustness and Cybersecurity Article 9 - Risk Management System
ISO 42001
6.1.2 - AI Risk Assessment 8.2 - AI System Design and Development
NIST AI RMF
MANAGE 2.2 - Mechanisms for AI Risk Treatment MAP 1.6 - Third-Party Risk
OWASP LLM Top 10
LLM03 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2026-7669?

SGLang up to version 0.5.9 contains an unsafe deserialization flaw in its HuggingFace tokenizer loading path that allows unauthenticated remote attackers to execute arbitrary code by sending crafted serialized payloads to exposed inference endpoints. While the CVSS score is medium (5.6), the EPSS places this in the top 82% for exploitation likelihood, and with 7,841 downstream dependents the blast radius across LLM inference deployments is substantial — a compromised SGLang server typically has access to loaded models, inference secrets, and internal network segments. Compounding risk: the vendor did not respond to disclosure, there is no patch available (affected range is all versions ≤ 0.5.9 with no patched version listed), and the OpenSSF scorecard of 4.9/10 signals weak supply chain security posture for this package. Immediate action: audit all SGLang deployments, restrict network access to inference endpoints to trusted sources only, and evaluate replacing SGLang with a patched alternative until an official fix is released.

Is CVE-2026-7669 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2026-7669, increasing the risk of exploitation.

How to fix CVE-2026-7669?

1. PATCH: No official patch exists for sglang ≤ 0.5.9. Monitor the package repository for a patched release and upgrade immediately upon availability. 2. NETWORK CONTROLS: Restrict SGLang inference API endpoints to trusted internal IP ranges via firewall rules or service mesh policies — this CVE requires network access. 3. ISOLATION: Run SGLang in isolated containers with minimal filesystem and network permissions; avoid storing credentials or API keys in the same environment. 4. INPUT VALIDATION: If modifying source is feasible, add strict allow-listing of tokenizer sources to reject untrusted serialized inputs before they reach get_tokenizer. 5. DETECTION: Monitor for anomalous process spawning from SGLang worker processes, unexpected outbound network connections from inference hosts, and unusual file system writes — these are indicators of post-exploitation activity. 6. INVENTORY: Use the public PoC reference (GHSA-6m5f-673f-5vh7) to identify affected deployments across your environment.

What systems are affected by CVE-2026-7669?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, Model serving pipelines, Multi-model inference backends, RAG pipelines using SGLang as inference layer, Fine-tuned model deployment workflows.

What is the CVSS score for CVE-2026-7669?

CVE-2026-7669 has a CVSS v3.1 base score of 5.6 (MEDIUM). The EPSS exploitation probability is 0.37%.

What is the AI security impact?

Affected AI Architectures

LLM inference servingModel serving pipelinesMulti-model inference backendsRAG pipelines using SGLang as inference layerFine-tuned model deployment workflows

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0049 Exploit Public-Facing Application
AML.T0112 Machine Compromise

Compliance Controls Affected

EU AI Act: Article 15, Article 9
ISO 42001: 6.1.2, 8.2
NIST AI RMF: MANAGE 2.2, MAP 1.6
OWASP LLM Top 10: LLM03

What are the technical details?

Original Advisory

A vulnerability was detected in sgl-project SGLang up to 0.5.9. Impacted is the function get_tokenizer of the file python/sglang/srt/utils/hf_transformers_utils.py of the component HuggingFace Transformer Handler. The manipulation results in deserialization. The attack can be executed remotely. A high complexity level is associated with this attack. The exploitability is considered difficult. The vendor was contacted early about this disclosure but did not respond in any way.

Exploitation Scenario

An adversary identifies an organization's SGLang inference endpoint exposed on an internal network (or internet-facing) through scanning or OSINT. They craft a malicious serialized Python object targeting the get_tokenizer deserialization path — leveraging the public PoC as a reference — and submit it as a model or tokenizer identifier via the SGLang API. Upon deserialization, the payload executes arbitrary code in the context of the SGLang inference process. The attacker then establishes persistence on the inference host, exfiltrates loaded model weights and environment-stored API keys (HuggingFace tokens, cloud credentials), and pivots to adjacent internal systems using the compromised server's network access.

Weaknesses (CWE)

CWE-20 — Improper Input Validation: The product receives input or data, but it does not validate or incorrectly validates that the input has the properties that are required to process the data safely and correctly.

  • [Architecture and Design] Consider using language-theoretic security (LangSec) techniques that characterize inputs using a formal language and build "recognizers" for that language. This effectively requires parsing to be a distinct layer that effectively enforces a boundary between raw input and internal data representations, instead of allowing parser code to be scattered throughout the program, where it could be subject to errors or inconsistencies that create weaknesses. [REF-1109] [REF-1110] [REF-1111]
  • [Architecture and Design] Use an input validation framework such as Struts or the OWASP ESAPI Validation API. Note that using a framework does not automatically address all input validation problems; be mindful of weaknesses that could arise from misusing the framework itself (CWE-1173).

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:L/I:L/A:L

Timeline

Published
May 2, 2026
Last Modified
May 7, 2026
First Seen
May 2, 2026

Related Vulnerabilities