CVE-2026-7669: Transformers deserialization

Q: Is CVE-2026-7669 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2026-7669, increasing the risk of exploitation.

Q: How to fix CVE-2026-7669?

1. PATCH: No official patch exists for sglang ≤ 0.5.9. Monitor the package repository for a patched release and upgrade immediately upon availability. 2. NETWORK CONTROLS: Restrict SGLang inference API endpoints to trusted internal IP ranges via firewall rules or service mesh policies — this CVE requires network access. 3. ISOLATION: Run SGLang in isolated containers with minimal filesystem and network permissions; avoid storing credentials or API keys in the same environment. 4. INPUT VALIDATION: If modifying source is feasible, add strict allow-listing of tokenizer sources to reject untrusted serialized inputs before they reach get_tokenizer. 5. DETECTION: Monitor for anomalous process spawning from SGLang worker processes, unexpected outbound network connections from inference hosts, and unusual file system writes — these are indicators of post-exploitation activity. 6. INVENTORY: Use the public PoC reference (GHSA-6m5f-673f-5vh7) to identify affected deployments across your environment.

Q: What systems are affected by CVE-2026-7669?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, Model serving pipelines, Multi-model inference backends, RAG pipelines using SGLang as inference layer, Fine-tuned model deployment workflows.

Q: What is the CVSS score for CVE-2026-7669?

CVE-2026-7669 has a CVSS v3.1 base score of 5.6 (MEDIUM). The EPSS exploitation probability is 0.37%.

CISO Take

SGLang up to version 0.5.9 contains an unsafe deserialization flaw in its HuggingFace tokenizer loading path that allows unauthenticated remote attackers to execute arbitrary code by sending crafted serialized payloads to exposed inference endpoints. While the CVSS score is medium (5.6), the EPSS places this in the top 82% for exploitation likelihood, and with 7,841 downstream dependents the blast radius across LLM inference deployments is substantial — a compromised SGLang server typically has access to loaded models, inference secrets, and internal network segments. Compounding risk: the vendor did not respond to disclosure, there is no patch available (affected range is all versions ≤ 0.5.9 with no patched version listed), and the OpenSSF scorecard of 4.9/10 signals weak supply chain security posture for this package. Immediate action: audit all SGLang deployments, restrict network access to inference endpoints to trusted sources only, and evaluate replacing SGLang with a patched alternative until an official fix is released.

Sources: NVD EPSS GitHub Advisory OpenSSF ATLAS

What is the risk?

Despite a medium CVSS (5.6), the combination of no available patch, vendor non-response, high EPSS percentile (top 82%), and 7,841 downstream dependents elevates operational risk for organizations running SGLang inference infrastructure. Attack complexity is rated HIGH (AC:H), which reduces immediate commodity exploitation risk, but a public PoC repository (github.com/gouldnicholas/CVE-2026-7669-PoC) exists, lowering the bar for targeted attacks. LLM inference servers are high-value targets with broad internal access. The package's history of 28 CVEs and a 4.9/10 OpenSSF score indicate systemic security debt.

How does the attack unfold?

Endpoint Discovery

Adversary identifies an exposed SGLang inference API endpoint via network scanning or OSINT on the target organization's ML infrastructure.

AML.T0006

Payload Crafting

Using the public PoC as reference, adversary crafts a malicious serialized Python object designed to trigger arbitrary code execution when deserialized by SGLang's get_tokenizer function.

AML.T0016.000

Deserialization Exploitation

Adversary submits the crafted payload to the SGLang inference endpoint; the HuggingFace tokenizer handler deserializes it without validation, executing the embedded malicious code on the inference server.

AML.T0049

Inference Host Compromise

With code execution on the inference server, adversary exfiltrates model weights, API keys, and environment credentials, then pivots to connected internal systems.

AML.T0112

Endpoint Discovery

Adversary identifies an exposed SGLang inference API endpoint via network scanning or OSINT on the target organization's ML infrastructure.

AML.T0006

Payload Crafting

Using the public PoC as reference, adversary crafts a malicious serialized Python object designed to trigger arbitrary code execution when deserialized by SGLang's get_tokenizer function.

AML.T0016.000

Deserialization Exploitation

Adversary submits the crafted payload to the SGLang inference endpoint; the HuggingFace tokenizer handler deserializes it without validation, executing the embedded malicious code on the inference server.

AML.T0049

Inference Host Compromise

With code execution on the inference server, adversary exfiltrates model weights, API keys, and environment credentials, then pivots to connected internal systems.

AML.T0112

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
Transformers	pip	<= 0.5.9	No patch
161.8K OpenSSF 6.4 8.3K dependents Pushed 4d ago 39% patched ~97d to patch Full package profile →

Do you use Transformers? You're affected.

How severe is it?

CVSS 3.1

5.6 / 10

EPSS

0.4%

chance of exploitation in 30 days

Higher than 28% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Advanced

Exploitation Confidence

medium

○ CISA SSVC: Public PoC

○ Public PoC indexed (trickest/cve)

Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV Network

AC High

PR None

UI None

S Unchanged

C Low

I Low

A Low

What should I do?

6 steps

PATCH

No official patch exists for sglang ≤ 0.5.9. Monitor the package repository for a patched release and upgrade immediately upon availability.
NETWORK CONTROLS

Restrict SGLang inference API endpoints to trusted internal IP ranges via firewall rules or service mesh policies — this CVE requires network access.
ISOLATION

Run SGLang in isolated containers with minimal filesystem and network permissions; avoid storing credentials or API keys in the same environment.
INPUT VALIDATION

If modifying source is feasible, add strict allow-listing of tokenizer sources to reject untrusted serialized inputs before they reach get_tokenizer.
DETECTION

Monitor for anomalous process spawning from SGLang worker processes, unexpected outbound network connections from inference hosts, and unusual file system writes — these are indicators of post-exploitation activity.
INVENTORY

Use the public PoC reference (GHSA-6m5f-673f-5vh7) to identify affected deployments across your environment.

What does CISA's SSVC say?

Decision Track*

Exploitation poc

Automatable No

Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Supply Chain Code Execution Framework Inference AML.T0010.001 - AI Software AML.T0049 - Exploit Public-Facing Application AML.T0112 - Machine Compromise

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Article 15 - Accuracy, Robustness and Cybersecurity Article 9 - Risk Management System

ISO 42001

6.1.2 - AI Risk Assessment 8.2 - AI System Design and Development

NIST AI RMF

MANAGE 2.2 - Mechanisms for AI Risk Treatment MAP 1.6 - Third-Party Risk

OWASP LLM Top 10

LLM03 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2026-7669?

SGLang up to version 0.5.9 contains an unsafe deserialization flaw in its HuggingFace tokenizer loading path that allows unauthenticated remote attackers to execute arbitrary code by sending crafted serialized payloads to exposed inference endpoints. While the CVSS score is medium (5.6), the EPSS places this in the top 82% for exploitation likelihood, and with 7,841 downstream dependents the blast radius across LLM inference deployments is substantial — a compromised SGLang server typically has access to loaded models, inference secrets, and internal network segments. Compounding risk: the vendor did not respond to disclosure, there is no patch available (affected range is all versions ≤ 0.5.9 with no patched version listed), and the OpenSSF scorecard of 4.9/10 signals weak supply chain security posture for this package. Immediate action: audit all SGLang deployments, restrict network access to inference endpoints to trusted sources only, and evaluate replacing SGLang with a patched alternative until an official fix is released.

Is CVE-2026-7669 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2026-7669, increasing the risk of exploitation.

How to fix CVE-2026-7669?

1. PATCH: No official patch exists for sglang ≤ 0.5.9. Monitor the package repository for a patched release and upgrade immediately upon availability. 2. NETWORK CONTROLS: Restrict SGLang inference API endpoints to trusted internal IP ranges via firewall rules or service mesh policies — this CVE requires network access. 3. ISOLATION: Run SGLang in isolated containers with minimal filesystem and network permissions; avoid storing credentials or API keys in the same environment. 4. INPUT VALIDATION: If modifying source is feasible, add strict allow-listing of tokenizer sources to reject untrusted serialized inputs before they reach get_tokenizer. 5. DETECTION: Monitor for anomalous process spawning from SGLang worker processes, unexpected outbound network connections from inference hosts, and unusual file system writes — these are indicators of post-exploitation activity. 6. INVENTORY: Use the public PoC reference (GHSA-6m5f-673f-5vh7) to identify affected deployments across your environment.

What systems are affected by CVE-2026-7669?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, Model serving pipelines, Multi-model inference backends, RAG pipelines using SGLang as inference layer, Fine-tuned model deployment workflows.

What is the CVSS score for CVE-2026-7669?

CVE-2026-7669 has a CVSS v3.1 base score of 5.6 (MEDIUM). The EPSS exploitation probability is 0.37%.

What is the AI security impact?

Affected AI Architectures

LLM inference servingModel serving pipelinesMulti-model inference backendsRAG pipelines using SGLang as inference layerFine-tuned model deployment workflows

MITRE ATLAS Techniques

AML.T0010.001 AI Software

AML.T0049 Exploit Public-Facing Application

AML.T0112 Machine Compromise

Compliance Controls Affected

EU AI Act: Article 15, Article 9

ISO 42001: 6.1.2, 8.2

NIST AI RMF: MANAGE 2.2, MAP 1.6

OWASP LLM Top 10: LLM03

What are the technical details?

Original Advisory

A vulnerability was detected in sgl-project SGLang up to 0.5.9. Impacted is the function get_tokenizer of the file python/sglang/srt/utils/hf_transformers_utils.py of the component HuggingFace Transformer Handler. The manipulation results in deserialization. The attack can be executed remotely. A high complexity level is associated with this attack. The exploitability is considered difficult. The vendor was contacted early about this disclosure but did not respond in any way.

Exploitation Scenario

An adversary identifies an organization's SGLang inference endpoint exposed on an internal network (or internet-facing) through scanning or OSINT. They craft a malicious serialized Python object targeting the get_tokenizer deserialization path — leveraging the public PoC as a reference — and submit it as a model or tokenizer identifier via the SGLang API. Upon deserialization, the payload executes arbitrary code in the context of the SGLang inference process. The attacker then establishes persistence on the inference host, exfiltrates loaded model weights and environment-stored API keys (HuggingFace tokens, cloud credentials), and pivots to adjacent internal systems using the compromised server's network access.

Weaknesses (CWE)

CWE-20 Improper Input Validation Primary CWE-20 Improper Input Validation Primary CWE-502 Deserialization of Untrusted Data Primary CWE-74 Improper Neutralization of Special Elements in Output Used by a Downstream Component ('Injection') Primary

CWE-20 — Improper Input Validation: The product receives input or data, but it does not validate or incorrectly validates that the input has the properties that are required to process the data safely and correctly.

[Architecture and Design] Consider using language-theoretic security (LangSec) techniques that characterize inputs using a formal language and build "recognizers" for that language. This effectively requires parsing to be a distinct layer that effectively enforces a boundary between raw input and internal data representations, instead of allowing parser code to be scattered throughout the program, where it could be subject to errors or inconsistencies that create weaknesses. [REF-1109] [REF-1110] [REF-1111]
[Architecture and Design] Use an input validation framework such as Struts or the OWASP ESAPI Validation API. Note that using a framework does not automatically address all input validation problems; be mindful of weaknesses that could arise from misusing the framework itself (CWE-1173).

Source: MITRE CWE corpus.