CVE-2021-41225: TensorFlow Grappler: uninitialized var, local priv-esc

HIGH PoC AVAILABLE
Published November 5, 2021
CISO Take

A crafted TensorFlow SavedModel lacking a Dequeue node triggers an uninitialized pointer in the Grappler optimizer, enabling local privilege escalation with minimal access requirements. Patch all TensorFlow instances to 2.7.0, 2.6.1, 2.5.2, or 2.4.4—shared GPU training clusters are highest risk due to multi-tenant exposure. Restrict model loading from untrusted sources as an immediate compensating control.

What is the risk?

CVSS 7.8 HIGH with local attack vector, low complexity, and low privilege requirement creates a realistic exploitation scenario in ML infrastructure. While physical/local access is required, shared GPU clusters and Jupyter notebook environments are common in enterprise ML teams—lateral movement from a compromised ML engineer account could weaponize this. Full CIA impact (C:H/I:H/A:H) means successful exploitation risks training data exposure, model tampering, and host compromise. Not in CISA KEV, suggesting no confirmed in-the-wild exploitation to date.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 3d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
7.8 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 9% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Moderate
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C High
I High
A High

What should I do?

5 steps
  1. PATCH

    Upgrade TensorFlow to 2.7.0, 2.6.1, 2.5.2, or 2.4.4 across all training servers, notebook environments, and serving infrastructure.

  2. AUDIT

    Inventory all TF deployments including transitive dependencies in ML pipelines (TFX, Keras, TF Serving).

  3. RESTRICT

    Enforce least-privilege access to training infrastructure and internal model registries—prevent loading models from untrusted sources.

  4. VALIDATE

    Gate SavedModel ingestion from external or untrusted origins; inspect graph structure before Grappler optimization.

  5. DETECT

    Monitor for anomalous privilege escalation events on ML training hosts and alert on unexpected process spawning from TF optimizer processes.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 9 - Risk Management System
ISO 42001
A.8.1 - AI System Security Controls
NIST AI RMF
MANAGE-2.2 - Risk Response and Treatment
OWASP LLM Top 10
LLM03 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2021-41225?

A crafted TensorFlow SavedModel lacking a Dequeue node triggers an uninitialized pointer in the Grappler optimizer, enabling local privilege escalation with minimal access requirements. Patch all TensorFlow instances to 2.7.0, 2.6.1, 2.5.2, or 2.4.4—shared GPU training clusters are highest risk due to multi-tenant exposure. Restrict model loading from untrusted sources as an immediate compensating control.

Is CVE-2021-41225 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-41225, increasing the risk of exploitation.

How to fix CVE-2021-41225?

1. PATCH: Upgrade TensorFlow to 2.7.0, 2.6.1, 2.5.2, or 2.4.4 across all training servers, notebook environments, and serving infrastructure. 2. AUDIT: Inventory all TF deployments including transitive dependencies in ML pipelines (TFX, Keras, TF Serving). 3. RESTRICT: Enforce least-privilege access to training infrastructure and internal model registries—prevent loading models from untrusted sources. 4. VALIDATE: Gate SavedModel ingestion from external or untrusted origins; inspect graph structure before Grappler optimization. 5. DETECT: Monitor for anomalous privilege escalation events on ML training hosts and alert on unexpected process spawning from TF optimizer processes.

What systems are affected by CVE-2021-41225?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, ML infrastructure, shared GPU clusters, MLOps platforms.

What is the CVSS score for CVE-2021-41225?

CVE-2021-41225 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.19%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servingML infrastructureshared GPU clustersMLOps platforms

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0011.000 Unsafe AI Artifacts
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 9
ISO 42001: A.8.1
NIST AI RMF: MANAGE-2.2
OWASP LLM Top 10: LLM03

What are the technical details?

Original Advisory

TensorFlow is an open source platform for machine learning. In affected versions TensorFlow's Grappler optimizer has a use of unitialized variable. If the `train_nodes` vector (obtained from the saved model that gets optimized) does not contain a `Dequeue` node, then `dequeue_node` is left unitialized. The fix will be included in TensorFlow 2.7.0. We will also cherrypick this commit on TensorFlow 2.6.1, TensorFlow 2.5.2, and TensorFlow 2.4.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with low-privilege access to a shared GPU training cluster crafts a malicious TensorFlow SavedModel whose train_nodes graph intentionally omits a Dequeue node. When a privileged training pipeline or model optimization job loads this artifact through Grappler, the uninitialized dequeue_node pointer is dereferenced, enabling memory corruption that leads to arbitrary code execution or privilege escalation to host root. In multi-tenant ML environments—common in enterprise data science teams—this allows one user's workload to compromise the host or other users' training jobs, potentially exfiltrating model weights, training data, or credentials stored on the system.

Weaknesses (CWE)

CWE-908 — Use of Uninitialized Resource: The product uses or accesses a resource that has not been initialized.

  • [Implementation] Explicitly initialize the resource before use. If this is performed through an API function or standard procedure, follow all required steps.
  • [Implementation] Pay close attention to complex conditionals that affect initialization, since some branches might not perform the initialization.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Timeline

Published
November 5, 2021
Last Modified
November 21, 2024
First Seen
November 5, 2021

Related Vulnerabilities