CVE-2021-37690: TensorFlow: use-after-free crashes training processes

MEDIUM
Published August 13, 2021
CISO Take

A use-after-free in TensorFlow's shape inference engine allows a local attacker with minimal privileges to crash TF processes via segfault. Patch to TF 2.6.0, 2.5.1, 2.4.3, or 2.3.4 immediately on training infrastructure. No active exploitation known, but unpatched training clusters are exposed to denial-of-service against long-running jobs.

Risk Assessment

Medium risk in practice. CVSS 6.6 with local access vector limits remote exploitation; an attacker needs a foothold on the machine running TensorFlow. The availability impact is high (A:H) — a crash terminates training runs — but confidentiality and integrity impact are low. Not in CISA KEV and no public exploit code observed. Risk elevates in multi-tenant ML platforms (e.g., shared Jupyter environments, Kubeflow clusters) where low-privileged users co-exist with production training workloads.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed 6d ago 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
6.6 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 7% of all CVEs
Exploitation Status
No known exploitation
Sophistication
Moderate

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C Low
I Low
A High

Recommended Action

5 steps
  1. Patch: Upgrade to TensorFlow 2.6.0, 2.5.1, 2.4.3, or 2.3.4 — all include commit ee119d4a.

  2. No viable workaround exists short of patching; avoid running untrusted TF graphs as a defense-in-depth measure.

  3. Isolate training environments: restrict who can submit training jobs to multi-tenant clusters.

  4. For containerized workloads, rebuild and redeploy ML containers with patched base images.

  5. Detection: monitor for unexpected TF process crashes (SIGSEGV) in training logs — repeated segfaults on hash table ops may indicate active probing.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 9 - Risk management system
ISO 42001
A.6.2.6 - AI system security
NIST AI RMF
GOVERN 5.2 - Organizational teams are committed to a culture that considers and communicates AI risk MANAGE 2.2 - Mechanisms to address identified AI risks

Frequently Asked Questions

What is CVE-2021-37690?

A use-after-free in TensorFlow's shape inference engine allows a local attacker with minimal privileges to crash TF processes via segfault. Patch to TF 2.6.0, 2.5.1, 2.4.3, or 2.3.4 immediately on training infrastructure. No active exploitation known, but unpatched training clusters are exposed to denial-of-service against long-running jobs.

Is CVE-2021-37690 actively exploited?

No confirmed active exploitation of CVE-2021-37690 has been reported, but organizations should still patch proactively.

How to fix CVE-2021-37690?

1. Patch: Upgrade to TensorFlow 2.6.0, 2.5.1, 2.4.3, or 2.3.4 — all include commit ee119d4a. 2. No viable workaround exists short of patching; avoid running untrusted TF graphs as a defense-in-depth measure. 3. Isolate training environments: restrict who can submit training jobs to multi-tenant clusters. 4. For containerized workloads, rebuild and redeploy ML containers with patched base images. 5. Detection: monitor for unexpected TF process crashes (SIGSEGV) in training logs — repeated segfaults on hash table ops may indicate active probing.

What systems are affected by CVE-2021-37690?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, MLOps platforms.

What is the CVSS score for CVE-2021-37690?

CVE-2021-37690 has a CVSS v3.1 base score of 6.6 (MEDIUM). The EPSS exploitation probability is 0.02%.

Technical Details

NVD Description

TensorFlow is an end-to-end open source platform for machine learning. In affected versions when running shape functions, some functions (such as `MutableHashTableShape`) produce extra output information in the form of a `ShapeAndType` struct. The shapes embedded in this struct are owned by an inference context that is cleaned up almost immediately; if the upstream code attempts to access this shape information, it can trigger a segfault. `ShapeRefiner` is mitigating this for normal output shapes by cloning them (and thus putting the newly created shape under ownership of an inference context that will not die), but we were not doing the same for shapes and types. This commit fixes that by doing similar logic on output shapes and types. We have patched the issue in GitHub commit ee119d4a498979525046fba1c3dd3f13a039fbb1. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.

Exploitation Scenario

An attacker with low-privileged access to a shared ML training cluster submits a TensorFlow graph containing a MutableHashTable operation with crafted shape inputs. When TensorFlow's ShapeRefiner evaluates the graph during session initialization or shape inference, the MutableHashTableShape function writes ShapeAndType structs referencing an inference context that is immediately freed. Accessing those dangling shape pointers triggers a segfault, killing the training process. On a Kubeflow or MLflow multi-tenant platform, this would terminate co-located training jobs, causing data loss for unfinished runs and potential GPU resource waste.

Weaknesses (CWE)

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:L/I:L/A:H

Timeline

Published
August 13, 2021
Last Modified
November 21, 2024
First Seen
August 13, 2021

Related Vulnerabilities