CVE-2021-37690: TensorFlow: use-after-free crashes training processes
MEDIUMA use-after-free in TensorFlow's shape inference engine allows a local attacker with minimal privileges to crash TF processes via segfault. Patch to TF 2.6.0, 2.5.1, 2.4.3, or 2.3.4 immediately on training infrastructure. No active exploitation known, but unpatched training clusters are exposed to denial-of-service against long-running jobs.
What is the risk?
Medium risk in practice. CVSS 6.6 with local access vector limits remote exploitation; an attacker needs a foothold on the machine running TensorFlow. The availability impact is high (A:H) — a crash terminates training runs — but confidentiality and integrity impact are low. Not in CISA KEV and no public exploit code observed. Risk elevates in multi-tenant ML platforms (e.g., shared Jupyter environments, Kubeflow clusters) where low-privileged users co-exist with production training workloads.
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| TensorFlow | pip | — | No patch |
Do you use TensorFlow? You're affected.
How severe is it?
What is the attack surface?
What should I do?
5 steps-
Patch: Upgrade to TensorFlow 2.6.0, 2.5.1, 2.4.3, or 2.3.4 — all include commit ee119d4a.
-
No viable workaround exists short of patching; avoid running untrusted TF graphs as a defense-in-depth measure.
-
Isolate training environments: restrict who can submit training jobs to multi-tenant clusters.
-
For containerized workloads, rebuild and redeploy ML containers with patched base images.
-
Detection: monitor for unexpected TF process crashes (SIGSEGV) in training logs — repeated segfaults on hash table ops may indicate active probing.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-37690?
A use-after-free in TensorFlow's shape inference engine allows a local attacker with minimal privileges to crash TF processes via segfault. Patch to TF 2.6.0, 2.5.1, 2.4.3, or 2.3.4 immediately on training infrastructure. No active exploitation known, but unpatched training clusters are exposed to denial-of-service against long-running jobs.
Is CVE-2021-37690 actively exploited?
No confirmed active exploitation of CVE-2021-37690 has been reported, but organizations should still patch proactively.
How to fix CVE-2021-37690?
1. Patch: Upgrade to TensorFlow 2.6.0, 2.5.1, 2.4.3, or 2.3.4 — all include commit ee119d4a. 2. No viable workaround exists short of patching; avoid running untrusted TF graphs as a defense-in-depth measure. 3. Isolate training environments: restrict who can submit training jobs to multi-tenant clusters. 4. For containerized workloads, rebuild and redeploy ML containers with patched base images. 5. Detection: monitor for unexpected TF process crashes (SIGSEGV) in training logs — repeated segfaults on hash table ops may indicate active probing.
What systems are affected by CVE-2021-37690?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, MLOps platforms.
What is the CVSS score for CVE-2021-37690?
CVE-2021-37690 has a CVSS v3.1 base score of 6.6 (MEDIUM). The EPSS exploitation probability is 0.16%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.001 AI Software AML.T0029 Denial of AI Service AML.T0049 Exploit Public-Facing Application Compliance Controls Affected
What are the technical details?
Original Advisory
TensorFlow is an end-to-end open source platform for machine learning. In affected versions when running shape functions, some functions (such as `MutableHashTableShape`) produce extra output information in the form of a `ShapeAndType` struct. The shapes embedded in this struct are owned by an inference context that is cleaned up almost immediately; if the upstream code attempts to access this shape information, it can trigger a segfault. `ShapeRefiner` is mitigating this for normal output shapes by cloning them (and thus putting the newly created shape under ownership of an inference context that will not die), but we were not doing the same for shapes and types. This commit fixes that by doing similar logic on output shapes and types. We have patched the issue in GitHub commit ee119d4a498979525046fba1c3dd3f13a039fbb1. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.
Exploitation Scenario
An attacker with low-privileged access to a shared ML training cluster submits a TensorFlow graph containing a MutableHashTable operation with crafted shape inputs. When TensorFlow's ShapeRefiner evaluates the graph during session initialization or shape inference, the MutableHashTableShape function writes ShapeAndType structs referencing an inference context that is immediately freed. Accessing those dangling shape pointers triggers a segfault, killing the training process. On a Kubeflow or MLflow multi-tenant platform, this would terminate co-located training jobs, causing data loss for unfinished runs and potential GPU resource waste.
Weaknesses (CWE)
CWE-416 — Use After Free: The product reuses or references memory after it has been freed. At some point afterward, the memory may be allocated again and saved in another pointer, while the original pointer references a location somewhere within the new allocation. Any operations using the original pointer are no longer valid because the memory "belongs" to the code that operates on the new pointer.
- [Architecture and Design] Choose a language that provides automatic memory management.
- [Implementation] When freeing pointers, be sure to set them to NULL once they are freed. However, the utilization of multiple or complex data structures may lower the usefulness of this strategy.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:L/I:L/A:H References
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow