CVE-2021-37659: TensorFlow: heap OOB in cwise ops enables local RCE
HIGHUpgrade TensorFlow to 2.6.0, 2.5.1, 2.4.3, or 2.3.4 on all training and inference infrastructure immediately. While local access is required, shared ML platforms—Jupyter hubs, GPU clusters, containerized MLOps pipelines—are routine attack surfaces where any low-privileged user can trigger this. Heap corruption enables privilege escalation beyond model code isolation boundaries, threatening host-level compromise.
What is the risk?
Effective risk is moderate-to-high in shared ML compute environments despite the local attack vector. CVSS 7.8 reflects full CIA impact (C:H/I:H/A:H) with low complexity and low privileges—any user who can submit a TensorFlow job can exploit this. Shared GPU clusters, notebook platforms, and containerized training workers amplify the local-access barrier. Not in CISA KEV and no confirmed active exploitation, but the patch has been public since 2021; unpatched deployments represent an inexcusable residual risk.
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| TensorFlow | pip | — | No patch |
Do you use TensorFlow? You're affected.
How severe is it?
What is the attack surface?
What should I do?
5 steps-
Patch immediately: upgrade to TensorFlow >= 2.6.0 or apply cherrypicks for 2.5.1, 2.4.3, 2.3.4 (commit 93f428fd1768df147171ed674fee1fc5ab8309ec).
-
Audit all TF deployments: scan CI/CD runners, Jupyter environments, and container images—
pip show tensorfloworpip3 show tensorflow. -
Enforce tensor shape validation at pipeline ingestion points before ops execute to reduce attack surface.
-
Run training jobs under dedicated least-privilege service accounts to contain blast radius if exploited.
-
Detection: monitor for SIGSEGV/SIGABRT in TF worker logs and unexpected core dumps from training processes; heap OOB often manifests as intermittent crashes before controlled exploitation.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-37659?
Upgrade TensorFlow to 2.6.0, 2.5.1, 2.4.3, or 2.3.4 on all training and inference infrastructure immediately. While local access is required, shared ML platforms—Jupyter hubs, GPU clusters, containerized MLOps pipelines—are routine attack surfaces where any low-privileged user can trigger this. Heap corruption enables privilege escalation beyond model code isolation boundaries, threatening host-level compromise.
Is CVE-2021-37659 actively exploited?
No confirmed active exploitation of CVE-2021-37659 has been reported, but organizations should still patch proactively.
How to fix CVE-2021-37659?
1. Patch immediately: upgrade to TensorFlow >= 2.6.0 or apply cherrypicks for 2.5.1, 2.4.3, 2.3.4 (commit 93f428fd1768df147171ed674fee1fc5ab8309ec). 2. Audit all TF deployments: scan CI/CD runners, Jupyter environments, and container images—`pip show tensorflow` or `pip3 show tensorflow`. 3. Enforce tensor shape validation at pipeline ingestion points before ops execute to reduce attack surface. 4. Run training jobs under dedicated least-privilege service accounts to contain blast radius if exploited. 5. Detection: monitor for SIGSEGV/SIGABRT in TF worker logs and unexpected core dumps from training processes; heap OOB often manifests as intermittent crashes before controlled exploitation.
What systems are affected by CVE-2021-37659?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, MLOps platforms, Jupyter/notebook environments.
What is the CVSS score for CVE-2021-37659?
CVE-2021-37659 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.18%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.001 AI Software AML.T0044 Full AI Model Access AML.T0049 Exploit Public-Facing Application Compliance Controls Affected
What are the technical details?
Original Advisory
TensorFlow is an end-to-end open source platform for machine learning. In affected versions an attacker can cause undefined behavior via binding a reference to null pointer in all binary cwise operations that don't require broadcasting (e.g., gradients of binary cwise operations). The [implementation](https://github.com/tensorflow/tensorflow/blob/84d053187cb80d975ef2b9684d4b61981bca0c41/tensorflow/core/kernels/cwise_ops_common.h#L264) assumes that the two inputs have exactly the same number of elements but does not check that. Hence, when the eigen functor executes it triggers heap OOB reads and undefined behavior due to binding to nullptr. We have patched the issue in GitHub commit 93f428fd1768df147171ed674fee1fc5ab8309ec. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.
Exploitation Scenario
An attacker with low-privilege access to a shared GPU training cluster—e.g., a compromised data scientist account or a malicious CI pipeline contribution—submits a TensorFlow training job invoking a binary element-wise operation (such as a custom gradient layer) with two tensors of deliberately mismatched sizes. Because TF's cwise kernel assumes shape equality without validating it, the Eigen functor binds a reference to a null pointer and executes heap reads beyond allocated tensor memory. This leaks adjacent heap contents (model weights, auth tokens, neighboring tenant data on a multi-tenant cluster) and can be chained with heap grooming to achieve code execution on the training host, potentially escaping a containerized ML workload to compromise the underlying node.
Weaknesses (CWE)
CWE-125 — Out-of-bounds Read: The product reads data past the end, or before the beginning, of the intended buffer.
- [Implementation] Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylis
- [Architecture and Design] Use a language that provides appropriate memory abstractions.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H References
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow