CVE-2021-37635: TensorFlow: heap OOB read in sparse reduction ops
HIGHTensorFlow's sparse reduction kernel fails to validate tensor index bounds, enabling heap out-of-bounds reads that can expose in-memory data (C:H) or crash the process (A:H). Any TF deployment prior to 2.6.0/2.5.1/2.4.3/2.3.4 that processes sparse tensors is vulnerable. Patch immediately — shared ML infrastructure faces elevated risk from low-privilege insiders or compromised pipeline accounts that can submit crafted workloads.
What is the risk?
CVSS 7.1 High with low attack complexity and low privilege requirements — any user with code execution on a TF host can trigger this. The confidentiality impact is HIGH, meaning heap memory exposure can leak co-tenant data, model weights, or in-memory credentials. Availability is also HIGH via process crash. Not in CISA KEV and no confirmed active exploitation, but the low trigger barrier makes opportunistic exploitation plausible in multi-tenant ML environments.
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| TensorFlow | pip | — | No patch |
Do you use TensorFlow? You're affected.
How severe is it?
What is the attack surface?
What should I do?
4 steps-
Patch: Upgrade to TensorFlow 2.6.0, or apply backports to 2.5.1, 2.4.3, or 2.3.4 for supported legacy versions.
-
Workaround: Validate sparse tensor shapes and indices before passing to reduction ops; reject inputs where indices exceed the declared dense shape.
-
Harden: Isolate TF workloads per tenant using containers or VMs to prevent cross-tenant memory exposure.
-
Detect: Alert on unexpected OOM errors or segfaults in TF processes; monitor for anomalous sparse op usage patterns in shared training environments.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-37635?
TensorFlow's sparse reduction kernel fails to validate tensor index bounds, enabling heap out-of-bounds reads that can expose in-memory data (C:H) or crash the process (A:H). Any TF deployment prior to 2.6.0/2.5.1/2.4.3/2.3.4 that processes sparse tensors is vulnerable. Patch immediately — shared ML infrastructure faces elevated risk from low-privilege insiders or compromised pipeline accounts that can submit crafted workloads.
Is CVE-2021-37635 actively exploited?
No confirmed active exploitation of CVE-2021-37635 has been reported, but organizations should still patch proactively.
How to fix CVE-2021-37635?
1. Patch: Upgrade to TensorFlow 2.6.0, or apply backports to 2.5.1, 2.4.3, or 2.3.4 for supported legacy versions. 2. Workaround: Validate sparse tensor shapes and indices before passing to reduction ops; reject inputs where indices exceed the declared dense shape. 3. Harden: Isolate TF workloads per tenant using containers or VMs to prevent cross-tenant memory exposure. 4. Detect: Alert on unexpected OOM errors or segfaults in TF processes; monitor for anomalous sparse op usage patterns in shared training environments.
What systems are affected by CVE-2021-37635?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, shared ML infrastructure, recommendation systems.
What is the CVSS score for CVE-2021-37635?
CVE-2021-37635 has a CVSS v3.1 base score of 7.1 (HIGH). The EPSS exploitation probability is 0.17%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.001 AI Software AML.T0011.000 Unsafe AI Artifacts AML.T0043 Craft Adversarial Data Compliance Controls Affected
What are the technical details?
Original Advisory
TensorFlow is an end-to-end open source platform for machine learning. In affected versions the implementation of sparse reduction operations in TensorFlow can trigger accesses outside of bounds of heap allocated data. The [implementation](https://github.com/tensorflow/tensorflow/blob/a1bc56203f21a5a4995311825ffaba7a670d7747/tensorflow/core/kernels/sparse_reduce_op.cc#L217-L228) fails to validate that each reduction group does not overflow and that each corresponding index does not point to outside the bounds of the input tensor. We have patched the issue in GitHub commit 87158f43f05f2720a374f3e6d22a7aaa3a33f750. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.
Exploitation Scenario
An adversary with low-privilege access to a shared ML training cluster (e.g., via compromised CI/CD service account or rogue insider) submits a TF training job containing deliberately crafted sparse reduction ops. The script constructs a SparseTensor with reduction group indices that overflow, causing TensorFlow to read heap memory outside the allocated buffer. The attacker captures out-of-bounds data via TF error output or side-channel, potentially recovering adjacent heap contents such as other tenants' model weights, hyperparameters, or cached authentication tokens. On single-tenant systems, the same technique achieves denial of service by crashing the training run.
Weaknesses (CWE)
CWE-125 — Out-of-bounds Read: The product reads data past the end, or before the beginning, of the intended buffer.
- [Implementation] Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylis
- [Architecture and Design] Use a language that provides appropriate memory abstractions.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:H References
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow