CVE-2021-37635: TensorFlow: heap OOB read in sparse reduction ops

HIGH
Published August 12, 2021
CISO Take

TensorFlow's sparse reduction kernel fails to validate tensor index bounds, enabling heap out-of-bounds reads that can expose in-memory data (C:H) or crash the process (A:H). Any TF deployment prior to 2.6.0/2.5.1/2.4.3/2.3.4 that processes sparse tensors is vulnerable. Patch immediately — shared ML infrastructure faces elevated risk from low-privilege insiders or compromised pipeline accounts that can submit crafted workloads.

Risk Assessment

CVSS 7.1 High with low attack complexity and low privilege requirements — any user with code execution on a TF host can trigger this. The confidentiality impact is HIGH, meaning heap memory exposure can leak co-tenant data, model weights, or in-memory credentials. Availability is also HIGH via process crash. Not in CISA KEV and no confirmed active exploitation, but the low trigger barrier makes opportunistic exploitation plausible in multi-tenant ML environments.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed today 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
7.1 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 11% of all CVEs
Exploitation Status
No known exploitation
Sophistication
Moderate

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C High
I None
A High

Recommended Action

4 steps
  1. Patch: Upgrade to TensorFlow 2.6.0, or apply backports to 2.5.1, 2.4.3, or 2.3.4 for supported legacy versions.

  2. Workaround: Validate sparse tensor shapes and indices before passing to reduction ops; reject inputs where indices exceed the declared dense shape.

  3. Harden: Isolate TF workloads per tenant using containers or VMs to prevent cross-tenant memory exposure.

  4. Detect: Alert on unexpected OOM errors or segfaults in TF processes; monitor for anomalous sparse op usage patterns in shared training environments.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2 - Information security for AI systems
NIST AI RMF
MANAGE 2.2 - Mechanisms for addressing AI risks and vulnerabilities

Frequently Asked Questions

What is CVE-2021-37635?

TensorFlow's sparse reduction kernel fails to validate tensor index bounds, enabling heap out-of-bounds reads that can expose in-memory data (C:H) or crash the process (A:H). Any TF deployment prior to 2.6.0/2.5.1/2.4.3/2.3.4 that processes sparse tensors is vulnerable. Patch immediately — shared ML infrastructure faces elevated risk from low-privilege insiders or compromised pipeline accounts that can submit crafted workloads.

Is CVE-2021-37635 actively exploited?

No confirmed active exploitation of CVE-2021-37635 has been reported, but organizations should still patch proactively.

How to fix CVE-2021-37635?

1. Patch: Upgrade to TensorFlow 2.6.0, or apply backports to 2.5.1, 2.4.3, or 2.3.4 for supported legacy versions. 2. Workaround: Validate sparse tensor shapes and indices before passing to reduction ops; reject inputs where indices exceed the declared dense shape. 3. Harden: Isolate TF workloads per tenant using containers or VMs to prevent cross-tenant memory exposure. 4. Detect: Alert on unexpected OOM errors or segfaults in TF processes; monitor for anomalous sparse op usage patterns in shared training environments.

What systems are affected by CVE-2021-37635?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, shared ML infrastructure, recommendation systems.

What is the CVSS score for CVE-2021-37635?

CVE-2021-37635 has a CVSS v3.1 base score of 7.1 (HIGH). The EPSS exploitation probability is 0.04%.

Technical Details

NVD Description

TensorFlow is an end-to-end open source platform for machine learning. In affected versions the implementation of sparse reduction operations in TensorFlow can trigger accesses outside of bounds of heap allocated data. The [implementation](https://github.com/tensorflow/tensorflow/blob/a1bc56203f21a5a4995311825ffaba7a670d7747/tensorflow/core/kernels/sparse_reduce_op.cc#L217-L228) fails to validate that each reduction group does not overflow and that each corresponding index does not point to outside the bounds of the input tensor. We have patched the issue in GitHub commit 87158f43f05f2720a374f3e6d22a7aaa3a33f750. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with low-privilege access to a shared ML training cluster (e.g., via compromised CI/CD service account or rogue insider) submits a TF training job containing deliberately crafted sparse reduction ops. The script constructs a SparseTensor with reduction group indices that overflow, causing TensorFlow to read heap memory outside the allocated buffer. The attacker captures out-of-bounds data via TF error output or side-channel, potentially recovering adjacent heap contents such as other tenants' model weights, hyperparameters, or cached authentication tokens. On single-tenant systems, the same technique achieves denial of service by crashing the training run.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:H

Timeline

Published
August 12, 2021
Last Modified
November 21, 2024
First Seen
August 12, 2021

Related Vulnerabilities