CVE-2021-37651: TensorFlow: heap OOB r/w in FractionalAvgPoolGrad op

HIGH
Published August 12, 2021
CISO Take

A local attacker with low privileges can trigger heap out-of-bounds read/write in TensorFlow's fractional average pooling gradient op by passing an empty tensor, potentially leading to arbitrary code execution on training infrastructure. Patch immediately to TF 2.6.0, 2.5.1, 2.4.3, or 2.3.4 — any shared ML training server or Jupyter environment running older TensorFlow is at risk from a malicious insider or compromised user account. Not actively exploited in the wild, but the low complexity makes it trivially weaponizable once an attacker has local access.

What is the risk?

CVSS 7.8 High with local attack vector reduces internet-facing exposure, but shared ML infrastructure (GPU clusters, Jupyter hubs, MLflow servers) routinely grants low-privileged shell access to multiple users — making this highly relevant in enterprise AI/ML environments. Low attack complexity (AC:L) and no user interaction required means exploitation is straightforward once local access is obtained. No public exploit code confirmed, not in CISA KEV. Risk elevates significantly in multi-tenant ML platforms.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 4d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
7.8 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 7% of all CVEs
Exploitation Status
No known exploitation
Sophistication
Moderate

What is the attack surface?

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C High
I High
A High

What should I do?

6 steps
  1. PATCH

    Upgrade TensorFlow to 2.6.0, or apply backports to 2.5.1, 2.4.3, or 2.3.4 (commit 0f931751).

  2. INVENTORY

    Identify all systems running TensorFlow — training servers, CI/CD pipelines, Jupyter hubs, MLflow/Kubeflow instances.

  3. ISOLATE

    Run training workloads in containers with least-privilege service accounts; disable host network/pid namespaces.

  4. VALIDATE

    Add input validation at pipeline entry points — reject empty or malformed tensors before they reach native ops.

  5. DETECT

    Monitor for abnormal process crashes (SIGSEGV, heap corruption dumps) in TF training processes as a potential exploitation indicator.

  6. VERSION PIN

    Enforce approved TF versions via dependency policies (pip constraints, conda envs, Docker base image scanning).

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.10.1 - Security of AI systems
NIST AI RMF
MANAGE-2.2 - Mechanisms to sustain AI risk management
OWASP LLM Top 10
LLM08 - Excessive Agency / Vulnerable Components

Frequently Asked Questions

What is CVE-2021-37651?

A local attacker with low privileges can trigger heap out-of-bounds read/write in TensorFlow's fractional average pooling gradient op by passing an empty tensor, potentially leading to arbitrary code execution on training infrastructure. Patch immediately to TF 2.6.0, 2.5.1, 2.4.3, or 2.3.4 — any shared ML training server or Jupyter environment running older TensorFlow is at risk from a malicious insider or compromised user account. Not actively exploited in the wild, but the low complexity makes it trivially weaponizable once an attacker has local access.

Is CVE-2021-37651 actively exploited?

No confirmed active exploitation of CVE-2021-37651 has been reported, but organizations should still patch proactively.

How to fix CVE-2021-37651?

1. PATCH: Upgrade TensorFlow to 2.6.0, or apply backports to 2.5.1, 2.4.3, or 2.3.4 (commit 0f931751). 2. INVENTORY: Identify all systems running TensorFlow — training servers, CI/CD pipelines, Jupyter hubs, MLflow/Kubeflow instances. 3. ISOLATE: Run training workloads in containers with least-privilege service accounts; disable host network/pid namespaces. 4. VALIDATE: Add input validation at pipeline entry points — reject empty or malformed tensors before they reach native ops. 5. DETECT: Monitor for abnormal process crashes (SIGSEGV, heap corruption dumps) in TF training processes as a potential exploitation indicator. 6. VERSION PIN: Enforce approved TF versions via dependency policies (pip constraints, conda envs, Docker base image scanning).

What systems are affected by CVE-2021-37651?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, MLOps platforms.

What is the CVSS score for CVE-2021-37651?

CVE-2021-37651 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.17%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servingMLOps platforms

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0011.000 Unsafe AI Artifacts
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: A.10.1
NIST AI RMF: MANAGE-2.2
OWASP LLM Top 10: LLM08

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. In affected versions the implementation for `tf.raw_ops.FractionalAvgPoolGrad` can be tricked into accessing data outside of bounds of heap allocated buffers. The [implementation](https://github.com/tensorflow/tensorflow/blob/f24faa153ad31a4b51578f8181d3aaab77a1ddeb/tensorflow/core/kernels/fractional_avg_pool_op.cc#L205) does not validate that the input tensor is non-empty. Thus, code constructs an empty `EigenDoubleMatrixMap` and then accesses this buffer with indices that are outside of the empty area. We have patched the issue in GitHub commit 0f931751fb20f565c4e94aa6df58d54a003cdb30. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.

Exploitation Scenario

A data scientist with shared access to a team GPU training server crafts a Python script that calls `tf.raw_ops.FractionalAvgPoolGrad` with an empty input tensor. On vulnerable TensorFlow versions, this constructs an empty EigenDoubleMatrixMap and immediately accesses out-of-bounds memory during index computation. An attacker who has studied the memory layout (feasible given TF is open source) can craft the tensor dimensions to trigger a controlled write primitive, enabling heap exploitation to overwrite function pointers and achieve code execution as the training service account — which typically has access to model artifacts, training data, and cloud credentials stored in the environment.

Weaknesses (CWE)

CWE-787 — Out-of-bounds Write: The product writes data past the end, or before the beginning, of the intended buffer.

  • [Requirements] Use a language that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. For example, many languages that perform their own memory management, such as Java and Perl, are not subject to buffer overflows. Other languages, such as Ada and C#, typically provide overflow protection, but the protection can be disabled by the programmer. Be wary that a language's interface to native code may still be subject to overflows, even if the language itself is theoretically safe.
  • [Architecture and Design] Use a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. Examples include the Safe C String Library (SafeStr) by Messier and Viega [REF-57], and the Strsafe.h library from Microsoft [REF-56]. These libraries provide safer versions of overflow-prone string-handling functions.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Timeline

Published
August 12, 2021
Last Modified
November 21, 2024
First Seen
August 12, 2021

Related Vulnerabilities