CVE-2021-37652: TensorFlow: double-free in BoostedTrees, code exec

HIGH
Published August 12, 2021
CISO Take

Patch TensorFlow to 2.6.0 or apply the available backport for 2.3.x–2.5.x immediately if your ML pipelines use BoostedTrees ops. A local attacker with low privileges can trigger memory corruption leading to arbitrary code execution inside the TF runtime—particularly dangerous in shared GPU clusters or MLOps platforms where multiple teams share compute. No workaround exists beyond patching; prioritize any system where BoostedTrees training or inference is exposed to untrusted inputs.

Risk Assessment

Despite the local attack vector, this carries real risk in enterprise ML environments. CVSS 7.8 with low complexity and no user interaction means exploitation is straightforward once access is obtained. Shared training clusters (Kubernetes, Slurm, SageMaker multi-tenant) reduce the effective barrier to 'low privilege user with pod/job access.' The double-free in a reference-counted resource can corrupt heap state, enabling a determined attacker to pivot from ML job execution to host-level compromise. Not in CISA KEV and no known active exploitation, but the primitives are reliable and the patch is available—there is no reason to remain unpatched.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed today 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
7.8 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 4% of all CVEs
Exploitation Status
No known exploitation
Sophistication
Moderate

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C High
I High
A High

Recommended Action

6 steps
  1. Patch immediately: upgrade to TensorFlow 2.6.0+, or apply cherry-picked commit 5ecec9c6fbdbc6be03295685190a45e7eee726ab to 2.3.4, 2.4.3, or 2.5.1 branches.

  2. Inventory: identify all services, notebooks, and training jobs invoking BoostedTreesCreateEnsemble or tf.estimator.BoostedTrees APIs.

  3. Input validation: if user-supplied arguments reach BoostedTrees ops (hyperparameter tuning APIs, AutoML pipelines), add validation layers to reject malformed ensemble configurations before they reach the op.

  4. Isolation: run untrusted training workloads in isolated containers with restricted capabilities (no CAP_SYS_ADMIN, seccomp profiles).

  5. Detection: monitor for abnormal process crashes or heap corruption signals (SIGABRT, SIGSEGV) in TF serving processes—these may indicate exploitation attempts.

  6. Verify TF version in all Docker images and ML pipeline dependencies including transitive references via requirements.txt and conda environments.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Art.15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2.6 - AI system security
NIST AI RMF
GOVERN 6.2 - Policies and procedures for AI risk management MANAGE 2.2 - Treatments, responses, and recovery plans for risks from AI systems
OWASP LLM Top 10
LLM05:2025 - Improper Output Handling / Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2021-37652?

Patch TensorFlow to 2.6.0 or apply the available backport for 2.3.x–2.5.x immediately if your ML pipelines use BoostedTrees ops. A local attacker with low privileges can trigger memory corruption leading to arbitrary code execution inside the TF runtime—particularly dangerous in shared GPU clusters or MLOps platforms where multiple teams share compute. No workaround exists beyond patching; prioritize any system where BoostedTrees training or inference is exposed to untrusted inputs.

Is CVE-2021-37652 actively exploited?

No confirmed active exploitation of CVE-2021-37652 has been reported, but organizations should still patch proactively.

How to fix CVE-2021-37652?

1. Patch immediately: upgrade to TensorFlow 2.6.0+, or apply cherry-picked commit 5ecec9c6fbdbc6be03295685190a45e7eee726ab to 2.3.4, 2.4.3, or 2.5.1 branches. 2. Inventory: identify all services, notebooks, and training jobs invoking BoostedTreesCreateEnsemble or tf.estimator.BoostedTrees APIs. 3. Input validation: if user-supplied arguments reach BoostedTrees ops (hyperparameter tuning APIs, AutoML pipelines), add validation layers to reject malformed ensemble configurations before they reach the op. 4. Isolation: run untrusted training workloads in isolated containers with restricted capabilities (no CAP_SYS_ADMIN, seccomp profiles). 5. Detection: monitor for abnormal process crashes or heap corruption signals (SIGABRT, SIGSEGV) in TF serving processes—these may indicate exploitation attempts. 6. Verify TF version in all Docker images and ML pipeline dependencies including transitive references via requirements.txt and conda environments.

What systems are affected by CVE-2021-37652?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, ML notebooks.

What is the CVSS score for CVE-2021-37652?

CVE-2021-37652 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.02%.

Technical Details

NVD Description

TensorFlow is an end-to-end open source platform for machine learning. In affected versions the implementation for `tf.raw_ops.BoostedTreesCreateEnsemble` can result in a use after free error if an attacker supplies specially crafted arguments. The [implementation](https://github.com/tensorflow/tensorflow/blob/f24faa153ad31a4b51578f8181d3aaab77a1ddeb/tensorflow/core/kernels/boosted_trees/resource_ops.cc#L55) uses a reference counted resource and decrements the refcount if the initialization fails, as it should. However, when the code was written, the resource was represented as a naked pointer but later refactoring has changed it to be a smart pointer. Thus, when the pointer leaves the scope, a subsequent `free`-ing of the resource occurs, but this fails to take into account that the refcount has already reached 0, thus the resource has been already freed. During this double-free process, members of the resource object are accessed for cleanup but they are invalid as the entire resource has been freed. We have patched the issue in GitHub commit 5ecec9c6fbdbc6be03295685190a45e7eee726ab. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with access to a shared MLOps platform (e.g., a data scientist account on a multi-tenant Kubeflow or SageMaker environment) submits a training job for a BoostedTrees model with crafted initialization arguments that cause the ensemble resource initialization to fail. The double-free corrupts heap metadata in the TensorFlow process. With a prepared heap layout—achievable by submitting prior jobs to groom allocations—the adversary gains write primitives and escalates to code execution within the TF runtime process. In a containerized environment this yields container escape potential; in a bare-metal shared GPU cluster this may directly compromise the host. The attack requires no special ML knowledge beyond knowing the target uses TensorFlow BoostedTrees, which is discoverable from job logs or model registry metadata.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Timeline

Published
August 12, 2021
Last Modified
November 21, 2024
First Seen
August 12, 2021

Related Vulnerabilities