CVE-2021-37661: TensorFlow integer sign conversion

CISO Take

A local attacker with low privileges can crash TensorFlow training jobs by passing a negative value to `boosted_trees_create_quantile_stream_resource`, triggering an implicit integer conversion that attempts a near-infinite memory allocation. In shared ML platforms, Jupyter environments, or CI/CD training pipelines, this can disrupt availability for all users on the system. Patch to TensorFlow 2.6.0, 2.5.1, 2.4.3, or 2.3.4 immediately and add input validation for all numeric parameters passed to TF ops.

What is the risk?

Medium risk overall, but elevated in shared ML infrastructure contexts. CVSS 5.5 (Local/Low-complexity/Low-privilege) reflects a pure availability impact with no confidentiality or integrity exposure. The primary amplifier is multi-tenancy: in shared Jupyter hubs, MLflow experiment servers, or Kubeflow pipelines where multiple users can submit training jobs, a single malicious or misconfigured job can DoS the entire training workload. No public PoC or active exploitation evidence as of the advisory date, but the exploit is trivial to reproduce — any caller who can invoke TF ops can trigger it.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
TensorFlow	pip	—	No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 2d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1

5.5 / 10

EPSS

0.2%

chance of exploitation in 30 days

Higher than 5% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

No known exploitation

Sophistication

Trivial

What is the attack surface?

AV Local

AC Low

PR Low

UI None

S Unchanged

C None

I None

A High

What should I do?

5 steps

Patch immediately

Upgrade to TensorFlow 2.6.0, or apply cherry-pick patches to 2.5.1, 2.4.3, or 2.3.4. Commit 8a84f7a2 is the authoritative fix.
Input validation

Add explicit validation that num_streams and similar numeric arguments are non-negative before invoking boosted tree ops. Reject negative values at the application layer before they reach TF kernels.
Namespace isolation

In shared ML platforms, isolate training job processes so a crash in one job cannot affect others (e.g., per-job containers or separate worker pods in Kubernetes).
Detection

Monitor for unexpected TF process crashes in training infrastructure; alert on std::bad_alloc or std::length_error exceptions originating from boosted tree quantile ops.
Inventory

Audit all TensorFlow versions in use across training, experimentation, and CI environments — this vulnerability is present across a wide version range.

How is it classified?

DoS Framework AML.T0010.001 - AI Software AML.T0029 - Denial of AI Service AML.T0049 - Exploit Public-Facing Application

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Article 15 - Accuracy, robustness, and cybersecurity

ISO 42001

A.6.1.2 - AI risk assessment

NIST AI RMF

MANAGE 2.2 - Mechanisms to address AI risks MAP 5.1 - Likelihood and magnitude of AI risk estimation

Frequently Asked Questions

What is CVE-2021-37661?

A local attacker with low privileges can crash TensorFlow training jobs by passing a negative value to `boosted_trees_create_quantile_stream_resource`, triggering an implicit integer conversion that attempts a near-infinite memory allocation. In shared ML platforms, Jupyter environments, or CI/CD training pipelines, this can disrupt availability for all users on the system. Patch to TensorFlow 2.6.0, 2.5.1, 2.4.3, or 2.3.4 immediately and add input validation for all numeric parameters passed to TF ops.

Is CVE-2021-37661 actively exploited?

No confirmed active exploitation of CVE-2021-37661 has been reported, but organizations should still patch proactively.

How to fix CVE-2021-37661?

1. **Patch immediately**: Upgrade to TensorFlow 2.6.0, or apply cherry-pick patches to 2.5.1, 2.4.3, or 2.3.4. Commit 8a84f7a2 is the authoritative fix. 2. **Input validation**: Add explicit validation that `num_streams` and similar numeric arguments are non-negative before invoking boosted tree ops. Reject negative values at the application layer before they reach TF kernels. 3. **Namespace isolation**: In shared ML platforms, isolate training job processes so a crash in one job cannot affect others (e.g., per-job containers or separate worker pods in Kubernetes). 4. **Detection**: Monitor for unexpected TF process crashes in training infrastructure; alert on `std::bad_alloc` or `std::length_error` exceptions originating from boosted tree quantile ops. 5. **Inventory**: Audit all TensorFlow versions in use across training, experimentation, and CI environments — this vulnerability is present across a wide version range.

What systems are affected by CVE-2021-37661?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, notebook environments, AutoML platforms, CI/CD ML pipelines.

What is the CVSS score for CVE-2021-37661?

CVE-2021-37661 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.15%.

What is the AI security impact?

Affected AI Architectures

training pipelinesnotebook environmentsAutoML platformsCI/CD ML pipelines

MITRE ATLAS Techniques

AML.T0010.001 AI Software

AML.T0029 Denial of AI Service

AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15

ISO 42001: A.6.1.2

NIST AI RMF: MANAGE 2.2, MAP 5.1

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. In affected versions an attacker can cause a denial of service in `boosted_trees_create_quantile_stream_resource` by using negative arguments. The [implementation](https://github.com/tensorflow/tensorflow/blob/84d053187cb80d975ef2b9684d4b61981bca0c41/tensorflow/core/kernels/boosted_trees/quantile_ops.cc#L96) does not validate that `num_streams` only contains non-negative numbers. In turn, [this results in using this value to allocate memory](https://github.com/tensorflow/tensorflow/blob/84d053187cb80d975ef2b9684d4b61981bca0c41/tensorflow/core/kernels/boosted_trees/quantiles/quantile_stream_resource.h#L31-L40). However, `reserve` receives an unsigned integer so there is an implicit conversion from a negative value to a large positive unsigned. This results in a crash from the standard library. We have patched the issue in GitHub commit 8a84f7a2b5a2b27ecf88d25bad9ac777cd2f7992. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with access to a shared ML training platform (e.g., a data scientist account on a JupyterHub or a CI pipeline that executes user-submitted notebooks) crafts a TensorFlow training script that calls `tf.raw_ops.BoostedTreesCreateQuantileStreamResource` with `num_streams=-1`. TensorFlow passes this value to C++ internals where it is implicitly cast from a signed negative integer to a very large `size_t` value (e.g., 18446744073709551615 on 64-bit systems). The `reserve()` call on the underlying vector attempts to allocate ~16 exabytes, immediately crashing the TF runtime with a standard library exception. On a shared platform, this terminates the training server process, disrupting all concurrent training jobs and denying service to legitimate users until the process restarts.

Weaknesses (CWE)

CWE-681 Incorrect Conversion between Numeric Types Primary CWE-681 Incorrect Conversion between Numeric Types

CWE-681 — Incorrect Conversion between Numeric Types: When converting from one data type to another, such as long to integer, data can be omitted or translated in a way that produces unexpected values. If the resulting values are used in a sensitive context, then dangerous behaviors may occur.