CVE-2021-37640: TensorFlow: SparseReshape div-by-zero crashes ML pipelines

MEDIUM
Published August 12, 2021
CISO Take

A local attacker with low privileges can crash any TensorFlow 2.5.x process that invokes `tf.raw_ops.SparseReshape` with a zero-dimension target shape, causing a hard process termination. Patch to TensorFlow 2.5.1 or 2.6.0 immediately if running on-prem training or serving infrastructure. Risk is limited to availability — no data exfiltration or code execution vector exists.

Risk Assessment

Medium severity in isolation, but contextually elevated for organizations running TensorFlow in shared training clusters or multi-tenant ML serving environments where a malicious tenant or malformed input dataset could trigger the crash repeatedly. CVSS 5.5 (local) reflects the prerequisite of local access, but in containerized ML pipelines, 'local' often means any user who can submit a training job. No evidence of active exploitation. Patched since August 2021.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed 6d ago 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
5.5 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 10% of all CVEs
Exploitation Status
No known exploitation
Sophistication
Trivial

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

Recommended Action

5 steps
  1. Upgrade to TensorFlow >= 2.5.1 or >= 2.6.0 (patch commit 4923de56ec94).

  2. If immediate patching is not possible, add input validation to reject sparse tensors with zero-dimension target shapes before they reach TF ops.

  3. In TF Serving: audit served model graphs for SparseReshape ops and consider input shape validation at the serving layer.

  4. Monitor ML worker process crash rates — repeated unexpected terminations may indicate exploitation attempts.

  5. In multi-tenant training clusters, isolate user workloads to prevent one tenant's crash from affecting others.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Annex I - 1.3 - Robustness, accuracy, and cybersecurity requirements for high-risk AI
ISO 42001
8.4 - AI system operation and monitoring
NIST AI RMF
GV-1.4 - Organizational teams commit to governance and risk management for AI MS-2.5 - Manage AI risks with appropriate response and recovery

Frequently Asked Questions

What is CVE-2021-37640?

A local attacker with low privileges can crash any TensorFlow 2.5.x process that invokes `tf.raw_ops.SparseReshape` with a zero-dimension target shape, causing a hard process termination. Patch to TensorFlow 2.5.1 or 2.6.0 immediately if running on-prem training or serving infrastructure. Risk is limited to availability — no data exfiltration or code execution vector exists.

Is CVE-2021-37640 actively exploited?

No confirmed active exploitation of CVE-2021-37640 has been reported, but organizations should still patch proactively.

How to fix CVE-2021-37640?

1. Upgrade to TensorFlow >= 2.5.1 or >= 2.6.0 (patch commit 4923de56ec94). 2. If immediate patching is not possible, add input validation to reject sparse tensors with zero-dimension target shapes before they reach TF ops. 3. In TF Serving: audit served model graphs for SparseReshape ops and consider input shape validation at the serving layer. 4. Monitor ML worker process crash rates — repeated unexpected terminations may indicate exploitation attempts. 5. In multi-tenant training clusters, isolate user workloads to prevent one tenant's crash from affecting others.

What systems are affected by CVE-2021-37640?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, data preprocessing pipelines.

What is the CVSS score for CVE-2021-37640?

CVE-2021-37640 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.03%.

Technical Details

NVD Description

TensorFlow is an end-to-end open source platform for machine learning. In affected versions the implementation of `tf.raw_ops.SparseReshape` can be made to trigger an integral division by 0 exception. The [implementation](https://github.com/tensorflow/tensorflow/blob/8d72537c6abf5a44103b57b9c2e22c14f5f49698/tensorflow/core/kernels/reshape_util.cc#L176-L181) calls the reshaping functor whenever there is at least an index in the input but does not check that shape of the input or the target shape have both a non-zero number of elements. The [reshape functor](https://github.com/tensorflow/tensorflow/blob/8d72537c6abf5a44103b57b9c2e22c14f5f49698/tensorflow/core/kernels/reshape_util.cc#L40-L78) blindly divides by the dimensions of the target shape. Hence, if this is not checked, code will result in a division by 0. We have patched the issue in GitHub commit 4923de56ec94fff7770df259ab7f2288a74feb41. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1 as this is the other affected version.

Exploitation Scenario

An adversary with access to a shared Jupyter or ML training environment submits a training job that constructs a sparse tensor with at least one index and calls `tf.raw_ops.SparseReshape` with a target shape of `[0, N]`. The TensorFlow process hits an unhandled integer division by zero in `reshape_util.cc` and crashes. In a Kubernetes ML cluster without job isolation, this can disrupt co-located training jobs. Against a TF Serving endpoint, if a deployed model's computation graph internally routes through SparseReshape (e.g., a sparse embedding lookup), crafted inference requests with adversarial tensor shapes could repeatedly crash the serving pod, achieving denial of service against the inference API.

Weaknesses (CWE)

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
August 12, 2021
Last Modified
November 21, 2024
First Seen
August 12, 2021

Related Vulnerabilities