CVE-2021-29524: TensorFlow: div-by-zero DoS in Conv2D backprop op

MEDIUM PoC AVAILABLE
Published May 14, 2021
CISO Take

A local attacker with low privileges can crash any TensorFlow process by passing a zero-divisor to the Conv2DBackpropFilter raw op, causing a divide-by-zero and process termination. Primary risk is disruption of CNN training jobs in shared ML environments or multi-tenant training platforms. Patch to TF 2.5.0 / 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4 immediately; no workaround exists short of blocking access to raw ops.

Risk Assessment

MEDIUM-LOW in isolated environments, MEDIUM in shared or multi-tenant ML infrastructure. CVSS 5.5 reflects local vector, but in practice any user with access to the TensorFlow runtime can terminate training jobs deterministically. No confidentiality or integrity impact, but availability impact is HIGH — a single malformed op call kills the process. Risk elevates in Jupyter/notebook environments, shared HPC clusters, and MLaaS platforms where multiple teams share a TF runtime.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed 6d ago 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
5.5 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 1% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

Recommended Action

5 steps
  1. PATCH

    Upgrade to TensorFlow 2.5.0, or apply backports 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4. Commit fca9874 resolves the issue.

  2. RESTRICT

    Audit who can submit arbitrary tf.raw_ops calls in your environment — restrict to trusted identities.

  3. ISOLATE

    Run training jobs in separate containers/processes per user/team to limit blast radius.

  4. DETECT

    Monitor for unexpected TF process crashes or SIGFPE signals in training infrastructure logs.

  5. HARDEN

    Disable direct tf.raw_ops exposure in any public-facing model-serving API.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2.6 - AI system robustness and security
NIST AI RMF
GOVERN 1.7 - Processes for decommissioning and patching AI systems MANAGE 2.2 - Residual risk tracking and treatment

Frequently Asked Questions

What is CVE-2021-29524?

A local attacker with low privileges can crash any TensorFlow process by passing a zero-divisor to the Conv2DBackpropFilter raw op, causing a divide-by-zero and process termination. Primary risk is disruption of CNN training jobs in shared ML environments or multi-tenant training platforms. Patch to TF 2.5.0 / 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4 immediately; no workaround exists short of blocking access to raw ops.

Is CVE-2021-29524 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29524, increasing the risk of exploitation.

How to fix CVE-2021-29524?

1. PATCH: Upgrade to TensorFlow 2.5.0, or apply backports 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4. Commit fca9874 resolves the issue. 2. RESTRICT: Audit who can submit arbitrary tf.raw_ops calls in your environment — restrict to trusted identities. 3. ISOLATE: Run training jobs in separate containers/processes per user/team to limit blast radius. 4. DETECT: Monitor for unexpected TF process crashes or SIGFPE signals in training infrastructure logs. 5. HARDEN: Disable direct tf.raw_ops exposure in any public-facing model-serving API.

What systems are affected by CVE-2021-29524?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, ML development environments, shared compute clusters.

What is the CVSS score for CVE-2021-29524?

CVE-2021-29524 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.01%.

Technical Details

NVD Description

TensorFlow is an end-to-end open source platform for machine learning. An attacker can trigger a division by 0 in `tf.raw_ops.Conv2DBackpropFilter`. This is because the implementation(https://github.com/tensorflow/tensorflow/blob/496c2630e51c1a478f095b084329acedb253db6b/tensorflow/core/kernels/conv_grad_shape_utils.cc#L130) does a modulus operation where the divisor is controlled by the caller. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

A malicious insider or attacker who has compromised a shared ML development environment calls tf.raw_ops.Conv2DBackpropFilter() with input shape parameters engineered to produce a zero-valued modulus divisor. The kernel performs the modulus without validation, triggers a SIGFPE/division-by-zero exception, and kills the TensorFlow process. In a shared Jupyter server or HPC training cluster, this can be used to repeatedly abort other teams' training runs — effective sabotage of ML pipelines without leaving obvious traces beyond process crash logs.

Weaknesses (CWE)

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 14, 2021
Last Modified
November 21, 2024
First Seen
May 14, 2021

Related Vulnerabilities