CVE-2021-29526: TensorFlow: Conv2D divide-by-zero crashes ML workloads

MEDIUM PoC AVAILABLE
Published May 14, 2021
CISO Take

A crafted input to TensorFlow's Conv2D operation causes a division by zero, crashing any process running the affected op — effectively a local denial of service against ML workloads. In shared environments (multi-tenant Jupyter servers, ML platforms, CI/CD pipelines), a low-privileged user can disrupt training jobs or inference services. Patch to TF 2.5.0 or the backported versions immediately if running affected TF versions in shared compute environments.

Risk Assessment

Medium overall, but context-dependent. The local attack vector and low privilege requirement limit exposure in isolated single-user setups. Risk escalates significantly in shared ML infrastructure: collaborative Jupyter hubs, AutoML platforms, or model-serving endpoints that accept user-defined computation graphs. Impact is purely availability — no data exfiltration or code execution — but crashing a long-running training job has real operational cost. No evidence of active exploitation in the wild.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed today 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
5.5 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 1% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

Recommended Action

1 step
  1. 1) Upgrade to TensorFlow 2.5.0 or patched backports: 2.4.2, 2.3.3, 2.2.3, or 2.1.4. 2) If patching is not immediate, restrict access to tf.raw_ops in multi-tenant environments using TF's op-level access controls or by sandboxing graph execution. 3) Validate Conv2D kernel and stride parameters (ensure no zero-valued dimensions) at application boundaries before passing to TF. 4) In shared Jupyter or ML platform deployments, audit user-submitted notebooks/graphs for calls to tf.raw_ops. 5) Monitor for unexpected TF process crashes as a low-noise detection signal.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.9.4 - AI system security
NIST AI RMF
MANAGE 2.4 - Residual risks from third-party dependencies
OWASP LLM Top 10
LLM06 - Sensitive Information Disclosure / Supply Chain

Frequently Asked Questions

What is CVE-2021-29526?

A crafted input to TensorFlow's Conv2D operation causes a division by zero, crashing any process running the affected op — effectively a local denial of service against ML workloads. In shared environments (multi-tenant Jupyter servers, ML platforms, CI/CD pipelines), a low-privileged user can disrupt training jobs or inference services. Patch to TF 2.5.0 or the backported versions immediately if running affected TF versions in shared compute environments.

Is CVE-2021-29526 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29526, increasing the risk of exploitation.

How to fix CVE-2021-29526?

1) Upgrade to TensorFlow 2.5.0 or patched backports: 2.4.2, 2.3.3, 2.2.3, or 2.1.4. 2) If patching is not immediate, restrict access to tf.raw_ops in multi-tenant environments using TF's op-level access controls or by sandboxing graph execution. 3) Validate Conv2D kernel and stride parameters (ensure no zero-valued dimensions) at application boundaries before passing to TF. 4) In shared Jupyter or ML platform deployments, audit user-submitted notebooks/graphs for calls to tf.raw_ops. 5) Monitor for unexpected TF process crashes as a low-noise detection signal.

What systems are affected by CVE-2021-29526?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, shared ML platforms, CI/CD ML pipelines.

What is the CVSS score for CVE-2021-29526?

CVE-2021-29526 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.01%.

Technical Details

NVD Description

TensorFlow is an end-to-end open source platform for machine learning. An attacker can trigger a division by 0 in `tf.raw_ops.Conv2D`. This is because the implementation(https://github.com/tensorflow/tensorflow/blob/988087bd83f144af14087fe4fecee2d250d93737/tensorflow/core/kernels/conv_ops.cc#L261-L263) does a division by a quantity that is controlled by the caller. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with access to a shared ML platform (e.g., a data scientist account on a corporate Jupyter hub or an AutoML service that accepts custom TF code) submits a notebook or training script calling tf.raw_ops.Conv2D with a zero-valued stride or filter dimension. The TF process divides by the attacker-controlled zero value, raises an unhandled exception, and crashes — aborting any co-located training jobs sharing the same process. In a continuous training pipeline with automated restarts, the adversary can repeat this to persistently deny GPU compute resources to legitimate workloads.

Weaknesses (CWE)

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 14, 2021
Last Modified
November 21, 2024
First Seen
May 14, 2021

Related Vulnerabilities