CVE-2021-29524: TensorFlow: div-by-zero DoS in Conv2D backprop op
MEDIUM PoC AVAILABLEA local attacker with low privileges can crash any TensorFlow process by passing a zero-divisor to the Conv2DBackpropFilter raw op, causing a divide-by-zero and process termination. Primary risk is disruption of CNN training jobs in shared ML environments or multi-tenant training platforms. Patch to TF 2.5.0 / 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4 immediately; no workaround exists short of blocking access to raw ops.
What is the risk?
MEDIUM-LOW in isolated environments, MEDIUM in shared or multi-tenant ML infrastructure. CVSS 5.5 reflects local vector, but in practice any user with access to the TensorFlow runtime can terminate training jobs deterministically. No confidentiality or integrity impact, but availability impact is HIGH — a single malformed op call kills the process. Risk elevates in Jupyter/notebook environments, shared HPC clusters, and MLaaS platforms where multiple teams share a TF runtime.
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| TensorFlow | pip | — | No patch |
Do you use TensorFlow? You're affected.
How severe is it?
What is the attack surface?
What should I do?
5 steps-
PATCH
Upgrade to TensorFlow 2.5.0, or apply backports 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4. Commit fca9874 resolves the issue.
-
RESTRICT
Audit who can submit arbitrary tf.raw_ops calls in your environment — restrict to trusted identities.
-
ISOLATE
Run training jobs in separate containers/processes per user/team to limit blast radius.
-
DETECT
Monitor for unexpected TF process crashes or SIGFPE signals in training infrastructure logs.
-
HARDEN
Disable direct tf.raw_ops exposure in any public-facing model-serving API.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-29524?
A local attacker with low privileges can crash any TensorFlow process by passing a zero-divisor to the Conv2DBackpropFilter raw op, causing a divide-by-zero and process termination. Primary risk is disruption of CNN training jobs in shared ML environments or multi-tenant training platforms. Patch to TF 2.5.0 / 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4 immediately; no workaround exists short of blocking access to raw ops.
Is CVE-2021-29524 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-29524, increasing the risk of exploitation.
How to fix CVE-2021-29524?
1. PATCH: Upgrade to TensorFlow 2.5.0, or apply backports 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4. Commit fca9874 resolves the issue. 2. RESTRICT: Audit who can submit arbitrary tf.raw_ops calls in your environment — restrict to trusted identities. 3. ISOLATE: Run training jobs in separate containers/processes per user/team to limit blast radius. 4. DETECT: Monitor for unexpected TF process crashes or SIGFPE signals in training infrastructure logs. 5. HARDEN: Disable direct tf.raw_ops exposure in any public-facing model-serving API.
What systems are affected by CVE-2021-29524?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, ML development environments, shared compute clusters.
What is the CVSS score for CVE-2021-29524?
CVE-2021-29524 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.19%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.001 AI Software AML.T0029 Denial of AI Service AML.T0049 Exploit Public-Facing Application Compliance Controls Affected
What are the technical details?
Original Advisory
TensorFlow is an end-to-end open source platform for machine learning. An attacker can trigger a division by 0 in `tf.raw_ops.Conv2DBackpropFilter`. This is because the implementation(https://github.com/tensorflow/tensorflow/blob/496c2630e51c1a478f095b084329acedb253db6b/tensorflow/core/kernels/conv_grad_shape_utils.cc#L130) does a modulus operation where the divisor is controlled by the caller. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.
Exploitation Scenario
A malicious insider or attacker who has compromised a shared ML development environment calls tf.raw_ops.Conv2DBackpropFilter() with input shape parameters engineered to produce a zero-valued modulus divisor. The kernel performs the modulus without validation, triggers a SIGFPE/division-by-zero exception, and kills the TensorFlow process. In a shared Jupyter server or HPC training cluster, this can be used to repeatedly abort other teams' training runs — effective sabotage of ML pipelines without leaving obvious traces beyond process crash logs.
Weaknesses (CWE)
CWE-369 — Divide By Zero: The product divides a value by zero.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H References
- github.com/tensorflow/tensorflow/commit/fca9874a9b42a2134f907d2fb46ab774a831404a Patch 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-r4pj-74mg-8868 Exploit Patch 3rd Party
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow