CVE-2021-29524: TensorFlow: div-by-zero DoS in Conv2D backprop op
MEDIUM PoC AVAILABLEA local attacker with low privileges can crash any TensorFlow process by passing a zero-divisor to the Conv2DBackpropFilter raw op, causing a divide-by-zero and process termination. Primary risk is disruption of CNN training jobs in shared ML environments or multi-tenant training platforms. Patch to TF 2.5.0 / 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4 immediately; no workaround exists short of blocking access to raw ops.
Risk Assessment
MEDIUM-LOW in isolated environments, MEDIUM in shared or multi-tenant ML infrastructure. CVSS 5.5 reflects local vector, but in practice any user with access to the TensorFlow runtime can terminate training jobs deterministically. No confidentiality or integrity impact, but availability impact is HIGH — a single malformed op call kills the process. Risk elevates in Jupyter/notebook environments, shared HPC clusters, and MLaaS platforms where multiple teams share a TF runtime.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| tensorflow | pip | — | No patch |
Do you use tensorflow? You're affected.
Severity & Risk
Attack Surface
Recommended Action
5 steps-
PATCH
Upgrade to TensorFlow 2.5.0, or apply backports 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4. Commit fca9874 resolves the issue.
-
RESTRICT
Audit who can submit arbitrary tf.raw_ops calls in your environment — restrict to trusted identities.
-
ISOLATE
Run training jobs in separate containers/processes per user/team to limit blast radius.
-
DETECT
Monitor for unexpected TF process crashes or SIGFPE signals in training infrastructure logs.
-
HARDEN
Disable direct tf.raw_ops exposure in any public-facing model-serving API.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-29524?
A local attacker with low privileges can crash any TensorFlow process by passing a zero-divisor to the Conv2DBackpropFilter raw op, causing a divide-by-zero and process termination. Primary risk is disruption of CNN training jobs in shared ML environments or multi-tenant training platforms. Patch to TF 2.5.0 / 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4 immediately; no workaround exists short of blocking access to raw ops.
Is CVE-2021-29524 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-29524, increasing the risk of exploitation.
How to fix CVE-2021-29524?
1. PATCH: Upgrade to TensorFlow 2.5.0, or apply backports 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4. Commit fca9874 resolves the issue. 2. RESTRICT: Audit who can submit arbitrary tf.raw_ops calls in your environment — restrict to trusted identities. 3. ISOLATE: Run training jobs in separate containers/processes per user/team to limit blast radius. 4. DETECT: Monitor for unexpected TF process crashes or SIGFPE signals in training infrastructure logs. 5. HARDEN: Disable direct tf.raw_ops exposure in any public-facing model-serving API.
What systems are affected by CVE-2021-29524?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, ML development environments, shared compute clusters.
What is the CVSS score for CVE-2021-29524?
CVE-2021-29524 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.01%.
Technical Details
NVD Description
TensorFlow is an end-to-end open source platform for machine learning. An attacker can trigger a division by 0 in `tf.raw_ops.Conv2DBackpropFilter`. This is because the implementation(https://github.com/tensorflow/tensorflow/blob/496c2630e51c1a478f095b084329acedb253db6b/tensorflow/core/kernels/conv_grad_shape_utils.cc#L130) does a modulus operation where the divisor is controlled by the caller. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.
Exploitation Scenario
A malicious insider or attacker who has compromised a shared ML development environment calls tf.raw_ops.Conv2DBackpropFilter() with input shape parameters engineered to produce a zero-valued modulus divisor. The kernel performs the modulus without validation, triggers a SIGFPE/division-by-zero exception, and kills the TensorFlow process. In a shared Jupyter server or HPC training cluster, this can be used to repeatedly abort other teams' training runs — effective sabotage of ML pipelines without leaving obvious traces beyond process crash logs.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H References
- github.com/tensorflow/tensorflow/commit/fca9874a9b42a2134f907d2fb46ab774a831404a Patch 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-r4pj-74mg-8868 Exploit Patch 3rd Party
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow
AI Threat Alert