CVE-2021-29526: TensorFlow: Conv2D divide-by-zero crashes ML workloads
MEDIUM PoC AVAILABLEA crafted input to TensorFlow's Conv2D operation causes a division by zero, crashing any process running the affected op — effectively a local denial of service against ML workloads. In shared environments (multi-tenant Jupyter servers, ML platforms, CI/CD pipelines), a low-privileged user can disrupt training jobs or inference services. Patch to TF 2.5.0 or the backported versions immediately if running affected TF versions in shared compute environments.
Risk Assessment
Medium overall, but context-dependent. The local attack vector and low privilege requirement limit exposure in isolated single-user setups. Risk escalates significantly in shared ML infrastructure: collaborative Jupyter hubs, AutoML platforms, or model-serving endpoints that accept user-defined computation graphs. Impact is purely availability — no data exfiltration or code execution — but crashing a long-running training job has real operational cost. No evidence of active exploitation in the wild.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| tensorflow | pip | — | No patch |
Do you use tensorflow? You're affected.
Severity & Risk
Attack Surface
Recommended Action
1 step-
1) Upgrade to TensorFlow 2.5.0 or patched backports: 2.4.2, 2.3.3, 2.2.3, or 2.1.4. 2) If patching is not immediate, restrict access to tf.raw_ops in multi-tenant environments using TF's op-level access controls or by sandboxing graph execution. 3) Validate Conv2D kernel and stride parameters (ensure no zero-valued dimensions) at application boundaries before passing to TF. 4) In shared Jupyter or ML platform deployments, audit user-submitted notebooks/graphs for calls to tf.raw_ops. 5) Monitor for unexpected TF process crashes as a low-noise detection signal.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-29526?
A crafted input to TensorFlow's Conv2D operation causes a division by zero, crashing any process running the affected op — effectively a local denial of service against ML workloads. In shared environments (multi-tenant Jupyter servers, ML platforms, CI/CD pipelines), a low-privileged user can disrupt training jobs or inference services. Patch to TF 2.5.0 or the backported versions immediately if running affected TF versions in shared compute environments.
Is CVE-2021-29526 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-29526, increasing the risk of exploitation.
How to fix CVE-2021-29526?
1) Upgrade to TensorFlow 2.5.0 or patched backports: 2.4.2, 2.3.3, 2.2.3, or 2.1.4. 2) If patching is not immediate, restrict access to tf.raw_ops in multi-tenant environments using TF's op-level access controls or by sandboxing graph execution. 3) Validate Conv2D kernel and stride parameters (ensure no zero-valued dimensions) at application boundaries before passing to TF. 4) In shared Jupyter or ML platform deployments, audit user-submitted notebooks/graphs for calls to tf.raw_ops. 5) Monitor for unexpected TF process crashes as a low-noise detection signal.
What systems are affected by CVE-2021-29526?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, shared ML platforms, CI/CD ML pipelines.
What is the CVSS score for CVE-2021-29526?
CVE-2021-29526 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.01%.
Technical Details
NVD Description
TensorFlow is an end-to-end open source platform for machine learning. An attacker can trigger a division by 0 in `tf.raw_ops.Conv2D`. This is because the implementation(https://github.com/tensorflow/tensorflow/blob/988087bd83f144af14087fe4fecee2d250d93737/tensorflow/core/kernels/conv_ops.cc#L261-L263) does a division by a quantity that is controlled by the caller. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.
Exploitation Scenario
An adversary with access to a shared ML platform (e.g., a data scientist account on a corporate Jupyter hub or an AutoML service that accepts custom TF code) submits a notebook or training script calling tf.raw_ops.Conv2D with a zero-valued stride or filter dimension. The TF process divides by the attacker-controlled zero value, raises an unhandled exception, and crashes — aborting any co-located training jobs sharing the same process. In a continuous training pipeline with automated restarts, the adversary can repeat this to persistently deny GPU compute resources to legitimate workloads.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H References
- github.com/tensorflow/tensorflow/commit/b12aa1d44352de21d1a6faaf04172d8c2508b42b Patch 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-4vf2-4xcg-65cx Exploit Patch 3rd Party
- github.com/ARPSyndicate/cvemon Exploit
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow
AI Threat Alert