CVE-2021-29577: TensorFlow: heap overflow in AvgPool3DGrad op
HIGH PoC AVAILABLEUpgrade TensorFlow to 2.5.0 or apply the backported patches for 2.1.4–2.4.2 immediately. This heap buffer overflow enables local code execution within ML training and serving environments—a real threat on shared GPU clusters, Jupyter hubs, or MLOps platforms where multiple users submit workloads. Audit any multi-tenant ML infrastructure for exposure before assuming low risk.
Risk Assessment
CVSS 7.8 (High). The local attack vector with low complexity and low privilege requirements means any authenticated user or compromised process on shared ML infrastructure can trigger this. While not directly remotely exploitable, real-world ML environments—Kubeflow clusters, shared Jupyter servers, TF Serving deployments—frequently expose TensorFlow ops to multiple principals, elevating effective exposure beyond what the local AV suggests. No known active exploitation at time of publication.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| tensorflow | pip | — | No patch |
Do you use tensorflow? You're affected.
Severity & Risk
Attack Surface
Recommended Action
5 steps-
Patch: Upgrade to TensorFlow 2.5.0, or cherry-pick commit 6fc9141 onto supported branches (2.1.4, 2.2.3, 2.3.3, 2.4.2).
-
Workaround: Add input validation to enforce matching first and last dimensions of orig_input_shape and grad before invoking AvgPool3DGrad.
-
Isolation: Ensure ML training workloads from untrusted users run in isolated containers with dropped capabilities and no host-level privilege.
-
Detection: Monitor TF worker processes for unexpected crashes or heap corruption signals; review core dumps if available.
-
Inventory: Identify all TensorFlow versions in use across training, inference, and CI/CD pipelines—containerized and bare-metal.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-29577?
Upgrade TensorFlow to 2.5.0 or apply the backported patches for 2.1.4–2.4.2 immediately. This heap buffer overflow enables local code execution within ML training and serving environments—a real threat on shared GPU clusters, Jupyter hubs, or MLOps platforms where multiple users submit workloads. Audit any multi-tenant ML infrastructure for exposure before assuming low risk.
Is CVE-2021-29577 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-29577, increasing the risk of exploitation.
How to fix CVE-2021-29577?
1. Patch: Upgrade to TensorFlow 2.5.0, or cherry-pick commit 6fc9141 onto supported branches (2.1.4, 2.2.3, 2.3.3, 2.4.2). 2. Workaround: Add input validation to enforce matching first and last dimensions of orig_input_shape and grad before invoking AvgPool3DGrad. 3. Isolation: Ensure ML training workloads from untrusted users run in isolated containers with dropped capabilities and no host-level privilege. 4. Detection: Monitor TF worker processes for unexpected crashes or heap corruption signals; review core dumps if available. 5. Inventory: Identify all TensorFlow versions in use across training, inference, and CI/CD pipelines—containerized and bare-metal.
What systems are affected by CVE-2021-29577?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, MLOps platforms.
What is the CVSS score for CVE-2021-29577?
CVE-2021-29577 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.01%.
Technical Details
NVD Description
TensorFlow is an end-to-end open source platform for machine learning. The implementation of `tf.raw_ops.AvgPool3DGrad` is vulnerable to a heap buffer overflow. The implementation(https://github.com/tensorflow/tensorflow/blob/d80ffba9702dc19d1fac74fc4b766b3fa1ee976b/tensorflow/core/kernels/pooling_ops_3d.cc#L376-L450) assumes that the `orig_input_shape` and `grad` tensors have similar first and last dimensions but does not check that this assumption is validated. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.
Exploitation Scenario
An attacker with data scientist-level access to a shared Kubeflow cluster submits a crafted training job that calls tf.raw_ops.AvgPool3DGrad with intentionally mismatched tensor shapes—e.g., orig_input_shape with batch size 4 but grad with batch size 16. The missing bounds check causes a heap buffer overflow, corrupting adjacent memory. With a shaped payload, the attacker achieves arbitrary code execution within the TensorFlow process, enabling exfiltration of co-tenants' model checkpoints, training data, or environment secrets, or escaping the container to the host node.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H References
- github.com/tensorflow/tensorflow/commit/6fc9141f42f6a72180ecd24021c3e6b36165fe0d Patch 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-v6r6-84gr-92rm Exploit Patch 3rd Party
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow
AI Threat Alert