CVE-2021-29576: TensorFlow: heap buffer overflow in MaxPool3DGradGrad op
HIGH PoC AVAILABLEA heap buffer overflow in TensorFlow's MaxPool3DGradGrad operation can lead to arbitrary code execution by a local low-privileged user. Shared ML training infrastructure and multi-tenant Jupyter/GPU environments carry the highest exposure. Patch to TF 2.5.0 or apply the available backports immediately; enforce sandboxed execution of untrusted TF computation graphs as a compensating control.
Risk Assessment
CVSS 7.8 High with local attack vector and low privilege requirement. Real-world risk is concentrated in multi-tenant ML training environments—shared GPU clusters, internal Jupyter hubs, MLOps platforms (Kubeflow, Vertex AI Workbench). The low attack complexity once local access is obtained means a moderately skilled attacker can reliably trigger the overflow. Isolated single-user workstations carry lower urgency but still warrant patching given the C:H/I:H/A:H impact triad.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| tensorflow | pip | — | No patch |
Do you use tensorflow? You're affected.
Severity & Risk
Attack Surface
Recommended Action
5 steps-
Upgrade TensorFlow to 2.5.0+, or apply backports: 2.4.2, 2.3.3, 2.2.3, 2.1.4 (patch commit: 63c6a29d0f2d).
-
Audit all TF versions across training servers, Docker images, and CI/CD pipelines—pin to patched versions.
-
Restrict execution of untrusted or user-submitted TF computation graphs via containerization and seccomp/AppArmor profiles.
-
In multi-tenant ML platforms, enforce least-privilege for workload runners; avoid running training jobs as root.
-
Monitor TF workload processes for anomalous behavior (unexpected child processes, unusual memory access patterns).
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-29576?
A heap buffer overflow in TensorFlow's MaxPool3DGradGrad operation can lead to arbitrary code execution by a local low-privileged user. Shared ML training infrastructure and multi-tenant Jupyter/GPU environments carry the highest exposure. Patch to TF 2.5.0 or apply the available backports immediately; enforce sandboxed execution of untrusted TF computation graphs as a compensating control.
Is CVE-2021-29576 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-29576, increasing the risk of exploitation.
How to fix CVE-2021-29576?
1. Upgrade TensorFlow to 2.5.0+, or apply backports: 2.4.2, 2.3.3, 2.2.3, 2.1.4 (patch commit: 63c6a29d0f2d). 2. Audit all TF versions across training servers, Docker images, and CI/CD pipelines—pin to patched versions. 3. Restrict execution of untrusted or user-submitted TF computation graphs via containerization and seccomp/AppArmor profiles. 4. In multi-tenant ML platforms, enforce least-privilege for workload runners; avoid running training jobs as root. 5. Monitor TF workload processes for anomalous behavior (unexpected child processes, unusual memory access patterns).
What systems are affected by CVE-2021-29576?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, shared ML platforms.
What is the CVSS score for CVE-2021-29576?
CVE-2021-29576 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.01%.
Technical Details
NVD Description
TensorFlow is an end-to-end open source platform for machine learning. The implementation of `tf.raw_ops.MaxPool3DGradGrad` is vulnerable to a heap buffer overflow. The implementation(https://github.com/tensorflow/tensorflow/blob/596c05a159b6fbb9e39ca10b3f7753b7244fa1e9/tensorflow/core/kernels/pooling_ops_3d.cc#L694-L696) does not check that the initialization of `Pool3dParameters` completes successfully. Since the constructor(https://github.com/tensorflow/tensorflow/blob/596c05a159b6fbb9e39ca10b3f7753b7244fa1e9/tensorflow/core/kernels/pooling_ops_3d.cc#L48-L88) uses `OP_REQUIRES` to validate conditions, the first assertion that fails interrupts the initialization of `params`, making it contain invalid data. In turn, this might cause a heap buffer overflow, depending on default initialized values. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.
Exploitation Scenario
An attacker with shell access on a shared GPU training server (e.g., a compromised data scientist account) crafts a Python script calling tf.raw_ops.MaxPool3DGradGrad with parameters designed to fail Pool3dParameters initialization. The constructor's OP_REQUIRES check aborts initialization, leaving the params struct containing invalid data. When the op proceeds with corrupted params, a heap buffer overflow occurs—giving the attacker the opportunity to overwrite heap metadata and achieve code execution under the TF process owner. In common MLOps environments where training jobs run as privileged service accounts or inside containers with host mounts, this can escalate to full host or cluster compromise.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H References
- github.com/tensorflow/tensorflow/commit/63c6a29d0f2d692b247f7bf81f8732d6442fad09 Patch 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-7cqx-92hp-x6wh Exploit Patch 3rd Party
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow
AI Threat Alert