CVE-2021-29569: TensorFlow: OOB heap read in MaxPoolGradWithArgmax op
HIGH PoC AVAILABLEA heap out-of-bounds read in TensorFlow's MaxPoolGradWithArgmax op allows any local user with low privileges to leak heap memory or crash the TF runtime by passing empty tensors. In shared ML environments — multi-user Jupyter servers, training clusters, model serving endpoints — this is trivially exploitable by any tenant. Patch immediately to TF 2.5.0 or the backported fixes in 2.1.4–2.4.2; there is no workaround short of input validation at the application layer.
What is the risk?
Moderate in isolated single-user environments; elevated in shared ML infrastructure. The local attack vector limits internet-facing exposure, but multi-tenant GPU servers, JupyterHub deployments, and MLOps platforms running shared TF sessions significantly amplify the blast radius. A CVSS of 7.1 reflects high confidentiality impact (heap data leakage) and high availability impact (process crash). Not in CISA KEV and no public exploit weaponization observed, but the primitive is trivial to construct — any user who can call TF ops can trigger it.
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| TensorFlow | pip | — | No patch |
Do you use TensorFlow? You're affected.
How severe is it?
What is the attack surface?
What should I do?
6 steps-
Patch: Upgrade to TensorFlow 2.5.0, or apply backport commits to 2.4.2/2.3.3/2.2.3/2.1.4 (commit ef0c008ee84bad91ec6725ddc42091e19a30cf0e).
-
Input validation: Enforce tensor shape/element-count checks at API boundaries before ops execute — reject empty tensors for ops requiring at least one element.
-
Network segmentation: If using TF Serving, restrict access to trusted networks; do not expose raw-op endpoints publicly.
-
Isolation: Run training jobs in dedicated containers or VMs per user/team to contain blast radius if exploited on shared infrastructure.
-
Detection: Alert on unexpected SIGSEGV or process crashes from TF worker processes; anomalous crash dumps from training jobs warrant investigation.
-
Audit: Inventory all TF versions deployed across training, serving, and notebook infrastructure — shadow AI deployments are a common blind spot.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-29569?
A heap out-of-bounds read in TensorFlow's MaxPoolGradWithArgmax op allows any local user with low privileges to leak heap memory or crash the TF runtime by passing empty tensors. In shared ML environments — multi-user Jupyter servers, training clusters, model serving endpoints — this is trivially exploitable by any tenant. Patch immediately to TF 2.5.0 or the backported fixes in 2.1.4–2.4.2; there is no workaround short of input validation at the application layer.
Is CVE-2021-29569 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-29569, increasing the risk of exploitation.
How to fix CVE-2021-29569?
1. Patch: Upgrade to TensorFlow 2.5.0, or apply backport commits to 2.4.2/2.3.3/2.2.3/2.1.4 (commit ef0c008ee84bad91ec6725ddc42091e19a30cf0e). 2. Input validation: Enforce tensor shape/element-count checks at API boundaries before ops execute — reject empty tensors for ops requiring at least one element. 3. Network segmentation: If using TF Serving, restrict access to trusted networks; do not expose raw-op endpoints publicly. 4. Isolation: Run training jobs in dedicated containers or VMs per user/team to contain blast radius if exploited on shared infrastructure. 5. Detection: Alert on unexpected SIGSEGV or process crashes from TF worker processes; anomalous crash dumps from training jobs warrant investigation. 6. Audit: Inventory all TF versions deployed across training, serving, and notebook infrastructure — shadow AI deployments are a common blind spot.
What systems are affected by CVE-2021-29569?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, MLOps platforms, shared Jupyter environments.
What is the CVSS score for CVE-2021-29569?
CVE-2021-29569 has a CVSS v3.1 base score of 7.1 (HIGH). The EPSS exploitation probability is 0.20%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.001 AI Software AML.T0043 Craft Adversarial Data AML.T0049 Exploit Public-Facing Application Compliance Controls Affected
What are the technical details?
Original Advisory
TensorFlow is an end-to-end open source platform for machine learning. The implementation of `tf.raw_ops.MaxPoolGradWithArgmax` can cause reads outside of bounds of heap allocated data if attacker supplies specially crafted inputs. The implementation(https://github.com/tensorflow/tensorflow/blob/ac328eaa3870491ababc147822cd04e91a790643/tensorflow/core/kernels/requantization_range_op.cc#L49-L50) assumes that the `input_min` and `input_max` tensors have at least one element, as it accesses the first element in two arrays. If the tensors are empty, `.flat<T>()` is an empty object, backed by an empty array. Hence, accesing even the 0th element is a read outside the bounds. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.
Exploitation Scenario
A malicious insider or compromised data scientist account on a shared ML training cluster opens a notebook and calls tf.raw_ops.MaxPoolGradWithArgmax with empty input_min and input_max tensors. TF accesses index 0 of empty flat arrays, reading beyond heap bounds. In the best case for the attacker, adjacent heap memory is returned — potentially containing model weights, training data batches, or API tokens cached in the same process. In an alternative scenario targeting TF Serving, an external attacker submits a crafted gRPC inference request with empty tensors to a publicly exposed serving endpoint, triggering a heap OOB read that crashes the server or leaks response data from co-located requests. Either path requires no special AI/ML knowledge — just knowledge of the TF op API.
Weaknesses (CWE)
CWE-125 — Out-of-bounds Read: The product reads data past the end, or before the beginning, of the intended buffer.
- [Implementation] Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylis
- [Architecture and Design] Use a language that provides appropriate memory abstractions.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:H References
- github.com/tensorflow/tensorflow/commit/ef0c008ee84bad91ec6725ddc42091e19a30cf0e Patch 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-3h8m-483j-7xxm Exploit Patch 3rd Party
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow