CVE-2021-29612: TensorFlow: heap overflow in linalg op, RCE risk
HIGH PoC AVAILABLEHeap buffer overflow in TensorFlow's BandedTriangularSolve kernel allows low-privileged local code execution — full CIA impact. Patch immediately to TF 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4. Shared ML platforms (Jupyter, Kubeflow, MLflow) where users submit arbitrary model code are at highest risk.
Risk Assessment
CVSS 7.8 High with local attack vector limits direct internet exposure, but shared ML training infrastructure substantially elevates real-world risk. Attack complexity is low, no user interaction required, and the root cause is a double failure: missing empty-tensor validation AND unchecked OP_REQUIRES status — making exploitation straightforward. No evidence of active exploitation in the wild, but the GitHub advisory includes an exploit reference.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| tensorflow | pip | — | No patch |
Do you use tensorflow? You're affected.
Severity & Risk
Attack Surface
Recommended Action
1 step-
1) Patch: upgrade to TF 2.5.0, or backport releases 2.4.2, 2.3.3, 2.2.3, 2.1.4. 2) Immediate workaround if patching is delayed: restrict access to raw TF ops in multi-tenant environments; validate tensors are non-empty before invoking BandedTriangularSolve. 3) Architecture: sandbox ML workload execution with process isolation (containers, VMs) to limit blast radius. 4) Detection: monitor for anomalous process behavior or unexpected memory errors from ML workers. 5) Inventory all TF versions across training and inference environments — containerized deployments are easy to miss.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-29612?
Heap buffer overflow in TensorFlow's BandedTriangularSolve kernel allows low-privileged local code execution — full CIA impact. Patch immediately to TF 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4. Shared ML platforms (Jupyter, Kubeflow, MLflow) where users submit arbitrary model code are at highest risk.
Is CVE-2021-29612 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-29612, increasing the risk of exploitation.
How to fix CVE-2021-29612?
1) Patch: upgrade to TF 2.5.0, or backport releases 2.4.2, 2.3.3, 2.2.3, 2.1.4. 2) Immediate workaround if patching is delayed: restrict access to raw TF ops in multi-tenant environments; validate tensors are non-empty before invoking BandedTriangularSolve. 3) Architecture: sandbox ML workload execution with process isolation (containers, VMs) to limit blast radius. 4) Detection: monitor for anomalous process behavior or unexpected memory errors from ML workers. 5) Inventory all TF versions across training and inference environments — containerized deployments are easy to miss.
What systems are affected by CVE-2021-29612?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, ML platforms, notebook environments.
What is the CVSS score for CVE-2021-29612?
CVE-2021-29612 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.03%.
Technical Details
NVD Description
TensorFlow is an end-to-end open source platform for machine learning. An attacker can trigger a heap buffer overflow in Eigen implementation of `tf.raw_ops.BandedTriangularSolve`. The implementation(https://github.com/tensorflow/tensorflow/blob/eccb7ec454e6617738554a255d77f08e60ee0808/tensorflow/core/kernels/linalg/banded_triangular_solve_op.cc#L269-L278) calls `ValidateInputTensors` for input validation but fails to validate that the two tensors are not empty. Furthermore, since `OP_REQUIRES` macro only stops execution of current function after setting `ctx->status()` to a non-OK value, callers of helper functions that use `OP_REQUIRES` must check value of `ctx->status()` before continuing. This doesn't happen in this op's implementation(https://github.com/tensorflow/tensorflow/blob/eccb7ec454e6617738554a255d77f08e60ee0808/tensorflow/core/kernels/linalg/banded_triangular_solve_op.cc#L219), hence the validation that is present is also not effective. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.
Exploitation Scenario
An adversary with access to a shared ML training platform (internal Jupyter hub, Kubeflow pipeline, or MLflow experiment server) submits a crafted TensorFlow model that invokes tf.raw_ops.BandedTriangularSolve with an empty input tensor. Due to missing empty-tensor validation and unchecked OP_REQUIRES return status, the Eigen implementation proceeds into heap memory, triggering a buffer overflow. On a successful exploit, the attacker gains code execution as the training worker process — which typically has access to cloud storage credentials, training datasets, and network access to internal ML infrastructure.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H References
- github.com/tensorflow/tensorflow/commit/0ab290774f91a23bebe30a358fde4e53ab4876a0 Patch 3rd Party
- github.com/tensorflow/tensorflow/commit/ba6822bd7b7324ba201a28b2f278c29a98edbef2 Patch 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-2xgj-xhgf-ggjv Exploit Patch 3rd Party
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow
AI Threat Alert