CVE-2021-29612: TensorFlow: heap overflow in linalg op, RCE risk

HIGH PoC AVAILABLE
Published May 14, 2021
CISO Take

Heap buffer overflow in TensorFlow's BandedTriangularSolve kernel allows low-privileged local code execution — full CIA impact. Patch immediately to TF 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4. Shared ML platforms (Jupyter, Kubeflow, MLflow) where users submit arbitrary model code are at highest risk.

Risk Assessment

CVSS 7.8 High with local attack vector limits direct internet exposure, but shared ML training infrastructure substantially elevates real-world risk. Attack complexity is low, no user interaction required, and the root cause is a double failure: missing empty-tensor validation AND unchecked OP_REQUIRES status — making exploitation straightforward. No evidence of active exploitation in the wild, but the GitHub advisory includes an exploit reference.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed 6d ago 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
7.8 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 7% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Moderate
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C High
I High
A High

Recommended Action

1 step
  1. 1) Patch: upgrade to TF 2.5.0, or backport releases 2.4.2, 2.3.3, 2.2.3, 2.1.4. 2) Immediate workaround if patching is delayed: restrict access to raw TF ops in multi-tenant environments; validate tensors are non-empty before invoking BandedTriangularSolve. 3) Architecture: sandbox ML workload execution with process isolation (containers, VMs) to limit blast radius. 4) Detection: monitor for anomalous process behavior or unexpected memory errors from ML workers. 5) Inventory all TF versions across training and inference environments — containerized deployments are easy to miss.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Art.15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.1 - AI system lifecycle management
NIST AI RMF
MANAGE-2.2 - Treatments, responses, and prioritization for identified AI risks
OWASP LLM Top 10
LLM05:2025 - Insecure Output Handling / Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2021-29612?

Heap buffer overflow in TensorFlow's BandedTriangularSolve kernel allows low-privileged local code execution — full CIA impact. Patch immediately to TF 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4. Shared ML platforms (Jupyter, Kubeflow, MLflow) where users submit arbitrary model code are at highest risk.

Is CVE-2021-29612 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29612, increasing the risk of exploitation.

How to fix CVE-2021-29612?

1) Patch: upgrade to TF 2.5.0, or backport releases 2.4.2, 2.3.3, 2.2.3, 2.1.4. 2) Immediate workaround if patching is delayed: restrict access to raw TF ops in multi-tenant environments; validate tensors are non-empty before invoking BandedTriangularSolve. 3) Architecture: sandbox ML workload execution with process isolation (containers, VMs) to limit blast radius. 4) Detection: monitor for anomalous process behavior or unexpected memory errors from ML workers. 5) Inventory all TF versions across training and inference environments — containerized deployments are easy to miss.

What systems are affected by CVE-2021-29612?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, ML platforms, notebook environments.

What is the CVSS score for CVE-2021-29612?

CVE-2021-29612 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.03%.

Technical Details

NVD Description

TensorFlow is an end-to-end open source platform for machine learning. An attacker can trigger a heap buffer overflow in Eigen implementation of `tf.raw_ops.BandedTriangularSolve`. The implementation(https://github.com/tensorflow/tensorflow/blob/eccb7ec454e6617738554a255d77f08e60ee0808/tensorflow/core/kernels/linalg/banded_triangular_solve_op.cc#L269-L278) calls `ValidateInputTensors` for input validation but fails to validate that the two tensors are not empty. Furthermore, since `OP_REQUIRES` macro only stops execution of current function after setting `ctx->status()` to a non-OK value, callers of helper functions that use `OP_REQUIRES` must check value of `ctx->status()` before continuing. This doesn't happen in this op's implementation(https://github.com/tensorflow/tensorflow/blob/eccb7ec454e6617738554a255d77f08e60ee0808/tensorflow/core/kernels/linalg/banded_triangular_solve_op.cc#L219), hence the validation that is present is also not effective. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with access to a shared ML training platform (internal Jupyter hub, Kubeflow pipeline, or MLflow experiment server) submits a crafted TensorFlow model that invokes tf.raw_ops.BandedTriangularSolve with an empty input tensor. Due to missing empty-tensor validation and unchecked OP_REQUIRES return status, the Eigen implementation proceeds into heap memory, triggering a buffer overflow. On a successful exploit, the attacker gains code execution as the training worker process — which typically has access to cloud storage credentials, training datasets, and network access to internal ML infrastructure.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Timeline

Published
May 14, 2021
Last Modified
November 21, 2024
First Seen
May 14, 2021

Related Vulnerabilities