CVE-2021-29583: TensorFlow: heap overflow in FusedBatchNorm risks RCE

HIGH PoC AVAILABLE
Published May 14, 2021
CISO Take

A heap buffer overflow in TensorFlow's FusedBatchNorm op lets low-privileged users trigger out-of-bounds reads or null pointer dereferences via malformed tensor inputs, with code execution potential. Upgrade to TensorFlow 2.5.0 or patched backports (2.1.4–2.4.2) immediately. Multi-tenant ML training clusters and shared inference infrastructure are the highest-risk environments.

Risk Assessment

CVSS 7.8 HIGH with local attack vector and low privilege requirement. While local-only, shared ML training clusters and multi-tenant GPU platforms expose this surface to non-admin users who can submit arbitrary tensor inputs. The combination of C:H/I:H/A:H impact and low attack complexity makes this a priority patch wherever TensorFlow runs with multi-user access. Not in CISA KEV and from 2021, so unlikely to be actively targeted, but unpatched legacy TF deployments remain at risk.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed 6d ago 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
7.8 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 2% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Moderate
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C High
I High
A High

Recommended Action

5 steps
  1. Patch: Upgrade to TensorFlow 2.5.0 or apply cherrypicks for supported branches (2.4.2, 2.3.3, 2.2.3, 2.1.4) per commit 6972f9d.

  2. Isolate: Run training workers and TF Serving in containers with seccomp/AppArmor profiles and minimal privileges.

  3. Input validation: Assert tensor dimension consistency (channel counts of scale, offset, mean, variance match x) before executing FusedBatchNorm ops.

  4. Monitor: Alert on TensorFlow process crashes (SIGSEGV/SIGABRT) as exploitation indicators.

  5. Network isolation: Restrict TF Serving endpoints to internal networks; never expose raw op execution to untrusted external callers.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 9 - Risk management system
ISO 42001
A.9.1 - Information security policies for AI systems
NIST AI RMF
MANAGE-2.2 - Mechanisms to sustain the value of deployed AI
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2021-29583?

A heap buffer overflow in TensorFlow's FusedBatchNorm op lets low-privileged users trigger out-of-bounds reads or null pointer dereferences via malformed tensor inputs, with code execution potential. Upgrade to TensorFlow 2.5.0 or patched backports (2.1.4–2.4.2) immediately. Multi-tenant ML training clusters and shared inference infrastructure are the highest-risk environments.

Is CVE-2021-29583 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29583, increasing the risk of exploitation.

How to fix CVE-2021-29583?

1. Patch: Upgrade to TensorFlow 2.5.0 or apply cherrypicks for supported branches (2.4.2, 2.3.3, 2.2.3, 2.1.4) per commit 6972f9d. 2. Isolate: Run training workers and TF Serving in containers with seccomp/AppArmor profiles and minimal privileges. 3. Input validation: Assert tensor dimension consistency (channel counts of scale, offset, mean, variance match x) before executing FusedBatchNorm ops. 4. Monitor: Alert on TensorFlow process crashes (SIGSEGV/SIGABRT) as exploitation indicators. 5. Network isolation: Restrict TF Serving endpoints to internal networks; never expose raw op execution to untrusted external callers.

What systems are affected by CVE-2021-29583?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving.

What is the CVSS score for CVE-2021-29583?

CVE-2021-29583 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.01%.

Technical Details

NVD Description

TensorFlow is an end-to-end open source platform for machine learning. The implementation of `tf.raw_ops.FusedBatchNorm` is vulnerable to a heap buffer overflow. If the tensors are empty, the same implementation can trigger undefined behavior by dereferencing null pointers. The implementation(https://github.com/tensorflow/tensorflow/blob/57d86e0db5d1365f19adcce848dfc1bf89fdd4c7/tensorflow/core/kernels/fused_batch_norm_op.cc) fails to validate that `scale`, `offset`, `mean` and `variance` (the last two only when required) all have the same number of elements as the number of channels of `x`. This results in heap out of bounds reads when the buffers backing these tensors are indexed past their boundary. If the tensors are empty, the validation mentioned in the above paragraph would also trigger and prevent the undefined behavior. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with data scientist access to a shared Kubernetes GPU cluster submits a training job containing a model with FusedBatchNorm layers fed empty or channel-mismatched tensors. The missing dimension validation triggers a heap OOB read, corrupting adjacent memory. With a crafted heap layout, the adversary escalates to code execution within the training worker pod, enabling lateral movement to access other tenants' model artifacts, training data, environment credentials, or cloud IAM tokens mounted in the pod.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Timeline

Published
May 14, 2021
Last Modified
November 21, 2024
First Seen
May 14, 2021

Related Vulnerabilities