CVE-2021-29583: TensorFlow: heap overflow in FusedBatchNorm risks RCE
HIGH PoC AVAILABLEA heap buffer overflow in TensorFlow's FusedBatchNorm op lets low-privileged users trigger out-of-bounds reads or null pointer dereferences via malformed tensor inputs, with code execution potential. Upgrade to TensorFlow 2.5.0 or patched backports (2.1.4–2.4.2) immediately. Multi-tenant ML training clusters and shared inference infrastructure are the highest-risk environments.
Risk Assessment
CVSS 7.8 HIGH with local attack vector and low privilege requirement. While local-only, shared ML training clusters and multi-tenant GPU platforms expose this surface to non-admin users who can submit arbitrary tensor inputs. The combination of C:H/I:H/A:H impact and low attack complexity makes this a priority patch wherever TensorFlow runs with multi-user access. Not in CISA KEV and from 2021, so unlikely to be actively targeted, but unpatched legacy TF deployments remain at risk.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| tensorflow | pip | — | No patch |
Do you use tensorflow? You're affected.
Severity & Risk
Attack Surface
Recommended Action
5 steps-
Patch: Upgrade to TensorFlow 2.5.0 or apply cherrypicks for supported branches (2.4.2, 2.3.3, 2.2.3, 2.1.4) per commit 6972f9d.
-
Isolate: Run training workers and TF Serving in containers with seccomp/AppArmor profiles and minimal privileges.
-
Input validation: Assert tensor dimension consistency (channel counts of scale, offset, mean, variance match x) before executing FusedBatchNorm ops.
-
Monitor: Alert on TensorFlow process crashes (SIGSEGV/SIGABRT) as exploitation indicators.
-
Network isolation: Restrict TF Serving endpoints to internal networks; never expose raw op execution to untrusted external callers.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-29583?
A heap buffer overflow in TensorFlow's FusedBatchNorm op lets low-privileged users trigger out-of-bounds reads or null pointer dereferences via malformed tensor inputs, with code execution potential. Upgrade to TensorFlow 2.5.0 or patched backports (2.1.4–2.4.2) immediately. Multi-tenant ML training clusters and shared inference infrastructure are the highest-risk environments.
Is CVE-2021-29583 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-29583, increasing the risk of exploitation.
How to fix CVE-2021-29583?
1. Patch: Upgrade to TensorFlow 2.5.0 or apply cherrypicks for supported branches (2.4.2, 2.3.3, 2.2.3, 2.1.4) per commit 6972f9d. 2. Isolate: Run training workers and TF Serving in containers with seccomp/AppArmor profiles and minimal privileges. 3. Input validation: Assert tensor dimension consistency (channel counts of scale, offset, mean, variance match x) before executing FusedBatchNorm ops. 4. Monitor: Alert on TensorFlow process crashes (SIGSEGV/SIGABRT) as exploitation indicators. 5. Network isolation: Restrict TF Serving endpoints to internal networks; never expose raw op execution to untrusted external callers.
What systems are affected by CVE-2021-29583?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving.
What is the CVSS score for CVE-2021-29583?
CVE-2021-29583 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.01%.
Technical Details
NVD Description
TensorFlow is an end-to-end open source platform for machine learning. The implementation of `tf.raw_ops.FusedBatchNorm` is vulnerable to a heap buffer overflow. If the tensors are empty, the same implementation can trigger undefined behavior by dereferencing null pointers. The implementation(https://github.com/tensorflow/tensorflow/blob/57d86e0db5d1365f19adcce848dfc1bf89fdd4c7/tensorflow/core/kernels/fused_batch_norm_op.cc) fails to validate that `scale`, `offset`, `mean` and `variance` (the last two only when required) all have the same number of elements as the number of channels of `x`. This results in heap out of bounds reads when the buffers backing these tensors are indexed past their boundary. If the tensors are empty, the validation mentioned in the above paragraph would also trigger and prevent the undefined behavior. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.
Exploitation Scenario
An adversary with data scientist access to a shared Kubernetes GPU cluster submits a training job containing a model with FusedBatchNorm layers fed empty or channel-mismatched tensors. The missing dimension validation triggers a heap OOB read, corrupting adjacent memory. With a crafted heap layout, the adversary escalates to code execution within the training worker pod, enabling lateral movement to access other tenants' model artifacts, training data, environment credentials, or cloud IAM tokens mounted in the pod.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H References
- github.com/tensorflow/tensorflow/commit/6972f9dfe325636b3db4e0bc517ee22a159365c0 Patch 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-9xh4-23q4-v6wr Exploit Patch 3rd Party
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow
AI Threat Alert