CVE-2021-29544: TensorFlow: DoS via missing tensor rank validation
MEDIUM PoC AVAILABLEA local attacker can crash TensorFlow processes by passing tensors with invalid rank to the QuantizeAndDequantizeV4Grad op, triggering a CHECK-fail abort in the C++ runtime. Exploitability is limited to local access, making this most dangerous in shared ML compute environments such as multi-tenant Jupyter servers or GPU clusters where untrusted users can submit jobs. Patch to TensorFlow 2.4.2 or 2.5.0 — no workaround exists beyond input sanitization at the application layer.
Risk Assessment
Medium risk overall, but highly context-dependent. In isolated single-user training environments the blast radius is minimal and the threat is largely theoretical. Risk escalates substantially in multi-tenant ML platforms where untrusted users can submit training or inference jobs, since a single malformed tensor call can crash the entire TF process and disrupt co-located workloads. No remote exploitation vector exists per the CVSS (AV:L), which limits exposure compared to network-reachable vulnerabilities.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| tensorflow | pip | — | No patch |
Do you use tensorflow? You're affected.
Severity & Risk
Attack Surface
Recommended Action
5 steps-
Upgrade TensorFlow to 2.4.2 (cherry-picked backport) or 2.5.0+.
-
If immediate patching is blocked, enforce input tensor shape validation at the application boundary before tensors reach raw TF ops.
-
Implement process supervision (systemd, supervisord, Kubernetes restartPolicy) for TF serving processes to auto-recover from crashes.
-
Audit multi-tenant ML platforms for user isolation — restrict who can invoke tf.raw_ops directly and enforce job sandboxing.
-
Monitor for unexpected TF process crashes in serving infrastructure as a detection signal.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-29544?
A local attacker can crash TensorFlow processes by passing tensors with invalid rank to the QuantizeAndDequantizeV4Grad op, triggering a CHECK-fail abort in the C++ runtime. Exploitability is limited to local access, making this most dangerous in shared ML compute environments such as multi-tenant Jupyter servers or GPU clusters where untrusted users can submit jobs. Patch to TensorFlow 2.4.2 or 2.5.0 — no workaround exists beyond input sanitization at the application layer.
Is CVE-2021-29544 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-29544, increasing the risk of exploitation.
How to fix CVE-2021-29544?
1. Upgrade TensorFlow to 2.4.2 (cherry-picked backport) or 2.5.0+. 2. If immediate patching is blocked, enforce input tensor shape validation at the application boundary before tensors reach raw TF ops. 3. Implement process supervision (systemd, supervisord, Kubernetes restartPolicy) for TF serving processes to auto-recover from crashes. 4. Audit multi-tenant ML platforms for user isolation — restrict who can invoke tf.raw_ops directly and enforce job sandboxing. 5. Monitor for unexpected TF process crashes in serving infrastructure as a detection signal.
What systems are affected by CVE-2021-29544?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, edge/mobile deployment pipelines.
What is the CVSS score for CVE-2021-29544?
CVE-2021-29544 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.03%.
Technical Details
NVD Description
TensorFlow is an end-to-end open source platform for machine learning. An attacker can trigger a denial of service via a `CHECK`-fail in `tf.raw_ops.QuantizeAndDequantizeV4Grad`. This is because the implementation does not validate the rank of the `input_*` tensors. In turn, this results in the tensors being passes as they are to `QuantizeAndDequantizePerChannelGradientImpl`. However, the `vec<T>` method, requires the rank to 1 and triggers a `CHECK` failure otherwise. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2 as this is the only other affected version.
Exploitation Scenario
An attacker with local access to a shared ML compute node — such as a data scientist account on a multi-tenant Jupyter server — writes a script calling tf.raw_ops.QuantizeAndDequantizeV4Grad with input tensors of rank ≠ 1. The TensorFlow C++ runtime's vec<T>() method expects rank 1, triggers a CHECK failure, and aborts the entire TF process. In a shared inference server environment, this takes down all concurrent inference requests. In a training cluster without job isolation, the crash can disrupt other users' active training runs and corrupt unsaved checkpoints.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H References
- github.com/tensorflow/tensorflow/blob/95078c145b5a7a43ee046144005f733092756ab5/tensorflow/core/kernels/quantize_and_dequantize_op.cc
- github.com/tensorflow/tensorflow/blob/95078c145b5a7a43ee046144005f733092756ab5/tensorflow/core/kernels/quantize_and_dequantize_op.h
- github.com/tensorflow/tensorflow/commit/20431e9044cf2ad3c0323c34888b192f3289af6b Patch 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-6g85-3hm8-83f9 Exploit Patch 3rd Party
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow
AI Threat Alert