CVE-2021-41198: TensorFlow: tf.tile integer overflow crashes ML process
MEDIUM PoC AVAILABLEA local attacker with minimal privileges can crash any TensorFlow process by passing an oversized tensor to tf.tile, causing a CHECK-failure due to int64 overflow. Patch immediately to TensorFlow 2.4.4+, 2.5.2+, 2.6.1+, or 2.7.0+. Risk is bounded to availability — no data exfiltration or code execution path exists.
Risk Assessment
Medium risk in isolation. Local attack vector limits exposure to multi-tenant training infrastructure, shared ML workspaces, or systems accepting untrusted model/graph inputs. In Jupyter-based environments or shared GPU clusters, a malicious notebook can crash co-tenant TF sessions. Not network-exploitable directly, but if TF is wrapped in a serving API that processes user-supplied tensor specs, the effective attack surface expands to network.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| tensorflow | pip | — | No patch |
Do you use tensorflow? You're affected.
Severity & Risk
Attack Surface
Recommended Action
5 steps-
Upgrade to TensorFlow 2.4.4, 2.5.2, 2.6.1, or 2.7.0 — patch at commit 9294094df6fea79271778eb7e7ae1bad8b5ef98f.
-
If patching is not immediately possible, add input validation to reject tensor shapes whose product exceeds INT64_MAX before passing to tf.tile.
-
In multi-tenant environments, enforce resource quotas and process isolation so a crashed session cannot affect others.
-
Audit any serving layer that accepts external tensor dimensions — reject inputs where multiples of repeated dimensions would overflow int64.
-
Detection: monitor for unexpected TF process exits or CHECK-failure log lines containing 'tile'.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-41198?
A local attacker with minimal privileges can crash any TensorFlow process by passing an oversized tensor to tf.tile, causing a CHECK-failure due to int64 overflow. Patch immediately to TensorFlow 2.4.4+, 2.5.2+, 2.6.1+, or 2.7.0+. Risk is bounded to availability — no data exfiltration or code execution path exists.
Is CVE-2021-41198 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-41198, increasing the risk of exploitation.
How to fix CVE-2021-41198?
1. Upgrade to TensorFlow 2.4.4, 2.5.2, 2.6.1, or 2.7.0 — patch at commit 9294094df6fea79271778eb7e7ae1bad8b5ef98f. 2. If patching is not immediately possible, add input validation to reject tensor shapes whose product exceeds INT64_MAX before passing to tf.tile. 3. In multi-tenant environments, enforce resource quotas and process isolation so a crashed session cannot affect others. 4. Audit any serving layer that accepts external tensor dimensions — reject inputs where multiples of repeated dimensions would overflow int64. 5. Detection: monitor for unexpected TF process exits or CHECK-failure log lines containing 'tile'.
What systems are affected by CVE-2021-41198?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, data preprocessing pipelines, shared ML workspaces.
What is the CVSS score for CVE-2021-41198?
CVE-2021-41198 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.05%.
Technical Details
NVD Description
TensorFlow is an open source platform for machine learning. In affected versions if `tf.tile` is called with a large input argument then the TensorFlow process will crash due to a `CHECK`-failure caused by an overflow. The number of elements in the output tensor is too much for the `int64_t` type and the overflow is detected via a `CHECK` statement. This aborts the process. The fix will be included in TensorFlow 2.7.0. We will also cherrypick this commit on TensorFlow 2.6.1, TensorFlow 2.5.2, and TensorFlow 2.4.4, as these are also affected and still in supported range.
Exploitation Scenario
An adversary with access to a shared ML training cluster (e.g., a compromised notebook user or rogue data scientist) submits a training job that calls tf.tile with a tensor shaped to produce an output with more than INT64_MAX elements. The TF process hits the CHECK assertion, crashes, and takes down any co-located training runs or serving replicas sharing that process. In a Kubernetes-based MLOps environment, this triggers repeated pod restarts, disrupting production inference serving during an outage window the adversary can time for maximum impact.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H References
- github.com/tensorflow/tensorflow/commit/9294094df6fea79271778eb7e7ae1bad8b5ef98f Patch 3rd Party
- github.com/tensorflow/tensorflow/issues/46911 Exploit 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-2p25-55c9-h58q Exploit 3rd Party
- github.com/ARPSyndicate/cvemon Exploit
- github.com/adwisatya/SnykVulndb Exploit
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow
AI Threat Alert