CVE-2021-29608: TensorFlow: heap OOB in RaggedTensorToTensor op
HIGH PoC AVAILABLEAny TensorFlow deployment below 2.5.0 (or unpatched 2.1.x–2.4.x) is vulnerable to heap out-of-bounds access via malformed ragged tensor inputs, enabling local privilege escalation to full system compromise. Patch to TF 2.5.0 or the respective cherrypick releases (2.1.4, 2.2.3, 2.3.3, 2.4.2) immediately. Prioritize ML training clusters and multi-tenant inference servers where low-privileged users can submit ops.
What is the risk?
High risk for shared or multi-tenant ML infrastructure. CVSS 7.8 with local, low-complexity, low-privilege vector means any authenticated user on a shared training node or Jupyter environment can exploit this. DCHECK guards are compiled out in release builds, removing the only defensive layer. No active KEV listing reduces urgency for internet-exposed systems, but internal threat actors or compromised ML user accounts pose a credible path to host takeover.
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| TensorFlow | pip | — | No patch |
Do you use TensorFlow? You're affected.
How severe is it?
What is the attack surface?
What should I do?
5 steps-
Patch: Upgrade to TensorFlow 2.5.0 or backport releases 2.1.4/2.2.3/2.3.3/2.4.2.
-
Workaround: Restrict access to tf.raw_ops.RaggedTensorToTensor via op allowlisting if running custom serving infrastructure.
-
Network isolation: Ensure TF Serving endpoints are not directly reachable by untrusted users.
-
Detection: Audit for anomalous process spawning or privilege escalation events on ML training hosts; monitor for empty-tensor inputs passed to RaggedTensor ops in serving logs.
-
Inventory: Scan all ML environments (notebooks, CI/CD pipelines, serving containers) for vulnerable TF versions using package managers or SBOM tooling.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-29608?
Any TensorFlow deployment below 2.5.0 (or unpatched 2.1.x–2.4.x) is vulnerable to heap out-of-bounds access via malformed ragged tensor inputs, enabling local privilege escalation to full system compromise. Patch to TF 2.5.0 or the respective cherrypick releases (2.1.4, 2.2.3, 2.3.3, 2.4.2) immediately. Prioritize ML training clusters and multi-tenant inference servers where low-privileged users can submit ops.
Is CVE-2021-29608 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-29608, increasing the risk of exploitation.
How to fix CVE-2021-29608?
1. Patch: Upgrade to TensorFlow 2.5.0 or backport releases 2.1.4/2.2.3/2.3.3/2.4.2. 2. Workaround: Restrict access to tf.raw_ops.RaggedTensorToTensor via op allowlisting if running custom serving infrastructure. 3. Network isolation: Ensure TF Serving endpoints are not directly reachable by untrusted users. 4. Detection: Audit for anomalous process spawning or privilege escalation events on ML training hosts; monitor for empty-tensor inputs passed to RaggedTensor ops in serving logs. 5. Inventory: Scan all ML environments (notebooks, CI/CD pipelines, serving containers) for vulnerable TF versions using package managers or SBOM tooling.
What systems are affected by CVE-2021-29608?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, notebook environments, data preprocessing pipelines.
What is the CVSS score for CVE-2021-29608?
CVE-2021-29608 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.23%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.001 AI Software AML.T0037 Data from Local System AML.T0049 Exploit Public-Facing Application Compliance Controls Affected
What are the technical details?
Original Advisory
TensorFlow is an end-to-end open source platform for machine learning. Due to lack of validation in `tf.raw_ops.RaggedTensorToTensor`, an attacker can exploit an undefined behavior if input arguments are empty. The implementation(https://github.com/tensorflow/tensorflow/blob/656e7673b14acd7835dc778867f84916c6d1cac2/tensorflow/core/kernels/ragged_tensor_to_tensor_op.cc#L356-L360) only checks that one of the tensors is not empty, but does not check for the other ones. There are multiple `DCHECK` validations to prevent heap OOB, but these are no-op in release builds, hence they don't prevent anything. The fix will be included in TensorFlow 2.5.0. We will also cherrypick these commits on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.
Exploitation Scenario
An adversary with a low-privileged account on a shared ML training cluster submits a TensorFlow job containing a crafted call to tf.raw_ops.RaggedTensorToTensor with an intentionally empty input tensor. The missing validation in release builds skips the DCHECK guards, triggering undefined behavior and heap OOB access. On a vulnerable host, this translates to a controlled memory corruption primitive, enabling the attacker to overwrite adjacent heap structures and escalate to the privileges of the TensorFlow process—often a service account with access to training data, model artifacts, and cloud credentials stored in environment variables.
Weaknesses (CWE)
CWE-131 — Incorrect Calculation of Buffer Size: The product does not correctly calculate the size to be used when allocating a buffer, which could lead to a buffer overflow.
- [Implementation] When allocating a buffer for the purpose of transforming, converting, or encoding an input, allocate enough memory to handle the largest possible encoding. For example, in a routine that converts "&" characters to "&" for HTML entity encoding, the output buffer needs to be at least 5 times as large as the input buffer.
- [Implementation] Understand the programming language's underlying representation and how it interacts with numeric calculation (CWE-681). Pay close attention to byte size discrepancies, precision, signed/unsigned distinctions, truncation, conversion and casting between types, "not-a-number" calculations, and how the language handles numbers that are too large or too small for its underlying representation. [REF-7] Also be careful to account for 32-bit, 64-bit, and other potential differences that may affect the numeric representation.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H References
- github.com/tensorflow/tensorflow/commit/b761c9b652af2107cfbc33efd19be0ce41daa33e Patch 3rd Party
- github.com/tensorflow/tensorflow/commit/c4d7afb6a5986b04505aca4466ae1951686c80f6 Patch 3rd Party
- github.com/tensorflow/tensorflow/commit/f94ef358bb3e91d517446454edff6535bcfe8e4a Patch 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-rgvq-pcvf-hx75 Exploit Patch 3rd Party
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow