CVE-2021-37638: TensorFlow null ptr deref

CISO Take

Any TensorFlow deployment (2.3.x–2.5.x) accepting user-controlled tensor inputs is exposed to process crash or potential code execution via a malformed RaggedTensor API call. Patch immediately to TF 2.6.0, 2.5.1, 2.4.3, or 2.3.4. In shared ML environments (JupyterHub, model serving APIs), treat this as high-priority since low-privilege local access is all that's needed.

What is the risk?

CVSS 7.8 with local attack vector and low privilege requirement makes this practically exploitable in any multi-tenant ML infrastructure — shared training clusters, Jupyter servers, or internal model serving endpoints. The undefined behavior from CWE-476 can escalate beyond denial-of-service into memory corruption and potential RCE, making the effective risk higher than a simple crash. No active exploitation evidence, but exploit complexity is trivial once the access threshold is met.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
TensorFlow	pip	—	No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 2d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1

7.8 / 10

EPSS

0.2%

chance of exploitation in 30 days

Higher than 6% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

No known exploitation

Sophistication

Trivial

What is the attack surface?

AV Local

AC Low

PR Low

UI None

S Unchanged

C High

I High

A High

What should I do?

5 steps

Patch

Upgrade to TensorFlow 2.6.0, 2.5.1, 2.4.3, or 2.3.4 (commit 301ae88b).
Input validation

Add server-side validation to reject empty row_partition_types before passing to tf.raw_ops.RaggedTensorToTensor.
Restrict API surface

If using TF Serving or custom endpoints, disable or restrict access to raw tf.raw_ops calls from untrusted callers.
Isolate

Run model serving and training workloads in sandboxed containers with minimal privileges to limit blast radius.
Detect

Monitor for unexpected TensorFlow process crashes (OOMKilled, segfaults) in your ML infrastructure as indicator of exploitation attempts.

How is it classified?

Code Execution DoS Framework Inference AML.T0010.001 - AI Software AML.T0029 - Denial of AI Service AML.T0049 - Exploit Public-Facing Application

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Art.15 - Accuracy, robustness and cybersecurity

ISO 42001

A.6.2.6 - Cybersecurity of AI systems

NIST AI RMF

MANAGE-2.2 - Mechanisms are in place to respond to and recover from vulnerabilities

OWASP LLM Top 10

LLM06 - Sensitive Information Disclosure

Frequently Asked Questions

What is CVE-2021-37638?

Any TensorFlow deployment (2.3.x–2.5.x) accepting user-controlled tensor inputs is exposed to process crash or potential code execution via a malformed RaggedTensor API call. Patch immediately to TF 2.6.0, 2.5.1, 2.4.3, or 2.3.4. In shared ML environments (JupyterHub, model serving APIs), treat this as high-priority since low-privilege local access is all that's needed.

Is CVE-2021-37638 actively exploited?

No confirmed active exploitation of CVE-2021-37638 has been reported, but organizations should still patch proactively.

How to fix CVE-2021-37638?

1. **Patch**: Upgrade to TensorFlow 2.6.0, 2.5.1, 2.4.3, or 2.3.4 (commit 301ae88b). 2. **Input validation**: Add server-side validation to reject empty row_partition_types before passing to tf.raw_ops.RaggedTensorToTensor. 3. **Restrict API surface**: If using TF Serving or custom endpoints, disable or restrict access to raw tf.raw_ops calls from untrusted callers. 4. **Isolate**: Run model serving and training workloads in sandboxed containers with minimal privileges to limit blast radius. 5. **Detect**: Monitor for unexpected TensorFlow process crashes (OOMKilled, segfaults) in your ML infrastructure as indicator of exploitation attempts.

What systems are affected by CVE-2021-37638?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, inference APIs, shared ML notebooks.

What is the CVSS score for CVE-2021-37638?

CVE-2021-37638 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.17%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servinginference APIsshared ML notebooks

MITRE ATLAS Techniques

AML.T0010.001 AI Software

AML.T0029 Denial of AI Service

AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Art.15

ISO 42001: A.6.2.6

NIST AI RMF: MANAGE-2.2

OWASP LLM Top 10: LLM06

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. Sending invalid argument for `row_partition_types` of `tf.raw_ops.RaggedTensorToTensor` API results in a null pointer dereference and undefined behavior. The [implementation](https://github.com/tensorflow/tensorflow/blob/47a06f40411a69c99f381495f490536972152ac0/tensorflow/core/kernels/ragged_tensor_to_tensor_op.cc#L328) accesses the first element of a user supplied list of values without validating that the provided list is not empty. We have patched the issue in GitHub commit 301ae88b331d37a2a16159b65b255f4f9eb39314. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.

Exploitation Scenario

An attacker with low-privilege access to a shared ML training cluster or JupyterHub server submits a crafted notebook or script that calls `tf.raw_ops.RaggedTensorToTensor(shape=..., values=..., default_value=..., row_partition_tensors=..., row_partition_types=[])` with an empty list for row_partition_types. This triggers a null pointer dereference in the kernel implementation at ragged_tensor_to_tensor_op.cc:328, crashing the TensorFlow process. In environments without memory protection or with misconfigured heap allocators, the undefined behavior may be leveraged for memory corruption and escalation to code execution — allowing the attacker to pivot within the ML infrastructure or exfiltrate model weights.

Weaknesses (CWE)

CWE-476 NULL Pointer Dereference

CWE-476 — NULL Pointer Dereference: The product dereferences a pointer that it expects to be valid but is NULL.

[Implementation] For any pointers that could have been modified or provided from a function that can return NULL, check the pointer for NULL before use. When working with a multithreaded or otherwise asynchronous environment, ensure that proper locking APIs are used to lock before the check, and unlock when it has finished [REF-1484].
[Requirements] Select a programming language that is not susceptible to these issues.

Source: MITRE CWE corpus.