CVE-2021-37656: TensorFlow null ptr deref

CISO Take

This TensorFlow vulnerability allows a local attacker with low privileges to trigger undefined behavior—potentially crashing or corrupting ML processes—by crafting a RaggedTensor with non-monotonic splits. In shared ML platforms (JupyterHub, training clusters, TF Serving), this represents a real lateral movement or DoS vector. Patch to TF 2.6.0, 2.5.1, 2.4.3, or 2.3.4 immediately; no workaround exists short of blocking untrusted user input to raw ops.

What is the risk?

CVSS 7.8 High with local attack vector, low complexity, and low privilege requirements makes this dangerous in any multi-tenant ML environment. While not remotely exploitable in a default setup, most production ML platforms—shared Jupyter environments, MLOps pipelines accepting user-submitted jobs, or TF Serving endpoints that accept raw tensor inputs—effectively reduce the attack surface to network-reachable. No active exploitation evidence and no CISA KEV listing; primary risk is insider threat or compromised low-privilege accounts targeting ML infrastructure.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
TensorFlow	pip	—	No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 3d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1

7.8 / 10

EPSS

0.2%

chance of exploitation in 30 days

Higher than 6% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

No known exploitation

Sophistication

Moderate

What is the attack surface?

AV Local

AC Low

PR Low

UI None

S Unchanged

C High

I High

A High

What should I do?

5 steps

Patch: Upgrade to TensorFlow 2.6.0 (primary fix) or apply cherrypicked patches to 2.5.1, 2.4.3, or 2.3.4. Commit 1071f554 is the authoritative fix.
Input validation: Until patched, validate that all splits arrays in RaggedTensor inputs are strictly monotonically increasing before passing to raw ops.
Isolation: Run TF worker processes under least-privilege accounts and in sandboxed environments (containers, VMs) to limit blast radius from memory corruption.
Detection: Monitor for process crashes in ML workers and unexpected OOM/segfault signals—these may indicate exploitation attempts.
Audit: Identify all services accepting external RaggedTensor inputs (especially TF Serving endpoints) and prioritize patching those first.

How is it classified?

Code Execution DoS Framework Training Data Inference AML.T0001 - Search Open AI Vulnerability Analysis AML.T0010.001 - AI Software AML.T0049 - Exploit Public-Facing Application

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Art. 15 - Accuracy, robustness and cybersecurity

ISO 42001

A.6.2.6 - AI system vulnerability management

NIST AI RMF

MANAGE-2.4 - Residual risks are managed

OWASP LLM Top 10

LLM09:2025 - Misinformation

Frequently Asked Questions

What is CVE-2021-37656?

This TensorFlow vulnerability allows a local attacker with low privileges to trigger undefined behavior—potentially crashing or corrupting ML processes—by crafting a RaggedTensor with non-monotonic splits. In shared ML platforms (JupyterHub, training clusters, TF Serving), this represents a real lateral movement or DoS vector. Patch to TF 2.6.0, 2.5.1, 2.4.3, or 2.3.4 immediately; no workaround exists short of blocking untrusted user input to raw ops.

Is CVE-2021-37656 actively exploited?

No confirmed active exploitation of CVE-2021-37656 has been reported, but organizations should still patch proactively.

How to fix CVE-2021-37656?

1. Patch: Upgrade to TensorFlow 2.6.0 (primary fix) or apply cherrypicked patches to 2.5.1, 2.4.3, or 2.3.4. Commit 1071f554 is the authoritative fix. 2. Input validation: Until patched, validate that all splits arrays in RaggedTensor inputs are strictly monotonically increasing before passing to raw ops. 3. Isolation: Run TF worker processes under least-privilege accounts and in sandboxed environments (containers, VMs) to limit blast radius from memory corruption. 4. Detection: Monitor for process crashes in ML workers and unexpected OOM/segfault signals—these may indicate exploitation attempts. 5. Audit: Identify all services accepting external RaggedTensor inputs (especially TF Serving endpoints) and prioritize patching those first.

What systems are affected by CVE-2021-37656?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, ML workstations, shared notebook environments, batch inference pipelines.

What is the CVSS score for CVE-2021-37656?

CVE-2021-37656 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.17%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servingML workstationsshared notebook environmentsbatch inference pipelines

MITRE ATLAS Techniques

AML.T0001 Search Open AI Vulnerability Analysis

AML.T0010.001 AI Software

AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Art. 15

ISO 42001: A.6.2.6

NIST AI RMF: MANAGE-2.4

OWASP LLM Top 10: LLM09:2025

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. In affected versions an attacker can cause undefined behavior via binding a reference to null pointer in `tf.raw_ops.RaggedTensorToSparse`. The [implementation](https://github.com/tensorflow/tensorflow/blob/f24faa153ad31a4b51578f8181d3aaab77a1ddeb/tensorflow/core/kernels/ragged_tensor_to_sparse_kernel.cc#L30) has an incomplete validation of the splits values: it does not check that they are in increasing order. We have patched the issue in GitHub commit 1071f554dbd09f7e101324d366eec5f4fe5a3ece. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.

Exploitation Scenario

An attacker with access to a shared Jupyter notebook environment or an MLOps pipeline that accepts user-submitted training jobs crafts a TensorFlow RaggedTensor with a splits array that is not in increasing order (e.g., [0, 3, 1, 5]). When the job executes `tf.raw_ops.RaggedTensorToSparse`, the incomplete validation binds a reference to a null pointer, triggering undefined behavior. In a Kubernetes-based training cluster, this crash can be leveraged to escape a crashed pod's health monitoring, disrupt other co-located training jobs sharing the node, or—if memory corruption is achievable—pivot to code execution within the ML worker's process space, potentially accessing model weights, training data, or credentials stored in environment variables.