CVE-2021-37667: TensorFlow: UnicodeEncode null deref, local code exec

HIGH
Published August 12, 2021
CISO Take

A low-privileged local attacker can trigger undefined behavior (null pointer dereference) in TensorFlow's UnicodeEncode op by passing an empty input_splits tensor, potentially leading to process crash or arbitrary code execution. In shared ML platforms—Jupyter hubs, Kubeflow, ML training infrastructure—any tenant with op execution access is a viable threat actor. Patch immediately to TF 2.6.0, 2.5.1, 2.4.3, or 2.3.4.

What is the risk?

CVSS 7.8 High with local/low-complexity/low-privilege profile makes this a credible insider or lateral-movement vector in shared ML environments. Not in CISA KEV and no active exploitation evidence, reducing urgency for air-gapped or single-tenant deployments. However, multi-tenant ML platforms (shared notebooks, model training clusters) amplify risk significantly—any authenticated user becomes a potential attacker. NLP preprocessing pipelines that expose raw op access to user-controlled data are at highest risk.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 4d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
7.8 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 7% of all CVEs
Exploitation Status
No known exploitation
Sophistication
Moderate

What is the attack surface?

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C High
I High
A High

What should I do?

5 steps
  1. Patch: Upgrade to TensorFlow 2.6.0, 2.5.1, 2.4.3, or 2.3.4 (cherry-picked fix in all supported branches).

  2. Input validation: Add explicit shape/size validation on input_splits tensors before passing to UnicodeEncode—reject empty or zero-dimension tensors at application layer.

  3. Least privilege: Restrict access to tf.raw_ops in multi-tenant environments; use TF's disable_eager_execution guards or op allowlisting where available.

  4. Detection: Monitor for segfaults or abnormal process crashes in TF serving pods/containers—unexpected exits in inference services may indicate exploitation attempts.

  5. Container isolation: Ensure TF processes run in isolated namespaces with no host-level privileges to limit blast radius.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2.6 - Management of AI system vulnerabilities
NIST AI RMF
MANAGE 2.2 - Mechanisms to sustain the value of deployed AI and manage risks
OWASP LLM Top 10
LLM05:2025 - Insecure Output Handling / Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2021-37667?

A low-privileged local attacker can trigger undefined behavior (null pointer dereference) in TensorFlow's UnicodeEncode op by passing an empty input_splits tensor, potentially leading to process crash or arbitrary code execution. In shared ML platforms—Jupyter hubs, Kubeflow, ML training infrastructure—any tenant with op execution access is a viable threat actor. Patch immediately to TF 2.6.0, 2.5.1, 2.4.3, or 2.3.4.

Is CVE-2021-37667 actively exploited?

No confirmed active exploitation of CVE-2021-37667 has been reported, but organizations should still patch proactively.

How to fix CVE-2021-37667?

1. Patch: Upgrade to TensorFlow 2.6.0, 2.5.1, 2.4.3, or 2.3.4 (cherry-picked fix in all supported branches). 2. Input validation: Add explicit shape/size validation on input_splits tensors before passing to UnicodeEncode—reject empty or zero-dimension tensors at application layer. 3. Least privilege: Restrict access to tf.raw_ops in multi-tenant environments; use TF's disable_eager_execution guards or op allowlisting where available. 4. Detection: Monitor for segfaults or abnormal process crashes in TF serving pods/containers—unexpected exits in inference services may indicate exploitation attempts. 5. Container isolation: Ensure TF processes run in isolated namespaces with no host-level privileges to limit blast radius.

What systems are affected by CVE-2021-37667?

This vulnerability affects the following AI/ML architecture patterns: NLP training pipelines, model serving, shared ML platforms / multi-tenant notebooks, text preprocessing pipelines, inference infrastructure.

What is the CVSS score for CVE-2021-37667?

CVE-2021-37667 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.17%.

What is the AI security impact?

Affected AI Architectures

NLP training pipelinesmodel servingshared ML platforms / multi-tenant notebookstext preprocessing pipelinesinference infrastructure

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0043 Craft Adversarial Data
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: A.6.2.6
NIST AI RMF: MANAGE 2.2
OWASP LLM Top 10: LLM05:2025

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. In affected versions an attacker can cause undefined behavior via binding a reference to null pointer in `tf.raw_ops.UnicodeEncode`. The [implementation](https://github.com/tensorflow/tensorflow/blob/460e000de3a83278fb00b61a16d161b1964f15f4/tensorflow/core/kernels/unicode_ops.cc#L533-L539) reads the first dimension of the `input_splits` tensor before validating that this tensor is not empty. We have patched the issue in GitHub commit 2e0ee46f1a47675152d3d865797a18358881d7a6. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.

Exploitation Scenario

An attacker with access to a shared Jupyter notebook environment or ML training platform submits a crafted tensor job: they call tf.raw_ops.UnicodeEncode with an empty input_splits tensor (shape [0]). The vulnerable code reads the first dimension before validating non-empty, binding a reference to a null pointer. In a process crash scenario this takes down the shared inference service (DoS). In a more sophisticated variant, the attacker crafts heap layout to redirect execution flow within the TF worker process, escalating to code execution under the service account running TensorFlow Serving—potentially gaining access to model weights, training data, or downstream ML pipeline credentials.

Weaknesses (CWE)

CWE-824 — Access of Uninitialized Pointer: The product accesses or uses a pointer that has not been initialized.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Timeline

Published
August 12, 2021
Last Modified
November 21, 2024
First Seen
August 12, 2021

Related Vulnerabilities