CVE-2021-41212: TensorFlow: heap OOB read in ragged.cross shape inference

HIGH PoC AVAILABLE
Published November 5, 2021
CISO Take

Any TensorFlow deployment (training, serving, notebooks) running versions prior to 2.4.4/2.5.2/2.6.1/2.7.0 that processes ragged tensors is exposed to a local heap out-of-bounds read triggerable by crafted inputs to tf.ragged.cross. Patch immediately — if you run multi-tenant Jupyter/Colab environments or expose TF serving endpoints that accept user-controlled tensor shapes, the risk escalates significantly. Prioritize patching ML infrastructure where untrusted inputs can reach ragged tensor operations.

What is the risk?

Effective risk is moderate-to-high for organizations with shared or externally-accessible ML infrastructure. The CVSS local attack vector (AV:L) limits scope for remote exploitation, but in practice TensorFlow Serving, Jupyter hubs, and ML pipeline APIs expose tensor shape processing to untrusted inputs — effectively elevating this to a network-reachable condition. AC:L (low complexity) means exploitation requires minimal skill once a vulnerable endpoint is identified. No CISA KEV listing and no public PoC weaponization as of disclosure, but the GitHub advisory tags it as 'Exploit' available.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 2d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
7.1 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 10% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Moderate
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C High
I None
A High

What should I do?

4 steps
  1. Patch: Upgrade to TensorFlow >=2.7.0, or apply cherrypick patches for 2.4.4, 2.5.2, 2.6.1. Pin versions in requirements.txt/conda envs and enforce in CI.

  2. Immediate workaround: If patching is delayed, disable or sandbox endpoints that accept user-controlled ragged tensor shapes. Validate input tensor rank and shape bounds before passing to tf.ragged.cross.

  3. Detection: Monitor for unexpected TF process crashes (SIGSEGV/SIGABRT) in serving infrastructure — repeated crashes against ragged ops are an indicator. Enable AddressSanitizer in dev/staging builds to catch OOB access during testing.

  4. Inventory: Audit ML pipelines for use of tf.ragged.cross and related ragged ops; prioritize multi-tenant or externally-facing deployments.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 9 - Risk management system for high-risk AI
ISO 42001
8.2 - AI risk assessment 8.4 - AI system operation and monitoring
NIST AI RMF
GOVERN-1.1 - Policies and processes for AI risk management MANAGE-2.2 - Mechanisms to sustain AI system reliability and safety

Frequently Asked Questions

What is CVE-2021-41212?

Any TensorFlow deployment (training, serving, notebooks) running versions prior to 2.4.4/2.5.2/2.6.1/2.7.0 that processes ragged tensors is exposed to a local heap out-of-bounds read triggerable by crafted inputs to tf.ragged.cross. Patch immediately — if you run multi-tenant Jupyter/Colab environments or expose TF serving endpoints that accept user-controlled tensor shapes, the risk escalates significantly. Prioritize patching ML infrastructure where untrusted inputs can reach ragged tensor operations.

Is CVE-2021-41212 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-41212, increasing the risk of exploitation.

How to fix CVE-2021-41212?

1. Patch: Upgrade to TensorFlow >=2.7.0, or apply cherrypick patches for 2.4.4, 2.5.2, 2.6.1. Pin versions in requirements.txt/conda envs and enforce in CI. 2. Immediate workaround: If patching is delayed, disable or sandbox endpoints that accept user-controlled ragged tensor shapes. Validate input tensor rank and shape bounds before passing to tf.ragged.cross. 3. Detection: Monitor for unexpected TF process crashes (SIGSEGV/SIGABRT) in serving infrastructure — repeated crashes against ragged ops are an indicator. Enable AddressSanitizer in dev/staging builds to catch OOB access during testing. 4. Inventory: Audit ML pipelines for use of tf.ragged.cross and related ragged ops; prioritize multi-tenant or externally-facing deployments.

What systems are affected by CVE-2021-41212?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, shared ML notebooks, feature engineering pipelines.

What is the CVSS score for CVE-2021-41212?

CVE-2021-41212 has a CVSS v3.1 base score of 7.1 (HIGH). The EPSS exploitation probability is 0.20%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servingshared ML notebooksfeature engineering pipelines

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0029 Denial of AI Service
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 9
ISO 42001: 8.2, 8.4
NIST AI RMF: GOVERN-1.1, MANAGE-2.2

What are the technical details?

Original Advisory

TensorFlow is an open source platform for machine learning. In affected versions the shape inference code for `tf.ragged.cross` can trigger a read outside of bounds of heap allocated array. The fix will be included in TensorFlow 2.7.0. We will also cherrypick this commit on TensorFlow 2.6.1, TensorFlow 2.5.2, and TensorFlow 2.4.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with access to a TensorFlow Serving endpoint or shared Jupyter environment submits inference requests or notebook code containing tf.ragged.cross calls with maliciously shaped input tensors. During shape inference — before any computation runs — the vulnerable code reads beyond the bounds of a heap-allocated array. In a multi-tenant ML platform, this crashes the TF worker process (denying service to other users) and may leak heap contents including portions of loaded model weights or co-located inference request buffers. In a training pipeline context, an insider or compromised CI/CD job injects the malformed op into a training graph, crashing distributed workers and potentially exfiltrating memory from the training coordinator process.

Weaknesses (CWE)

CWE-125 — Out-of-bounds Read: The product reads data past the end, or before the beginning, of the intended buffer.

  • [Implementation] Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylis
  • [Architecture and Design] Use a language that provides appropriate memory abstractions.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:H

Timeline

Published
November 5, 2021
Last Modified
November 21, 2024
First Seen
November 5, 2021

Related Vulnerabilities