CVE-2021-41211: TensorFlow: heap OOB read in QuantizeV2 shape inference

HIGH PoC AVAILABLE
Published November 5, 2021
CISO Take

Any TensorFlow deployment on 2.6.0 or earlier running quantization workflows is exposed to a heap out-of-bounds read that can leak memory contents or crash processes. Patch to TensorFlow 2.7.0 or 2.6.1 immediately — this is straightforward with no workaround available. In shared GPU/TPU training clusters, this is a lateral-movement risk if jobs from different tenants coexist.

What is the risk?

CVSS 7.1 High. Local attack vector with low complexity and low privileges limits internet-facing exposure, but shared ML training infrastructure (Kubernetes GPU clusters, Jupyter hubs, MLflow-managed compute) dramatically widens the blast radius. An insider or compromised CI/CD job can trigger this with a single crafted TensorFlow script. Confidentiality impact is high — heap OOB reads on ML systems can expose model weights, training data batches, or API keys loaded into process memory.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 4d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
7.1 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 10% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C High
I None
A High

What should I do?

5 steps
  1. Patch: Upgrade TensorFlow to 2.7.0 or 2.6.1 — the fix validates axis bounds before heap access.

  2. Inventory: Identify all services using TensorFlow quantization (QuantizeV2, quantize_and_dequantize ops) in training and serving pipelines.

  3. Isolate: In multi-tenant ML platforms, ensure training jobs run in separate namespaces/pods with restricted inter-process memory access.

  4. Detect: Monitor for TensorFlow SIGABRT/segfault crashes in training workers — repeated crashes on quantization ops are a signal.

  5. CI/CD: Add TensorFlow version pinning checks in dependency scanning (Snyk, Dependabot) to catch this class of library vulnerability at build time.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity for high-risk AI systems
ISO 42001
A.6.1.4 - AI risk assessment — technical vulnerabilities
NIST AI RMF
GOVERN 1.4 - Organizational teams are committed to transparency and accountability MANAGE 2.2 - Mechanisms to sustain treatment of AI risks

Frequently Asked Questions

What is CVE-2021-41211?

Any TensorFlow deployment on 2.6.0 or earlier running quantization workflows is exposed to a heap out-of-bounds read that can leak memory contents or crash processes. Patch to TensorFlow 2.7.0 or 2.6.1 immediately — this is straightforward with no workaround available. In shared GPU/TPU training clusters, this is a lateral-movement risk if jobs from different tenants coexist.

Is CVE-2021-41211 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-41211, increasing the risk of exploitation.

How to fix CVE-2021-41211?

1. Patch: Upgrade TensorFlow to 2.7.0 or 2.6.1 — the fix validates axis bounds before heap access. 2. Inventory: Identify all services using TensorFlow quantization (QuantizeV2, quantize_and_dequantize ops) in training and serving pipelines. 3. Isolate: In multi-tenant ML platforms, ensure training jobs run in separate namespaces/pods with restricted inter-process memory access. 4. Detect: Monitor for TensorFlow SIGABRT/segfault crashes in training workers — repeated crashes on quantization ops are a signal. 5. CI/CD: Add TensorFlow version pinning checks in dependency scanning (Snyk, Dependabot) to catch this class of library vulnerability at build time.

What systems are affected by CVE-2021-41211?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, model compression workflows.

What is the CVSS score for CVE-2021-41211?

CVE-2021-41211 has a CVSS v3.1 base score of 7.1 (HIGH). The EPSS exploitation probability is 0.20%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servingmodel compression workflows

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0037 Data from Local System
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: A.6.1.4
NIST AI RMF: GOVERN 1.4, MANAGE 2.2

What are the technical details?

Original Advisory

TensorFlow is an open source platform for machine learning. In affected versions the shape inference code for `QuantizeV2` can trigger a read outside of bounds of heap allocated array. This occurs whenever `axis` is a negative value less than `-1`. In this case, we are accessing data before the start of a heap buffer. The code allows `axis` to be an optional argument (`s` would contain an `error::NOT_FOUND` error code). Otherwise, it assumes that `axis` is a valid index into the dimensions of the `input` tensor. If `axis` is less than `-1` then this results in a heap OOB read. The fix will be included in TensorFlow 2.7.0. We will also cherrypick this commit on TensorFlow 2.6.1, as this version is the only one that is also affected.

Exploitation Scenario

An attacker with access to a shared ML training platform (e.g., via a compromised service account or insider threat) submits a training job containing a TensorFlow graph that calls QuantizeV2 with axis set to -2 or lower. The shape inference code reads memory before the start of the heap-allocated dimensions array. Depending on heap layout, this leaks adjacent heap contents — potentially including model weights from another tenant's job, environment variables with API keys, or training batch data. The attacker extracts this via TensorFlow's own logging/output channels without triggering obvious network-based detections.

Weaknesses (CWE)

CWE-125 — Out-of-bounds Read: The product reads data past the end, or before the beginning, of the intended buffer.

  • [Implementation] Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylis
  • [Architecture and Design] Use a language that provides appropriate memory abstractions.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:H

Timeline

Published
November 5, 2021
Last Modified
November 21, 2024
First Seen
November 5, 2021

Related Vulnerabilities