CVE-2021-41204: TensorFlow: DoS via Grappler constant folding segfault

MEDIUM PoC AVAILABLE
Published November 5, 2021
CISO Take

A local attacker with low privileges can crash TensorFlow training jobs by triggering a segfault in the Grappler optimizer—pure availability impact, no data exfiltration. Patch to TF 2.7.0 or backported fixes (2.6.1, 2.5.2, 2.4.4); risk is highest in shared ML compute environments where one crash disrupts co-tenant training jobs. Not urgent for single-user or air-gapped training setups.

Risk Assessment

Medium risk with limited blast radius. Local attack vector with low privilege requirement makes remote exploitation impossible without a prior foothold. Impact is availability-only (A:H)—no confidentiality or integrity compromise. Risk is elevated in multi-tenant ML platforms such as JupyterHub, shared GPU clusters, or Kubeflow where a malicious or compromised user can deny service to co-tenants by deliberately triggering the crash.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed today 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
5.5 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 5% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Moderate
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

Recommended Action

5 steps
  1. Patch: Upgrade to TensorFlow 2.7.0, 2.6.1, 2.5.2, or 2.4.4 (fix: commit 7731e8d).

  2. Workaround (performance trade-off): Disable constant folding via tf.config.optimizer.set_experimental_options({'constant_folding': False}).

  3. Detection: Monitor for abnormal SIGSEGV signals in TensorFlow process logs on training workers.

  4. Multi-tenant hardening: Enforce process-level isolation (containers, namespaces) so a crash on one worker cannot propagate to adjacent training jobs.

  5. Inventory all TF versions in your ML infrastructure before patching.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 9 - Risk management system
ISO 42001
A.6.2 - AI system lifecycle management
NIST AI RMF
MANAGE 2.2 - Mechanisms are in place and applied to sustain the value of deployed AI systems

Frequently Asked Questions

What is CVE-2021-41204?

A local attacker with low privileges can crash TensorFlow training jobs by triggering a segfault in the Grappler optimizer—pure availability impact, no data exfiltration. Patch to TF 2.7.0 or backported fixes (2.6.1, 2.5.2, 2.4.4); risk is highest in shared ML compute environments where one crash disrupts co-tenant training jobs. Not urgent for single-user or air-gapped training setups.

Is CVE-2021-41204 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-41204, increasing the risk of exploitation.

How to fix CVE-2021-41204?

1. Patch: Upgrade to TensorFlow 2.7.0, 2.6.1, 2.5.2, or 2.4.4 (fix: commit 7731e8d). 2. Workaround (performance trade-off): Disable constant folding via tf.config.optimizer.set_experimental_options({'constant_folding': False}). 3. Detection: Monitor for abnormal SIGSEGV signals in TensorFlow process logs on training workers. 4. Multi-tenant hardening: Enforce process-level isolation (containers, namespaces) so a crash on one worker cannot propagate to adjacent training jobs. 5. Inventory all TF versions in your ML infrastructure before patching.

What systems are affected by CVE-2021-41204?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model optimization, shared ML compute.

What is the CVSS score for CVE-2021-41204?

CVE-2021-41204 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.02%.

Technical Details

NVD Description

TensorFlow is an open source platform for machine learning. In affected versions during TensorFlow's Grappler optimizer phase, constant folding might attempt to deep copy a resource tensor. This results in a segfault, as these tensors are supposed to not change. The fix will be included in TensorFlow 2.7.0. We will also cherrypick this commit on TensorFlow 2.6.1, TensorFlow 2.5.2, and TensorFlow 2.4.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with shell access to a shared ML training server crafts a TensorFlow model graph that routes resource tensors through constant-foldable subgraphs. When the model is submitted for training or warm-up optimization in a shared JupyterHub or Kubeflow pipeline environment, the Grappler optimizer attempts to deep-copy the resource tensor during constant folding, triggering a SIGSEGV that terminates the TF worker process. On shared nodes without process isolation, this cascades to other users' in-flight training jobs, causing data loss and pipeline disruption without leaving obvious attacker artifacts.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
November 5, 2021
Last Modified
November 21, 2024
First Seen
November 5, 2021

Related Vulnerabilities