CVE-2021-29563: TensorFlow: DoS via RFFT empty matrix assertion crash

MEDIUM PoC AVAILABLE
Published May 14, 2021
CISO Take

A low-privileged local attacker can crash any TensorFlow process by passing an empty matrix to the RFFT operation, terminating ML workers or inference servers. Patch immediately to TF 2.5.0 or available backports (2.4.2, 2.3.3, 2.2.3, 2.1.4). In shared ML environments (Jupyter hubs, MLflow, SageMaker notebooks), this is a viable insider or multi-tenant disruption vector.

What is the risk?

Medium risk overall, but elevated in multi-tenant ML platforms. The local attack vector (AV:L) limits exposure for standalone deployments, but shared compute environments—where multiple users or workloads share a TensorFlow process or ML inference server—make this trivially exploitable. No confidentiality or integrity impact; availability is fully compromised for the affected process. Not in CISA KEV and no evidence of active exploitation.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 2d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
5.5 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 9% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

What should I do?

5 steps
  1. Patch: Upgrade to TensorFlow 2.5.0 or apply cherrypicks to 2.4.2, 2.3.3, 2.2.3, or 2.1.4. Reference commit: 31bd5026304677faa8a0b77602c6154171b9aec1.

  2. Validate inputs: Add shape assertions before any RFFT call—reject empty tensors at the application layer.

  3. Isolate workloads: Run untrusted or user-submitted TF graphs in sandboxed processes (separate containers or VMs) to contain crash blast radius.

  4. Monitor: Alert on unexpected TensorFlow process exits in ML workers/serving infrastructure.

  5. Audit exposure: Identify any public-facing APIs that accept user-controlled tensors passed to FFT operations.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2.6 - AI system operational monitoring and control
NIST AI RMF
GOVERN 1.7 - Risk tolerance and AI risk policies MANAGE 2.4 - Incident response and recovery for AI risks

Frequently Asked Questions

What is CVE-2021-29563?

A low-privileged local attacker can crash any TensorFlow process by passing an empty matrix to the RFFT operation, terminating ML workers or inference servers. Patch immediately to TF 2.5.0 or available backports (2.4.2, 2.3.3, 2.2.3, 2.1.4). In shared ML environments (Jupyter hubs, MLflow, SageMaker notebooks), this is a viable insider or multi-tenant disruption vector.

Is CVE-2021-29563 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29563, increasing the risk of exploitation.

How to fix CVE-2021-29563?

1. Patch: Upgrade to TensorFlow 2.5.0 or apply cherrypicks to 2.4.2, 2.3.3, 2.2.3, or 2.1.4. Reference commit: 31bd5026304677faa8a0b77602c6154171b9aec1. 2. Validate inputs: Add shape assertions before any RFFT call—reject empty tensors at the application layer. 3. Isolate workloads: Run untrusted or user-submitted TF graphs in sandboxed processes (separate containers or VMs) to contain crash blast radius. 4. Monitor: Alert on unexpected TensorFlow process exits in ML workers/serving infrastructure. 5. Audit exposure: Identify any public-facing APIs that accept user-controlled tensors passed to FFT operations.

What systems are affected by CVE-2021-29563?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, ML notebooks, batch inference pipelines, audio/signal preprocessing.

What is the CVSS score for CVE-2021-29563?

CVE-2021-29563 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.19%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servingML notebooksbatch inference pipelinesaudio/signal preprocessing

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0029 Denial of AI Service
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: A.6.2.6
NIST AI RMF: GOVERN 1.7, MANAGE 2.4

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. An attacker can cause a denial of service by exploiting a `CHECK`-failure coming from the implementation of `tf.raw_ops.RFFT`. Eigen code operating on an empty matrix can trigger on an assertion and will cause program termination. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

An attacker with authenticated access to a shared ML notebook environment (e.g., JupyterHub, Databricks, SageMaker Studio) imports TensorFlow and calls tf.raw_ops.RFFT with an empty input tensor. This triggers an Eigen assertion failure that terminates the Python kernel or TF serving process—disrupting other users sharing the same runtime. In a model-serving context, an adversary who can submit inference requests containing crafted empty inputs to an endpoint backed by TF RFFT operations achieves remote process termination, causing service outage.

Weaknesses (CWE)

CWE-617 — Reachable Assertion: The product contains an assert() or similar statement that can be triggered by an attacker, which leads to an application exit or other behavior that is more severe than necessary.

  • [Implementation] Make sensitive open/close operation non reachable by directly user-controlled data (e.g. open/close resources)
  • [Implementation] Perform input validation on user data.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 14, 2021
Last Modified
November 21, 2024
First Seen
May 14, 2021

Related Vulnerabilities