CVE-2021-29556: TensorFlow: DoS via divide-by-zero in Reverse op

MEDIUM PoC AVAILABLE
Published May 14, 2021
CISO Take

A local attacker with low privileges can crash any TensorFlow process by passing a zero-dimension tensor to tf.raw_ops.Reverse, triggering a floating point exception. In model serving contexts where user-supplied inputs reach this op, this becomes a remote availability risk. Patch to TF 2.5.0 (or cherrypick backports for 2.1.x–2.4.x) and add tensor shape validation at ingestion boundaries.

Risk Assessment

Effective risk is low-to-medium and context-dependent. CVSS 5.5 (Local) understates the threat in cloud ML serving architectures where inference endpoints accept raw tensors from untrusted callers. No confidentiality or integrity impact—pure availability. Not in CISA KEV and no evidence of in-the-wild exploitation. Urgency is low for offline training workloads, moderate for exposed inference APIs.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed today 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
5.5 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 1% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

Recommended Action

5 steps
  1. Upgrade to TensorFlow 2.5.0 or apply official cherrypicks to 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4 (commit 4071d8e).

  2. If patching is blocked, add server-side input validation to reject tensors with any dimension equal to zero before passing to Reverse.

  3. Run TF serving processes under process supervisors (systemd, Kubernetes restartPolicy=Always) to auto-recover from crashes.

  4. Monitor for SIGFPE / abnormal process exits in TF inference pods—repeated crashes may signal probing.

  5. Audit downstream dependencies (TFX pipelines, TF Serving images) for pinned vulnerable versions.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, Robustness and Cybersecurity
ISO 42001
A.6.2.6 - Security of AI Systems
NIST AI RMF
MG-3.2 - Risk Response — Availability and Resilience MS-2.5 - AI Robustness and Reliability Testing
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2021-29556?

A local attacker with low privileges can crash any TensorFlow process by passing a zero-dimension tensor to tf.raw_ops.Reverse, triggering a floating point exception. In model serving contexts where user-supplied inputs reach this op, this becomes a remote availability risk. Patch to TF 2.5.0 (or cherrypick backports for 2.1.x–2.4.x) and add tensor shape validation at ingestion boundaries.

Is CVE-2021-29556 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29556, increasing the risk of exploitation.

How to fix CVE-2021-29556?

1. Upgrade to TensorFlow 2.5.0 or apply official cherrypicks to 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4 (commit 4071d8e). 2. If patching is blocked, add server-side input validation to reject tensors with any dimension equal to zero before passing to Reverse. 3. Run TF serving processes under process supervisors (systemd, Kubernetes restartPolicy=Always) to auto-recover from crashes. 4. Monitor for SIGFPE / abnormal process exits in TF inference pods—repeated crashes may signal probing. 5. Audit downstream dependencies (TFX pipelines, TF Serving images) for pinned vulnerable versions.

What systems are affected by CVE-2021-29556?

This vulnerability affects the following AI/ML architecture patterns: model serving, inference pipelines, training pipelines.

What is the CVSS score for CVE-2021-29556?

CVE-2021-29556 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.01%.

Technical Details

NVD Description

TensorFlow is an end-to-end open source platform for machine learning. An attacker can cause a denial of service via a FPE runtime error in `tf.raw_ops.Reverse`. This is because the implementation(https://github.com/tensorflow/tensorflow/blob/36229ea9e9451dac14a8b1f4711c435a1d84a594/tensorflow/core/kernels/reverse_op.cc#L75-L76) performs a division based on the first dimension of the tensor argument. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary targets a model serving endpoint backed by TF 2.4.x that accepts serialized tensors (e.g., via TFServing gRPC or a custom REST wrapper). They craft a protobuf request containing a tensor with shape [0, 128] and invoke a model that internally calls tf.reverse(). The kernel performs integer division by the first dimension (0), raising a SIGFPE that kills the inference worker. In a Kubernetes deployment without restart policies, this takes the endpoint offline. Repeated at scale or timed during peak inference load, it constitutes a targeted denial-of-service against a production ML system.

Weaknesses (CWE)

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 14, 2021
Last Modified
November 21, 2024
First Seen
May 14, 2021

Related Vulnerabilities