CVE-2021-29575: TensorFlow: stack overflow DoS in ReverseSequence op

MEDIUM PoC AVAILABLE
Published May 14, 2021
CISO Take

Patch TensorFlow to 2.5.0 (or backport versions 2.4.2/2.3.3/2.2.3/2.1.4) immediately if running sequence-based models. Risk is elevated in shared ML platforms — multi-tenant Jupyter environments or shared GPU clusters — where any user can trigger a TF runtime crash. Not actively exploited, but trivially reproducible with a single negative integer argument.

Risk Assessment

Medium overall, but context-dependent. The local attack vector limits exposure for dedicated, isolated inference servers. Risk escalates significantly in shared ML environments (data science platforms, Jupyter hubs, Kubeflow pipelines) where untrusted or semi-trusted users execute TF operations. CVSS 5.5 is appropriate for isolated deployments; organizations running multi-tenant AI infrastructure should treat this closer to high due to blast radius on co-located workloads.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed today 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
5.5 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 1% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

Recommended Action

5 steps
  1. Patch: Upgrade TensorFlow to 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4 per your branch.

  2. Workaround (if patching is delayed): Add explicit input validation — assert seq_dim >= 0 and batch_dim >= 0 and both within tensor rank bounds before calling ReverseSequence.

  3. Detection: Monitor for abnormal TF process crashes or CHECK-failure stack traces in inference/training logs.

  4. Access control: In shared environments, restrict direct access to tf.raw_ops namespace for untrusted users.

  5. Dependency scanning: Add CVE-2021-29575 to your SCA tooling allowlist to flag unpatched TF versions in container images.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.9.3 - AI system security and resilience
NIST AI RMF
MANAGE 2.2 - AI risk treatment and response mechanisms
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2021-29575?

Patch TensorFlow to 2.5.0 (or backport versions 2.4.2/2.3.3/2.2.3/2.1.4) immediately if running sequence-based models. Risk is elevated in shared ML platforms — multi-tenant Jupyter environments or shared GPU clusters — where any user can trigger a TF runtime crash. Not actively exploited, but trivially reproducible with a single negative integer argument.

Is CVE-2021-29575 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29575, increasing the risk of exploitation.

How to fix CVE-2021-29575?

1. Patch: Upgrade TensorFlow to 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4 per your branch. 2. Workaround (if patching is delayed): Add explicit input validation — assert seq_dim >= 0 and batch_dim >= 0 and both within tensor rank bounds before calling ReverseSequence. 3. Detection: Monitor for abnormal TF process crashes or CHECK-failure stack traces in inference/training logs. 4. Access control: In shared environments, restrict direct access to tf.raw_ops namespace for untrusted users. 5. Dependency scanning: Add CVE-2021-29575 to your SCA tooling allowlist to flag unpatched TF versions in container images.

What systems are affected by CVE-2021-29575?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, multi-tenant ML platforms.

What is the CVSS score for CVE-2021-29575?

CVE-2021-29575 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.01%.

Technical Details

NVD Description

TensorFlow is an end-to-end open source platform for machine learning. The implementation of `tf.raw_ops.ReverseSequence` allows for stack overflow and/or `CHECK`-fail based denial of service. The implementation(https://github.com/tensorflow/tensorflow/blob/5b3b071975e01f0d250c928b2a8f901cd53b90a7/tensorflow/core/kernels/reverse_sequence_op.cc#L114-L118) fails to validate that `seq_dim` and `batch_dim` arguments are valid. Negative values for `seq_dim` can result in stack overflow or `CHECK`-failure, depending on the version of Eigen code used to implement the operation. Similar behavior can be exhibited by invalid values of `batch_dim`. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with local access to a shared ML platform — e.g., a data scientist on a multi-tenant Jupyter environment — executes a single notebook cell calling tf.raw_ops.ReverseSequence with seq_dim=-1 on an arbitrary tensor. This triggers a stack overflow in the Eigen backend, crashing the TF runtime process. In a Kubernetes-based ML serving environment, this causes pod restarts and temporary inference service disruption for all users sharing the node. A malicious insider could use this to disrupt competitor team training runs or mask other malicious activity during the outage window.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 14, 2021
Last Modified
November 21, 2024
First Seen
May 14, 2021

Related Vulnerabilities