CVE-2021-29575: TensorFlow stack overflow DoS

CISO Take

Patch TensorFlow to 2.5.0 (or backport versions 2.4.2/2.3.3/2.2.3/2.1.4) immediately if running sequence-based models. Risk is elevated in shared ML platforms — multi-tenant Jupyter environments or shared GPU clusters — where any user can trigger a TF runtime crash. Not actively exploited, but trivially reproducible with a single negative integer argument.

What is the risk?

Medium overall, but context-dependent. The local attack vector limits exposure for dedicated, isolated inference servers. Risk escalates significantly in shared ML environments (data science platforms, Jupyter hubs, Kubeflow pipelines) where untrusted or semi-trusted users execute TF operations. CVSS 5.5 is appropriate for isolated deployments; organizations running multi-tenant AI infrastructure should treat this closer to high due to blast radius on co-located workloads.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
TensorFlow	pip	—	No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 3d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1

5.5 / 10

EPSS

0.2%

chance of exploitation in 30 days

Higher than 10% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Trivial

Exploitation Confidence

medium

○ Public PoC indexed (trickest/cve)

Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV Local

AC Low

PR Low

UI None

S Unchanged

C None

I None

A High

What should I do?

5 steps

Patch: Upgrade TensorFlow to 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4 per your branch.
Workaround (if patching is delayed): Add explicit input validation — assert seq_dim >= 0 and batch_dim >= 0 and both within tensor rank bounds before calling ReverseSequence.
Detection: Monitor for abnormal TF process crashes or CHECK-failure stack traces in inference/training logs.
Access control: In shared environments, restrict direct access to tf.raw_ops namespace for untrusted users.
Dependency scanning: Add CVE-2021-29575 to your SCA tooling allowlist to flag unpatched TF versions in container images.

How is it classified?

DoS Framework Inference AML.T0010.001 - AI Software AML.T0029 - Denial of AI Service AML.T0049 - Exploit Public-Facing Application

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Article 15 - Accuracy, robustness and cybersecurity

ISO 42001

A.9.3 - AI system security and resilience

NIST AI RMF

MANAGE 2.2 - AI risk treatment and response mechanisms

OWASP LLM Top 10

LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2021-29575?

Patch TensorFlow to 2.5.0 (or backport versions 2.4.2/2.3.3/2.2.3/2.1.4) immediately if running sequence-based models. Risk is elevated in shared ML platforms — multi-tenant Jupyter environments or shared GPU clusters — where any user can trigger a TF runtime crash. Not actively exploited, but trivially reproducible with a single negative integer argument.

Is CVE-2021-29575 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29575, increasing the risk of exploitation.

How to fix CVE-2021-29575?

1. Patch: Upgrade TensorFlow to 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4 per your branch. 2. Workaround (if patching is delayed): Add explicit input validation — assert seq_dim >= 0 and batch_dim >= 0 and both within tensor rank bounds before calling ReverseSequence. 3. Detection: Monitor for abnormal TF process crashes or CHECK-failure stack traces in inference/training logs. 4. Access control: In shared environments, restrict direct access to tf.raw_ops namespace for untrusted users. 5. Dependency scanning: Add CVE-2021-29575 to your SCA tooling allowlist to flag unpatched TF versions in container images.

What systems are affected by CVE-2021-29575?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, multi-tenant ML platforms.

What is the CVSS score for CVE-2021-29575?

CVE-2021-29575 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.20%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servingmulti-tenant ML platforms

MITRE ATLAS Techniques

AML.T0010.001 AI Software

AML.T0029 Denial of AI Service

AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15

ISO 42001: A.9.3

NIST AI RMF: MANAGE 2.2

OWASP LLM Top 10: LLM04

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. The implementation of `tf.raw_ops.ReverseSequence` allows for stack overflow and/or `CHECK`-fail based denial of service. The implementation(https://github.com/tensorflow/tensorflow/blob/5b3b071975e01f0d250c928b2a8f901cd53b90a7/tensorflow/core/kernels/reverse_sequence_op.cc#L114-L118) fails to validate that `seq_dim` and `batch_dim` arguments are valid. Negative values for `seq_dim` can result in stack overflow or `CHECK`-failure, depending on the version of Eigen code used to implement the operation. Similar behavior can be exhibited by invalid values of `batch_dim`. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with local access to a shared ML platform — e.g., a data scientist on a multi-tenant Jupyter environment — executes a single notebook cell calling tf.raw_ops.ReverseSequence with seq_dim=-1 on an arbitrary tensor. This triggers a stack overflow in the Eigen backend, crashing the TF runtime process. In a Kubernetes-based ML serving environment, this causes pod restarts and temporary inference service disruption for all users sharing the node. A malicious insider could use this to disrupt competitor team training runs or mask other malicious activity during the outage window.

Weaknesses (CWE)

CWE-787 Out-of-bounds Write Primary CWE-119 Improper Restriction of Operations within the Bounds of a Memory Buffer

CWE-787 — Out-of-bounds Write: The product writes data past the end, or before the beginning, of the intended buffer.

[Requirements] Use a language that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. For example, many languages that perform their own memory management, such as Java and Perl, are not subject to buffer overflows. Other languages, such as Ada and C#, typically provide overflow protection, but the protection can be disabled by the programmer. Be wary that a language's interface to native code may still be subject to overflows, even if the language itself is theoretically safe.
[Architecture and Design] Use a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid. Examples include the Safe C String Library (SafeStr) by Messier and Viega [REF-57], and the Strsafe.h library from Microsoft [REF-56]. These libraries provide safer versions of overflow-prone string-handling functions.

Source: MITRE CWE corpus.