CVE-2021-29611: TensorFlow DoS — MEDIUM

CISO Take

This medium-severity local DoS in TensorFlow's SparseReshape op allows any user with local execution access to crash the TensorFlow process by supplying a malformed sparse tensor. For ML platforms where users submit training jobs or inference requests (shared compute, Jupyter, Vertex AI, SageMaker), this is a reliable availability disruption vector. Upgrade affected deployments to TensorFlow 2.5.0, 2.4.2, or 2.3.3 immediately.

What is the risk?

Risk is moderate-to-low in isolated environments but elevates in multi-tenant ML platforms. The local attack vector (AV:L) limits remote exploitability, but in practice most ML training infrastructure exposes code execution via notebooks, job schedulers, or REST APIs that accept user-defined graphs—effectively lowering the bar to exploitation. No confidentiality or integrity impact; purely an availability issue. Not in CISA KEV and no active exploitation reported. CVSS 5.5 accurately reflects the limited blast radius in properly segmented deployments.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
TensorFlow	pip	—	No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 3d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1

5.5 / 10

EPSS

0.2%

chance of exploitation in 30 days

Higher than 10% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Trivial

Exploitation Confidence

medium

○ Public PoC indexed (trickest/cve)

Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV Local

AC Low

PR Low

UI None

S Unchanged

C None

I None

A High

What should I do?

4 steps

Patch: Upgrade to TensorFlow 2.5.0, 2.4.2, or 2.3.3. Verify via pip show tensorflow.
Workaround: Validate sparse tensor dimensions and non-zero counts at API/pipeline ingestion boundaries before passing to SparseReshape—reject inputs where shape product does not match nnz.
Isolation: Run TensorFlow model servers in separate processes per tenant; use process-level isolation (containers) so a crash does not affect other users.
Detection: Monitor for abnormal TF process exits (SIGABRT) and CHECK-failure log messages containing 'SparseReshape'. Alert on unexpected serving process restarts.

How is it classified?

DoS Framework AML.T0029 - Denial of AI Service AML.T0043 - Craft Adversarial Data AML.T0049 - Exploit Public-Facing Application

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Article 9 - Risk management system

ISO 42001

8.4 - AI system operation and monitoring

NIST AI RMF

MANAGE 2.2 - Mechanisms to sustain AI system availability and reliability

OWASP LLM Top 10

LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2021-29611?

This medium-severity local DoS in TensorFlow's SparseReshape op allows any user with local execution access to crash the TensorFlow process by supplying a malformed sparse tensor. For ML platforms where users submit training jobs or inference requests (shared compute, Jupyter, Vertex AI, SageMaker), this is a reliable availability disruption vector. Upgrade affected deployments to TensorFlow 2.5.0, 2.4.2, or 2.3.3 immediately.

Is CVE-2021-29611 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29611, increasing the risk of exploitation.

How to fix CVE-2021-29611?

1. Patch: Upgrade to TensorFlow 2.5.0, 2.4.2, or 2.3.3. Verify via `pip show tensorflow`. 2. Workaround: Validate sparse tensor dimensions and non-zero counts at API/pipeline ingestion boundaries before passing to SparseReshape—reject inputs where shape product does not match nnz. 3. Isolation: Run TensorFlow model servers in separate processes per tenant; use process-level isolation (containers) so a crash does not affect other users. 4. Detection: Monitor for abnormal TF process exits (SIGABRT) and CHECK-failure log messages containing 'SparseReshape'. Alert on unexpected serving process restarts.

What systems are affected by CVE-2021-29611?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, ML data preprocessing, shared compute / multi-tenant ML platforms.

What is the CVSS score for CVE-2021-29611?

CVE-2021-29611 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.20%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servingML data preprocessingshared compute / multi-tenant ML platforms

MITRE ATLAS Techniques

AML.T0029 Denial of AI Service

AML.T0043 Craft Adversarial Data

AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 9

ISO 42001: 8.4

NIST AI RMF: MANAGE 2.2

OWASP LLM Top 10: LLM04

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. Incomplete validation in `SparseReshape` results in a denial of service based on a `CHECK`-failure. The implementation(https://github.com/tensorflow/tensorflow/blob/e87b51ce05c3eb172065a6ea5f48415854223285/tensorflow/core/kernels/sparse_reshape_op.cc#L40) has no validation that the input arguments specify a valid sparse tensor. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2 and TensorFlow 2.3.3, as these are the only affected versions.

Exploitation Scenario

An adversary with access to a shared ML platform (e.g., a data scientist submitting jobs, or an external user querying a TensorFlow Serving endpoint that accepts sparse tensor inputs) crafts a SparseTensor with an inconsistent shape—e.g., declaring a 2D shape [10, 10] but providing indices that exceed those bounds, or mismatching the dense_shape dimensions against actual values. When the model pipeline calls SparseReshape on this input, the incomplete validation triggers a CHECK-failure abort, crashing the TensorFlow process. On a shared training cluster, this kills co-located training jobs. On a serving deployment, it causes a service outage until the process is restarted.

Weaknesses (CWE)

CWE-20 Improper Input Validation Primary CWE-665 Improper Initialization

CWE-20 — Improper Input Validation: The product receives input or data, but it does not validate or incorrectly validates that the input has the properties that are required to process the data safely and correctly.

[Architecture and Design] Consider using language-theoretic security (LangSec) techniques that characterize inputs using a formal language and build "recognizers" for that language. This effectively requires parsing to be a distinct layer that effectively enforces a boundary between raw input and internal data representations, instead of allowing parser code to be scattered throughout the program, where it could be subject to errors or inconsistencies that create weaknesses. [REF-1109] [REF-1110] [REF-1111]
[Architecture and Design] Use an input validation framework such as Struts or the OWASP ESAPI Validation API. Note that using a framework does not automatically address all input validation problems; be mindful of weaknesses that could arise from misusing the framework itself (CWE-1173).

Source: MITRE CWE corpus.