CVE-2021-29615: TensorFlow uncontrolled recursion

CISO Take

A crafted TensorFlow graph attribute triggers unbounded recursion crashing any TF process that parses it—training jobs, model servers, or pipelines loading untrusted graphs. Impact is availability-only (no confidentiality or integrity loss). Patch to TF 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4 immediately; if patching is delayed, restrict all TF graph parsing to signed, trusted sources only.

What is the risk?

Medium operational risk. The local attack vector limits direct remote exploitability, but ML pipelines routinely process model files from semi-trusted sources—artifact registries, collaborative training, third-party checkpoints. Any system loading external TensorFlow graphs or SavedModels is exposed. Low attack complexity means exploitation is straightforward once the attacker influences parsed inputs. Not actively exploited and not in CISA KEV.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
TensorFlow	pip	—	No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 3d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1

5.5 / 10

EPSS

0.2%

chance of exploitation in 30 days

Higher than 10% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Moderate

Exploitation Confidence

medium

○ Public PoC indexed (trickest/cve)

Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV Local

AC Low

PR Low

UI None

S Unchanged

C None

I None

A High

What should I do?

5 steps

Patch: Upgrade to TF ≥2.5.0 or apply cherry-picked fixes to 2.4.2, 2.3.3, 2.2.3, or 2.1.4.
Input trust boundary: Reject untrusted or unsigned TF graphs before they reach ParseAttrValue—validate model provenance via cryptographic signing.
Process isolation: Run model loading in sandboxed child processes with ulimit stack constraints to bound crash blast radius.
Detection: Alert on unexpected TF process crashes or SIGSEGV signals in serving and training infrastructure logs.
Inventory: Audit all TF versions in use via SBOM, pip freeze, or container image scans.

How is it classified?

DoS Supply Chain Framework Inference AML.T0011.000 - Unsafe AI Artifacts AML.T0029 - Denial of AI Service AML.T0049 - Exploit Public-Facing Application

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Article 15 - Accuracy, robustness and cybersecurity

ISO 42001

A.10.1 - AI system supply chain management

NIST AI RMF

MANAGE 2.2 - Mechanisms are in place to respond to and recover from AI risks

OWASP LLM Top 10

LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2021-29615?

A crafted TensorFlow graph attribute triggers unbounded recursion crashing any TF process that parses it—training jobs, model servers, or pipelines loading untrusted graphs. Impact is availability-only (no confidentiality or integrity loss). Patch to TF 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4 immediately; if patching is delayed, restrict all TF graph parsing to signed, trusted sources only.

Is CVE-2021-29615 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29615, increasing the risk of exploitation.

How to fix CVE-2021-29615?

1. Patch: Upgrade to TF ≥2.5.0 or apply cherry-picked fixes to 2.4.2, 2.3.3, 2.2.3, or 2.1.4. 2. Input trust boundary: Reject untrusted or unsigned TF graphs before they reach ParseAttrValue—validate model provenance via cryptographic signing. 3. Process isolation: Run model loading in sandboxed child processes with ulimit stack constraints to bound crash blast radius. 4. Detection: Alert on unexpected TF process crashes or SIGSEGV signals in serving and training infrastructure logs. 5. Inventory: Audit all TF versions in use via SBOM, pip freeze, or container image scans.

What systems are affected by CVE-2021-29615?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, model registries, CI/CD for ML.

What is the CVSS score for CVE-2021-29615?

CVE-2021-29615 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.20%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servingmodel registriesCI/CD for ML

MITRE ATLAS Techniques

AML.T0011.000 Unsafe AI Artifacts

AML.T0029 Denial of AI Service

AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15

ISO 42001: A.10.1

NIST AI RMF: MANAGE 2.2

OWASP LLM Top 10: LLM05

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. The implementation of `ParseAttrValue`(https://github.com/tensorflow/tensorflow/blob/c22d88d6ff33031aa113e48aa3fc9aa74ed79595/tensorflow/core/framework/attr_value_util.cc#L397-L453) can be tricked into stack overflow due to recursion by giving in a specially crafted input. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary crafts a malicious TensorFlow SavedModel or GraphDef containing deeply nested attribute value structures specifically designed to exhaust the call stack in ParseAttrValue. They upload this to a shared model registry used by the victim's ML platform, or inject it into a federated learning aggregation pipeline. When the victim's infrastructure loads and parses the graph—during model validation in CI, deployment to a TF Serving endpoint, or training checkpoint loading—the TF process crashes with a stack overflow. In a production model serving context, this results in inference downtime. In a training context, it forces job restarts and wastes GPU compute budget.

Weaknesses (CWE)

CWE-674 Uncontrolled Recursion

CWE-674 — Uncontrolled Recursion: The product does not properly control the amount of recursion that takes place, consuming excessive resources, such as allocated memory or the program stack.

[Implementation] Ensure that an end condition will be reached under all logic conditions. The end condition may include checking against the depth of recursion and exiting with an error if the recursion goes too deep. The complexity of the end condition contributes to the effectiveness of this action.
[Implementation] Increase the stack size.

Source: MITRE CWE corpus.