CVE-2021-29615: TensorFlow: uncontrolled recursion DoS in ParseAttrValue

MEDIUM PoC AVAILABLE
Published May 14, 2021
CISO Take

A crafted TensorFlow graph attribute triggers unbounded recursion crashing any TF process that parses it—training jobs, model servers, or pipelines loading untrusted graphs. Impact is availability-only (no confidentiality or integrity loss). Patch to TF 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4 immediately; if patching is delayed, restrict all TF graph parsing to signed, trusted sources only.

What is the risk?

Medium operational risk. The local attack vector limits direct remote exploitability, but ML pipelines routinely process model files from semi-trusted sources—artifact registries, collaborative training, third-party checkpoints. Any system loading external TensorFlow graphs or SavedModels is exposed. Low attack complexity means exploitation is straightforward once the attacker influences parsed inputs. Not actively exploited and not in CISA KEV.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 3d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
5.5 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 10% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Moderate
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

What should I do?

5 steps
  1. Patch: Upgrade to TF ≥2.5.0 or apply cherry-picked fixes to 2.4.2, 2.3.3, 2.2.3, or 2.1.4.

  2. Input trust boundary: Reject untrusted or unsigned TF graphs before they reach ParseAttrValue—validate model provenance via cryptographic signing.

  3. Process isolation: Run model loading in sandboxed child processes with ulimit stack constraints to bound crash blast radius.

  4. Detection: Alert on unexpected TF process crashes or SIGSEGV signals in serving and training infrastructure logs.

  5. Inventory: Audit all TF versions in use via SBOM, pip freeze, or container image scans.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.10.1 - AI system supply chain management
NIST AI RMF
MANAGE 2.2 - Mechanisms are in place to respond to and recover from AI risks
OWASP LLM Top 10
LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2021-29615?

A crafted TensorFlow graph attribute triggers unbounded recursion crashing any TF process that parses it—training jobs, model servers, or pipelines loading untrusted graphs. Impact is availability-only (no confidentiality or integrity loss). Patch to TF 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4 immediately; if patching is delayed, restrict all TF graph parsing to signed, trusted sources only.

Is CVE-2021-29615 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29615, increasing the risk of exploitation.

How to fix CVE-2021-29615?

1. Patch: Upgrade to TF ≥2.5.0 or apply cherry-picked fixes to 2.4.2, 2.3.3, 2.2.3, or 2.1.4. 2. Input trust boundary: Reject untrusted or unsigned TF graphs before they reach ParseAttrValue—validate model provenance via cryptographic signing. 3. Process isolation: Run model loading in sandboxed child processes with ulimit stack constraints to bound crash blast radius. 4. Detection: Alert on unexpected TF process crashes or SIGSEGV signals in serving and training infrastructure logs. 5. Inventory: Audit all TF versions in use via SBOM, pip freeze, or container image scans.

What systems are affected by CVE-2021-29615?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, model registries, CI/CD for ML.

What is the CVSS score for CVE-2021-29615?

CVE-2021-29615 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.20%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servingmodel registriesCI/CD for ML

MITRE ATLAS Techniques

AML.T0011.000 Unsafe AI Artifacts
AML.T0029 Denial of AI Service
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: A.10.1
NIST AI RMF: MANAGE 2.2
OWASP LLM Top 10: LLM05

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. The implementation of `ParseAttrValue`(https://github.com/tensorflow/tensorflow/blob/c22d88d6ff33031aa113e48aa3fc9aa74ed79595/tensorflow/core/framework/attr_value_util.cc#L397-L453) can be tricked into stack overflow due to recursion by giving in a specially crafted input. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary crafts a malicious TensorFlow SavedModel or GraphDef containing deeply nested attribute value structures specifically designed to exhaust the call stack in ParseAttrValue. They upload this to a shared model registry used by the victim's ML platform, or inject it into a federated learning aggregation pipeline. When the victim's infrastructure loads and parses the graph—during model validation in CI, deployment to a TF Serving endpoint, or training checkpoint loading—the TF process crashes with a stack overflow. In a production model serving context, this results in inference downtime. In a training context, it forces job restarts and wastes GPU compute budget.

Weaknesses (CWE)

CWE-674 — Uncontrolled Recursion: The product does not properly control the amount of recursion that takes place, consuming excessive resources, such as allocated memory or the program stack.

  • [Implementation] Ensure that an end condition will be reached under all logic conditions. The end condition may include checking against the depth of recursion and exiting with an error if the recursion goes too deep. The complexity of the end condition contributes to the effectiveness of this action.
  • [Implementation] Increase the stack size.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 14, 2021
Last Modified
November 21, 2024
First Seen
May 14, 2021

Related Vulnerabilities