CVE-2021-29589: TFLite GatherNd: divide-by-zero crashes inference runtime

HIGH PoC AVAILABLE
Published May 14, 2021
CISO Take

Any pipeline ingesting externally-sourced .tflite models is exposed—the 'local' attack vector is misleading in ML contexts where model files routinely cross trust boundaries through registries, artifact stores, and third-party sharing. An attacker who can substitute a crafted model (supply chain, poisoned registry, compromised bucket) crashes the TFLite runtime immediately on inference. Patch to TF 2.5.0 / 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4 and enforce cryptographic model provenance before loading any artifact.

Risk Assessment

CVSS 7.8 High with local vector and low complexity. Effective risk is elevated above the raw CVSS score in ML environments: model files routinely move across organizational trust boundaries via MLOps pipelines, model hubs, and update channels, making the 'local' prerequisite achievable remotely. Not in CISA KEV and dated (2021), which reduces urgency for patched orgs; unpatched TFLite deployments—common in embedded/edge contexts with infrequent update cycles—remain a realistic target. EPSS data unavailable.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed 6d ago 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
7.8 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 1% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Moderate
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C High
I High
A High

Recommended Action

5 steps
  1. PATCH

    Upgrade to TensorFlow 2.5.0 or cherry-pick releases 2.4.2, 2.3.3, 2.2.3, 2.1.4. Prioritize embedded/edge fleet where patch cadence is slow.

  2. MODEL PROVENANCE

    Implement cryptographic signing (e.g., Sigstore/TUF) and verification for all .tflite artifacts before runtime loading. Reject unsigned or unverified models.

  3. INPUT VALIDATION

    Add pre-load validation to detect zero-dimensional tensor shapes in GatherNd params before execution.

  4. SANDBOXING

    Run TFLite inference in isolated processes with minimal privileges to contain blast radius of runtime crashes.

  5. DETECTION

    Alert on unexpected crashes (SIGFPE/SIGILL) in inference processes, anomalous model file changes in registries or artifact stores, and model files sourced outside approved repositories.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Art. 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.1.6 - AI system risk treatment and controls
NIST AI RMF
MANAGE 2.2 - Mechanisms to sustain AI system trustworthiness are applied
OWASP LLM Top 10
LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2021-29589?

Any pipeline ingesting externally-sourced .tflite models is exposed—the 'local' attack vector is misleading in ML contexts where model files routinely cross trust boundaries through registries, artifact stores, and third-party sharing. An attacker who can substitute a crafted model (supply chain, poisoned registry, compromised bucket) crashes the TFLite runtime immediately on inference. Patch to TF 2.5.0 / 2.4.2 / 2.3.3 / 2.2.3 / 2.1.4 and enforce cryptographic model provenance before loading any artifact.

Is CVE-2021-29589 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29589, increasing the risk of exploitation.

How to fix CVE-2021-29589?

1. PATCH: Upgrade to TensorFlow 2.5.0 or cherry-pick releases 2.4.2, 2.3.3, 2.2.3, 2.1.4. Prioritize embedded/edge fleet where patch cadence is slow. 2. MODEL PROVENANCE: Implement cryptographic signing (e.g., Sigstore/TUF) and verification for all .tflite artifacts before runtime loading. Reject unsigned or unverified models. 3. INPUT VALIDATION: Add pre-load validation to detect zero-dimensional tensor shapes in GatherNd params before execution. 4. SANDBOXING: Run TFLite inference in isolated processes with minimal privileges to contain blast radius of runtime crashes. 5. DETECTION: Alert on unexpected crashes (SIGFPE/SIGILL) in inference processes, anomalous model file changes in registries or artifact stores, and model files sourced outside approved repositories.

What systems are affected by CVE-2021-29589?

This vulnerability affects the following AI/ML architecture patterns: edge AI / on-device inference (TFLite), mobile ML applications (Android/iOS), model serving pipelines (TFLite runtime), MLOps artifact stores and model registries, embedded/IoT inference systems.

What is the CVSS score for CVE-2021-29589?

CVE-2021-29589 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.01%.

Technical Details

NVD Description

TensorFlow is an end-to-end open source platform for machine learning. The reference implementation of the `GatherNd` TFLite operator is vulnerable to a division by zero error(https://github.com/tensorflow/tensorflow/blob/0d45ea1ca641b21b73bcf9c00e0179cda284e7e7/tensorflow/lite/kernels/internal/reference/reference_ops.h#L966). An attacker can craft a model such that `params` input would be an empty tensor. In turn, `params_shape.Dims(.)` would be zero, in at least one dimension. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary targeting an organization's mobile AI application discovers that the app downloads updated .tflite models from an object storage bucket without verifying file integrity or provenance. The attacker compromises the storage bucket (via leaked access key or misconfigured IAM) and replaces a legitimate model with a crafted variant containing a GatherNd operator whose params input is an empty tensor. On the next app update cycle, devices load the poisoned model; the TFLite runtime performs a division by params_shape.Dims(.) which is zero, causing an immediate crash. On embedded IoT devices with no watchdog or auto-restart, this results in a persistent denial-of-service requiring manual intervention. For a more targeted attack, the adversary could inject the crafted model at the CI/CD export stage, affecting all downstream consumers of a shared model artifact.

Weaknesses (CWE)

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H

Timeline

Published
May 14, 2021
Last Modified
November 21, 2024
First Seen
May 14, 2021

Related Vulnerabilities