CVE-2021-29517: TensorFlow Conv3D div-by-zero

CISO Take

A malicious user with local access can crash any TensorFlow process using Conv3D by passing a crafted filter tensor with a zero fifth element. Patch immediately to TF 2.5.0 or the cherrypicked fixes for 2.1.x–2.4.x; if running untrusted tensor inputs through 3D CNN inference endpoints, add input validation as a defense-in-depth layer. No active exploitation reported and not in CISA KEV, but unpatched training or serving infrastructure accepting external input is directly at risk.

What is the risk?

Medium severity (CVSS 5.5) with high availability impact locally. The low attack complexity and low privileges required make exploitation trivial for anyone with access to a TF-backed service or shared training environment. Blast radius is limited to process availability — no confidentiality or integrity impact. Organizations running TF inference APIs accessible to multiple internal users or tenants face the highest exposure; production model serving with strict input validation has lower practical risk.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
TensorFlow	pip	—	No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 3d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1

5.5 / 10

EPSS

0.2%

chance of exploitation in 30 days

Higher than 9% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Trivial

Exploitation Confidence

medium

○ Public PoC indexed (trickest/cve)

Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV Local

AC Low

PR Low

UI None

S Unchanged

C None

I None

A High

What should I do?

5 steps

Patch: Upgrade to TensorFlow 2.5.0, or cherrypick commit 799f835 onto 2.4.2, 2.3.3, 2.2.3, or 2.1.4.
Validate inputs server-side before passing to Conv3D: assert filter.shape[4] > 0 and verify tensor shape compatibility before op execution.
Run inference workers as unprivileged, isolated processes (containers/VMs) so a crash doesn't cascade to the host.
Implement request rate limiting and input size/shape bounds on any public or internal TF Serving endpoints.
Detection: monitor for abrupt TF process exits (SIGABRT from Eigen assertion) or crash loops in serving pods — alert on repeated abnormal terminations.

How is it classified?

DoS Framework Inference AML.T0010.001 - AI Software AML.T0029 - Denial of AI Service AML.T0049 - Exploit Public-Facing Application

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Article 15 - Accuracy, robustness and cybersecurity for high-risk AI

ISO 42001

A.6.2.6 - AI system robustness and resilience

NIST AI RMF

GOVERN 6.2 - Organizational practices for AI risk management MANAGE 2.4 - Risks from third-party AI components

Frequently Asked Questions

What is CVE-2021-29517?

A malicious user with local access can crash any TensorFlow process using Conv3D by passing a crafted filter tensor with a zero fifth element. Patch immediately to TF 2.5.0 or the cherrypicked fixes for 2.1.x–2.4.x; if running untrusted tensor inputs through 3D CNN inference endpoints, add input validation as a defense-in-depth layer. No active exploitation reported and not in CISA KEV, but unpatched training or serving infrastructure accepting external input is directly at risk.

Is CVE-2021-29517 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29517, increasing the risk of exploitation.

How to fix CVE-2021-29517?

1. Patch: Upgrade to TensorFlow 2.5.0, or cherrypick commit 799f835 onto 2.4.2, 2.3.3, 2.2.3, or 2.1.4. 2. Validate inputs server-side before passing to Conv3D: assert filter.shape[4] > 0 and verify tensor shape compatibility before op execution. 3. Run inference workers as unprivileged, isolated processes (containers/VMs) so a crash doesn't cascade to the host. 4. Implement request rate limiting and input size/shape bounds on any public or internal TF Serving endpoints. 5. Detection: monitor for abrupt TF process exits (SIGABRT from Eigen assertion) or crash loops in serving pods — alert on repeated abnormal terminations.

What systems are affected by CVE-2021-29517?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, inference, shared ML compute environments.

What is the CVSS score for CVE-2021-29517?

CVE-2021-29517 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.19%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servinginferenceshared ML compute environments

MITRE ATLAS Techniques

AML.T0010.001 AI Software

AML.T0029 Denial of AI Service

AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15

ISO 42001: A.6.2.6

NIST AI RMF: GOVERN 6.2, MANAGE 2.4

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. A malicious user could trigger a division by 0 in `Conv3D` implementation. The implementation(https://github.com/tensorflow/tensorflow/blob/42033603003965bffac51ae171b51801565e002d/tensorflow/core/kernels/conv_ops_3d.cc#L143-L145) does a modulo operation based on user controlled input. Thus, when `filter` has a 0 as the fifth element, this results in a division by 0. Additionally, if the shape of the two tensors is not valid, an Eigen assertion can be triggered, resulting in a program crash. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with access to an internal model serving API (e.g., TensorFlow Serving or a custom Flask/FastAPI wrapper) submits a Conv3D inference request with a filter tensor whose fifth dimension is 0. The Conv3D kernel at conv_ops_3d.cc performs a modulo on this user-controlled value, triggering an integer division by zero and crashing the worker process. In a Kubernetes deployment without proper pod restart backoff, the adversary repeats the request to keep the replica in a crash loop, effectively taking the inference endpoint offline. In a shared GPU cluster, the same technique kills a co-tenant's training job, causing data loss for that run.