CVE-2021-29543: TensorFlow DoS via assertion fail

CISO Take

A low-privileged local attacker can crash any TensorFlow process using CTCGreedyDecoder by supplying malformed inputs that trigger an unhandled CHECK assertion, abruptly terminating the process. In shared ML inference or training infrastructure—Jupyter servers, multi-tenant GPU clusters, or exposed TF Serving endpoints—this is a realistic availability threat. Upgrade to TensorFlow 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4 immediately; there is no safe workaround short of patching.

What is the risk?

Medium risk in isolated single-tenant environments; elevated in shared or multi-tenant ML infrastructure. The local attack vector limits internet-reachable exposure, but 'local' in ML contexts often means a shared notebook server, containerized training job, or internal API endpoint—all reachable by semi-trusted insiders or compromised co-tenants. Exploitability is trivial (craft out-of-range tensor dimensions), but the impact is limited to availability with no confidentiality or integrity breach. CVSS 5.5 is appropriate; contextual risk can reach HIGH in production inference pipelines with no process isolation.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
TensorFlow	pip	—	No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 3d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1

5.5 / 10

EPSS

0.2%

chance of exploitation in 30 days

Higher than 9% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Trivial

Exploitation Confidence

medium

○ Public PoC indexed (trickest/cve)

Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV Local

AC Low

PR Low

UI None

S Unchanged

C None

I None

A High

What should I do?

6 steps

PATCH

Upgrade to TensorFlow 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4—all contain the fix (commit ea3b43e).
VERIFY

Run python -c "import tensorflow as tf; print(tf.__version__)" across all ML infrastructure nodes.
ISOLATE

If patching is delayed, run TF inference processes in separate containers/pods so a crash does not cascade.
VALIDATE INPUT

Add bounds checks on sequence length inputs before passing to CTCGreedyDecoder.
MONITOR

Alert on abnormal TF process terminations (exit code != 0) in serving infrastructure.
RESTRICT

Limit which users can submit raw ops to TF Serving endpoints; do not expose tf.raw_ops to untrusted callers.

How is it classified?

DoS Framework Inference AML.T0029 - Denial of AI Service AML.T0034 - Cost Harvesting AML.T0049 - Exploit Public-Facing Application

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Article 15 - Accuracy, robustness and cybersecurity for high-risk AI

ISO 42001

A.9.7 - Robustness and reliability of AI systems

NIST AI RMF

MS-2.5 - Measure: AI system availability and reliability

OWASP LLM Top 10

LLM04:2025 - Model Denial of Service

Frequently Asked Questions

What is CVE-2021-29543?

A low-privileged local attacker can crash any TensorFlow process using CTCGreedyDecoder by supplying malformed inputs that trigger an unhandled CHECK assertion, abruptly terminating the process. In shared ML inference or training infrastructure—Jupyter servers, multi-tenant GPU clusters, or exposed TF Serving endpoints—this is a realistic availability threat. Upgrade to TensorFlow 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4 immediately; there is no safe workaround short of patching.

Is CVE-2021-29543 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-29543, increasing the risk of exploitation.

How to fix CVE-2021-29543?

1. PATCH: Upgrade to TensorFlow 2.5.0, 2.4.2, 2.3.3, 2.2.3, or 2.1.4—all contain the fix (commit ea3b43e). 2. VERIFY: Run `python -c "import tensorflow as tf; print(tf.__version__)"` across all ML infrastructure nodes. 3. ISOLATE: If patching is delayed, run TF inference processes in separate containers/pods so a crash does not cascade. 4. VALIDATE INPUT: Add bounds checks on sequence length inputs before passing to CTCGreedyDecoder. 5. MONITOR: Alert on abnormal TF process terminations (exit code != 0) in serving infrastructure. 6. RESTRICT: Limit which users can submit raw ops to TF Serving endpoints; do not expose `tf.raw_ops` to untrusted callers.

What systems are affected by CVE-2021-29543?

This vulnerability affects the following AI/ML architecture patterns: model serving, training pipelines, inference.

What is the CVSS score for CVE-2021-29543?

CVE-2021-29543 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.19%.

What is the AI security impact?

Affected AI Architectures

model servingtraining pipelinesinference

MITRE ATLAS Techniques

AML.T0029 Denial of AI Service

AML.T0034 Cost Harvesting

AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15

ISO 42001: A.9.7

NIST AI RMF: MS-2.5

OWASP LLM Top 10: LLM04:2025

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. An attacker can trigger a denial of service via a `CHECK`-fail in `tf.raw_ops.CTCGreedyDecoder`. This is because the implementation(https://github.com/tensorflow/tensorflow/blob/1615440b17b364b875eb06f43d087381f1460a65/tensorflow/core/kernels/ctc_decoder_ops.cc#L37-L50) has a `CHECK_LT` inserted to validate some invariants. When this condition is false, the program aborts, instead of returning a valid error to the user. This abnormal termination can be weaponized in denial of service attacks. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.

Exploitation Scenario

An insider or compromised ML engineer with access to a shared Jupyter server or internal TF Serving endpoint crafts a tensor with invalid dimensions (e.g., sequence_length values exceeding the input tensor's actual time dimension) and submits it to `tf.raw_ops.CTCGreedyDecoder`. The CHECK_LT assertion fails, the TF runtime calls `abort()`, and the entire serving process terminates. In a Kubernetes deployment, the pod restarts but any in-flight requests are dropped. If the attacker loops this request, they can keep the serving endpoint continuously unavailable—a low-sophistication, high-persistence denial of service against production ASR or OCR AI services.

Weaknesses (CWE)

CWE-617 Reachable Assertion

CWE-617 — Reachable Assertion: The product contains an assert() or similar statement that can be triggered by an attacker, which leads to an application exit or other behavior that is more severe than necessary.

[Implementation] Make sensitive open/close operation non reachable by directly user-controlled data (e.g. open/close resources)
[Implementation] Perform input validation on user data.

Source: MITRE CWE corpus.