CVE-2021-29545: TensorFlow: heap OOB write in sparse tensor DoS
MEDIUM PoC AVAILABLEA local attacker with minimal privileges can crash TensorFlow processes by submitting malformed sparse tensors, triggering an out-of-bounds heap write via the sparse-to-CSR matrix conversion kernel. The local-only attack vector limits broad exposure, but multi-tenant ML platforms and shared data science environments (Jupyter hubs, model serving endpoints accepting sparse input) carry real denial-of-service risk. Patch to TensorFlow 2.5.0 or apply the cherry-picked fixes for 2.1.x–2.4.x immediately.
Risk Assessment
Medium risk in isolation (CVSS 5.5, AV:L/PR:L), but elevated in shared or multi-tenant ML infrastructure. Exploitability within local scope is trivial—crafting a malformed sparse tensor index requires no deep ML knowledge. Impact is confined to availability; no confidentiality or integrity risk. Organizations running TensorFlow in Jupyter environments, managed notebook services, or model serving APIs that accept sparse matrix inputs should treat this as higher operational priority than the base score suggests.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| tensorflow | pip | — | No patch |
Do you use tensorflow? You're affected.
Severity & Risk
Attack Surface
Recommended Action
1 step-
1) Upgrade TensorFlow to 2.5.0, or apply cherry-pick commit 1e922ccdf6bf46a3a52641f99fd47d54c1decd13 to 2.1.4, 2.2.3, 2.3.3, or 2.4.2. 2) As a workaround, validate sparse tensor indices server-side before passing to CSR conversion: reject any input where max(indices[:,0]) >= expected_num_rows. 3) Run TensorFlow serving workers in isolated containers with restart policies to limit DoS impact duration. 4) Alert on unexpected TensorFlow process exits in serving infrastructure as a detection signal. 5) Audit production code for use of tf.raw_ops.SparseToCsrSparseMatrix and related sparse conversion APIs exposed to external input.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-29545?
A local attacker with minimal privileges can crash TensorFlow processes by submitting malformed sparse tensors, triggering an out-of-bounds heap write via the sparse-to-CSR matrix conversion kernel. The local-only attack vector limits broad exposure, but multi-tenant ML platforms and shared data science environments (Jupyter hubs, model serving endpoints accepting sparse input) carry real denial-of-service risk. Patch to TensorFlow 2.5.0 or apply the cherry-picked fixes for 2.1.x–2.4.x immediately.
Is CVE-2021-29545 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-29545, increasing the risk of exploitation.
How to fix CVE-2021-29545?
1) Upgrade TensorFlow to 2.5.0, or apply cherry-pick commit 1e922ccdf6bf46a3a52641f99fd47d54c1decd13 to 2.1.4, 2.2.3, 2.3.3, or 2.4.2. 2) As a workaround, validate sparse tensor indices server-side before passing to CSR conversion: reject any input where max(indices[:,0]) >= expected_num_rows. 3) Run TensorFlow serving workers in isolated containers with restart policies to limit DoS impact duration. 4) Alert on unexpected TensorFlow process exits in serving infrastructure as a detection signal. 5) Audit production code for use of tf.raw_ops.SparseToCsrSparseMatrix and related sparse conversion APIs exposed to external input.
What systems are affected by CVE-2021-29545?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, shared ML notebooks.
What is the CVSS score for CVE-2021-29545?
CVE-2021-29545 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.01%.
Technical Details
NVD Description
TensorFlow is an end-to-end open source platform for machine learning. An attacker can trigger a denial of service via a `CHECK`-fail in converting sparse tensors to CSR Sparse matrices. This is because the implementation(https://github.com/tensorflow/tensorflow/blob/800346f2c03a27e182dd4fba48295f65e7790739/tensorflow/core/kernels/sparse/kernels.cc#L66) does a double redirection to access an element of an array allocated on the heap. If the value at `indices(i, 0)` is such that `indices(i, 0) + 1` is outside the bounds of `csr_row_ptr`, this results in writing outside of bounds of heap allocated data. The fix will be included in TensorFlow 2.5.0. We will also cherrypick this commit on TensorFlow 2.4.2, TensorFlow 2.3.3, TensorFlow 2.2.3 and TensorFlow 2.1.4, as these are also affected and still in supported range.
Exploitation Scenario
An adversary with access to a shared ML platform—such as a tenant on a multi-user Jupyter hub or a client of a model serving endpoint that accepts sparse feature inputs—constructs a sparse tensor where indices(i, 0) + 1 exceeds the allocated bounds of the csr_row_ptr array. Submitting this tensor to any operation invoking the SparseToCsrSparseMatrix kernel triggers the out-of-bounds heap write, causing an immediate CHECK-fail and process crash. In a model serving context (e.g., TensorFlow Serving behind an API), the attacker can repeatedly submit these payloads to keep the inference worker down, achieving sustained denial of service against the ML endpoint with no authentication bypass required beyond API access.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H References
- github.com/tensorflow/tensorflow/commit/1e922ccdf6bf46a3a52641f99fd47d54c1decd13 Patch 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-hmg3-c7xj-6qwm Exploit Patch 3rd Party
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow
AI Threat Alert