CVE-2021-41203: TensorFlow: malformed checkpoint triggers overflow/crash
HIGH PoC AVAILABLEAttackers who can modify TensorFlow checkpoint files on disk can crash training or inference processes via integer overflows and undefined behavior. Patch to TF 2.7.0 / 2.6.1 / 2.5.2 / 2.4.4 immediately — any shared storage or model registry accessible to low-privileged users is a viable attack path. Treat checkpoint files as untrusted inputs and enforce integrity checks (checksums, access controls) before loading.
Risk Assessment
CVSS 7.8 High with local attack vector and low complexity/privileges. Risk is elevated in MLOps environments with shared storage (NFS, S3, NAS) where checkpoints are written by one process and loaded by another — a compromised low-privilege account or insider threat can trigger crashes or undefined behavior across the ML stack. Not in CISA KEV and no known active exploitation, but the attack primitive (craft malicious file → crash ML process) is trivial once filesystem access is obtained.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| tensorflow | pip | — | No patch |
Do you use tensorflow? You're affected.
Severity & Risk
Attack Surface
Recommended Action
6 steps-
Patch: Upgrade to TensorFlow 2.7.0, 2.6.1, 2.5.2, or 2.4.4 immediately.
-
Restrict filesystem permissions: checkpoint directories should be writable only by the process that creates them; separate write/read service accounts.
-
Integrity verification: implement SHA-256 checksums on checkpoint files and validate before loading — reject any checkpoint that fails verification.
-
Immutable storage: use write-once/append-only storage policies for checkpoint artifacts in production.
-
Detection: monitor for unexpected process crashes (segfaults, OOM) in TF training/inference workloads — repeated crashes against checkpoint-loading paths may indicate active exploitation.
-
Audit: inventory all systems running unpatched TF versions, prioritize those with shared checkpoint storage.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-41203?
Attackers who can modify TensorFlow checkpoint files on disk can crash training or inference processes via integer overflows and undefined behavior. Patch to TF 2.7.0 / 2.6.1 / 2.5.2 / 2.4.4 immediately — any shared storage or model registry accessible to low-privileged users is a viable attack path. Treat checkpoint files as untrusted inputs and enforce integrity checks (checksums, access controls) before loading.
Is CVE-2021-41203 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-41203, increasing the risk of exploitation.
How to fix CVE-2021-41203?
1. Patch: Upgrade to TensorFlow 2.7.0, 2.6.1, 2.5.2, or 2.4.4 immediately. 2. Restrict filesystem permissions: checkpoint directories should be writable only by the process that creates them; separate write/read service accounts. 3. Integrity verification: implement SHA-256 checksums on checkpoint files and validate before loading — reject any checkpoint that fails verification. 4. Immutable storage: use write-once/append-only storage policies for checkpoint artifacts in production. 5. Detection: monitor for unexpected process crashes (segfaults, OOM) in TF training/inference workloads — repeated crashes against checkpoint-loading paths may indicate active exploitation. 6. Audit: inventory all systems running unpatched TF versions, prioritize those with shared checkpoint storage.
What systems are affected by CVE-2021-41203?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, MLOps CI/CD pipelines, transfer learning workflows, distributed training infrastructure.
What is the CVSS score for CVE-2021-41203?
CVE-2021-41203 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.02%.
Technical Details
NVD Description
TensorFlow is an open source platform for machine learning. In affected versions an attacker can trigger undefined behavior, integer overflows, segfaults and `CHECK`-fail crashes if they can change saved checkpoints from outside of TensorFlow. This is because the checkpoints loading infrastructure is missing validation for invalid file formats. The fixes will be included in TensorFlow 2.7.0. We will also cherrypick these commits on TensorFlow 2.6.1, TensorFlow 2.5.2, and TensorFlow 2.4.4, as these are also affected and still in supported range.
Exploitation Scenario
An adversary with low-privilege access to a shared MLOps environment (e.g., compromised data scientist account, malicious insider, or supply chain compromise of a model registry) locates the checkpoint storage directory for a production training or fine-tuning job. They craft a malformed checkpoint file — manipulating file format fields to trigger integer overflow conditions — and replace or inject it into the expected checkpoint path. When the TensorFlow training process resumes from checkpoint (e.g., nightly scheduled training job), it loads the malicious file without validation, triggering undefined behavior, segfaults, or CHECK-fail crashes. In a Kubernetes-based ML training cluster, this could repeatedly crash pods and disrupt model delivery pipelines, or in worst-case exploit the undefined behavior for code execution under the training process's service account.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H References
- github.com/tensorflow/tensorflow/commit/368af875869a204b4ac552b9ddda59f6a46a56ec Patch 3rd Party
- github.com/tensorflow/tensorflow/commit/abcced051cb1bd8fb05046ac3b6023a7ebcc4578 Patch 3rd Party
- github.com/tensorflow/tensorflow/commit/b619c6f865715ca3b15ef1842b5b95edbaa710ad Patch 3rd Party
- github.com/tensorflow/tensorflow/commit/e8dc63704c88007ee4713076605c90188d66f3d2 Patch 3rd Party
- github.com/tensorflow/tensorflow/security/advisories/GHSA-7pxj-m4jf-r6h2 3rd Party
- github.com/ARPSyndicate/cvemon Exploit
- github.com/adwisatya/SnykVulndb Exploit
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow
AI Threat Alert