CVE-2024-0520: MLflow: path traversal enables RCE via dataset loading

HIGH PoC AVAILABLE CISA: TRACK*
Published June 6, 2024
CISO Take

Any ML team running MLflow older than 2.9.0 and loading datasets from external HTTP URLs is exposed to arbitrary file write and remote code execution — no authentication required, just a crafted HTTP response. Patch to 2.9.0 immediately; if patching is blocked, restrict MLflow to internal-only dataset sources and block outbound HTTP dataset loading at the network level. Treat any MLflow host as a potential pivot point into training infrastructure, model artifacts, and credentials.

Risk Assessment

HIGH. CVSS 8.8 reflects the real-world severity: network-accessible, low complexity, no privileges needed on MLflow itself. The only friction is user interaction — a data scientist must load a dataset from an attacker-controlled URL, which is trivially achievable via social engineering (Slack message, shared notebook, poisoned dataset registry). MLflow instances are often deployed inside corporate networks with broad access to training data, model registries, and cloud credentials, making post-exploitation impact severe.

Affected Systems

Package Ecosystem Vulnerable Range Patched
mlflow pip No patch
25.7K OpenSSF 4.5 624 dependents Pushed 7d ago 24% patched ~64d to patch Full package profile →

Do you use mlflow? You're affected.

Severity & Risk

CVSS 3.1
8.8 / 10
EPSS
4.9%
chance of exploitation in 30 days
Higher than 90% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Moderate
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI Required
S Unchanged
C High
I High
A High

Recommended Action

6 steps
  1. PATCH

    Upgrade MLflow to >= 2.9.0 immediately — this is the only complete fix.

  2. NETWORK CONTROLS

    If patching is delayed, restrict MLflow servers from making outbound HTTP requests to untrusted domains via egress firewall rules.

  3. RUNTIME CONTROLS

    Run MLflow under a least-privilege service account with minimal filesystem write permissions; use read-only mounts where possible.

  4. DETECTION

    Monitor for unexpected file creation in non-data directories by the MLflow process (auditd or Falco rules on the mlflow user); alert on Content-Disposition headers containing '../' or absolute paths in outbound HTTP responses via WAF/proxy.

  5. AUDIT

    Check MLflow logs for dataset loads from external URLs; review recently loaded datasets for suspicious source URLs.

  6. VERIFY

    Confirm your deployed version with pip show mlflow or container image inspection.

CISA SSVC Assessment

Decision Track*
Exploitation poc
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Art. 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2.6 - AI system security
NIST AI RMF
GOVERN 6.2 - Policies and procedures are in place for AI risk management MANAGE 2.2 - Mechanisms for sustaining AI risk management are in place
OWASP LLM Top 10
LLM05:2025 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2024-0520?

Any ML team running MLflow older than 2.9.0 and loading datasets from external HTTP URLs is exposed to arbitrary file write and remote code execution — no authentication required, just a crafted HTTP response. Patch to 2.9.0 immediately; if patching is blocked, restrict MLflow to internal-only dataset sources and block outbound HTTP dataset loading at the network level. Treat any MLflow host as a potential pivot point into training infrastructure, model artifacts, and credentials.

Is CVE-2024-0520 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2024-0520, increasing the risk of exploitation.

How to fix CVE-2024-0520?

1. PATCH: Upgrade MLflow to >= 2.9.0 immediately — this is the only complete fix. 2. NETWORK CONTROLS: If patching is delayed, restrict MLflow servers from making outbound HTTP requests to untrusted domains via egress firewall rules. 3. RUNTIME CONTROLS: Run MLflow under a least-privilege service account with minimal filesystem write permissions; use read-only mounts where possible. 4. DETECTION: Monitor for unexpected file creation in non-data directories by the MLflow process (auditd or Falco rules on the mlflow user); alert on Content-Disposition headers containing '../' or absolute paths in outbound HTTP responses via WAF/proxy. 5. AUDIT: Check MLflow logs for dataset loads from external URLs; review recently loaded datasets for suspicious source URLs. 6. VERIFY: Confirm your deployed version with `pip show mlflow` or container image inspection.

What systems are affected by CVE-2024-0520?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, MLOps platforms, experiment tracking infrastructure, shared data science environments, model serving.

What is the CVSS score for CVE-2024-0520?

CVE-2024-0520 has a CVSS v3.1 base score of 8.8 (HIGH). The EPSS exploitation probability is 4.88%.

Technical Details

NVD Description

A vulnerability in mlflow/mlflow version 8.2.1 allows for remote code execution due to improper neutralization of special elements used in an OS command ('Command Injection') within the `mlflow.data.http_dataset_source.py` module. Specifically, when loading a dataset from a source URL with an HTTP scheme, the filename extracted from the `Content-Disposition` header or the URL path is used to generate the final file path without proper sanitization. This flaw enables an attacker to control the file path fully by utilizing path traversal or absolute path techniques, such as '../../tmp/poc.txt' or '/tmp/poc.txt', leading to arbitrary file write. Exploiting this vulnerability could allow a malicious user to execute commands on the vulnerable machine, potentially gaining access to data and model information. The issue is fixed in version 2.9.0.

Exploitation Scenario

An adversary targets a data science team by sharing a convincing-looking dataset via a public URL (e.g., in a research forum or Slack message). The URL points to an attacker-controlled HTTP server. When a data scientist loads the dataset using MLflow, the server returns a Content-Disposition header like: `Content-Disposition: attachment; filename=../../.local/lib/python3.10/site-packages/mlflow/__init__.py`. MLflow writes the attacker's payload (a Python backdoor) to that path without validation, overwriting the MLflow package itself. On the next MLflow import or training job execution, the backdoor runs with full process privileges — establishing a reverse shell, exfiltrating AWS credentials from the instance metadata service, or poisoning model artifacts stored in S3.

Weaknesses (CWE)

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

Timeline

Published
June 6, 2024
Last Modified
October 15, 2025
First Seen
June 6, 2024

Related Vulnerabilities