CVE-2024-37056: MLflow: RCE via LightGBM model deserialization

HIGH PoC AVAILABLE
Published June 4, 2024
CISO Take

Any MLflow deployment (v1.23.0+) where users can upload or interact with externally sourced models is exposed to remote code execution. An attacker needs only to upload a malicious LightGBM scikit-learn model—no credentials required—and wait for a data scientist to load it. Upgrade to a patched MLflow release immediately and restrict model registry write access to trusted, authenticated users only.

What is the risk?

High risk for organizations running shared or collaborative ML platforms. CVSS 8.8 with low complexity and no privilege requirement makes exploitation straightforward—'user interaction' here means a data scientist routine­ly opening or loading a model, which is normal workflow behavior. The attack surface expands significantly in multi-tenant or externally accessible MLflow instances, and the lack of CISA KEV listing does not diminish real-world risk given the ease of exploitation.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
MLflow pip No patch
26.6K OpenSSF 5.6 655 dependents Pushed 4d ago 31% patched ~51d to patch Full package profile →

Do you use MLflow? You're affected.

How severe is it?

CVSS 3.1
8.8 / 10
EPSS
0.6%
chance of exploitation in 30 days
Higher than 45% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI Required
S Unchanged
C High
I High
A High

What should I do?

7 steps
  1. Upgrade MLflow to the latest patched release immediately.

  2. Restrict model upload permissions to authenticated, trusted users only—disable anonymous or open model registry access.

  3. Implement model provenance controls: require signed models or enforce format allowlists.

  4. Scan model files for unsafe serialization formats (pickle, joblib) before they enter the registry using tools like ModelScan.

  5. Audit existing model registry contents for unexpected LightGBM scikit-learn models from unfamiliar accounts.

  6. Deploy network monitoring on MLflow workers to detect anomalous outbound connections as a post-exploitation indicator.

  7. If immediate patching is not possible, air-gap the model registry from untrusted external upload sources.

What does CISA's SSVC say?

Decision Track
Exploitation none
Automatable No
Technical Impact total

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Art. 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2.6 - AI supply chain security
NIST AI RMF
MANAGE 2.2 - Mechanisms for response and recovery
OWASP LLM Top 10
LLM03:2025 - Supply Chain

Frequently Asked Questions

What is CVE-2024-37056?

Any MLflow deployment (v1.23.0+) where users can upload or interact with externally sourced models is exposed to remote code execution. An attacker needs only to upload a malicious LightGBM scikit-learn model—no credentials required—and wait for a data scientist to load it. Upgrade to a patched MLflow release immediately and restrict model registry write access to trusted, authenticated users only.

Is CVE-2024-37056 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2024-37056, increasing the risk of exploitation.

How to fix CVE-2024-37056?

1. Upgrade MLflow to the latest patched release immediately. 2. Restrict model upload permissions to authenticated, trusted users only—disable anonymous or open model registry access. 3. Implement model provenance controls: require signed models or enforce format allowlists. 4. Scan model files for unsafe serialization formats (pickle, joblib) before they enter the registry using tools like ModelScan. 5. Audit existing model registry contents for unexpected LightGBM scikit-learn models from unfamiliar accounts. 6. Deploy network monitoring on MLflow workers to detect anomalous outbound connections as a post-exploitation indicator. 7. If immediate patching is not possible, air-gap the model registry from untrusted external upload sources.

What systems are affected by CVE-2024-37056?

This vulnerability affects the following AI/ML architecture patterns: model registry, MLOps platforms, training pipelines, model serving, experiment tracking.

What is the CVSS score for CVE-2024-37056?

CVE-2024-37056 has a CVSS v3.1 base score of 8.8 (HIGH). The EPSS exploitation probability is 0.62%.

What is the AI security impact?

Affected AI Architectures

model registryMLOps platformstraining pipelinesmodel servingexperiment tracking

MITRE ATLAS Techniques

AML.T0010.003 Model
AML.T0011.000 Unsafe AI Artifacts
AML.T0018.002 Embed Malware
AML.T0049 Exploit Public-Facing Application
AML.T0058 Publish Poisoned Models

Compliance Controls Affected

EU AI Act: Art. 15
ISO 42001: A.6.2.6
NIST AI RMF: MANAGE 2.2
OWASP LLM Top 10: LLM03:2025

What are the technical details?

Original Advisory

Deserialization of untrusted data can occur in versions of the MLflow platform running version 1.23.0 or newer, enabling a maliciously uploaded LightGBM scikit-learn model to run arbitrary code on an end user’s system when interacted with.

Exploitation Scenario

An adversary identifies a target organization's shared or publicly accessible MLflow instance. They register an account (or leverage a compromised one) and upload a crafted LightGBM scikit-learn model containing a malicious Python pickle payload embedded in the serialized artifact. When a data scientist browses the model registry and clicks 'load model' or references it in an experiment run, Python's deserialization triggers the payload—granting the attacker immediate code execution on the victim's machine. From there they can exfiltrate API keys, cloud credentials, or model weights, or establish persistence within the ML infrastructure.

Weaknesses (CWE)

CWE-502 — Deserialization of Untrusted Data: The product deserializes untrusted data without sufficiently ensuring that the resulting data will be valid.

  • [Architecture and Design, Implementation] If available, use the signing/sealing features of the programming language to assure that deserialized data has not been tainted. For example, a hash-based message authentication code (HMAC) could be used to ensure that data has not been modified.
  • [Implementation] When deserializing data, populate a new object rather than just deserializing. The result is that the data flows through safe input validation and that the functions are safe.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

Timeline

Published
June 4, 2024
Last Modified
February 3, 2025
First Seen
June 4, 2024

Related Vulnerabilities