CVE-2024-34072: pickle deserialization enables RCE

CISO Take

Upgrade sagemaker-python-sdk to v2.218.0 immediately if your ML pipelines load numpy arrays from external or untrusted sources. Any workflow where third-party data reaches SageMaker's NumpyDeserializer via pickle format is exposed to full code execution. If patching is delayed, enforce strict controls on who can supply pickled data to your ML infrastructure.

What is the risk?

High risk for organizations running SageMaker-based ML pipelines that ingest external data. CVSS 7.8 with low attack complexity means exploitation is straightforward once a malicious pickle reaches the pipeline. The local attack vector and required user interaction reduce opportunistic remote exploitation, but in collaborative ML environments — shared S3 buckets, model registries, third-party datasets — an adversary can realistically deliver a malicious pickle as a data artifact. No active exploitation reported and not in CISA KEV, but pickle-based RCE is a well-documented, tooled attack class.

How severe is it?

CVSS 3.1

7.8 / 10

EPSS

0.4%

chance of exploitation in 30 days

Higher than 32% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

No known exploitation

Sophistication

Trivial

What is the attack surface?

AV Local

AC Low

PR None

UI Required

S Unchanged

C High

I High

A High

What should I do?

5 steps

Patch: upgrade sagemaker-python-sdk to >= 2.218.0 across all environments (notebooks, training clusters, CI/CD pipelines).
Audit: run pip show sagemaker | grep Version across all ML infrastructure, notebooks, and container images to identify unpatched installations.
Workaround if patching is delayed: reject all pickled numpy object arrays originating from untrusted or unverified sources at the pipeline boundary.
Apply integrity verification (checksums or cryptographic signatures) on all serialized model artifacts and datasets before ingestion.
Review IAM roles attached to SageMaker execution environments — apply least-privilege to limit blast radius if RCE occurs.

What does CISA's SSVC say?

Decision Track

Exploitation none

Automatable No

Technical Impact total

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Code Execution Supply Chain Framework Training Data AML.T0010.001 - AI Software AML.T0011.000 - Unsafe AI Artifacts AML.T0018.002 - Embed Malware AML.T0035 - AI Artifact Collection

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Art. 15 - Accuracy, robustness and cybersecurity

ISO 42001

A.6.2 - AI system impact assessment A.8.2 - Data for development and enhancement of AI systems

NIST AI RMF

MANAGE 2.2 - Mechanisms are in place to inventory AI risks and prioritize treatment

OWASP LLM Top 10

LLM03:2025 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2024-34072?

Upgrade sagemaker-python-sdk to v2.218.0 immediately if your ML pipelines load numpy arrays from external or untrusted sources. Any workflow where third-party data reaches SageMaker's NumpyDeserializer via pickle format is exposed to full code execution. If patching is delayed, enforce strict controls on who can supply pickled data to your ML infrastructure.

Is CVE-2024-34072 actively exploited?

No confirmed active exploitation of CVE-2024-34072 has been reported, but organizations should still patch proactively.

How to fix CVE-2024-34072?

1. Patch: upgrade sagemaker-python-sdk to >= 2.218.0 across all environments (notebooks, training clusters, CI/CD pipelines). 2. Audit: run `pip show sagemaker | grep Version` across all ML infrastructure, notebooks, and container images to identify unpatched installations. 3. Workaround if patching is delayed: reject all pickled numpy object arrays originating from untrusted or unverified sources at the pipeline boundary. 4. Apply integrity verification (checksums or cryptographic signatures) on all serialized model artifacts and datasets before ingestion. 5. Review IAM roles attached to SageMaker execution environments — apply least-privilege to limit blast radius if RCE occurs.

What systems are affected by CVE-2024-34072?

This vulnerability affects the following AI/ML architecture patterns: ML training pipelines, SageMaker inference endpoints, data preprocessing pipelines, MLOps artifact stores.

What is the CVSS score for CVE-2024-34072?

CVE-2024-34072 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.41%.

What is the AI security impact?

Affected AI Architectures

ML training pipelinesSageMaker inference endpointsdata preprocessing pipelinesMLOps artifact stores

MITRE ATLAS Techniques

AML.T0010.001 AI Software

AML.T0011.000 Unsafe AI Artifacts

AML.T0018.002 Embed Malware

AML.T0035 AI Artifact Collection

Compliance Controls Affected

EU AI Act: Art. 15

ISO 42001: A.6.2, A.8.2

NIST AI RMF: MANAGE 2.2

OWASP LLM Top 10: LLM03:2025

What are the technical details?

Original Advisory

sagemaker-python-sdk is a library for training and deploying machine learning models on Amazon SageMaker. The sagemaker.base_deserializers.NumpyDeserializer module before v2.218.0 allows potentially unsafe deserialization when untrusted data is passed as pickled object arrays. This consequently may allow an unprivileged third party to cause remote code execution, denial of service, affecting both confidentiality and integrity. Users are advised to upgrade to version 2.218.0. Users unable to upgrade should not pass pickled numpy object arrays which originated from an untrusted source, or that could have been tampered with. Only pass pickled numpy object arrays from trusted sources.

Exploitation Scenario

An adversary targeting an organization's ML pipeline identifies an S3 bucket used to stage external training datasets for SageMaker jobs. Using standard Python pickle exploit tooling, they craft a malicious serialized numpy object array that executes a reverse shell payload upon deserialization. The adversary uploads this file to the staging bucket, either by compromising an upstream data provider or exploiting a misconfigured S3 ACL. When the next training job runs and loads the dataset via NumpyDeserializer, the payload executes in the SageMaker training container — exfiltrating the execution role's IAM credentials, model weights, and establishing persistence in the MLOps infrastructure.

Weaknesses (CWE)

CWE-502 Deserialization of Untrusted Data

CWE-502 — Deserialization of Untrusted Data: The product deserializes untrusted data without sufficiently ensuring that the resulting data will be valid.

[Architecture and Design, Implementation] If available, use the signing/sealing features of the programming language to assure that deserialized data has not been tainted. For example, a hash-based message authentication code (HMAC) could be used to ensure that data has not been modified.
[Implementation] When deserializing data, populate a new object rather than just deserializing. The result is that the data flows through safe input validation and that the functions are safe.

Source: MITRE CWE corpus.