CVE-2026-31224: snorkel: RCE via unsafe model deserialization

AWAITING NVD
Published May 12, 2026
CISO Take

The snorkel library through v0.10.0 uses torch.load() without the weights_only=True safety flag in its MultitaskClassifier.load() method, allowing arbitrary Python objects to be deserialized via the Pickle protocol and resulting in remote code execution on the loading host. Any ML pipeline that ingests snorkel model checkpoints from shared storage, model registries, or external sources faces full code execution exposure with the privileges of the training process — which typically includes access to cloud credentials, internal APIs, and sensitive training data. No public exploit or KEV entry exists yet, but pickle deserialization attacks are extensively documented and require only moderate skill to weaponize, making this a realistic threat for organizations with shared model artifact workflows. Until a patched release above v0.10.0 is available, restrict model loading to cryptographically verified internal sources and run picklescan or fickling over all model files before ingestion.

Sources: NVD ATLAS

Risk Assessment

Medium-High risk. Insecure deserialization via Python's pickle protocol is a mature, well-understood attack class with published exploit tooling, lowering the bar for weaponization once a malicious model file can be delivered. The vulnerability grants full arbitrary code execution on the ML host, a high-impact outcome in enterprise environments where training infrastructure often holds elevated credentials and access to downstream systems. The absence of CVSS scoring reflects recency rather than low severity. Organizations running automated MLOps pipelines that load snorkel models without artifact integrity checks are the highest-exposure population.

Attack Kill Chain

Artifact Staging
Adversary crafts a malicious snorkel MultitaskClassifier model file embedding a Python pickle payload designed to execute arbitrary code upon deserialization.
AML.T0018.002
Supply Chain Delivery
Malicious model file is uploaded to a shared model registry, S3 bucket, or artifact repository accessible to the target ML pipeline, replacing or supplementing a legitimate checkpoint.
AML.T0010.003
Execution Trigger
Victim's training pipeline or data scientist calls MultitaskClassifier.load() on the malicious file, triggering torch.load() deserialization without weights_only restriction and executing the pickle payload.
AML.T0011.000
System Compromise
Pickle payload executes with ML process privileges, enabling credential theft, data exfiltration, reverse shell establishment, or lateral movement into connected ML infrastructure and data stores.
AML.T0072

Severity & Risk

CVSS 3.1
N/A
EPSS
N/A
Exploitation Status
No known exploitation
Sophistication
Moderate

Recommended Action

6 steps
  1. Upgrade snorkel to a version above v0.10.0 once a patched release is published; monitor the snorkel-team/snorkel GitHub repository for security advisories.

  2. Audit all torch.load() calls across your ML codebase and add weights_only=True to every call that does not require full object deserialization.

  3. Enforce SHA-256 hash or cryptographic signature verification on all model artifact files as part of your model registry intake process before loading.

  4. Integrate picklescan or fickling into your CI/CD pipeline to scan model files for malicious pickle payloads at artifact registration time.

  5. Restrict model loading to internally controlled, access-audited storage — do not allow training jobs to load models directly from public URLs or unverified external registries.

  6. Monitor ML training hosts for anomalous subprocess spawns, unexpected outbound network connections, or credential access events that could indicate post-exploitation activity.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2 - AI system lifecycle management A.9.4 - Security of AI systems
NIST AI RMF
MANAGE 2.2 - Risk response for AI supply chain
OWASP LLM Top 10
LLM05:2025 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2026-31224?

The snorkel library through v0.10.0 uses torch.load() without the weights_only=True safety flag in its MultitaskClassifier.load() method, allowing arbitrary Python objects to be deserialized via the Pickle protocol and resulting in remote code execution on the loading host. Any ML pipeline that ingests snorkel model checkpoints from shared storage, model registries, or external sources faces full code execution exposure with the privileges of the training process — which typically includes access to cloud credentials, internal APIs, and sensitive training data. No public exploit or KEV entry exists yet, but pickle deserialization attacks are extensively documented and require only moderate skill to weaponize, making this a realistic threat for organizations with shared model artifact workflows. Until a patched release above v0.10.0 is available, restrict model loading to cryptographically verified internal sources and run picklescan or fickling over all model files before ingestion.

Is CVE-2026-31224 actively exploited?

No confirmed active exploitation of CVE-2026-31224 has been reported, but organizations should still patch proactively.

How to fix CVE-2026-31224?

1. Upgrade snorkel to a version above v0.10.0 once a patched release is published; monitor the snorkel-team/snorkel GitHub repository for security advisories. 2. Audit all torch.load() calls across your ML codebase and add weights_only=True to every call that does not require full object deserialization. 3. Enforce SHA-256 hash or cryptographic signature verification on all model artifact files as part of your model registry intake process before loading. 4. Integrate picklescan or fickling into your CI/CD pipeline to scan model files for malicious pickle payloads at artifact registration time. 5. Restrict model loading to internally controlled, access-audited storage — do not allow training jobs to load models directly from public URLs or unverified external registries. 6. Monitor ML training hosts for anomalous subprocess spawns, unexpected outbound network connections, or credential access events that could indicate post-exploitation activity.

What systems are affected by CVE-2026-31224?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, weak supervision workflows, MLOps pipelines, model serving, data labeling pipelines.

What is the CVSS score for CVE-2026-31224?

No CVSS score has been assigned yet.

Technical Details

NVD Description

The snorkel library thru v0.10.0 contains an insecure deserialization vulnerability (CWE-502) in the MultitaskClassifier.load() method of the MultitaskClassifier class. The method loads model weight files using torch.load() without enabling the security-restrictive weights_only=True parameter. This default behavior allows the deserialization of arbitrary Python objects via the Pickle module. A remote attacker can exploit this by providing a maliciously crafted model file, leading to arbitrary code execution on the victim's system when the file is loaded via the vulnerable method.

Exploitation Scenario

An adversary with write access to a shared model artifact store — obtained via compromised CI/CD credentials, a misconfigured S3 bucket policy, or a malicious insider — uploads a crafted snorkel MultitaskClassifier model file containing an embedded pickle payload. The payload is disguised as a legitimate model checkpoint and placed in a path expected by an automated training pipeline. When the pipeline calls MultitaskClassifier.load() during a scheduled training run, torch.load() deserializes the file without restriction, executing the payload with the ML process's permissions. The attacker receives a reverse shell or exfiltrated cloud credentials, establishing a foothold in the ML infrastructure with access to training data and downstream production systems.

Timeline

Published
May 12, 2026
Last Modified
May 12, 2026
First Seen
May 12, 2026

Related Vulnerabilities