CVE-2026-31214: torch-checkpoint: unsafe pickle deserialization RCE
CRITICAL CISA: TRACK*The ml-engineering project's torch-checkpoint-shrink.py utility loads PyTorch checkpoint files using torch.load() without the weights_only=True guard, allowing embedded pickle payloads to execute arbitrary Python code the moment a checkpoint is processed. This is a classic unsafe deserialization pattern (CWE-502) applied to the ML supply chain: any engineer who runs this script against an untrusted or attacker-supplied .pt file hands over their workstation or CI runner to the attacker. EPSS is not yet scored and no public exploit exists, but the attack primitive is trivially reproducible — crafting a malicious pickle payload for torch.load() is well-documented and has powered prior ML-targeted attacks. Immediately audit all uses of this script; apply the one-line fix of setting weights_only=True in the torch.load() call, or migrate checkpoint workflows to safetensors format, which is pickle-free by design.
What is the risk?
Risk is HIGH for any team that has this script in a training pipeline or model management workflow, despite the absence of a CVSS score and KEV listing. The exploitability is trivial: crafting a malicious PyTorch checkpoint is a solved problem with public tooling (e.g., fickling). The blast radius is contained to users who actively invoke the script, but those users are typically ML engineers or automated CI/CD jobs running with elevated permissions. The lack of a CVSS score reflects publication lag, not low severity. The attack surface expands significantly in multi-tenant environments, shared model registries, or any pipeline that ingests third-party checkpoints.
How does the attack unfold?
How severe is it?
What is the attack surface?
What should I do?
6 steps-
Immediate: Replace torch.load(f) with torch.load(f, weights_only=True) in torch-checkpoint-shrink.py at line
-
This restricts deserialization to tensor data only and eliminates arbitrary pickle execution.
-
Short-term: Audit all other scripts in your codebase and pipelines for torch.load() calls lacking weights_only=True; use grep -r 'torch.load' to enumerate them.
-
Structural: Migrate checkpoint storage to safetensors format (pip install safetensors) — pickle-free, memory-safe, and faster to load.
-
Detection: Deploy fickling (pip install fickling) in CI to scan .pt files for suspicious pickle opcodes before loading.
-
Policy: Enforce code review gates on any script that deserializes model files from external or shared sources.
What does CISA's SSVC say?
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2026-31214?
The ml-engineering project's torch-checkpoint-shrink.py utility loads PyTorch checkpoint files using torch.load() without the weights_only=True guard, allowing embedded pickle payloads to execute arbitrary Python code the moment a checkpoint is processed. This is a classic unsafe deserialization pattern (CWE-502) applied to the ML supply chain: any engineer who runs this script against an untrusted or attacker-supplied .pt file hands over their workstation or CI runner to the attacker. EPSS is not yet scored and no public exploit exists, but the attack primitive is trivially reproducible — crafting a malicious pickle payload for torch.load() is well-documented and has powered prior ML-targeted attacks. Immediately audit all uses of this script; apply the one-line fix of setting weights_only=True in the torch.load() call, or migrate checkpoint workflows to safetensors format, which is pickle-free by design.
Is CVE-2026-31214 actively exploited?
No confirmed active exploitation of CVE-2026-31214 has been reported, but organizations should still patch proactively.
How to fix CVE-2026-31214?
1. Immediate: Replace torch.load(f) with torch.load(f, weights_only=True) in torch-checkpoint-shrink.py at line 57. This restricts deserialization to tensor data only and eliminates arbitrary pickle execution. 2. Short-term: Audit all other scripts in your codebase and pipelines for torch.load() calls lacking weights_only=True; use grep -r 'torch.load' to enumerate them. 3. Structural: Migrate checkpoint storage to safetensors format (pip install safetensors) — pickle-free, memory-safe, and faster to load. 4. Detection: Deploy fickling (pip install fickling) in CI to scan .pt files for suspicious pickle opcodes before loading. 5. Policy: Enforce code review gates on any script that deserializes model files from external or shared sources.
What systems are affected by CVE-2026-31214?
This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, MLOps CI/CD, model registries.
What is the CVSS score for CVE-2026-31214?
CVE-2026-31214 has a CVSS v3.1 base score of 9.8 (CRITICAL). The EPSS exploitation probability is 0.49%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.003 Model AML.T0011.000 Unsafe AI Artifacts AML.T0018.002 Embed Malware AML.T0058 Publish Poisoned Models Compliance Controls Affected
What are the technical details?
Original Advisory
The torch-checkpoint-shrink.py script in the ml-engineering project in commit 0099885db36a8f06556efe1faf552518852cb1e0 (2025-20-27) contains an insecure deserialization vulnerability (CWE-502). The script uses torch.load() to process PyTorch checkpoint files (.pt) without enabling the security-restrictive weights_only=True parameter. This oversight allows the deserialization of arbitrary Python objects via the pickle module. A remote attacker can exploit this by providing a maliciously crafted checkpoint file, leading to arbitrary code execution in the context of the user running the script.
Exploitation Scenario
An adversary targets an ML engineer at a financial institution running distributed LLM fine-tuning. The attacker publishes a poisoned PyTorch checkpoint on a public model hub, mimicking a popular base model. The victim's pipeline uses torch-checkpoint-shrink.py to reduce checkpoint size before uploading to the internal model registry. When the script calls torch.load() on the attacker's file, the embedded pickle payload executes: it establishes a reverse shell to an attacker-controlled server, exfiltrates the SSH key from ~/.ssh/, and installs a persistent backdoor in site-packages. The attacker now has credentials to the internal training cluster and access to proprietary model weights.
Weaknesses (CWE)
CWE-502 — Deserialization of Untrusted Data: The product deserializes untrusted data without sufficiently ensuring that the resulting data will be valid.
- [Architecture and Design, Implementation] If available, use the signing/sealing features of the programming language to assure that deserialized data has not been tainted. For example, a hash-based message authentication code (HMAC) could be used to ensure that data has not been modified.
- [Implementation] When deserializing data, populate a new object rather than just deserializing. The result is that the data flows through safe input validation and that the functions are safe.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H References
Timeline
Related Vulnerabilities
CVE-2025-59528 10.0 Flowise: Unauthenticated RCE via MCP config injection
Same attack type: Supply Chain CVE-2024-2912 10.0 BentoML: RCE via insecure deserialization (CVSS 10)
Same attack type: Supply Chain CVE-2023-3765 10.0 MLflow: path traversal allows arbitrary file read
Same attack type: Supply Chain CVE-2025-5120 10.0 smolagents: sandbox escape enables unauthenticated RCE
Same attack type: Supply Chain CVE-2026-21858 10.0 n8n: Input Validation flaw enables exploitation
Same attack type: Code Execution