CVE-2026-31232: CosyVoice: RCE via unsafe torch.load() model deserialization

AWAITING NVD
Published May 12, 2026
CISO Take

CosyVoice, an open-source AI voice synthesis framework, loads PyTorch model files using torch.load() without the weights_only=True safeguard, allowing arbitrary Python objects to be deserialized via Pickle when a crafted .pt file is processed — resulting in full remote code execution on the loading host. Any team using CosyVoice for voice synthesis, whether in an internal research environment or a web-facing product, is exposed the moment they load a model directory from an untrusted source, and the CosyVoice web interface makes this vector accessible without deep technical knowledge. The Pickle-based payload technique is trivially reproducible with publicly available tools, no patch exists at time of disclosure, and no CVSS score has been formally assigned yet — meaning organizational risk management may not flag this through standard vuln feeds. Until an upstream fix incorporating weights_only=True lands, teams should immediately restrict CosyVoice to trusted model sources, sandbox deployments, and use fickling or picklescan to inspect .pt files before loading.

Sources: NVD ATLAS

Risk Assessment

HIGH risk despite absent CVSS metadata. CWE-502 (Insecure Deserialization) in ML model loading is a well-understood, trivially exploitable class — adversaries need only craft a malicious .pt file using standard Python tooling. Blast radius scales with deployment context: isolated local research is lower risk, but any web-exposed CosyVoice instance or pipeline accepting model paths from external input is critically exposed. The attack fires at model load time, before any inference or authentication occurs, and no compensating controls exist in the current codebase.

Attack Kill Chain

Artifact Preparation
Adversary crafts a malicious .pt PyTorch model file containing a serialized Python object with an embedded reverse shell or arbitrary payload using standard pickle exploit tooling.
AML.T0018.002
Delivery
Adversary delivers the malicious model directory to the victim via a shared storage location, phishing archive, or compromised model repository targeting CosyVoice users who load external models.
AML.T0011.000
Exploitation
Victim loads the malicious model directory via --model_dir or the web interface; torch.load() deserializes the pickle payload without restriction, executing arbitrary code on the host.
AML.T0049
Impact
Adversary achieves remote code execution with the privileges of the CosyVoice process, enabling data exfiltration, lateral movement within ML infrastructure, or persistent backdoor installation.
AML.T0072

Severity & Risk

CVSS 3.1
N/A
EPSS
N/A
Exploitation Status
No known exploitation
Sophistication
Trivial

Recommended Action

5 steps
  1. Immediate: Stop loading CosyVoice model files from untrusted or external sources until patched.

  2. If web interface is exposed, restrict access to authenticated internal users only or take offline.

  3. Await an upstream patch that adds weights_only=True to all torch.load() calls in the codebase; pin to a vetted commit in the meantime.

  4. Run CosyVoice in a sandboxed environment (Docker with dropped capabilities, no network egress, read-only filesystem where possible) to contain exploitation impact.

  5. Detection: use fickling or picklescan to scan .pt model files before loading; monitor for unexpected child processes spawned by the CosyVoice process and anomalous outbound connections from model-loading jobs.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.1.2 - AI system security by design A.9.3 - Protection of AI system inputs
NIST AI RMF
MANAGE-2.2 - Risk treatment for AI system vulnerabilities
OWASP LLM Top 10
LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2026-31232?

CosyVoice, an open-source AI voice synthesis framework, loads PyTorch model files using torch.load() without the weights_only=True safeguard, allowing arbitrary Python objects to be deserialized via Pickle when a crafted .pt file is processed — resulting in full remote code execution on the loading host. Any team using CosyVoice for voice synthesis, whether in an internal research environment or a web-facing product, is exposed the moment they load a model directory from an untrusted source, and the CosyVoice web interface makes this vector accessible without deep technical knowledge. The Pickle-based payload technique is trivially reproducible with publicly available tools, no patch exists at time of disclosure, and no CVSS score has been formally assigned yet — meaning organizational risk management may not flag this through standard vuln feeds. Until an upstream fix incorporating weights_only=True lands, teams should immediately restrict CosyVoice to trusted model sources, sandbox deployments, and use fickling or picklescan to inspect .pt files before loading.

Is CVE-2026-31232 actively exploited?

No confirmed active exploitation of CVE-2026-31232 has been reported, but organizations should still patch proactively.

How to fix CVE-2026-31232?

1. Immediate: Stop loading CosyVoice model files from untrusted or external sources until patched. 2. If web interface is exposed, restrict access to authenticated internal users only or take offline. 3. Await an upstream patch that adds weights_only=True to all torch.load() calls in the codebase; pin to a vetted commit in the meantime. 4. Run CosyVoice in a sandboxed environment (Docker with dropped capabilities, no network egress, read-only filesystem where possible) to contain exploitation impact. 5. Detection: use fickling or picklescan to scan .pt model files before loading; monitor for unexpected child processes spawned by the CosyVoice process and anomalous outbound connections from model-loading jobs.

What systems are affected by CVE-2026-31232?

This vulnerability affects the following AI/ML architecture patterns: model serving, training pipelines, MLOps pipelines, voice synthesis deployments, web-based AI inference endpoints.

What is the CVSS score for CVE-2026-31232?

No CVSS score has been assigned yet.

Technical Details

NVD Description

The CosyVoice project thru commit 6e01309e01bc93bbeb83bdd996b1182a81aaf11e (2025-30-21) contains an insecure deserialization vulnerability (CWE-502) in its model loading process. When loading model files (.pt) from a user-specified directory (via the --model_dir argument), the code uses torch.load() without the security-restrictive weights_only=True parameter. This allows the deserialization of arbitrary Python objects via the Pickle module. An attacker can exploit this by providing a maliciously crafted model directory containing .pt files with embedded pickle payloads. When a victim loads this directory using CosyVoice's web interface, the malicious payload is executed, leading to remote code execution on the victim's system.

Exploitation Scenario

An adversary crafts a malicious PyTorch .pt file containing a serialized Python object that executes a reverse shell payload upon deserialization. They host this file in a directory they control — a shared NFS mount, a compromised model registry, or a phishing-delivered ZIP archive mimicking a legitimate CosyVoice model release. When a data scientist or MLOps engineer points CosyVoice's --model_dir to this directory, or a web-facing CosyVoice instance loads it via its interface, torch.load() invokes Python's pickle.load() internally without restriction, executing the embedded payload and granting the adversary code execution on the host. From there, the attacker pivots to lateral movement within the ML infrastructure or exfiltrates training data and credentials.

Timeline

Published
May 12, 2026
Last Modified
May 12, 2026
First Seen
May 12, 2026

Related Vulnerabilities