CVE-2026-31251: CosyVoice: RCE via unsafe torch.load() deserialization

UNKNOWN
Published May 11, 2026
CISO Take

CosyVoice's gRPC server deserializes PyTorch model files using torch.load() without the weights_only=True safety parameter, allowing arbitrary Python code execution via pickle at server startup. Any deployment where an attacker can influence the model directory — through a compromised model repository, shared model store, or supply chain tampering — becomes a full RCE vector requiring no authentication and triggering before the service handles a single request. While no active exploitation or CISA KEV listing exists yet, the underlying torch pickle deserialization attack class has mature public tooling (fickling, picklescan) that makes weaponization trivial for motivated actors. Organizations using CosyVoice in voice synthesis or multi-modal AI pipelines should pin to a post-fix commit, scan model files with picklescan before loading, and isolate gRPC server processes with least-privilege filesystem access.

Sources: NVD ATLAS

What is the risk?

High severity despite absent CVSS scoring. The attack requires placing a malicious model file in the target directory, but that threshold is low in AI/ML environments where teams routinely pull models from public repositories without integrity verification. RCE at server initialization means no authentication layer is involved — damage is done before the service processes a single request. The pickle deserialization attack pattern is well-documented with mature public tooling available, lowering attacker sophistication requirements to trivial.

How does the attack unfold?

Model Staging
Adversary creates a malicious PyTorch model file with an embedded pickle payload and places it in a model directory accessible to the victim via a public model hub, shared storage, or CI/CD artifact store.
AML.T0002.001
Unsafe Deserialization
Victim starts the CosyVoice gRPC server pointing to the malicious model directory; torch.load() deserializes the file without weights_only=True, triggering the embedded pickle payload.
AML.T0011.000
Code Execution
Pickle payload executes arbitrary Python code with the gRPC server process's privileges at startup, before any authentication or request handling layer is active.
AML.T0018.002
Lateral Movement
Attacker uses the RCE foothold to pivot across ML infrastructure — harvesting environment secrets, accessing model storage volumes, and reaching adjacent training or serving services.
AML.T0025

How severe is it?

CVSS 3.1
N/A
EPSS
0.2%
chance of exploitation in 30 days
Higher than 12% of all CVEs
Exploitation Status
No known exploitation
Sophistication
Trivial

What should I do?

6 steps
  1. Upgrade: pin to a commit that applies weights_only=True to all torch.load() calls and verify the fix is present in the gRPC server initialization path.

  2. Scan all model files with picklescan or fickling before loading to detect embedded pickle payloads.

  3. Restrict gRPC server processes to isolated containers with read-only, cryptographically verified model volumes.

  4. Audit all model directory paths for untrusted content and enforce model file integrity via hash verification in deployment pipelines.

  5. Apply least-privilege filesystem access — the server process should not have write access to model directories or broader system paths.

  6. Monitor process spawning at server startup for unexpected child processes indicative of payload execution.

What does CISA's SSVC say?

Decision Track
Exploitation none
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Art. 15 - Accuracy, robustness and cybersecurity Art. 9 - Risk management system
ISO 42001
8.4 - AI system documentation and risk assessment A.6.2.6 - AI system security
NIST AI RMF
GOVERN 1.7 - Processes for AI risk management MANAGE 2.2 - AI system security and resilience
OWASP LLM Top 10
LLM05:2025 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2026-31251?

CosyVoice's gRPC server deserializes PyTorch model files using torch.load() without the weights_only=True safety parameter, allowing arbitrary Python code execution via pickle at server startup. Any deployment where an attacker can influence the model directory — through a compromised model repository, shared model store, or supply chain tampering — becomes a full RCE vector requiring no authentication and triggering before the service handles a single request. While no active exploitation or CISA KEV listing exists yet, the underlying torch pickle deserialization attack class has mature public tooling (fickling, picklescan) that makes weaponization trivial for motivated actors. Organizations using CosyVoice in voice synthesis or multi-modal AI pipelines should pin to a post-fix commit, scan model files with picklescan before loading, and isolate gRPC server processes with least-privilege filesystem access.

Is CVE-2026-31251 actively exploited?

No confirmed active exploitation of CVE-2026-31251 has been reported, but organizations should still patch proactively.

How to fix CVE-2026-31251?

1. Upgrade: pin to a commit that applies weights_only=True to all torch.load() calls and verify the fix is present in the gRPC server initialization path. 2. Scan all model files with picklescan or fickling before loading to detect embedded pickle payloads. 3. Restrict gRPC server processes to isolated containers with read-only, cryptographically verified model volumes. 4. Audit all model directory paths for untrusted content and enforce model file integrity via hash verification in deployment pipelines. 5. Apply least-privilege filesystem access — the server process should not have write access to model directories or broader system paths. 6. Monitor process spawning at server startup for unexpected child processes indicative of payload execution.

What systems are affected by CVE-2026-31251?

This vulnerability affects the following AI/ML architecture patterns: Speech synthesis pipelines, gRPC-based model serving, ML model deployment, Multi-modal AI systems, AI/ML CI/CD pipelines.

What is the CVSS score for CVE-2026-31251?

No CVSS score has been assigned yet.

What is the AI security impact?

Affected AI Architectures

Speech synthesis pipelinesgRPC-based model servingML model deploymentMulti-modal AI systemsAI/ML CI/CD pipelines

MITRE ATLAS Techniques

AML.T0002.001 Models
AML.T0010.003 Model
AML.T0011.000 Unsafe AI Artifacts
AML.T0018.002 Embed Malware

Compliance Controls Affected

EU AI Act: Art. 15, Art. 9
ISO 42001: 8.4, A.6.2.6
NIST AI RMF: GOVERN 1.7, MANAGE 2.2
OWASP LLM Top 10: LLM05:2025

What are the technical details?

Original Advisory

CosyVoice thru commit 6e01309e01bc93bbeb83bdd996b1182a81aaf11e (2025-30-21) contains an insecure deserialization vulnerability (CWE-502) in its gRPC server component. When the server starts, it loads the speech synthesis model from a user-specified directory using torch.load() without enabling the weights_only=True security parameter. This allows the deserialization of arbitrary Python objects via the pickle module. An attacker can exploit this by providing malicious model files within a directory. When a victim starts the gRPC server pointing to this directory, arbitrary code is executed on the victim's system during server initialization.

Exploitation Scenario

An adversary targeting an organization's voice synthesis infrastructure creates a malicious PyTorch model file using fickling that embeds a pickle payload executing a reverse shell. The file is injected into a public model hub, a shared NFS mount, or a CI/CD artifact store used by the victim. When the CosyVoice gRPC server restarts — triggered by a routine deployment, autoscaling event, or crash recovery — torch.load() deserializes the payload at startup, granting the attacker RCE with the server process's privileges before any authentication layer is active. From this foothold, the attacker pivots to model storage, training infrastructure, environment secrets, and other services accessible from the ML backend.

Timeline

Published
May 11, 2026
Last Modified
May 12, 2026
First Seen
May 11, 2026

Related Vulnerabilities