CVE-2021-43811: Sockeye: unsafe YAML load RCE via model config file
HIGH PoC AVAILABLEIf your team downloads and runs Sockeye models from external sources, a malicious model config can execute arbitrary code on the engineer's workstation at load time—before any inference occurs. This is a supply chain attack: adversary publishes poisoned model, waits for someone to pull and run it. Upgrade to Sockeye 2.3.24 immediately and enforce model artifact sourcing policies.
Risk Assessment
High severity in practice despite the local attack vector rating. CVSS AV:L understates real-world exposure because ML practitioners routinely pull pre-trained models from public repositories (GitHub, HuggingFace, model zoos) with minimal vetting. Exploitability is trivial—crafting a malicious PyYAML object requires no AI/ML expertise. Automated MLOps pipelines that load model configs without human review are the highest-risk targets, as the payload fires silently during model initialization.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| sockeye | — | — | No patch |
Do you use sockeye? You're affected.
Severity & Risk
Attack Surface
Recommended Action
6 steps-
Upgrade Sockeye to >=2.3.24 immediately.
-
If patching is blocked: restrict model loading to internally-signed artifacts only—no external model downloads without security review.
-
Audit all ML codebases for yaml.load() calls and replace with yaml.safe_load() universally.
-
Implement model artifact signing and integrity verification in MLOps pipelines (cosign, DVC, or similar).
-
Detection: monitor for unexpected process spawning or outbound network connections triggered during model load operations.
-
Consider sandboxing model evaluation environments (containers, restricted VMs) to limit blast radius.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-43811?
If your team downloads and runs Sockeye models from external sources, a malicious model config can execute arbitrary code on the engineer's workstation at load time—before any inference occurs. This is a supply chain attack: adversary publishes poisoned model, waits for someone to pull and run it. Upgrade to Sockeye 2.3.24 immediately and enforce model artifact sourcing policies.
Is CVE-2021-43811 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-43811, increasing the risk of exploitation.
How to fix CVE-2021-43811?
1. Upgrade Sockeye to >=2.3.24 immediately. 2. If patching is blocked: restrict model loading to internally-signed artifacts only—no external model downloads without security review. 3. Audit all ML codebases for yaml.load() calls and replace with yaml.safe_load() universally. 4. Implement model artifact signing and integrity verification in MLOps pipelines (cosign, DVC, or similar). 5. Detection: monitor for unexpected process spawning or outbound network connections triggered during model load operations. 6. Consider sandboxing model evaluation environments (containers, restricted VMs) to limit blast radius.
What systems are affected by CVE-2021-43811?
This vulnerability affects the following AI/ML architecture patterns: NMT training pipelines, model serving, ML model distribution, MLOps CI/CD pipelines, research environments.
What is the CVSS score for CVE-2021-43811?
CVE-2021-43811 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 8.72%.
Technical Details
NVD Description
Sockeye is an open-source sequence-to-sequence framework for Neural Machine Translation built on PyTorch. Sockeye uses YAML to store model and data configurations on disk. Versions below 2.3.24 use unsafe YAML loading, which can be made to execute arbitrary code embedded in config files. An attacker can add malicious code to the config file of a trained model and attempt to convince users to download and run it. If users run the model, the embedded code will run locally. The issue is fixed in version 2.3.24.
Exploitation Scenario
An adversary publishes a Sockeye-compatible pre-trained NMT model (e.g., English-Spanish translation) to a public repository, promoting it via social channels or SEO-optimized documentation. The model's YAML config contains a crafted PyYAML directive (!!python/object/apply:subprocess.check_output or similar) that spawns a reverse shell or exfiltrates cloud credentials (AWS_ACCESS_KEY_ID, GCP service account tokens) upon deserialization. An ML engineer downloads the model to benchmark it against their production system—even just to evaluate quality—and the payload executes with their local privileges, potentially pivoting to cloud infrastructure, training data stores, or the corporate network.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H References
- github.com/awslabs/sockeye/pull/964 Patch 3rd Party
- github.com/awslabs/sockeye/releases/tag/2.3.24 Release 3rd Party
- github.com/awslabs/sockeye/security/advisories/GHSA-ggmr-44cv-24pm 3rd Party
- github.com/ARPSyndicate/cvemon Exploit
- github.com/NaInSec/CVE-PoC-in-GitHub Exploit
- github.com/SYRTI/POC_to_review Exploit
- github.com/WhooAmii/POC_to_review Exploit
- github.com/nomi-sec/PoC-in-GitHub Exploit
- github.com/s-index/CVE-2021-43811 Exploit
- github.com/s-index/poc-list Exploit
- github.com/trhacknon/Pocingit Exploit
- github.com/zecool/cve Exploit
Timeline
Related Vulnerabilities
CVE-2025-59528 10.0 Flowise: Unauthenticated RCE via MCP config injection
Same attack type: Supply Chain CVE-2024-2912 10.0 BentoML: RCE via insecure deserialization (CVSS 10)
Same attack type: Supply Chain CVE-2023-3765 10.0 MLflow: path traversal allows arbitrary file read
Same attack type: Supply Chain CVE-2025-5120 10.0 smolagents: sandbox escape enables unauthenticated RCE
Same attack type: Supply Chain CVE-2026-21858 10.0 n8n: Input Validation flaw enables exploitation
Same attack type: Code Execution
AI Threat Alert