CVE-2021-43811: Sockeye: unsafe YAML load RCE via model config file
HIGH PoC AVAILABLEIf your team downloads and runs Sockeye models from external sources, a malicious model config can execute arbitrary code on the engineer's workstation at load time—before any inference occurs. This is a supply chain attack: adversary publishes poisoned model, waits for someone to pull and run it. Upgrade to Sockeye 2.3.24 immediately and enforce model artifact sourcing policies.
What is the risk?
High severity in practice despite the local attack vector rating. CVSS AV:L understates real-world exposure because ML practitioners routinely pull pre-trained models from public repositories (GitHub, HuggingFace, model zoos) with minimal vetting. Exploitability is trivial—crafting a malicious PyYAML object requires no AI/ML expertise. Automated MLOps pipelines that load model configs without human review are the highest-risk targets, as the payload fires silently during model initialization.
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| sockeye | — | — | No patch |
Do you use sockeye? You're affected.
How severe is it?
What is the attack surface?
What should I do?
6 steps-
Upgrade Sockeye to >=2.3.24 immediately.
-
If patching is blocked: restrict model loading to internally-signed artifacts only—no external model downloads without security review.
-
Audit all ML codebases for yaml.load() calls and replace with yaml.safe_load() universally.
-
Implement model artifact signing and integrity verification in MLOps pipelines (cosign, DVC, or similar).
-
Detection: monitor for unexpected process spawning or outbound network connections triggered during model load operations.
-
Consider sandboxing model evaluation environments (containers, restricted VMs) to limit blast radius.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2021-43811?
If your team downloads and runs Sockeye models from external sources, a malicious model config can execute arbitrary code on the engineer's workstation at load time—before any inference occurs. This is a supply chain attack: adversary publishes poisoned model, waits for someone to pull and run it. Upgrade to Sockeye 2.3.24 immediately and enforce model artifact sourcing policies.
Is CVE-2021-43811 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2021-43811, increasing the risk of exploitation.
How to fix CVE-2021-43811?
1. Upgrade Sockeye to >=2.3.24 immediately. 2. If patching is blocked: restrict model loading to internally-signed artifacts only—no external model downloads without security review. 3. Audit all ML codebases for yaml.load() calls and replace with yaml.safe_load() universally. 4. Implement model artifact signing and integrity verification in MLOps pipelines (cosign, DVC, or similar). 5. Detection: monitor for unexpected process spawning or outbound network connections triggered during model load operations. 6. Consider sandboxing model evaluation environments (containers, restricted VMs) to limit blast radius.
What systems are affected by CVE-2021-43811?
This vulnerability affects the following AI/ML architecture patterns: NMT training pipelines, model serving, ML model distribution, MLOps CI/CD pipelines, research environments.
What is the CVSS score for CVE-2021-43811?
CVE-2021-43811 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 2.42%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.003 Model AML.T0011.000 Unsafe AI Artifacts AML.T0018.002 Embed Malware AML.T0058 Publish Poisoned Models Compliance Controls Affected
What are the technical details?
Original Advisory
Sockeye is an open-source sequence-to-sequence framework for Neural Machine Translation built on PyTorch. Sockeye uses YAML to store model and data configurations on disk. Versions below 2.3.24 use unsafe YAML loading, which can be made to execute arbitrary code embedded in config files. An attacker can add malicious code to the config file of a trained model and attempt to convince users to download and run it. If users run the model, the embedded code will run locally. The issue is fixed in version 2.3.24.
Exploitation Scenario
An adversary publishes a Sockeye-compatible pre-trained NMT model (e.g., English-Spanish translation) to a public repository, promoting it via social channels or SEO-optimized documentation. The model's YAML config contains a crafted PyYAML directive (!!python/object/apply:subprocess.check_output or similar) that spawns a reverse shell or exfiltrates cloud credentials (AWS_ACCESS_KEY_ID, GCP service account tokens) upon deserialization. An ML engineer downloads the model to benchmark it against their production system—even just to evaluate quality—and the payload executes with their local privileges, potentially pivoting to cloud infrastructure, training data stores, or the corporate network.
Weaknesses (CWE)
CWE-94 — Improper Control of Generation of Code ('Code Injection'): The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the syntax or behavior of the intended code segment.
- [Architecture and Design] Refactor your program so that you do not have to dynamically generate code.
- [Architecture and Design] Run your code in a "jail" or similar sandbox environment that enforces strict boundaries between the process and the operating system. This may effectively restrict which code can be executed by your product. Examples include the Unix chroot jail and AppArmor. In general, managed code may provide some protection. This may not be a feasible solution, and it only limits the impact to the operating system; the rest of your application may still be subject to compromise. Be careful to avoid CWE-243 and other weaknesses related to jails.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H References
- github.com/awslabs/sockeye/pull/964 Patch 3rd Party
- github.com/awslabs/sockeye/releases/tag/2.3.24 Release 3rd Party
- github.com/awslabs/sockeye/security/advisories/GHSA-ggmr-44cv-24pm 3rd Party
- github.com/ARPSyndicate/cvemon Exploit
- github.com/NaInSec/CVE-PoC-in-GitHub Exploit
- github.com/SYRTI/POC_to_review Exploit
- github.com/WhooAmii/POC_to_review Exploit
- github.com/nomi-sec/PoC-in-GitHub Exploit
- github.com/s-index/CVE-2021-43811 Exploit
- github.com/s-index/poc-list Exploit
- github.com/trhacknon/Pocingit Exploit
- github.com/zecool/cve Exploit
Timeline
Related Vulnerabilities
CVE-2025-59528 10.0 Flowise: Unauthenticated RCE via MCP config injection
Same attack type: Supply Chain CVE-2024-2912 10.0 BentoML: RCE via insecure deserialization (CVSS 10)
Same attack type: Supply Chain CVE-2023-3765 10.0 MLflow: path traversal allows arbitrary file read
Same attack type: Supply Chain CVE-2025-5120 10.0 smolagents: sandbox escape enables unauthenticated RCE
Same attack type: Supply Chain CVE-2026-21858 10.0 n8n: Input Validation flaw enables exploitation
Same attack type: Code Execution