CVE-2020-13092: scikit-learn: RCE via malicious joblib model deserialization
CRITICAL PoC AVAILABLEAny ML pipeline loading scikit-learn model files from untrusted sources (shared storage, S3 buckets, user uploads, third-party model registries) is exposed to full remote code execution — one malicious .pkl file is all it takes. Audit every call to joblib.load() in your stack and enforce cryptographic verification (hash or signature) of model artifacts before loading. This is not a bug scikit-learn will patch; it is a design constraint you must architect around.
Risk Assessment
CVSS 9.8 reflects worst-case exposure: no authentication, no user interaction, network-reachable if a model-serving endpoint loads user-supplied or externally fetched model files. In practice, exploitability depends on whether attacker-controlled files reach joblib.load() — which is common in MLOps pipelines that pull models from shared registries or accept model uploads. The 'by design' disclaimer does not reduce operational risk; ML teams routinely treat joblib.load() as a safe I/O operation without understanding the underlying pickle execution model.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| scikit-learn | pip | — | No patch |
Do you use scikit-learn? You're affected.
Severity & Risk
Attack Surface
Recommended Action
6 steps-
Upgrade scikit-learn to >= 0.24.0 (no functional fix exists — mitigation is architectural).
-
Never call joblib.load() on files from untrusted or unverified sources.
-
Enforce cryptographic integrity checks (SHA-256 hash or digital signature) on all model artifacts before loading.
-
Run model loading processes in sandboxed environments (gVisor, seccomp profiles, read-only mounts) to limit blast radius.
-
Implement model provenance tracking in your MLOps pipeline — only load models whose chain of custody is auditable.
-
For detection: monitor for unexpected os.system(), subprocess, or network calls spawned from Python processes running inference workloads.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2020-13092?
Any ML pipeline loading scikit-learn model files from untrusted sources (shared storage, S3 buckets, user uploads, third-party model registries) is exposed to full remote code execution — one malicious .pkl file is all it takes. Audit every call to joblib.load() in your stack and enforce cryptographic verification (hash or signature) of model artifacts before loading. This is not a bug scikit-learn will patch; it is a design constraint you must architect around.
Is CVE-2020-13092 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2020-13092, increasing the risk of exploitation.
How to fix CVE-2020-13092?
1. Upgrade scikit-learn to >= 0.24.0 (no functional fix exists — mitigation is architectural). 2. Never call joblib.load() on files from untrusted or unverified sources. 3. Enforce cryptographic integrity checks (SHA-256 hash or digital signature) on all model artifacts before loading. 4. Run model loading processes in sandboxed environments (gVisor, seccomp profiles, read-only mounts) to limit blast radius. 5. Implement model provenance tracking in your MLOps pipeline — only load models whose chain of custody is auditable. 6. For detection: monitor for unexpected os.system(), subprocess, or network calls spawned from Python processes running inference workloads.
What systems are affected by CVE-2020-13092?
This vulnerability affects the following AI/ML architecture patterns: model serving, training pipelines, MLOps/CI-CD pipelines, model registries, data science notebooks.
What is the CVSS score for CVE-2020-13092?
CVE-2020-13092 has a CVSS v3.1 base score of 9.8 (CRITICAL). The EPSS exploitation probability is 0.88%.
Technical Details
NVD Description
scikit-learn (aka sklearn) through 0.23.0 can unserialize and execute commands from an untrusted file that is passed to the joblib.load() function, if __reduce__ makes an os.system call. NOTE: third parties dispute this issue because the joblib.load() function is documented as unsafe and it is the user's responsibility to use the function in a secure manner
Exploitation Scenario
An adversary identifies an ML inference API that accepts a model file path or fetches models from a configurable S3 bucket. They craft a malicious scikit-learn model file using Python's pickle __reduce__ mechanism to embed an os.system() reverse shell call. The file is uploaded to a writable storage location or injected into a model registry via a compromised CI credential. When the inference service calls joblib.load() on the next model refresh cycle, the payload executes with the service account's privileges — giving the attacker a foothold inside the ML infrastructure to pivot to training data, API keys, or downstream systems.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H References
- github.com/0FuzzingQ/vuln/blob/master/sklearn%20unserialize.md Exploit 3rd Party
- scikit-learn.org/stable/modules/model_persistence.html 3rd Party
Timeline
Related Vulnerabilities
CVE-2020-28975 7.5 scikit-learn: DoS via crafted SVM model deserialization
Same package: scikit-learn CVE-2024-5206 4.7 scikit-learn: TfidfVectorizer leaks training data tokens
Same package: scikit-learn CVE-2023-3765 10.0 MLflow: path traversal allows arbitrary file read
Same attack type: Supply Chain CVE-2025-5120 10.0 smolagents: sandbox escape enables unauthenticated RCE
Same attack type: Supply Chain CVE-2025-59528 10.0 Flowise: Unauthenticated RCE via MCP config injection
Same attack type: Supply Chain
AI Threat Alert