CVE-2020-28975: scikit-learn DoS — HIGH

CISO Take

If your ML pipelines load scikit-learn SVM models from untrusted sources (user uploads, shared registries, third-party model repos), an attacker can crash your inference service with a malicious pickle or JSON model file. Patch scikit-learn to a version post-1.0 and enforce strict model provenance controls — only load models from signed, internal registries. No exploit in the wild, but the attack primitive is trivially reproducible.

What is the risk?

Real-world risk is context-dependent. CVSS 7.5 assumes network reachability to model loading code, which is accurate for model-as-a-service deployments or pipelines accepting external model uploads. Organizations with air-gapped model registries and no external model ingestion have minimal exposure. The vendor's disputed note (requires API misuse) understates risk in multi-tenant or collaborative ML platforms where users can supply model artifacts. No CISA KEV listing, no known active exploitation, but the attack is trivially reproducible from published PoC code.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
scikit-learn	pip	—	No patch
66.4K OpenSSF 9.4 29.2K dependents Pushed 3d ago 0% patched Full package profile →

Do you use scikit-learn? You're affected.

How severe is it?

CVSS 3.1

7.5 / 10

EPSS

3.4%

chance of exploitation in 30 days

Higher than 87% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Trivial

Exploitation Confidence

medium

○ CISA SSVC: Public PoC

○ Public PoC indexed (trickest/cve)

Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV Network

AC Low

PR None

UI None

S Unchanged

C None

I None

A High

What should I do?

5 steps

PATCH

Upgrade scikit-learn to ≥1.0; the upstream fix validates _n_support bounds before prediction.
MODEL PROVENANCE

Enforce cryptographic signing of all model artifacts (e.g., sigstore, custom HMAC). Reject unsigned or externally sourced models.
ISOLATION

Run model loading in sandboxed subprocesses or containers with resource limits (ulimit, cgroups) so a segfault does not cascade to the host service.
AVOID PICKLE FROM UNTRUSTED SOURCES: Replace pickle with safer serialization formats (ONNX, joblib with integrity checks) for models crossing trust boundaries.
DETECT

Alert on unexpected process crashes in inference workers; repeated crashes from model load events indicate active exploitation attempts.

What does CISA's SSVC say?

Decision Track*

Exploitation poc

Automatable No

Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

DoS Supply Chain Framework Model Inference AML.T0010.001 - AI Software AML.T0011.000 - Unsafe AI Artifacts AML.T0018 - Manipulate AI Model AML.T0029 - Denial of AI Service AML.T0049 - Exploit Public-Facing Application

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Art.15 - Accuracy, robustness and cybersecurity for high-risk AI systems

ISO 42001

8.4 - AI System Risk Management — Data and Model Integrity 9.1 - Monitoring and Measurement of AI System Performance

NIST AI RMF

GOVERN-6.1 - Policies for third-party AI risks and dependencies MANAGE-2.4 - Residual risks are managed and documented

OWASP LLM Top 10

LLM05:2025 - Insecure Output Handling / Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2020-28975?

If your ML pipelines load scikit-learn SVM models from untrusted sources (user uploads, shared registries, third-party model repos), an attacker can crash your inference service with a malicious pickle or JSON model file. Patch scikit-learn to a version post-1.0 and enforce strict model provenance controls — only load models from signed, internal registries. No exploit in the wild, but the attack primitive is trivially reproducible.

Is CVE-2020-28975 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2020-28975, increasing the risk of exploitation.

How to fix CVE-2020-28975?

1. PATCH: Upgrade scikit-learn to ≥1.0; the upstream fix validates _n_support bounds before prediction. 2. MODEL PROVENANCE: Enforce cryptographic signing of all model artifacts (e.g., sigstore, custom HMAC). Reject unsigned or externally sourced models. 3. ISOLATION: Run model loading in sandboxed subprocesses or containers with resource limits (ulimit, cgroups) so a segfault does not cascade to the host service. 4. AVOID PICKLE FROM UNTRUSTED SOURCES: Replace pickle with safer serialization formats (ONNX, joblib with integrity checks) for models crossing trust boundaries. 5. DETECT: Alert on unexpected process crashes in inference workers; repeated crashes from model load events indicate active exploitation attempts.

What systems are affected by CVE-2020-28975?

This vulnerability affects the following AI/ML architecture patterns: model serving, training pipelines, ML model registries.

What is the CVSS score for CVE-2020-28975?

CVE-2020-28975 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 3.43%.

What is the AI security impact?

Affected AI Architectures

model servingtraining pipelinesML model registries

MITRE ATLAS Techniques

AML.T0010.001 AI Software

AML.T0011.000 Unsafe AI Artifacts

AML.T0018 Manipulate AI Model

AML.T0029 Denial of AI Service

AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Art.15

ISO 42001: 8.4, 9.1

NIST AI RMF: GOVERN-6.1, MANAGE-2.4

OWASP LLM Top 10: LLM05:2025

What are the technical details?

Original Advisory

svm_predict_values in svm.cpp in Libsvm v324, as used in scikit-learn 0.23.2 and other products, allows attackers to cause a denial of service (segmentation fault) via a crafted model SVM (introduced via pickle, json, or any other model permanence standard) with a large value in the _n_support array. NOTE: the scikit-learn vendor's position is that the behavior can only occur if the library's API is violated by an application that changes a private attribute.

Exploitation Scenario

An adversary targeting an ML platform that accepts user-submitted scikit-learn models crafts a malicious SVM model by loading a legitimate model via pickle, then programmatically setting `model._n_support` to an array with an extremely large integer value, and re-serializing. They submit this model to the platform's model evaluation endpoint. When the backend calls `model.predict()`, libsvm's svm_predict_values dereferences memory beyond allocated bounds in the support vector arrays, producing a segfault that kills the inference worker process. On a shared inference platform, this denies service to all tenants. Repeating submissions prevents recovery and constitutes a sustained DoS against the ML serving infrastructure.