CVE-2024-2912: BentoML: RCE via insecure deserialization (CVSS 10)

CRITICAL PoC AVAILABLE CISA: ATTEND
Published April 16, 2024
CISO Take

Any BentoML model-serving endpoint exposed to the network is fully compromisable with a single unauthenticated POST request — no credentials, no prior access needed. If you run BentoML in production, treat this as a fire drill: patch to the fixed commit immediately or isolate the service behind a network perimeter. CVSS 10.0 with no mitigating factors means this is as bad as it gets for ML inference infrastructure.

Risk Assessment

Maximum severity. CVSS 10.0 with AV:N/AC:L/PR:N/UI:N/S:C means any network-reachable BentoML instance is exploitable by anyone with HTTP access — no authentication bypass needed, no social engineering, no insider access. MLOps and model-serving infrastructure is typically under-patched compared to traditional web services and may run with elevated privileges or have access to sensitive model artifacts, training data, and internal APIs. The blast radius extends well beyond the compromised endpoint.

Severity & Risk

CVSS 3.1
10.0 / 10
EPSS
7.5%
chance of exploitation in 30 days
Higher than 92% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Changed
C High
I High
A High

Recommended Action

6 steps
  1. PATCH

    Apply the fix at commit fd70379733c57c6368cc022ac1f841b7b426db7b or upgrade to any BentoML release that includes it.

  2. ISOLATE

    If patching is not immediately possible, restrict network access to BentoML endpoints to trusted internal IPs only — do not expose unauthenticated BentoML services to the public internet or to untrusted network segments.

  3. AUDIT EXPOSURE

    Inventory all BentoML deployments across environments (dev, staging, prod); treat any instance reachable from external networks as actively compromised until patched.

  4. REVIEW CREDENTIALS

    Rotate any secrets (API keys, cloud credentials, model registry tokens) accessible from the environment of any BentoML instance, especially if internet-facing.

  5. DETECT

    Look for anomalous process spawning from the BentoML process (e.g., unexpected shell processes, outbound connections on non-standard ports, file writes outside model directories) as indicators of exploitation.

  6. MONITOR

    Enable auditd or eBPF-based syscall monitoring on inference servers to detect deserialization abuse patterns.

CISA SSVC Assessment

Decision Attend
Exploitation poc
Automatable Yes
Technical Impact total

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2.6 - AI system security
NIST AI RMF
MANAGE 2.2 - Mechanisms are in place to respond to and recover from known or anticipated AI system risks
OWASP LLM Top 10
LLM05:2025 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2024-2912?

Any BentoML model-serving endpoint exposed to the network is fully compromisable with a single unauthenticated POST request — no credentials, no prior access needed. If you run BentoML in production, treat this as a fire drill: patch to the fixed commit immediately or isolate the service behind a network perimeter. CVSS 10.0 with no mitigating factors means this is as bad as it gets for ML inference infrastructure.

Is CVE-2024-2912 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2024-2912, increasing the risk of exploitation.

How to fix CVE-2024-2912?

1. PATCH: Apply the fix at commit fd70379733c57c6368cc022ac1f841b7b426db7b or upgrade to any BentoML release that includes it. 2. ISOLATE: If patching is not immediately possible, restrict network access to BentoML endpoints to trusted internal IPs only — do not expose unauthenticated BentoML services to the public internet or to untrusted network segments. 3. AUDIT EXPOSURE: Inventory all BentoML deployments across environments (dev, staging, prod); treat any instance reachable from external networks as actively compromised until patched. 4. REVIEW CREDENTIALS: Rotate any secrets (API keys, cloud credentials, model registry tokens) accessible from the environment of any BentoML instance, especially if internet-facing. 5. DETECT: Look for anomalous process spawning from the BentoML process (e.g., unexpected shell processes, outbound connections on non-standard ports, file writes outside model directories) as indicators of exploitation. 6. MONITOR: Enable auditd or eBPF-based syscall monitoring on inference servers to detect deserialization abuse patterns.

What systems are affected by CVE-2024-2912?

This vulnerability affects the following AI/ML architecture patterns: model serving, MLOps pipelines, inference endpoints, RAG pipelines, agent frameworks.

What is the CVSS score for CVE-2024-2912?

CVE-2024-2912 has a CVSS v3.1 base score of 10.0 (CRITICAL). The EPSS exploitation probability is 7.49%.

Technical Details

NVD Description

An insecure deserialization vulnerability exists in the BentoML framework, allowing remote code execution (RCE) by sending a specially crafted POST request. By exploiting this vulnerability, attackers can execute arbitrary commands on the server hosting the BentoML application. The vulnerability is triggered when a serialized object, crafted to execute OS commands upon deserialization, is sent to any valid BentoML endpoint. This issue poses a significant security risk, enabling attackers to compromise the server and potentially gain unauthorized access or control.

Exploitation Scenario

An attacker scans for BentoML inference endpoints (default port 3000) or targets a known deployment. They craft a Python pickle payload that executes an OS command — e.g., a reverse shell or credential harvesting script — and serialize it. They POST this payload to any valid BentoML API endpoint (e.g., /predict or /classify). BentoML deserializes the payload server-side without validation, triggering arbitrary code execution. The attacker receives a reverse shell with the privileges of the BentoML process (often running as root or a service account with broad cloud IAM permissions). From there they exfiltrate model weights, pivot to internal infrastructure, or install persistence. The entire attack requires no authentication and can be scripted in under 20 lines of Python.

Weaknesses (CWE)

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H

Timeline

Published
April 16, 2024
Last Modified
November 21, 2024
First Seen
April 16, 2024

Related Vulnerabilities