CVE-2025-32375: BentoML: RCE via insecure deserialization in runner
GHSA-7v4r-c989-xh26 CRITICAL PoC AVAILABLE CISA: ATTENDAny BentoML runner server exposed to the network is fully compromised by an unauthenticated attacker with a crafted POST request — no credentials, no user interaction required. Patch to 1.4.8 immediately or take runner endpoints offline until you can. With an EPSS of 0.67, weaponized exploits are either available or imminent.
What is the risk?
Extreme. CVSS 9.8 with AV:N/AC:L/PR:N/UI:N means this is a drive-by RCE requiring zero attacker skill beyond sending an HTTP request. EPSS of 0.67 places this in the top 5% of actively-exploited vulnerabilities. BentoML is a production-grade model serving framework used in real-time inference pipelines — compromise gives an attacker persistent foothold inside the ML inference layer, with access to models, API keys, training artifacts, and downstream data pipelines.
What systems are affected?
How severe is it?
What is the attack surface?
What should I do?
5 steps-
PATCH
Upgrade BentoML to 1.4.8 immediately — this is the only complete fix.
-
ISOLATE
If patching is not immediately possible, restrict runner server endpoints to localhost or trusted internal subnets only via firewall/network policy. Do not expose runner ports externally under any circumstances.
-
DETECT
Review HTTP access logs for POST requests to runner endpoints with unusual or binary Content-Type headers, particularly pickle/msgpack types. Alert on unexpected outbound connections from runner processes (reverse shell indicators).
-
AUDIT
Identify all BentoML deployments in your environment — check container images, Kubernetes manifests, and cloud deployments. Treat all pre-1.4.8 instances as potentially compromised if they were network-accessible.
-
ROTATE
If compromise is suspected, rotate all credentials accessible from runner hosts (API keys, DB passwords, cloud IAM tokens).
What does CISA's SSVC say?
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2025-32375?
Any BentoML runner server exposed to the network is fully compromised by an unauthenticated attacker with a crafted POST request — no credentials, no user interaction required. Patch to 1.4.8 immediately or take runner endpoints offline until you can. With an EPSS of 0.67, weaponized exploits are either available or imminent.
Is CVE-2025-32375 actively exploited?
A weaponized Metasploit module (exploit/linux/http/bentoml_runner_server_rce_cve_2025_32375) exists for CVE-2025-32375, meaning the exploit is point-and-click and the risk of opportunistic exploitation is high.
How to fix CVE-2025-32375?
1. PATCH: Upgrade BentoML to 1.4.8 immediately — this is the only complete fix. 2. ISOLATE: If patching is not immediately possible, restrict runner server endpoints to localhost or trusted internal subnets only via firewall/network policy. Do not expose runner ports externally under any circumstances. 3. DETECT: Review HTTP access logs for POST requests to runner endpoints with unusual or binary Content-Type headers, particularly pickle/msgpack types. Alert on unexpected outbound connections from runner processes (reverse shell indicators). 4. AUDIT: Identify all BentoML deployments in your environment — check container images, Kubernetes manifests, and cloud deployments. Treat all pre-1.4.8 instances as potentially compromised if they were network-accessible. 5. ROTATE: If compromise is suspected, rotate all credentials accessible from runner hosts (API keys, DB passwords, cloud IAM tokens).
What systems are affected by CVE-2025-32375?
This vulnerability affects the following AI/ML architecture patterns: model serving, inference endpoints, MLOps pipelines, microservice inference architectures, containerized model deployments.
What is the CVSS score for CVE-2025-32375?
CVE-2025-32375 has a CVSS v3.1 base score of 9.8 (CRITICAL). The EPSS exploitation probability is 43.81%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.001 AI Software AML.T0025 Exfiltration via Cyber Means AML.T0049 Exploit Public-Facing Application AML.T0050 Command and Scripting Interpreter AML.T0072 Reverse Shell Compliance Controls Affected
What are the technical details?
Original Advisory
BentoML is a Python library for building online serving systems optimized for AI apps and model inference. Prior to 1.4.8, there was an insecure deserialization in BentoML's runner server. By setting specific headers and parameters in the POST request, it is possible to execute any unauthorized arbitrary code on the server, which will grant the attackers to have the initial access and information disclosure on the server. This vulnerability is fixed in 1.4.8.
Exploitation Scenario
An attacker scans for BentoML runner server endpoints (default port 8080/8081) using Shodan or internal network reconnaissance. They craft a POST request to a runner inference endpoint with a Content-Type header indicating pickle serialization and a malicious payload that executes arbitrary shell commands upon deserialization. The runner server deserializes the payload without validation, executing the attacker's code as the runner process user. From here, the attacker drops a reverse shell, exfiltrates model weights and environment variables (cloud credentials, API keys), and establishes persistence. In a Kubernetes environment, the service account token is harvested for cluster-wide lateral movement.
Weaknesses (CWE)
CWE-502 — Deserialization of Untrusted Data: The product deserializes untrusted data without sufficiently ensuring that the resulting data will be valid.
- [Architecture and Design, Implementation] If available, use the signing/sealing features of the programming language to assure that deserialized data has not been tainted. For example, a hash-based message authentication code (HMAC) could be used to ensure that data has not been modified.
- [Implementation] When deserializing data, populate a new object rather than just deserializing. The result is that the data flows through safe input validation and that the functions are safe.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H References
- github.com/advisories/GHSA-7v4r-c989-xh26
- github.com/pypa/advisory-database/tree/main/vulns/bentoml/PYSEC-2025-32.yaml
- nvd.nist.gov/vuln/detail/CVE-2025-32375
- github.com/bentoml/BentoML/security/advisories/GHSA-7v4r-c989-xh26 Exploit Vendor
- github.com/ARPSyndicate/cve-scores Exploit
- github.com/nomi-sec/PoC-in-GitHub Exploit
- github.com/plzheheplztrying/cve_monitor Exploit
- github.com/theGEBIRGE/CVE-2025-32375 Exploit
Timeline
Related Vulnerabilities
CVE-2025-54381 9.9 BentoML: unauthenticated SSRF via file upload URLs
Same package: bentoml CVE-2025-27520 9.8 BentoML: unauthenticated RCE via insecure deserialization
Same package: bentoml CVE-2024-9070 9.8 BentoML: unauthenticated RCE via runner deserialization
Same package: bentoml CVE-2026-35044 8.8 BentoML: malicious bento archive RCE via Jinja2 SSTI
Same package: bentoml CVE-2026-44346 8.8 BentoML: Dockerfile injection enables build-time RCE
Same package: bentoml