BentoML 1.3.9's Gradio integration exposes an unauthenticated DoS vector on the /login endpoint — no credentials, no user interaction required to take down inference services. No official patch exists yet; immediately restrict public network access to BentoML instances and apply WAF rate-limiting on multipart requests. Any ML model serving infrastructure exposed to untrusted networks is at risk of complete availability loss.
Risk Assessment
HIGH. CVSS 7.5 with AV:N/AC:L/PR:N/UI:N means this is trivially exploitable by any unauthenticated attacker over the internet. The lack of a patch elevates operational risk significantly. BentoML is widely used in production ML serving pipelines, and the Gradio integration is increasingly common for rapid model deployment. Exposure surface is broad: any internet-facing BentoML 1.3.9 deployment with Gradio enabled is fully vulnerable until network-level controls are applied.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| bentoml | pip | <= 1.3.9 | No patch |
Do you use bentoml? You're affected.
Severity & Risk
Attack Surface
Recommended Action
5 steps-
IMMEDIATE
Block or rate-limit HTTP requests to /login and all multipart endpoints at the WAF/reverse proxy layer (Nginx, Cloudflare, AWS WAF). Restrict access to trusted IP ranges.
-
NETWORK
If Gradio UI is not required externally, bind BentoML to localhost or internal VPC only — do not expose publicly.
-
MONITORING
Alert on abnormal CPU/memory spikes on BentoML processes; set up health check endpoints with auto-restart on failure.
-
PATCH
Monitor https://github.com/bentoml/BentoML/releases for a fix; currently no patch is available. Pin to a patched version immediately once released.
-
WORKAROUND
Disable Gradio integration entirely if not actively used by setting
gradio_enabled=falsein BentoML configuration.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is GHSA-hh3j-9m59-p8vc?
BentoML 1.3.9's Gradio integration exposes an unauthenticated DoS vector on the /login endpoint — no credentials, no user interaction required to take down inference services. No official patch exists yet; immediately restrict public network access to BentoML instances and apply WAF rate-limiting on multipart requests. Any ML model serving infrastructure exposed to untrusted networks is at risk of complete availability loss.
Is GHSA-hh3j-9m59-p8vc actively exploited?
No confirmed active exploitation of GHSA-hh3j-9m59-p8vc has been reported, but organizations should still patch proactively.
How to fix GHSA-hh3j-9m59-p8vc?
1. IMMEDIATE: Block or rate-limit HTTP requests to /login and all multipart endpoints at the WAF/reverse proxy layer (Nginx, Cloudflare, AWS WAF). Restrict access to trusted IP ranges. 2. NETWORK: If Gradio UI is not required externally, bind BentoML to localhost or internal VPC only — do not expose publicly. 3. MONITORING: Alert on abnormal CPU/memory spikes on BentoML processes; set up health check endpoints with auto-restart on failure. 4. PATCH: Monitor https://github.com/bentoml/BentoML/releases for a fix; currently no patch is available. Pin to a patched version immediately once released. 5. WORKAROUND: Disable Gradio integration entirely if not actively used by setting `gradio_enabled=false` in BentoML configuration.
What systems are affected by GHSA-hh3j-9m59-p8vc?
This vulnerability affects the following AI/ML architecture patterns: model serving, inference APIs, MLOps platforms.
What is the CVSS score for GHSA-hh3j-9m59-p8vc?
GHSA-hh3j-9m59-p8vc has a CVSS v3.1 base score of 7.5 (HIGH).
Technical Details
NVD Description
In bentoml/bentoml version 1.3.9, the `/login` endpoint of the newly integrated Gradio app is vulnerable to a Denial of Service (DoS) attack. This vulnerability can be exploited by appending characters, such as dashes (-), to the end of a multipart boundary in an HTTP request. The server continuously processes each character, leading to excessive resource consumption and rendering the service unavailable. The issue is unauthenticated and does not require any user interaction.
Exploitation Scenario
An attacker identifies a public-facing BentoML 1.3.9 endpoint (e.g., via Shodan, GitHub repository leaks, or DNS enumeration). They send a crafted HTTP POST to /login with a Content-Type header containing a multipart boundary followed by hundreds of appended dash characters (e.g., `Content-Type: multipart/form-data; boundary=----WebKitFormBoundary----------...`). The server enters a processing loop for each appended character, consuming increasing CPU and memory. With repeated requests — trivially automated with curl or a basic Python script — the server exhausts resources and crashes or becomes unresponsive. No credentials, tokens, or prior knowledge of the system are required. The attack takes seconds to execute and requires no AI/ML domain expertise.
Weaknesses (CWE)
CVSS Vector
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H References
Timeline
Related Vulnerabilities
CVE-2025-54381 9.9 BentoML: unauthenticated SSRF via file upload URLs
Same package: bentoml CVE-2025-27520 9.8 BentoML: unauthenticated RCE via insecure deserialization
Same package: bentoml CVE-2024-9070 9.8 BentoML: unauthenticated RCE via runner deserialization
Same package: bentoml CVE-2025-32375 9.8 BentoML: RCE via insecure deserialization in runner
Same package: bentoml CVE-2026-35044 8.8 BentoML: malicious bento archive RCE via Jinja2 SSTI
Same package: bentoml
AI Threat Alert