GHSA-5ccf-884p-4jjq: Open WebUI DoS — HIGH

CISO Take

Any open-webui instance ≤ 0.3.21 exposed to the network can be taken offline with a single unauthenticated HTTP request targeting three core endpoints including RAG document ingestion and audio transcription. No authentication is required to trigger resource exhaustion — trivial to script and automate. Immediately restrict network access to trusted IPs and apply rate-limiting on multipart upload endpoints; no official patch is currently listed.

What is the risk?

HIGH. Zero authentication barrier combined with network accessibility and high impact on core AI functionality makes this a priority for any organization running open-webui. The attack requires no AI/ML knowledge — just a crafted HTTP POST with a padded multipart boundary. Risk escalates significantly for internet-facing deployments or instances accessible from untrusted internal segments. Absence of a listed patch version extends the exposure window, leaving network controls as the only current mitigation.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
Open WebUI	npm	<= 0.3.21	No patch
142.4K Pushed 4d ago 77% patched ~5d to patch Full package profile →
Open WebUI	pip	<= 0.3.21	No patch
142.4K Pushed 4d ago 77% patched ~5d to patch Full package profile →

How severe is it?

CVSS 3.1

7.5 / 10

EPSS

N/A

Exploitation Status

No known exploitation

Sophistication

Trivial

What is the attack surface?

AV Network

AC Low

PR None

UI None

S Unchanged

C None

I None

A High

What should I do?

6 steps

IMMEDIATE

Restrict access to /ollama/models/upload, /audio/api/v1/transcriptions, and /rag/api/v1/doc via firewall or reverse proxy ACLs to trusted IPs only.
Deploy WAF or rate-limiting rules targeting multipart/form-data POST requests to these endpoints.
Place open-webui behind an authenticating reverse proxy (e.g., Nginx + OAuth2-proxy or basic auth) as an interim control if internet-facing.
Set OS-level resource limits (CPU/memory cgroups, ulimits) on the open-webui process to contain blast radius.
Monitor GitHub releases for open-webui > 0.3.21 and prioritize patching immediately on release.
DETECTION

Alert on sustained CPU/memory spikes from the open-webui process correlated with high-rate multipart POST requests to affected endpoints.

How is it classified?

DoS Inference RAG Framework AML.T0029 - Denial of AI Service AML.T0034 - Cost Harvesting AML.T0049 - Exploit Public-Facing Application

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Art.15 - Accuracy, robustness and cybersecurity

ISO 42001

A.9.2 - AI system availability and resilience

NIST AI RMF

MANAGE-2.2 - AI risk treatment and residual risk management

OWASP LLM Top 10

LLM04 - Model Denial of Service

Frequently Asked Questions

What is GHSA-5ccf-884p-4jjq?

Any open-webui instance ≤ 0.3.21 exposed to the network can be taken offline with a single unauthenticated HTTP request targeting three core endpoints including RAG document ingestion and audio transcription. No authentication is required to trigger resource exhaustion — trivial to script and automate. Immediately restrict network access to trusted IPs and apply rate-limiting on multipart upload endpoints; no official patch is currently listed.

Is GHSA-5ccf-884p-4jjq actively exploited?

No confirmed active exploitation of GHSA-5ccf-884p-4jjq has been reported, but organizations should still patch proactively.

How to fix GHSA-5ccf-884p-4jjq?

1. IMMEDIATE: Restrict access to /ollama/models/upload, /audio/api/v1/transcriptions, and /rag/api/v1/doc via firewall or reverse proxy ACLs to trusted IPs only. 2. Deploy WAF or rate-limiting rules targeting multipart/form-data POST requests to these endpoints. 3. Place open-webui behind an authenticating reverse proxy (e.g., Nginx + OAuth2-proxy or basic auth) as an interim control if internet-facing. 4. Set OS-level resource limits (CPU/memory cgroups, ulimits) on the open-webui process to contain blast radius. 5. Monitor GitHub releases for open-webui > 0.3.21 and prioritize patching immediately on release. 6. DETECTION: Alert on sustained CPU/memory spikes from the open-webui process correlated with high-rate multipart POST requests to affected endpoints.

What systems are affected by GHSA-5ccf-884p-4jjq?

This vulnerability affects the following AI/ML architecture patterns: LLM inference servers, RAG pipelines, AI model serving, web-based AI interfaces, audio transcription pipelines.

What is the CVSS score for GHSA-5ccf-884p-4jjq?

GHSA-5ccf-884p-4jjq has a CVSS v3.1 base score of 7.5 (HIGH).

What is the AI security impact?

Affected AI Architectures

LLM inference serversRAG pipelinesAI model servingweb-based AI interfacesaudio transcription pipelines

MITRE ATLAS Techniques

AML.T0029 Denial of AI Service

AML.T0034 Cost Harvesting

AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Art.15

ISO 42001: A.9.2

NIST AI RMF: MANAGE-2.2

OWASP LLM Top 10: LLM04

What are the technical details?

Original Advisory

A Denial of Service (DoS) vulnerability exists in open-webui/open-webui version 0.3.21. This vulnerability affects multiple endpoints, including `/ollama/models/upload`, `/audio/api/v1/transcriptions`, and `/rag/api/v1/doc`. The application processes multipart boundaries without authentication, leading to resource exhaustion. By appending additional characters to the multipart boundary, an attacker can cause the server to parse each byte of the boundary, ultimately leading to service unavailability. This vulnerability can be exploited remotely, resulting in high CPU and memory usage, and rendering the service inaccessible to legitimate users.

Exploitation Scenario

An external attacker discovers an open-webui instance via Shodan or internal network scan. They craft a multipart HTTP POST to /rag/api/v1/doc with a boundary string padded with thousands of additional characters — a single-line curl command. The server's multipart parser processes each byte of the extended boundary, consuming disproportionate CPU cycles per request. The attacker runs a simple script firing concurrent requests, exhausting server resources within seconds and rendering the entire AI assistant — chat, document ingestion, model management — unavailable. No credentials, no prior knowledge of the target's AI stack, no cleanup required.

Weaknesses (CWE)

CWE-400 Uncontrolled Resource Consumption Primary

CWE-400 — Uncontrolled Resource Consumption: The product does not properly control the allocation and maintenance of a limited resource.

[Architecture and Design] Design throttling mechanisms into the system architecture. The best protection is to limit the amount of resources that an unauthorized user can cause to be expended. A strong authentication and access control model will help prevent such attacks from occurring in the first place. The login application should be protected against DoS attacks as much as possible. Limiting the database access, perhaps by caching result sets, can help minimize the resources expended. To further limit the potential for a DoS attack, consider tracking the rate of requests received from users and blocking requests that exceed a defined rate threshold.
[Architecture and Design] Mitigation of resource exhaustion attacks requires that the target system either: The first of these solutions is an issue in itself though, since it may allow attackers to prevent the use of the system by a particular valid user. If the attacker impersonates the valid user, they may be able to prevent the user from accessing the server in question. The second solution is simply difficult to effectively institute -- and even when properly done, it does not provide a full solution. It simply makes the attack require more resources on the part of the attacker. recognizes the attack and denies that user further access for a given amount of time, or uniformly throttles all requests in order to make it more difficult to consume resources more quickly than they can again be freed.

Source: MITRE CWE corpus.