CVE-2025-0315: Ollama: GGUF model upload causes memory exhaustion DoS

HIGH PoC AVAILABLE CISA: TRACK*
Published March 20, 2025
CISO Take

Any attacker with network access to an Ollama instance can crash it by uploading a crafted GGUF model file—no credentials required by default. If your org runs Ollama for internal LLM inference, patch to >0.3.14 immediately or place the API behind an authenticated reverse proxy. Exposed Ollama instances on internal dev networks are at high risk given the zero-auth, low-complexity exploit path.

What is the risk?

High. The combination of network-accessible vector, zero authentication required, and low attack complexity makes this trivially exploitable by anyone who can reach the Ollama API port (default 11434). Ollama is routinely deployed without network restrictions in AI dev environments and on developer workstations, expanding the attack surface significantly. Availability impact is complete for the instance; confidentiality and integrity are unaffected. No active exploitation confirmed, but CVSS 7.5 and trivial reproducibility make this a priority patch.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
Ollama pip No patch
174.6K 1.6K dependents Pushed 3d ago 12% patched ~0d to patch Full package profile →

Do you use Ollama? You're affected.

How severe is it?

CVSS 3.1
7.5 / 10
EPSS
0.7%
chance of exploitation in 30 days
Higher than 47% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Unchanged
C None
I None
A High

What should I do?

6 steps
  1. Patch: Upgrade Ollama beyond version 0.3.14 as soon as a patched release is available; monitor the official GitHub releases page.

  2. Network isolation: Restrict Ollama API (port 11434) to localhost or trusted internal subnets via firewall rules immediately.

  3. Auth proxy: Place a reverse proxy with authentication (nginx + OAuth2 proxy, Caddy with auth middleware) in front of any network-accessible Ollama instance.

  4. Restrict upload access: Audit and limit who can call /api/create and model push endpoints to administrators only.

  5. Detection: Alert on rapid memory growth in Ollama processes, OOM killer events targeting Ollama, or unexpected model creation API calls from non-admin principals.

  6. Workaround if patching is delayed: Disable the model creation endpoint at the network layer if untrusted users have API access.

What does CISA's SSVC say?

Decision Track*
Exploitation poc
Automatable Yes
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Art. 15 - Accuracy, Robustness and Cybersecurity
ISO 42001
A.6.2 - AI System Resource Management and Availability
NIST AI RMF
MANAGE 2.2 - Mechanisms to sustain deployed AI with ongoing maintenance
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2025-0315?

Any attacker with network access to an Ollama instance can crash it by uploading a crafted GGUF model file—no credentials required by default. If your org runs Ollama for internal LLM inference, patch to >0.3.14 immediately or place the API behind an authenticated reverse proxy. Exposed Ollama instances on internal dev networks are at high risk given the zero-auth, low-complexity exploit path.

Is CVE-2025-0315 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-0315, increasing the risk of exploitation.

How to fix CVE-2025-0315?

1. Patch: Upgrade Ollama beyond version 0.3.14 as soon as a patched release is available; monitor the official GitHub releases page. 2. Network isolation: Restrict Ollama API (port 11434) to localhost or trusted internal subnets via firewall rules immediately. 3. Auth proxy: Place a reverse proxy with authentication (nginx + OAuth2 proxy, Caddy with auth middleware) in front of any network-accessible Ollama instance. 4. Restrict upload access: Audit and limit who can call /api/create and model push endpoints to administrators only. 5. Detection: Alert on rapid memory growth in Ollama processes, OOM killer events targeting Ollama, or unexpected model creation API calls from non-admin principals. 6. Workaround if patching is delayed: Disable the model creation endpoint at the network layer if untrusted users have API access.

What systems are affected by CVE-2025-0315?

This vulnerability affects the following AI/ML architecture patterns: LLM inference servers, model serving, local AI deployments, agent frameworks, RAG pipelines.

What is the CVSS score for CVE-2025-0315?

CVE-2025-0315 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.67%.

What is the AI security impact?

Affected AI Architectures

LLM inference serversmodel servinglocal AI deploymentsagent frameworksRAG pipelines

MITRE ATLAS Techniques

AML.T0011.000 Unsafe AI Artifacts
AML.T0029 Denial of AI Service
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Art. 15
ISO 42001: A.6.2
NIST AI RMF: MANAGE 2.2
OWASP LLM Top 10: LLM04

What are the technical details?

Original Advisory

A vulnerability in ollama/ollama <=0.3.14 allows a malicious user to create a customized GGUF model file, upload it to the Ollama server, and create it. This can cause the server to allocate unlimited memory, leading to a Denial of Service (DoS) attack.

Exploitation Scenario

An attacker enumerates internal network services and discovers an Ollama instance on port 11434—common in AI dev environments where developers run local LLMs. Using the Ollama REST API (no authentication required by default), the attacker crafts a malicious GGUF file with a manipulated header that declares tensor metadata requiring terabytes of memory allocation. They POST this file to the /api/create endpoint. The Ollama server parses the GGUF header and attempts to allocate the declared memory without bounds validation, triggering OOM and crashing the process. All dependent services—RAG pipelines, agent frameworks, internal chatbots relying on this inference endpoint—go offline instantly. The attacker can automate resubmission to prevent service recovery, creating a sustained DoS condition.

Weaknesses (CWE)

CWE-770 — Allocation of Resources Without Limits or Throttling: The product allocates a reusable resource or group of resources on behalf of an actor without imposing any intended restrictions on the size or number of resources that can be allocated.

  • [Requirements] Clearly specify the minimum and maximum expectations for capabilities, and dictate which behaviors are acceptable when resource allocation reaches limits.
  • [Architecture and Design] Limit the amount of resources that are accessible to unprivileged users. Set per-user limits for resources. Allow the system administrator to define these limits. Be careful to avoid CWE-410.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
March 20, 2025
Last Modified
April 2, 2025
First Seen
March 20, 2025

Related Vulnerabilities