CVE-2025-7707: llama-index: world-writable NLTK dir allows local tampering

GHSA-rg9h-vx28-xxp5 HIGH
Published October 13, 2025
CISO Take

Upgrade llama-index to 0.13.0 immediately on any multi-user or shared compute environment. The attack requires local access, but shared ML servers, Jupyter notebook environments, and containers with sidecar processes are realistic attack surfaces. Risk is highest in organizations running llama-index on shared inference or fine-tuning infrastructure where multiple OS users coexist.

Risk Assessment

Low real-world risk in standard cloud-native single-tenant deployments, but elevated in shared ML compute clusters, data science platforms, or CI/CD pipelines where multiple users share a host. EPSS of 0.00024 reflects minimal exploitation activity to date. Local-only attack vector caps severity for most SaaS deployments, but the trivial exploitation bar and potential privilege escalation path via symlink abuse on misconfigured systems justify prompt patching.

Affected Systems

Package Ecosystem Vulnerable Range Patched
llama-index pip < 0.13.0 0.13.0
49.1K 229 dependents Pushed 8d ago 87% patched ~50d to patch Full package profile →

Do you use llama-index? You're affected.

Severity & Risk

CVSS 3.1
7.1 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 7% of all CVEs
Exploitation Status
No known exploitation
Sophistication
Trivial

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C None
I High
A High

Recommended Action

5 steps
  1. Upgrade llama-index to >=0.13.0 (patch relocates NLTK cache to a user-specific directory, eliminating shared writability).

  2. Immediate workaround if patching is delayed: set the NLTK_DATA environment variable to a protected, non-world-writable path before starting the service.

  3. Audit all shared ML servers and container images for llama-index versions <0.13.0.

  4. Detection: monitor NLTK data directory for unexpected write events from non-service accounts using auditd or filesystem integrity monitoring (e.g., AIDE, Wazuh).

  5. Review container security posture—ensure llama-index processes do not run as root.

CISA SSVC Assessment

Decision Track
Exploitation none
Automatable No
Technical Impact total

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Art.15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2 - AI system data management
NIST AI RMF
MANAGE 2.2 - Mechanisms to sustain oversight of deployed AI
OWASP LLM Top 10
LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2025-7707?

Upgrade llama-index to 0.13.0 immediately on any multi-user or shared compute environment. The attack requires local access, but shared ML servers, Jupyter notebook environments, and containers with sidecar processes are realistic attack surfaces. Risk is highest in organizations running llama-index on shared inference or fine-tuning infrastructure where multiple OS users coexist.

Is CVE-2025-7707 actively exploited?

No confirmed active exploitation of CVE-2025-7707 has been reported, but organizations should still patch proactively.

How to fix CVE-2025-7707?

1. Upgrade llama-index to >=0.13.0 (patch relocates NLTK cache to a user-specific directory, eliminating shared writability). 2. Immediate workaround if patching is delayed: set the NLTK_DATA environment variable to a protected, non-world-writable path before starting the service. 3. Audit all shared ML servers and container images for llama-index versions <0.13.0. 4. Detection: monitor NLTK data directory for unexpected write events from non-service accounts using auditd or filesystem integrity monitoring (e.g., AIDE, Wazuh). 5. Review container security posture—ensure llama-index processes do not run as root.

What systems are affected by CVE-2025-7707?

This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, document ingestion pipelines, NLP preprocessing pipelines, agent frameworks, shared ML inference servers.

What is the CVSS score for CVE-2025-7707?

CVE-2025-7707 has a CVSS v3.1 base score of 7.1 (HIGH). The EPSS exploitation probability is 0.02%.

Technical Details

NVD Description

The llama_index library version 0.12.33 sets the NLTK data directory to a subdirectory of the codebase by default, which is world-writable in multi-user environments. This configuration allows local users to overwrite, delete, or corrupt NLTK data files, leading to potential denial of service, data tampering, or privilege escalation. The vulnerability arises from the use of a shared cache directory instead of a user-specific one, making it susceptible to local data tampering and denial of service.

Exploitation Scenario

On a shared ML inference server, a low-privileged user (e.g., a data analyst with shell access) locates the world-writable NLTK data directory within the llama-index installation path. They overwrite the punkt tokenizer or stopwords corpus with malformed content. When the llama-index service next processes incoming documents for a RAG pipeline, NLTK operations fail or silently produce corrupted text chunks—either crashing the ingestion pipeline (DoS) or degrading retrieval quality in ways that may not surface in standard monitoring. On systems where the LLM application runs as a privileged service account, the attacker may replace NLTK data files with symlinks pointing to sensitive system files, leveraging the privileged process to overwrite or read them.

CVSS Vector

CVSS:3.0/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:H/A:H

Timeline

Published
October 13, 2025
Last Modified
October 13, 2025
First Seen
March 24, 2026

Related Vulnerabilities