CVE-2025-7707 — HIGH (CVSS 7.1) AI Security Vulnerability

Q: Is CVE-2025-7707 actively exploited?

No confirmed active exploitation of CVE-2025-7707 has been reported, but organizations should still patch proactively.

Q: How to fix CVE-2025-7707?

1. Upgrade llama-index to >=0.13.0 (patch relocates NLTK cache to a user-specific directory, eliminating shared writability). 2. Immediate workaround if patching is delayed: set the NLTK_DATA environment variable to a protected, non-world-writable path before starting the service. 3. Audit all shared ML servers and container images for llama-index versions <0.13.0. 4. Detection: monitor NLTK data directory for unexpected write events from non-service accounts using auditd or filesystem integrity monitoring (e.g., AIDE, Wazuh). 5. Review container security posture—ensure llama-index processes do not run as root.

Q: What systems are affected by CVE-2025-7707?

This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, document ingestion pipelines, NLP preprocessing pipelines, agent frameworks, shared ML inference servers.

Q: What is the CVSS score for CVE-2025-7707?

CVE-2025-7707 has a CVSS v3.1 base score of 7.1 (HIGH). The EPSS exploitation probability is 0.02%.

CISO Take

Upgrade llama-index to 0.13.0 immediately on any multi-user or shared compute environment. The attack requires local access, but shared ML servers, Jupyter notebook environments, and containers with sidecar processes are realistic attack surfaces. Risk is highest in organizations running llama-index on shared inference or fine-tuning infrastructure where multiple OS users coexist.

Risk Assessment

Low real-world risk in standard cloud-native single-tenant deployments, but elevated in shared ML compute clusters, data science platforms, or CI/CD pipelines where multiple users share a host. EPSS of 0.00024 reflects minimal exploitation activity to date. Local-only attack vector caps severity for most SaaS deployments, but the trivial exploitation bar and potential privilege escalation path via symlink abuse on misconfigured systems justify prompt patching.

Affected Systems

Package	Ecosystem	Vulnerable Range	Patched
llama-index	pip	< 0.13.0	`0.13.0`
49.1K 229 dependents Pushed 8d ago 87% patched ~50d to patch Full package profile →

Do you use llama-index? You're affected.

Severity & Risk

CVSS 3.1

7.1 / 10

EPSS

0.0%

chance of exploitation in 30 days

Higher than 7% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

No known exploitation

Sophistication

Trivial

Attack Surface

AV Local

AC Low

PR Low

UI None

S Unchanged

C None

I High

A High

Recommended Action

5 steps

Upgrade llama-index to >=0.13.0 (patch relocates NLTK cache to a user-specific directory, eliminating shared writability).
Immediate workaround if patching is delayed: set the NLTK_DATA environment variable to a protected, non-world-writable path before starting the service.
Audit all shared ML servers and container images for llama-index versions <0.13.0.
Detection: monitor NLTK data directory for unexpected write events from non-service accounts using auditd or filesystem integrity monitoring (e.g., AIDE, Wazuh).
Review container security posture—ensure llama-index processes do not run as root.

CISA SSVC Assessment

Decision Track

Exploitation none

Automatable No

Technical Impact total

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

DoS Supply Chain Model Poisoning Framework RAG AML.T0010.001 - AI Software AML.T0020 - Poison Training Data AML.T0029 - Denial of AI Service AML.T0059 - Erode Dataset Integrity

Compliance Impact

This CVE is relevant to:

EU AI Act

Art.15 - Accuracy, robustness and cybersecurity

ISO 42001

A.6.2 - AI system data management

NIST AI RMF

MANAGE 2.2 - Mechanisms to sustain oversight of deployed AI

OWASP LLM Top 10

LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2025-7707?

Upgrade llama-index to 0.13.0 immediately on any multi-user or shared compute environment. The attack requires local access, but shared ML servers, Jupyter notebook environments, and containers with sidecar processes are realistic attack surfaces. Risk is highest in organizations running llama-index on shared inference or fine-tuning infrastructure where multiple OS users coexist.

Is CVE-2025-7707 actively exploited?

No confirmed active exploitation of CVE-2025-7707 has been reported, but organizations should still patch proactively.

How to fix CVE-2025-7707?

1. Upgrade llama-index to >=0.13.0 (patch relocates NLTK cache to a user-specific directory, eliminating shared writability). 2. Immediate workaround if patching is delayed: set the NLTK_DATA environment variable to a protected, non-world-writable path before starting the service. 3. Audit all shared ML servers and container images for llama-index versions <0.13.0. 4. Detection: monitor NLTK data directory for unexpected write events from non-service accounts using auditd or filesystem integrity monitoring (e.g., AIDE, Wazuh). 5. Review container security posture—ensure llama-index processes do not run as root.

What systems are affected by CVE-2025-7707?

This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, document ingestion pipelines, NLP preprocessing pipelines, agent frameworks, shared ML inference servers.

What is the CVSS score for CVE-2025-7707?

CVE-2025-7707 has a CVSS v3.1 base score of 7.1 (HIGH). The EPSS exploitation probability is 0.02%.

Technical Details

NVD Description

The llama_index library version 0.12.33 sets the NLTK data directory to a subdirectory of the codebase by default, which is world-writable in multi-user environments. This configuration allows local users to overwrite, delete, or corrupt NLTK data files, leading to potential denial of service, data tampering, or privilege escalation. The vulnerability arises from the use of a shared cache directory instead of a user-specific one, making it susceptible to local data tampering and denial of service.

Exploitation Scenario

On a shared ML inference server, a low-privileged user (e.g., a data analyst with shell access) locates the world-writable NLTK data directory within the llama-index installation path. They overwrite the punkt tokenizer or stopwords corpus with malformed content. When the llama-index service next processes incoming documents for a RAG pipeline, NLTK operations fail or silently produce corrupted text chunks—either crashing the ingestion pipeline (DoS) or degrading retrieval quality in ways that may not surface in standard monitoring. On systems where the LLM application runs as a privileged service account, the attacker may replace NLTK data files with symlinks pointing to sensitive system files, leveraging the privileged process to overwrite or read them.