Upgrade llama-index to 0.13.0 immediately on any multi-user or shared compute environment. The attack requires local access, but shared ML servers, Jupyter notebook environments, and containers with sidecar processes are realistic attack surfaces. Risk is highest in organizations running llama-index on shared inference or fine-tuning infrastructure where multiple OS users coexist.
Risk Assessment
Low real-world risk in standard cloud-native single-tenant deployments, but elevated in shared ML compute clusters, data science platforms, or CI/CD pipelines where multiple users share a host. EPSS of 0.00024 reflects minimal exploitation activity to date. Local-only attack vector caps severity for most SaaS deployments, but the trivial exploitation bar and potential privilege escalation path via symlink abuse on misconfigured systems justify prompt patching.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| llama-index | pip | < 0.13.0 | 0.13.0 |
Do you use llama-index? You're affected.
Severity & Risk
Attack Surface
Recommended Action
5 steps-
Upgrade llama-index to >=0.13.0 (patch relocates NLTK cache to a user-specific directory, eliminating shared writability).
-
Immediate workaround if patching is delayed: set the NLTK_DATA environment variable to a protected, non-world-writable path before starting the service.
-
Audit all shared ML servers and container images for llama-index versions <0.13.0.
-
Detection: monitor NLTK data directory for unexpected write events from non-service accounts using auditd or filesystem integrity monitoring (e.g., AIDE, Wazuh).
-
Review container security posture—ensure llama-index processes do not run as root.
CISA SSVC Assessment
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2025-7707?
Upgrade llama-index to 0.13.0 immediately on any multi-user or shared compute environment. The attack requires local access, but shared ML servers, Jupyter notebook environments, and containers with sidecar processes are realistic attack surfaces. Risk is highest in organizations running llama-index on shared inference or fine-tuning infrastructure where multiple OS users coexist.
Is CVE-2025-7707 actively exploited?
No confirmed active exploitation of CVE-2025-7707 has been reported, but organizations should still patch proactively.
How to fix CVE-2025-7707?
1. Upgrade llama-index to >=0.13.0 (patch relocates NLTK cache to a user-specific directory, eliminating shared writability). 2. Immediate workaround if patching is delayed: set the NLTK_DATA environment variable to a protected, non-world-writable path before starting the service. 3. Audit all shared ML servers and container images for llama-index versions <0.13.0. 4. Detection: monitor NLTK data directory for unexpected write events from non-service accounts using auditd or filesystem integrity monitoring (e.g., AIDE, Wazuh). 5. Review container security posture—ensure llama-index processes do not run as root.
What systems are affected by CVE-2025-7707?
This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, document ingestion pipelines, NLP preprocessing pipelines, agent frameworks, shared ML inference servers.
What is the CVSS score for CVE-2025-7707?
CVE-2025-7707 has a CVSS v3.1 base score of 7.1 (HIGH). The EPSS exploitation probability is 0.02%.
Technical Details
NVD Description
The llama_index library version 0.12.33 sets the NLTK data directory to a subdirectory of the codebase by default, which is world-writable in multi-user environments. This configuration allows local users to overwrite, delete, or corrupt NLTK data files, leading to potential denial of service, data tampering, or privilege escalation. The vulnerability arises from the use of a shared cache directory instead of a user-specific one, making it susceptible to local data tampering and denial of service.
Exploitation Scenario
On a shared ML inference server, a low-privileged user (e.g., a data analyst with shell access) locates the world-writable NLTK data directory within the llama-index installation path. They overwrite the punkt tokenizer or stopwords corpus with malformed content. When the llama-index service next processes incoming documents for a RAG pipeline, NLTK operations fail or silently produce corrupted text chunks—either crashing the ingestion pipeline (DoS) or degrading retrieval quality in ways that may not surface in standard monitoring. On systems where the LLM application runs as a privileged service account, the attacker may replace NLTK data files with symlinks pointing to sensitive system files, leveraging the privileged process to overwrite or read them.
Weaknesses (CWE)
CVSS Vector
CVSS:3.0/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:H/A:H References
Timeline
Related Vulnerabilities
CVE-2024-12909 10.0 llama-index finchat: SQL injection enables RCE
Same package: llama-index CVE-2025-1793 9.8 llama_index: SQL injection in vector store integrations
Same package: llama-index CVE-2024-11958 9.8 llama-index DuckDB retriever: SQLi enables RCE
Same package: llama-index CVE-2025-1753 7.8 llama-index-cli: OS command injection enables RCE
Same package: llama-index CVE-2025-3225 7.5 llama-index Papers Loader: XML expansion DoS
Same package: llama-index
AI Threat Alert