Any RAG or document ingestion pipeline using llama-index-readers-papers to process sitemaps is vulnerable to a billion-laughs DoS that can crash the service via memory exhaustion. The fix is available: upgrade to llama-index-readers-papers >= 0.3.2 (llama-index >= 0.12.29) now. No exploitation observed in the wild yet, but the attack is trivial to craft.
Risk Assessment
CVSS 7.5 High but real-world risk is moderate. EPSS 0.00144 indicates minimal active exploitation. The attack vector is network-accessible with no authentication required and no user interaction, making it a zero-friction DoS if the parser is exposed to untrusted input. The impact is purely availability — no data exposure or code execution. Risk elevates significantly for teams running automated document ingestion pipelines that accept external URLs or user-submitted sitemaps.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| llama-index-readers-papers | pip | < 0.3.2 | 0.3.2 |
Do you use llama-index-readers-papers? You're affected.
Severity & Risk
Attack Surface
Recommended Action
6 steps-
Patch immediately: upgrade llama-index-readers-papers to >= 0.3.2 or llama-index to >= 0.12.29.
-
If patching is delayed, disable or sandbox the Papers Loader until patched.
-
Compensating control: apply XML entity limits at the parser level (Python: use defusedxml or set entity expansion limits).
-
Validate and allowlist sitemap URLs before processing — reject untrusted or user-supplied URLs.
-
Monitor document ingestion workers for abnormal memory spikes as a detection signal.
-
Audit your dependency tree for llama-index-readers-papers usage across all services.
CISA SSVC Assessment
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2025-3225?
Any RAG or document ingestion pipeline using llama-index-readers-papers to process sitemaps is vulnerable to a billion-laughs DoS that can crash the service via memory exhaustion. The fix is available: upgrade to llama-index-readers-papers >= 0.3.2 (llama-index >= 0.12.29) now. No exploitation observed in the wild yet, but the attack is trivial to craft.
Is CVE-2025-3225 actively exploited?
No confirmed active exploitation of CVE-2025-3225 has been reported, but organizations should still patch proactively.
How to fix CVE-2025-3225?
1. Patch immediately: upgrade llama-index-readers-papers to >= 0.3.2 or llama-index to >= 0.12.29. 2. If patching is delayed, disable or sandbox the Papers Loader until patched. 3. Compensating control: apply XML entity limits at the parser level (Python: use defusedxml or set entity expansion limits). 4. Validate and allowlist sitemap URLs before processing — reject untrusted or user-supplied URLs. 5. Monitor document ingestion workers for abnormal memory spikes as a detection signal. 6. Audit your dependency tree for llama-index-readers-papers usage across all services.
What systems are affected by CVE-2025-3225?
This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, document ingestion pipelines, LLM agent frameworks, automated data loaders.
What is the CVSS score for CVE-2025-3225?
CVE-2025-3225 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.34%.
Technical Details
NVD Description
An XML Entity Expansion vulnerability, also known as a 'billion laughs' attack, exists in the sitemap parser of the run-llama/llama_index repository, specifically affecting the Papers Loaders package before version 0.3.2 (in llama-index v0.10.0 and above through v0.12.29). This vulnerability allows an attacker to supply a malicious Sitemap XML, leading to a Denial of Service (DoS) by exhausting system memory and potentially causing a system crash. The issue is resolved in version 0.3.2 (in llama-index 0.12.29).
Exploitation Scenario
An adversary targeting an organization's RAG pipeline identifies that it uses llama-index to ingest papers from external sitemaps. They craft a malicious XML sitemap containing deeply nested entity references (classic billion-laughs structure) and either submit it via a public-facing document upload endpoint, inject the URL into an automated pipeline that crawls academic sources, or host it on a compromised domain the pipeline is configured to ingest. When the Papers Loader parses the sitemap, recursive entity expansion consumes all available memory, crashing the ingestion worker and halting RAG knowledge base updates.
Weaknesses (CWE)
CVSS Vector
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H References
Timeline
Related Vulnerabilities
CVE-2024-12909 10.0 llama-index finchat: SQL injection enables RCE
Same package: llama-index CVE-2024-11958 9.8 llama-index DuckDB retriever: SQLi enables RCE
Same package: llama-index CVE-2025-1793 9.8 llama_index: SQL injection in vector store integrations
Same package: llama-index CVE-2025-1753 7.8 llama-index-cli: OS command injection enables RCE
Same package: llama-index CVE-2025-3046 7.5 LlamaIndex Obsidian: symlink traversal exposes host files
Same package: llama-index
AI Threat Alert