Any RAG or document ingestion pipeline using llama-index-readers-papers to process sitemaps is vulnerable to a billion-laughs DoS that can crash the service via memory exhaustion. The fix is available: upgrade to llama-index-readers-papers >= 0.3.2 (llama-index >= 0.12.29) now. No exploitation observed in the wild yet, but the attack is trivial to craft.
What is the risk?
CVSS 7.5 High but real-world risk is moderate. EPSS 0.00144 indicates minimal active exploitation. The attack vector is network-accessible with no authentication required and no user interaction, making it a zero-friction DoS if the parser is exposed to untrusted input. The impact is purely availability — no data exposure or code execution. Risk elevates significantly for teams running automated document ingestion pipelines that accept external URLs or user-submitted sitemaps.
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| LlamaIndex | pip | < 0.3.2 | 0.3.2 |
Do you use LlamaIndex? You're affected.
How severe is it?
What is the attack surface?
What should I do?
6 steps-
Patch immediately: upgrade llama-index-readers-papers to >= 0.3.2 or llama-index to >= 0.12.29.
-
If patching is delayed, disable or sandbox the Papers Loader until patched.
-
Compensating control: apply XML entity limits at the parser level (Python: use defusedxml or set entity expansion limits).
-
Validate and allowlist sitemap URLs before processing — reject untrusted or user-supplied URLs.
-
Monitor document ingestion workers for abnormal memory spikes as a detection signal.
-
Audit your dependency tree for llama-index-readers-papers usage across all services.
What does CISA's SSVC say?
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2025-3225?
Any RAG or document ingestion pipeline using llama-index-readers-papers to process sitemaps is vulnerable to a billion-laughs DoS that can crash the service via memory exhaustion. The fix is available: upgrade to llama-index-readers-papers >= 0.3.2 (llama-index >= 0.12.29) now. No exploitation observed in the wild yet, but the attack is trivial to craft.
Is CVE-2025-3225 actively exploited?
No confirmed active exploitation of CVE-2025-3225 has been reported, but organizations should still patch proactively.
How to fix CVE-2025-3225?
1. Patch immediately: upgrade llama-index-readers-papers to >= 0.3.2 or llama-index to >= 0.12.29. 2. If patching is delayed, disable or sandbox the Papers Loader until patched. 3. Compensating control: apply XML entity limits at the parser level (Python: use defusedxml or set entity expansion limits). 4. Validate and allowlist sitemap URLs before processing — reject untrusted or user-supplied URLs. 5. Monitor document ingestion workers for abnormal memory spikes as a detection signal. 6. Audit your dependency tree for llama-index-readers-papers usage across all services.
What systems are affected by CVE-2025-3225?
This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, document ingestion pipelines, LLM agent frameworks, automated data loaders.
What is the CVSS score for CVE-2025-3225?
CVE-2025-3225 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.41%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.001 AI Software AML.T0029 Denial of AI Service AML.T0049 Exploit Public-Facing Application Compliance Controls Affected
What are the technical details?
Original Advisory
An XML Entity Expansion vulnerability, also known as a 'billion laughs' attack, exists in the sitemap parser of the run-llama/llama_index repository, specifically affecting the Papers Loaders package before version 0.3.2 (in llama-index v0.10.0 and above through v0.12.29). This vulnerability allows an attacker to supply a malicious Sitemap XML, leading to a Denial of Service (DoS) by exhausting system memory and potentially causing a system crash. The issue is resolved in version 0.3.2 (in llama-index 0.12.29).
Exploitation Scenario
An adversary targeting an organization's RAG pipeline identifies that it uses llama-index to ingest papers from external sitemaps. They craft a malicious XML sitemap containing deeply nested entity references (classic billion-laughs structure) and either submit it via a public-facing document upload endpoint, inject the URL into an automated pipeline that crawls academic sources, or host it on a compromised domain the pipeline is configured to ingest. When the Papers Loader parses the sitemap, recursive entity expansion consumes all available memory, crashing the ingestion worker and halting RAG knowledge base updates.
Weaknesses (CWE)
CWE-776 Improper Restriction of Recursive Entity References in DTDs ('XML Entity Expansion')
Primary
CWE-776 — Improper Restriction of Recursive Entity References in DTDs ('XML Entity Expansion'): The product uses XML documents and allows their structure to be defined with a Document Type Definition (DTD), but it does not properly control the number of recursive definitions of entities.
- [Operation] If possible, prohibit the use of DTDs or use an XML parser that limits the expansion of recursive DTD entities.
- [Implementation] Before parsing XML files with associated DTDs, scan for recursive entity declarations and do not continue parsing potentially explosive content.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H References
Timeline
Related Vulnerabilities
CVE-2024-12909 10.0 llama-index finchat: SQL injection enables RCE
Same package: llama-index CVE-2024-11958 9.8 llama-index DuckDB retriever: SQLi enables RCE
Same package: llama-index CVE-2025-1793 9.8 llama_index: SQL injection in vector store integrations
Same package: llama-index CVE-2025-1753 7.8 llama-index-cli: OS command injection enables RCE
Same package: llama-index CVE-2025-3046 7.5 LlamaIndex Obsidian: symlink traversal exposes host files
Same package: llama-index