CVE-2025-3225: LlamaIndex llama-index Papers Loader:

CISO Take

Any RAG or document ingestion pipeline using llama-index-readers-papers to process sitemaps is vulnerable to a billion-laughs DoS that can crash the service via memory exhaustion. The fix is available: upgrade to llama-index-readers-papers >= 0.3.2 (llama-index >= 0.12.29) now. No exploitation observed in the wild yet, but the attack is trivial to craft.

What is the risk?

CVSS 7.5 High but real-world risk is moderate. EPSS 0.00144 indicates minimal active exploitation. The attack vector is network-accessible with no authentication required and no user interaction, making it a zero-friction DoS if the parser is exposed to untrusted input. The impact is purely availability — no data exposure or code execution. Risk elevates significantly for teams running automated document ingestion pipelines that accept external URLs or user-submitted sitemaps.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
LlamaIndex	pip	< 0.3.2	`0.3.2`
50.2K 238 dependents Pushed 4d ago 87% patched ~50d to patch Full package profile →

Do you use LlamaIndex? You're affected.

How severe is it?

CVSS 3.1

7.5 / 10

EPSS

0.4%

chance of exploitation in 30 days

Higher than 33% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Trivial

Exploitation Confidence

medium

○ CISA SSVC: Public PoC

Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV Network

AC Low

PR None

UI None

S Unchanged

C None

I None

A High

What should I do?

6 steps

Patch immediately: upgrade llama-index-readers-papers to >= 0.3.2 or llama-index to >= 0.12.29.
If patching is delayed, disable or sandbox the Papers Loader until patched.
Compensating control: apply XML entity limits at the parser level (Python: use defusedxml or set entity expansion limits).
Validate and allowlist sitemap URLs before processing — reject untrusted or user-supplied URLs.
Monitor document ingestion workers for abnormal memory spikes as a detection signal.
Audit your dependency tree for llama-index-readers-papers usage across all services.

What does CISA's SSVC say?

Decision Track*

Exploitation poc

Automatable Yes

Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

DoS Supply Chain Framework RAG AML.T0010.001 - AI Software AML.T0029 - Denial of AI Service AML.T0049 - Exploit Public-Facing Application

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Art. 9 - Risk management system

ISO 42001

6.1.2 - AI risk assessment

NIST AI RMF

MANAGE 2.2 - Mechanisms are in place and applied to sustain the value of deployed AI systems

OWASP LLM Top 10

LLM05:2025 - Improper Output Handling / Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2025-3225?

Any RAG or document ingestion pipeline using llama-index-readers-papers to process sitemaps is vulnerable to a billion-laughs DoS that can crash the service via memory exhaustion. The fix is available: upgrade to llama-index-readers-papers >= 0.3.2 (llama-index >= 0.12.29) now. No exploitation observed in the wild yet, but the attack is trivial to craft.

Is CVE-2025-3225 actively exploited?

No confirmed active exploitation of CVE-2025-3225 has been reported, but organizations should still patch proactively.

How to fix CVE-2025-3225?

1. Patch immediately: upgrade llama-index-readers-papers to >= 0.3.2 or llama-index to >= 0.12.29. 2. If patching is delayed, disable or sandbox the Papers Loader until patched. 3. Compensating control: apply XML entity limits at the parser level (Python: use defusedxml or set entity expansion limits). 4. Validate and allowlist sitemap URLs before processing — reject untrusted or user-supplied URLs. 5. Monitor document ingestion workers for abnormal memory spikes as a detection signal. 6. Audit your dependency tree for llama-index-readers-papers usage across all services.

What systems are affected by CVE-2025-3225?

This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, document ingestion pipelines, LLM agent frameworks, automated data loaders.

What is the CVSS score for CVE-2025-3225?

CVE-2025-3225 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.41%.

What is the AI security impact?

Affected AI Architectures

RAG pipelinesdocument ingestion pipelinesLLM agent frameworksautomated data loaders

MITRE ATLAS Techniques

AML.T0010.001 AI Software

AML.T0029 Denial of AI Service

AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Art. 9

ISO 42001: 6.1.2

NIST AI RMF: MANAGE 2.2

OWASP LLM Top 10: LLM05:2025

What are the technical details?

Original Advisory

An XML Entity Expansion vulnerability, also known as a 'billion laughs' attack, exists in the sitemap parser of the run-llama/llama_index repository, specifically affecting the Papers Loaders package before version 0.3.2 (in llama-index v0.10.0 and above through v0.12.29). This vulnerability allows an attacker to supply a malicious Sitemap XML, leading to a Denial of Service (DoS) by exhausting system memory and potentially causing a system crash. The issue is resolved in version 0.3.2 (in llama-index 0.12.29).

Exploitation Scenario

An adversary targeting an organization's RAG pipeline identifies that it uses llama-index to ingest papers from external sitemaps. They craft a malicious XML sitemap containing deeply nested entity references (classic billion-laughs structure) and either submit it via a public-facing document upload endpoint, inject the URL into an automated pipeline that crawls academic sources, or host it on a compromised domain the pipeline is configured to ingest. When the Papers Loader parses the sitemap, recursive entity expansion consumes all available memory, crashing the ingestion worker and halting RAG knowledge base updates.

Weaknesses (CWE)

CWE-776 Improper Restriction of Recursive Entity References in DTDs ('XML Entity Expansion') Primary

CWE-776 — Improper Restriction of Recursive Entity References in DTDs ('XML Entity Expansion'): The product uses XML documents and allows their structure to be defined with a Document Type Definition (DTD), but it does not properly control the number of recursive definitions of entities.

[Operation] If possible, prohibit the use of DTDs or use an XML parser that limits the expansion of recursive DTD entities.
[Implementation] Before parsing XML files with associated DTDs, scan for recursive entity declarations and do not continue parsing potentially explosive content.

Source: MITRE CWE corpus.