CVE-2025-6985: LangChain XXE enables arbitrary file

CISO Take

Upgrade langchain-text-splitters to 0.3.9 immediately — any deployment using HTMLSectionSplitter with user-supplied or external XSLT is fully exposed with zero authentication required. This is a direct, unauthenticated path to reading SSH keys, API credentials, and .env files from your LangChain process. If you cannot patch now, remove custom XSLT input at the application layer and restrict the process filesystem access.

What is the risk?

High risk for organizations running LangChain-based pipelines that accept external or user-controlled XSLT input. CVSS 7.5 with network vector, no privileges, and no user interaction makes this trivially exploitable against exposed endpoints. EPSS of 0.00235 suggests no active widespread exploitation yet, but the attack surface is broad given LangChain's adoption in production AI systems. The primary risk is credential and secrets exfiltration rather than code execution, with cloud metadata endpoints (AWS IMDS, GCP) representing a critical secondary exposure.

What systems are affected?

Package	Ecosystem	Vulnerable Range	Patched
LangChain	pip	< 0.3.9	`0.3.9`
139.8K OpenSSF 5.9 2.7K dependents Pushed 2d ago 24% patched ~156d to patch Full package profile →

Do you use LangChain? You're affected.

How severe is it?

CVSS 3.1

7.5 / 10

EPSS

0.6%

chance of exploitation in 30 days

Higher than 45% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Trivial

Exploitation Confidence

medium

○ CISA SSVC: Public PoC

○ Public PoC indexed (trickest/cve)

Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV Network

AC Low

PR None

UI None

S Unchanged

C High

I None

A None

What should I do?

6 steps

PATCH

Upgrade langchain-text-splitters to >= 0.3.9 immediately.
WORKAROUND

Disable or restrict custom XSLT input paths in your application before patching.
LEAST PRIVILEGE

Run LangChain processes with restricted filesystem access — no access to ~/.ssh/, .env files, or instance metadata paths.
NETWORK CONTROL

Block outbound HTTP/HTTPS from LangChain processes to cloud metadata IPs (169.254.169.254, 169.254.169.254/latest).
DETECTION

Monitor for unusual file access from LangChain processes, especially to /etc/passwd, ~/.ssh/, home directories, and metadata endpoints.
AUDIT

Inventory all services using HTMLSectionSplitter — check if XSLT input originates from user-controlled or external sources.

What does CISA's SSVC say?

Decision Track*

Exploitation poc

Automatable Yes

Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Data Extraction Supply Chain Data Leakage Framework RAG Agent AML.T0010.001 - AI Software AML.T0025 - Exfiltration via Cyber Means AML.T0037 - Data from Local System AML.T0049 - Exploit Public-Facing Application AML.T0055 - Unsecured Credentials

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act

Article 15 - Accuracy, robustness and cybersecurity

ISO 42001

Clause 8.3 - AI Risk Treatment

NIST AI RMF

MANAGE 2.2 - AI risk controls are monitored and adjusted

OWASP LLM Top 10

LLM02:2025 - Sensitive Information Disclosure LLM03:2025 - Supply Chain

Frequently Asked Questions

What is CVE-2025-6985?

Upgrade langchain-text-splitters to 0.3.9 immediately — any deployment using HTMLSectionSplitter with user-supplied or external XSLT is fully exposed with zero authentication required. This is a direct, unauthenticated path to reading SSH keys, API credentials, and .env files from your LangChain process. If you cannot patch now, remove custom XSLT input at the application layer and restrict the process filesystem access.

Is CVE-2025-6985 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-6985, increasing the risk of exploitation.

How to fix CVE-2025-6985?

1. PATCH: Upgrade langchain-text-splitters to >= 0.3.9 immediately. 2. WORKAROUND: Disable or restrict custom XSLT input paths in your application before patching. 3. LEAST PRIVILEGE: Run LangChain processes with restricted filesystem access — no access to ~/.ssh/, .env files, or instance metadata paths. 4. NETWORK CONTROL: Block outbound HTTP/HTTPS from LangChain processes to cloud metadata IPs (169.254.169.254, 169.254.169.254/latest). 5. DETECTION: Monitor for unusual file access from LangChain processes, especially to /etc/passwd, ~/.ssh/, home directories, and metadata endpoints. 6. AUDIT: Inventory all services using HTMLSectionSplitter — check if XSLT input originates from user-controlled or external sources.

What systems are affected by CVE-2025-6985?

This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, document ingestion pipelines, LLM agent frameworks, data preprocessing pipelines.

What is the CVSS score for CVE-2025-6985?

CVE-2025-6985 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.61%.

What is the AI security impact?

Affected AI Architectures

RAG pipelinesdocument ingestion pipelinesLLM agent frameworksdata preprocessing pipelines

MITRE ATLAS Techniques

AML.T0010.001 AI Software

AML.T0025 Exfiltration via Cyber Means

AML.T0037 Data from Local System

AML.T0049 Exploit Public-Facing Application

AML.T0055 Unsecured Credentials

Compliance Controls Affected

EU AI Act: Article 15

ISO 42001: Clause 8.3

NIST AI RMF: MANAGE 2.2

OWASP LLM Top 10: LLM02:2025, LLM03:2025

What are the technical details?

Original Advisory

The HTMLSectionSplitter class in langchain-text-splitters version 0.3.8 is vulnerable to XML External Entity (XXE) attacks due to unsafe XSLT parsing. This vulnerability arises because the class allows the use of arbitrary XSLT stylesheets, which are parsed using lxml.etree.parse() and lxml.etree.XSLT() without any hardening measures. In lxml versions up to 4.9.x, external entities are resolved by default, allowing attackers to read arbitrary local files or perform outbound HTTP(S) fetches. In lxml versions 5.0 and above, while entity expansion is disabled, the XSLT document() function can still read any URI unless XSLTAccessControl is applied. This vulnerability allows remote attackers to gain read-only access to any file the LangChain process can reach, including sensitive files such as SSH keys, environment files, source code, or cloud metadata. No authentication, special privileges, or user interaction are required, and the issue is exploitable in default deployments that enable custom XSLT.

Exploitation Scenario

An adversary submits a crafted HTML document containing a malicious embedded XSLT stylesheet to a RAG document ingestion endpoint. The XSLT instructs lxml to read /app/.env (containing LLM API keys and database credentials) or the AWS IMDS endpoint at http://169.254.169.254/latest/meta-data/iam/security-credentials/ to retrieve temporary IAM credentials. File contents surface in parsed output or error responses, enabling full credential exfiltration with no authentication, privileges, or user interaction. In batch pipelines, the attack can be embedded in a document uploaded to a monitored S3 bucket or shared drive, triggering passively on the next ingestion run.

Weaknesses (CWE)

CWE-611 Improper Restriction of XML External Entity Reference Primary CWE-611 Improper Restriction of XML External Entity Reference

CWE-611 — Improper Restriction of XML External Entity Reference: The product processes an XML document that can contain XML entities with URIs that resolve to documents outside of the intended sphere of control, causing the product to embed incorrect documents into its output.