CVE-2025-6985: langchain-text-splitters: XXE enables arbitrary file read

GHSA-m42m-m8cr-8m58 HIGH PoC AVAILABLE CISA: TRACK*
Published October 6, 2025
CISO Take

Upgrade langchain-text-splitters to 0.3.9 immediately — any deployment using HTMLSectionSplitter with user-supplied or external XSLT is fully exposed with zero authentication required. This is a direct, unauthenticated path to reading SSH keys, API credentials, and .env files from your LangChain process. If you cannot patch now, remove custom XSLT input at the application layer and restrict the process filesystem access.

Risk Assessment

High risk for organizations running LangChain-based pipelines that accept external or user-controlled XSLT input. CVSS 7.5 with network vector, no privileges, and no user interaction makes this trivially exploitable against exposed endpoints. EPSS of 0.00235 suggests no active widespread exploitation yet, but the attack surface is broad given LangChain's adoption in production AI systems. The primary risk is credential and secrets exfiltration rather than code execution, with cloud metadata endpoints (AWS IMDS, GCP) representing a critical secondary exposure.

Affected Systems

Package Ecosystem Vulnerable Range Patched
langchain-text-splitters pip < 0.3.9 0.3.9
135.7K OpenSSF 6.5 2.6K dependents Pushed 7d ago 17% patched ~256d to patch Full package profile →

Do you use langchain-text-splitters? You're affected.

Severity & Risk

CVSS 3.1
7.5 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 42% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Unchanged
C High
I None
A None

Recommended Action

6 steps
  1. PATCH

    Upgrade langchain-text-splitters to >= 0.3.9 immediately.

  2. WORKAROUND

    Disable or restrict custom XSLT input paths in your application before patching.

  3. LEAST PRIVILEGE

    Run LangChain processes with restricted filesystem access — no access to ~/.ssh/, .env files, or instance metadata paths.

  4. NETWORK CONTROL

    Block outbound HTTP/HTTPS from LangChain processes to cloud metadata IPs (169.254.169.254, 169.254.169.254/latest).

  5. DETECTION

    Monitor for unusual file access from LangChain processes, especially to /etc/passwd, ~/.ssh/, home directories, and metadata endpoints.

  6. AUDIT

    Inventory all services using HTMLSectionSplitter — check if XSLT input originates from user-controlled or external sources.

CISA SSVC Assessment

Decision Track*
Exploitation poc
Automatable Yes
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
Clause 8.3 - AI Risk Treatment
NIST AI RMF
MANAGE 2.2 - AI risk controls are monitored and adjusted
OWASP LLM Top 10
LLM02:2025 - Sensitive Information Disclosure LLM03:2025 - Supply Chain

Frequently Asked Questions

What is CVE-2025-6985?

Upgrade langchain-text-splitters to 0.3.9 immediately — any deployment using HTMLSectionSplitter with user-supplied or external XSLT is fully exposed with zero authentication required. This is a direct, unauthenticated path to reading SSH keys, API credentials, and .env files from your LangChain process. If you cannot patch now, remove custom XSLT input at the application layer and restrict the process filesystem access.

Is CVE-2025-6985 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-6985, increasing the risk of exploitation.

How to fix CVE-2025-6985?

1. PATCH: Upgrade langchain-text-splitters to >= 0.3.9 immediately. 2. WORKAROUND: Disable or restrict custom XSLT input paths in your application before patching. 3. LEAST PRIVILEGE: Run LangChain processes with restricted filesystem access — no access to ~/.ssh/, .env files, or instance metadata paths. 4. NETWORK CONTROL: Block outbound HTTP/HTTPS from LangChain processes to cloud metadata IPs (169.254.169.254, 169.254.169.254/latest). 5. DETECTION: Monitor for unusual file access from LangChain processes, especially to /etc/passwd, ~/.ssh/, home directories, and metadata endpoints. 6. AUDIT: Inventory all services using HTMLSectionSplitter — check if XSLT input originates from user-controlled or external sources.

What systems are affected by CVE-2025-6985?

This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, document ingestion pipelines, LLM agent frameworks, data preprocessing pipelines.

What is the CVSS score for CVE-2025-6985?

CVE-2025-6985 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.20%.

Technical Details

NVD Description

The HTMLSectionSplitter class in langchain-text-splitters version 0.3.8 is vulnerable to XML External Entity (XXE) attacks due to unsafe XSLT parsing. This vulnerability arises because the class allows the use of arbitrary XSLT stylesheets, which are parsed using lxml.etree.parse() and lxml.etree.XSLT() without any hardening measures. In lxml versions up to 4.9.x, external entities are resolved by default, allowing attackers to read arbitrary local files or perform outbound HTTP(S) fetches. In lxml versions 5.0 and above, while entity expansion is disabled, the XSLT document() function can still read any URI unless XSLTAccessControl is applied. This vulnerability allows remote attackers to gain read-only access to any file the LangChain process can reach, including sensitive files such as SSH keys, environment files, source code, or cloud metadata. No authentication, special privileges, or user interaction are required, and the issue is exploitable in default deployments that enable custom XSLT.

Exploitation Scenario

An adversary submits a crafted HTML document containing a malicious embedded XSLT stylesheet to a RAG document ingestion endpoint. The XSLT instructs lxml to read /app/.env (containing LLM API keys and database credentials) or the AWS IMDS endpoint at http://169.254.169.254/latest/meta-data/iam/security-credentials/ to retrieve temporary IAM credentials. File contents surface in parsed output or error responses, enabling full credential exfiltration with no authentication, privileges, or user interaction. In batch pipelines, the attack can be embedded in a document uploaded to a monitored S3 bucket or shared drive, triggering passively on the next ingestion run.

CVSS Vector

CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N

Timeline

Published
October 6, 2025
Last Modified
October 8, 2025
First Seen
October 6, 2025

Related Vulnerabilities