CVE-2025-6985: langchain-text-splitters: XXE enables arbitrary file read

GHSA-m42m-m8cr-8m58 HIGH PoC AVAILABLE CISA: TRACK*
Published October 6, 2025
CISO Take

Upgrade langchain-text-splitters to 0.3.9 immediately — any deployment using HTMLSectionSplitter with user-supplied or external XSLT is fully exposed with zero authentication required. This is a direct, unauthenticated path to reading SSH keys, API credentials, and .env files from your LangChain process. If you cannot patch now, remove custom XSLT input at the application layer and restrict the process filesystem access.

What is the risk?

High risk for organizations running LangChain-based pipelines that accept external or user-controlled XSLT input. CVSS 7.5 with network vector, no privileges, and no user interaction makes this trivially exploitable against exposed endpoints. EPSS of 0.00235 suggests no active widespread exploitation yet, but the attack surface is broad given LangChain's adoption in production AI systems. The primary risk is credential and secrets exfiltration rather than code execution, with cloud metadata endpoints (AWS IMDS, GCP) representing a critical secondary exposure.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
LangChain pip < 0.3.9 0.3.9
139.8K OpenSSF 5.9 2.7K dependents Pushed 2d ago 24% patched ~156d to patch Full package profile →

Do you use LangChain? You're affected.

How severe is it?

CVSS 3.1
7.5 / 10
EPSS
0.6%
chance of exploitation in 30 days
Higher than 45% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Unchanged
C High
I None
A None

What should I do?

6 steps
  1. PATCH

    Upgrade langchain-text-splitters to >= 0.3.9 immediately.

  2. WORKAROUND

    Disable or restrict custom XSLT input paths in your application before patching.

  3. LEAST PRIVILEGE

    Run LangChain processes with restricted filesystem access — no access to ~/.ssh/, .env files, or instance metadata paths.

  4. NETWORK CONTROL

    Block outbound HTTP/HTTPS from LangChain processes to cloud metadata IPs (169.254.169.254, 169.254.169.254/latest).

  5. DETECTION

    Monitor for unusual file access from LangChain processes, especially to /etc/passwd, ~/.ssh/, home directories, and metadata endpoints.

  6. AUDIT

    Inventory all services using HTMLSectionSplitter — check if XSLT input originates from user-controlled or external sources.

What does CISA's SSVC say?

Decision Track*
Exploitation poc
Automatable Yes
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
Clause 8.3 - AI Risk Treatment
NIST AI RMF
MANAGE 2.2 - AI risk controls are monitored and adjusted
OWASP LLM Top 10
LLM02:2025 - Sensitive Information Disclosure LLM03:2025 - Supply Chain

Frequently Asked Questions

What is CVE-2025-6985?

Upgrade langchain-text-splitters to 0.3.9 immediately — any deployment using HTMLSectionSplitter with user-supplied or external XSLT is fully exposed with zero authentication required. This is a direct, unauthenticated path to reading SSH keys, API credentials, and .env files from your LangChain process. If you cannot patch now, remove custom XSLT input at the application layer and restrict the process filesystem access.

Is CVE-2025-6985 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-6985, increasing the risk of exploitation.

How to fix CVE-2025-6985?

1. PATCH: Upgrade langchain-text-splitters to >= 0.3.9 immediately. 2. WORKAROUND: Disable or restrict custom XSLT input paths in your application before patching. 3. LEAST PRIVILEGE: Run LangChain processes with restricted filesystem access — no access to ~/.ssh/, .env files, or instance metadata paths. 4. NETWORK CONTROL: Block outbound HTTP/HTTPS from LangChain processes to cloud metadata IPs (169.254.169.254, 169.254.169.254/latest). 5. DETECTION: Monitor for unusual file access from LangChain processes, especially to /etc/passwd, ~/.ssh/, home directories, and metadata endpoints. 6. AUDIT: Inventory all services using HTMLSectionSplitter — check if XSLT input originates from user-controlled or external sources.

What systems are affected by CVE-2025-6985?

This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, document ingestion pipelines, LLM agent frameworks, data preprocessing pipelines.

What is the CVSS score for CVE-2025-6985?

CVE-2025-6985 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.61%.

What is the AI security impact?

Affected AI Architectures

RAG pipelinesdocument ingestion pipelinesLLM agent frameworksdata preprocessing pipelines

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0025 Exfiltration via Cyber Means
AML.T0037 Data from Local System
AML.T0049 Exploit Public-Facing Application
AML.T0055 Unsecured Credentials

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: Clause 8.3
NIST AI RMF: MANAGE 2.2
OWASP LLM Top 10: LLM02:2025, LLM03:2025

What are the technical details?

Original Advisory

The HTMLSectionSplitter class in langchain-text-splitters version 0.3.8 is vulnerable to XML External Entity (XXE) attacks due to unsafe XSLT parsing. This vulnerability arises because the class allows the use of arbitrary XSLT stylesheets, which are parsed using lxml.etree.parse() and lxml.etree.XSLT() without any hardening measures. In lxml versions up to 4.9.x, external entities are resolved by default, allowing attackers to read arbitrary local files or perform outbound HTTP(S) fetches. In lxml versions 5.0 and above, while entity expansion is disabled, the XSLT document() function can still read any URI unless XSLTAccessControl is applied. This vulnerability allows remote attackers to gain read-only access to any file the LangChain process can reach, including sensitive files such as SSH keys, environment files, source code, or cloud metadata. No authentication, special privileges, or user interaction are required, and the issue is exploitable in default deployments that enable custom XSLT.

Exploitation Scenario

An adversary submits a crafted HTML document containing a malicious embedded XSLT stylesheet to a RAG document ingestion endpoint. The XSLT instructs lxml to read /app/.env (containing LLM API keys and database credentials) or the AWS IMDS endpoint at http://169.254.169.254/latest/meta-data/iam/security-credentials/ to retrieve temporary IAM credentials. File contents surface in parsed output or error responses, enabling full credential exfiltration with no authentication, privileges, or user interaction. In batch pipelines, the attack can be embedded in a document uploaded to a monitored S3 bucket or shared drive, triggering passively on the next ingestion run.

Weaknesses (CWE)

CWE-611 — Improper Restriction of XML External Entity Reference: The product processes an XML document that can contain XML entities with URIs that resolve to documents outside of the intended sphere of control, causing the product to embed incorrect documents into its output.

  • [Implementation, System Configuration] Many XML parsers and validators can be configured to disable external entity expansion.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N

Timeline

Published
October 6, 2025
Last Modified
October 8, 2025
First Seen
October 6, 2025

Related Vulnerabilities