CVE-2025-6985: langchain-text-splitters: XXE enables arbitrary file read
GHSA-m42m-m8cr-8m58 HIGH PoC AVAILABLE CISA: TRACK*Upgrade langchain-text-splitters to 0.3.9 immediately — any deployment using HTMLSectionSplitter with user-supplied or external XSLT is fully exposed with zero authentication required. This is a direct, unauthenticated path to reading SSH keys, API credentials, and .env files from your LangChain process. If you cannot patch now, remove custom XSLT input at the application layer and restrict the process filesystem access.
What is the risk?
High risk for organizations running LangChain-based pipelines that accept external or user-controlled XSLT input. CVSS 7.5 with network vector, no privileges, and no user interaction makes this trivially exploitable against exposed endpoints. EPSS of 0.00235 suggests no active widespread exploitation yet, but the attack surface is broad given LangChain's adoption in production AI systems. The primary risk is credential and secrets exfiltration rather than code execution, with cloud metadata endpoints (AWS IMDS, GCP) representing a critical secondary exposure.
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| LangChain | pip | < 0.3.9 | 0.3.9 |
Do you use LangChain? You're affected.
How severe is it?
What is the attack surface?
What should I do?
6 steps-
PATCH
Upgrade langchain-text-splitters to >= 0.3.9 immediately.
-
WORKAROUND
Disable or restrict custom XSLT input paths in your application before patching.
-
LEAST PRIVILEGE
Run LangChain processes with restricted filesystem access — no access to ~/.ssh/, .env files, or instance metadata paths.
-
NETWORK CONTROL
Block outbound HTTP/HTTPS from LangChain processes to cloud metadata IPs (169.254.169.254, 169.254.169.254/latest).
-
DETECTION
Monitor for unusual file access from LangChain processes, especially to /etc/passwd, ~/.ssh/, home directories, and metadata endpoints.
-
AUDIT
Inventory all services using HTMLSectionSplitter — check if XSLT input originates from user-controlled or external sources.
What does CISA's SSVC say?
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2025-6985?
Upgrade langchain-text-splitters to 0.3.9 immediately — any deployment using HTMLSectionSplitter with user-supplied or external XSLT is fully exposed with zero authentication required. This is a direct, unauthenticated path to reading SSH keys, API credentials, and .env files from your LangChain process. If you cannot patch now, remove custom XSLT input at the application layer and restrict the process filesystem access.
Is CVE-2025-6985 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2025-6985, increasing the risk of exploitation.
How to fix CVE-2025-6985?
1. PATCH: Upgrade langchain-text-splitters to >= 0.3.9 immediately. 2. WORKAROUND: Disable or restrict custom XSLT input paths in your application before patching. 3. LEAST PRIVILEGE: Run LangChain processes with restricted filesystem access — no access to ~/.ssh/, .env files, or instance metadata paths. 4. NETWORK CONTROL: Block outbound HTTP/HTTPS from LangChain processes to cloud metadata IPs (169.254.169.254, 169.254.169.254/latest). 5. DETECTION: Monitor for unusual file access from LangChain processes, especially to /etc/passwd, ~/.ssh/, home directories, and metadata endpoints. 6. AUDIT: Inventory all services using HTMLSectionSplitter — check if XSLT input originates from user-controlled or external sources.
What systems are affected by CVE-2025-6985?
This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, document ingestion pipelines, LLM agent frameworks, data preprocessing pipelines.
What is the CVSS score for CVE-2025-6985?
CVE-2025-6985 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.61%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010.001 AI Software AML.T0025 Exfiltration via Cyber Means AML.T0037 Data from Local System AML.T0049 Exploit Public-Facing Application AML.T0055 Unsecured Credentials Compliance Controls Affected
What are the technical details?
Original Advisory
The HTMLSectionSplitter class in langchain-text-splitters version 0.3.8 is vulnerable to XML External Entity (XXE) attacks due to unsafe XSLT parsing. This vulnerability arises because the class allows the use of arbitrary XSLT stylesheets, which are parsed using lxml.etree.parse() and lxml.etree.XSLT() without any hardening measures. In lxml versions up to 4.9.x, external entities are resolved by default, allowing attackers to read arbitrary local files or perform outbound HTTP(S) fetches. In lxml versions 5.0 and above, while entity expansion is disabled, the XSLT document() function can still read any URI unless XSLTAccessControl is applied. This vulnerability allows remote attackers to gain read-only access to any file the LangChain process can reach, including sensitive files such as SSH keys, environment files, source code, or cloud metadata. No authentication, special privileges, or user interaction are required, and the issue is exploitable in default deployments that enable custom XSLT.
Exploitation Scenario
An adversary submits a crafted HTML document containing a malicious embedded XSLT stylesheet to a RAG document ingestion endpoint. The XSLT instructs lxml to read /app/.env (containing LLM API keys and database credentials) or the AWS IMDS endpoint at http://169.254.169.254/latest/meta-data/iam/security-credentials/ to retrieve temporary IAM credentials. File contents surface in parsed output or error responses, enabling full credential exfiltration with no authentication, privileges, or user interaction. In batch pipelines, the attack can be embedded in a document uploaded to a monitored S3 bucket or shared drive, triggering passively on the next ingestion run.
Weaknesses (CWE)
CWE-611 Improper Restriction of XML External Entity Reference
Primary
CWE-611 Improper Restriction of XML External Entity Reference CWE-611 — Improper Restriction of XML External Entity Reference: The product processes an XML document that can contain XML entities with URIs that resolve to documents outside of the intended sphere of control, causing the product to embed incorrect documents into its output.
- [Implementation, System Configuration] Many XML parsers and validators can be configured to disable external entity expansion.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N References
- github.com/advisories/GHSA-m42m-m8cr-8m58
- github.com/langchain-ai/langchain/commit/43eef435505a1c907227b724c0c760ad5fc01790
- github.com/langchain-ai/langchain/pull/31819
- nvd.nist.gov/vuln/detail/CVE-2025-6985
- huntr.com/bounties/cf78abbb-df3b-43de-b6ee-132b73ff8331
- github.com/ARPSyndicate/cve-scores Exploit
- github.com/fkie-cad/nvd-json-data-feeds Exploit
Timeline
Related Vulnerabilities
CVE-2025-2828 10.0 LangChain RequestsToolkit: SSRF exposes cloud metadata
Same package: langchain CVE-2023-34540 9.8 LangChain: RCE via JiraAPIWrapper crafted input
Same package: langchain CVE-2023-29374 9.8 LangChain: RCE via prompt injection in LLMMathChain
Same package: langchain CVE-2023-34541 9.8 LangChain: RCE via unsafe load_prompt deserialization
Same package: langchain CVE-2023-36258 9.8 LangChain: unauthenticated RCE via code injection
Same package: langchain