CVE-2024-2965: langchain-community: DoS via recursive sitemap loop

GHSA-3hjh-jh2h-vrg6 MEDIUM CISA: TRACK*
Published June 6, 2024
CISO Take

If your LangChain-based application uses SitemapLoader with user-controlled URLs, an attacker can crash the Python process by submitting a self-referencing sitemap — taking the service offline. Upgrade langchain-community to 0.2.5 or later. Low urgency for environments where sitemap URLs are hardcoded or admin-controlled; high urgency for multi-tenant or user-driven data ingestion pipelines.

What is the risk?

Low-to-medium real-world risk. CVSS 4.2 with AV:P reflects conservative NVD scoring, but any deployment accepting user-supplied sitemap URLs is effectively remotely exploitable. EPSS of 0.00038 and absence from CISA KEV confirm no observed in-the-wild exploitation. Primary impact is availability: a single malicious input can crash the entire AI service process. Risk multiplies in multi-tenant architectures where one user can affect all others.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
LangChain pip >= 0, < 0.2.5 0.2.5
139.8K OpenSSF 5.9 2.7K dependents Pushed 3d ago 24% patched ~156d to patch Full package profile →
LangChain Community pip < 0.2.5 0.2.5
139.8K OpenSSF 5.9 1.2K dependents Pushed 3d ago 57% patched ~48d to patch Full package profile →

How severe is it?

CVSS 3.1
4.2 / 10
EPSS
0.3%
chance of exploitation in 30 days
Higher than 22% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Physical
AC High
PR None
UI None
S Unchanged
C None
I None
A High

What should I do?

1 step
  1. 1) Patch: Upgrade langchain and langchain-community to >= 0.2.5 — this is the only complete fix. 2) Audit: Grep codebase for SitemapLoader usage and determine whether URLs are user-controlled or hardcoded. 3) Workaround (if patching is delayed): Wrap parse_sitemap calls in try/except RecursionError with circuit-breaker logic, or enforce a URL allowlist for permitted sitemaps. 4) Detect: Alert on RecursionError exceptions in AI service logs — unexpected occurrences may indicate active exploitation attempts. 5) Dependency scanning: Add langchain-community to SCA tooling with a fail-build rule for versions < 0.2.5.

What does CISA's SSVC say?

Decision Track*
Exploitation poc
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 9 - Risk management system
ISO 42001
Clause 8 - Operation
NIST AI RMF
MANAGE-2.2 - Mechanisms to address identified AI risks
OWASP LLM Top 10
LLM04 - Model Denial of Service LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2024-2965?

If your LangChain-based application uses SitemapLoader with user-controlled URLs, an attacker can crash the Python process by submitting a self-referencing sitemap — taking the service offline. Upgrade langchain-community to 0.2.5 or later. Low urgency for environments where sitemap URLs are hardcoded or admin-controlled; high urgency for multi-tenant or user-driven data ingestion pipelines.

Is CVE-2024-2965 actively exploited?

No confirmed active exploitation of CVE-2024-2965 has been reported, but organizations should still patch proactively.

How to fix CVE-2024-2965?

1) Patch: Upgrade langchain and langchain-community to >= 0.2.5 — this is the only complete fix. 2) Audit: Grep codebase for SitemapLoader usage and determine whether URLs are user-controlled or hardcoded. 3) Workaround (if patching is delayed): Wrap parse_sitemap calls in try/except RecursionError with circuit-breaker logic, or enforce a URL allowlist for permitted sitemaps. 4) Detect: Alert on RecursionError exceptions in AI service logs — unexpected occurrences may indicate active exploitation attempts. 5) Dependency scanning: Add langchain-community to SCA tooling with a fail-build rule for versions < 0.2.5.

What systems are affected by CVE-2024-2965?

This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, agent frameworks, document processing pipelines.

What is the CVSS score for CVE-2024-2965?

CVE-2024-2965 has a CVSS v3.1 base score of 4.2 (MEDIUM). The EPSS exploitation probability is 0.30%.

What is the AI security impact?

Affected AI Architectures

RAG pipelinesagent frameworksdocument processing pipelines

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0029 Denial of AI Service
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 9
ISO 42001: Clause 8
NIST AI RMF: MANAGE-2.2
OWASP LLM Top 10: LLM04, LLM05

What are the technical details?

Original Advisory

Denial of service in `SitemapLoader` Document Loader in the `langchain-community` package, affecting versions below 0.2.5. The `parse_sitemap` method, responsible for parsing sitemaps and extracting URLs, lacks a mechanism to prevent infinite recursion when a sitemap URL refers to the current sitemap itself. This oversight allows for the possibility of an infinite loop, leading to a crash by exceeding the maximum recursion depth in Python. This vulnerability can be exploited to occupy server socket/port resources and crash the Python process, impacting the availability of services relying on this functionality.

Exploitation Scenario

An adversary targeting a RAG-powered product that allows users to specify external documentation sources submits a URL pointing to a crafted sitemap.xml that references itself (e.g., <sitemap><loc>https://attacker.com/sitemap.xml</loc></sitemap>). LangChain's SitemapLoader fetches it and calls parse_sitemap recursively with no depth limit or cycle detection. After ~1000 recursive calls, Python raises RecursionError and the process crashes. In a shared SaaS environment, this takes down the service for all tenants simultaneously. Attacker effort: minimal — a single HTTP request with a self-referencing sitemap is sufficient.

Weaknesses (CWE)

CWE-400 — Uncontrolled Resource Consumption: The product does not properly control the allocation and maintenance of a limited resource.

  • [Architecture and Design] Design throttling mechanisms into the system architecture. The best protection is to limit the amount of resources that an unauthorized user can cause to be expended. A strong authentication and access control model will help prevent such attacks from occurring in the first place. The login application should be protected against DoS attacks as much as possible. Limiting the database access, perhaps by caching result sets, can help minimize the resources expended. To further limit the potential for a DoS attack, consider tracking the rate of requests received from users and blocking requests that exceed a defined rate threshold.
  • [Architecture and Design] Mitigation of resource exhaustion attacks requires that the target system either: The first of these solutions is an issue in itself though, since it may allow attackers to prevent the use of the system by a particular valid user. If the attacker impersonates the valid user, they may be able to prevent the user from accessing the server in question. The second solution is simply difficult to effectively institute -- and even when properly done, it does not provide a full solution. It simply makes the attack require more resources on the part of the attacker. recognizes the attack and denies that user further access for a given amount of time, or uniformly throttles all requests in order to make it more difficult to consume resources more quickly than they can again be freed.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.0/AV:P/AC:H/PR:N/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
June 6, 2024
Last Modified
November 4, 2024
First Seen
March 24, 2026

Related Vulnerabilities