CVE-2024-12704: llama-index: DoS via infinite loop in LangChain LLM
GHSA-j3wr-m6xh-64hg HIGH PoC AVAILABLE CISA: TRACK*Any production service using llama-index with LangChain LLM streaming is vulnerable to process hang with zero authentication required — attacker just sends a malformed input. Upgrade llama-index-core to 0.12.6 immediately; if you cannot patch now, disable or gate the streaming endpoint. EPSS is low (0.27%) but the exploit is trivial and the blast radius covers all RAG and agent pipelines using this integration.
Risk Assessment
HIGH severity (CVSS 7.5) with a trivial exploitation path: network-accessible, no privileges, no user interaction required. The vulnerability is a pure availability impact — no data exposure or privilege escalation. EPSS of 0.00271 suggests no observed mass exploitation yet, but the attack primitive (sending a wrong-type input to a streaming endpoint) requires zero AI/ML expertise. Risk is elevated for any team running llama-index in production with LangChain LLM wrappers and public-facing APIs.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| llama-index-core | pip | < 0.12.6 | 0.12.6 |
| llamaindex | pip | — | No patch |
Severity & Risk
Attack Surface
Recommended Action
6 steps-
PATCH
Upgrade llama-index-core to >= 0.12.6 (patch commit d1ecfb77). This is the only complete fix.
-
WORKAROUND (if immediate patch is not possible): Replace calls to stream_complete on LangChainLLM instances with synchronous complete; remove streaming endpoints from public exposure.
-
INPUT VALIDATION
Add type-checking middleware to reject malformed inputs before they reach LLM wrappers.
-
CIRCUIT BREAKER
Implement per-request timeouts (e.g., 30s) and process-level watchdogs (e.g., supervisord, Kubernetes liveness probes) to auto-restart hung workers.
-
DETECTION
Monitor for LLM inference worker threads that do not terminate within expected latency windows; alert on CPU spikes correlated with incomplete LLM responses.
-
AUDIT
Inventory all internal services importing llama-index and check version with: pip show llama-index-core | grep Version
CISA SSVC Assessment
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2024-12704?
Any production service using llama-index with LangChain LLM streaming is vulnerable to process hang with zero authentication required — attacker just sends a malformed input. Upgrade llama-index-core to 0.12.6 immediately; if you cannot patch now, disable or gate the streaming endpoint. EPSS is low (0.27%) but the exploit is trivial and the blast radius covers all RAG and agent pipelines using this integration.
Is CVE-2024-12704 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2024-12704, increasing the risk of exploitation.
How to fix CVE-2024-12704?
1. PATCH: Upgrade llama-index-core to >= 0.12.6 (patch commit d1ecfb77). This is the only complete fix. 2. WORKAROUND (if immediate patch is not possible): Replace calls to stream_complete on LangChainLLM instances with synchronous complete; remove streaming endpoints from public exposure. 3. INPUT VALIDATION: Add type-checking middleware to reject malformed inputs before they reach LLM wrappers. 4. CIRCUIT BREAKER: Implement per-request timeouts (e.g., 30s) and process-level watchdogs (e.g., supervisord, Kubernetes liveness probes) to auto-restart hung workers. 5. DETECTION: Monitor for LLM inference worker threads that do not terminate within expected latency windows; alert on CPU spikes correlated with incomplete LLM responses. 6. AUDIT: Inventory all internal services importing llama-index and check version with: pip show llama-index-core | grep Version
What systems are affected by CVE-2024-12704?
This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, agent frameworks, LLM serving (streaming), document processing pipelines, chatbot backends.
What is the CVSS score for CVE-2024-12704?
CVE-2024-12704 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.35%.
Technical Details
NVD Description
A vulnerability in the LangChainLLM class of the run-llama/llama_index repository, version v0.12.5, allows for a Denial of Service (DoS) attack. The stream_complete method executes the llm using a thread and retrieves the result via the get_response_gen method of the StreamingGeneratorCallbackHandler class. If the thread terminates abnormally before the _llm.predict is executed, there is no exception handling for this case, leading to an infinite loop in the get_response_gen function. This can be triggered by providing an input of an incorrect type, causing the thread to terminate and the process to continue running indefinitely.
Exploitation Scenario
An adversary identifies a public-facing endpoint (chatbot, document Q&A, or RAG API) built on llama-index. They send an HTTP request with a malformed payload — for example, passing an integer or list where the LangChainLLM wrapper expects a string prompt. The LangChainLLM.stream_complete method launches a background thread that crashes before _llm.predict executes. The main thread, waiting in get_response_gen, enters an infinite loop with no exit condition. The worker process hangs indefinitely. The attacker repeats the request to exhaust all available workers, bringing the service down. No authentication, no AI/ML knowledge, and no special tooling required — a single malformed HTTP request is sufficient.
Weaknesses (CWE)
CVSS Vector
CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H References
Timeline
Related Vulnerabilities
CVE-2024-23751 9.8 LlamaIndex: SQL injection in Text-to-SQL feature
Same package: llamaindex CVE-2024-14021 7.8 llamaindex: Deserialization enables RCE
Same package: llamaindex CVE-2024-58339 7.5 llamaindex: Resource Exhaustion enables DoS
Same package: llamaindex CVE-2024-12911 7.1 llama-index: SQLi+DoS via prompt injection in query engine
Same package: llamaindex CVE-2024-4181 llama_index: RCE via eval() in RunGptLLM connector
Same package: llamaindex
AI Threat Alert