CVE-2024-5998: LangChain: RCE via FAISS pickle deserialization
HIGH PoC AVAILABLE CISA: TRACK*Any LangChain deployment loading FAISS vector store indexes from external or shared sources is vulnerable to full host compromise. An attacker who controls a FAISS index file can execute arbitrary OS commands the moment the file is deserialized — a routine operation in RAG pipelines. Patch immediately and treat all FAISS index files as untrusted input requiring integrity verification.
Risk Assessment
High operational risk for AI/ML teams despite the Local attack vector classification. In practice, loading a vector store from S3, a shared drive, or a model registry is standard pipeline behavior — the required 'user interaction' is indistinguishable from normal operations. Pickle-based RCE is a well-understood exploit class requiring minimal attacker skill. Full C/I/A impact means complete host compromise, including exfiltration of API keys, model artifacts, and training data. Exposure is broad given LangChain's dominant market position in enterprise RAG deployments.
Affected Systems
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| langchain | pip | — | No patch |
Do you use langchain? You're affected.
Severity & Risk
Attack Surface
Recommended Action
6 steps-
PATCH
Update langchain to the version containing commit 604dfe2d99246b0c09f047c604f0c63eafba31e7 or later — verify with pip show langchain.
-
AUDIT
Grep codebase for FAISS.deserialize_from_bytes and FAISS.load_local — document every call site and its data source.
-
INPUT CONTROL
Only load FAISS indexes from internally generated, cryptographically signed sources; implement SHA-256 hash verification before deserialization.
-
SANDBOX
Run FAISS deserialization in restricted containers with seccomp profiles blocking execve/os.system syscalls.
-
DETECT
Alert on unexpected subprocess or shell spawning from Python processes handling vector stores; monitor for outbound connections post-deserialization.
-
WORKAROUND (pre-patch): Replace FAISS.deserialize_from_bytes with safer alternatives like FAISS.load_local with allow_dangerous_deserialization=False where available, or switch to a non-pickle vector store format.
CISA SSVC Assessment
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2024-5998?
Any LangChain deployment loading FAISS vector store indexes from external or shared sources is vulnerable to full host compromise. An attacker who controls a FAISS index file can execute arbitrary OS commands the moment the file is deserialized — a routine operation in RAG pipelines. Patch immediately and treat all FAISS index files as untrusted input requiring integrity verification.
Is CVE-2024-5998 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2024-5998, increasing the risk of exploitation.
How to fix CVE-2024-5998?
1. PATCH: Update langchain to the version containing commit 604dfe2d99246b0c09f047c604f0c63eafba31e7 or later — verify with pip show langchain. 2. AUDIT: Grep codebase for FAISS.deserialize_from_bytes and FAISS.load_local — document every call site and its data source. 3. INPUT CONTROL: Only load FAISS indexes from internally generated, cryptographically signed sources; implement SHA-256 hash verification before deserialization. 4. SANDBOX: Run FAISS deserialization in restricted containers with seccomp profiles blocking execve/os.system syscalls. 5. DETECT: Alert on unexpected subprocess or shell spawning from Python processes handling vector stores; monitor for outbound connections post-deserialization. 6. WORKAROUND (pre-patch): Replace FAISS.deserialize_from_bytes with safer alternatives like FAISS.load_local with allow_dangerous_deserialization=False where available, or switch to a non-pickle vector store format.
What systems are affected by CVE-2024-5998?
This vulnerability affects the following AI/ML architecture patterns: RAG pipelines, vector databases, agent frameworks, document QA systems, knowledge base systems.
What is the CVSS score for CVE-2024-5998?
CVE-2024-5998 has a CVSS v3.1 base score of 7.8 (HIGH). The EPSS exploitation probability is 0.09%.
Technical Details
NVD Description
A vulnerability in the FAISS.deserialize_from_bytes function of langchain-ai/langchain allows for pickle deserialization of untrusted data. This can lead to the execution of arbitrary commands via the os.system function. The issue affects the latest version of the product.
Exploitation Scenario
Attacker identifies a victim organization's RAG pipeline that loads FAISS indexes from an S3 bucket shared with external collaborators or loaded from user-supplied files. Using publicly documented pickle exploit techniques, the attacker crafts a FAISS index file embedding a malicious pickle payload: __reduce__ returning os.system with a reverse shell command. The file is uploaded to the S3 bucket or supplied via an API endpoint accepting vector store uploads. When the LangChain application calls FAISS.deserialize_from_bytes() or FAISS.load_local() during normal startup or query handling, the payload executes with application privileges — establishing persistent access, exfiltrating OpenAI/Anthropic API keys from environment variables, and pivoting to internal model infrastructure.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H References
Timeline
Related Vulnerabilities
CVE-2025-2828 10.0 LangChain RequestsToolkit: SSRF exposes cloud metadata
Same package: langchain CVE-2023-34541 9.8 LangChain: RCE via unsafe load_prompt deserialization
Same package: langchain CVE-2023-29374 9.8 LangChain: RCE via prompt injection in LLMMathChain
Same package: langchain CVE-2023-34540 9.8 LangChain: RCE via JiraAPIWrapper crafted input
Same package: langchain CVE-2023-36258 9.8 LangChain: unauthenticated RCE via code injection
Same package: langchain
AI Threat Alert