CVE-2024-4181: llama_index: RCE via eval() in RunGptLLM connector

UNKNOWN PoC AVAILABLE
Published May 16, 2024
CISO Take

If your team uses llama_index's RunGptLLM class (JinaAI RunGpt integration), upgrade to v0.10.13 or later immediately. A malicious or compromised LLM hosting provider can execute arbitrary code on client machines via unsanitized eval() calls. Patching is necessary but insufficient — also audit which LLM providers you trust with execution-level access to your infrastructure.

What is the risk?

Effectively Critical despite missing CVSS scores. The attack requires an adversary to control or compromise an LLM hosting provider, raising the bar slightly — but that scenario is realistic given supply chain attacks, MITM, or provider breaches. Impact is full RCE on any machine running the vulnerable library. AI/ML pipelines are particularly exposed as they often run with elevated privileges and broad network access to sensitive internal systems and secrets stores.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
LlamaIndex pip No patch
50.2K Pushed 4d ago 0% patched Full package profile →

Do you use LlamaIndex? You're affected.

How severe is it?

CVSS 3.1
N/A
EPSS
2.1%
chance of exploitation in 30 days
Higher than 79% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Moderate
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What should I do?

6 steps
  1. Upgrade llama_index to v0.10.13 or later immediately — this is the only complete fix.

  2. Audit all LLM connector classes in your llama_index deployments for eval() or exec() patterns in response handling.

  3. Enumerate which LLM hosting providers your AI pipelines connect to; apply zero-trust principles and validate provider authenticity.

  4. Network-segment AI inference hosts to limit blast radius from RCE — restrict outbound connections from pipeline processes.

  5. Monitor AI pipeline processes for unexpected subprocess spawning, outbound connections, or credential access patterns.

  6. Rotate any secrets (API keys, DB credentials) accessible from environments running the vulnerable version.

What does CISA's SSVC say?

Decision Track
Exploitation none
Automatable No
Technical Impact total

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2 - AI supply chain management
NIST AI RMF
GOVERN-6.1 - Policies and procedures for AI supply chain risk
OWASP LLM Top 10
LLM02 - Insecure Output Handling LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2024-4181?

If your team uses llama_index's RunGptLLM class (JinaAI RunGpt integration), upgrade to v0.10.13 or later immediately. A malicious or compromised LLM hosting provider can execute arbitrary code on client machines via unsanitized eval() calls. Patching is necessary but insufficient — also audit which LLM providers you trust with execution-level access to your infrastructure.

Is CVE-2024-4181 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2024-4181, increasing the risk of exploitation.

How to fix CVE-2024-4181?

1. Upgrade llama_index to v0.10.13 or later immediately — this is the only complete fix. 2. Audit all LLM connector classes in your llama_index deployments for eval() or exec() patterns in response handling. 3. Enumerate which LLM hosting providers your AI pipelines connect to; apply zero-trust principles and validate provider authenticity. 4. Network-segment AI inference hosts to limit blast radius from RCE — restrict outbound connections from pipeline processes. 5. Monitor AI pipeline processes for unexpected subprocess spawning, outbound connections, or credential access patterns. 6. Rotate any secrets (API keys, DB credentials) accessible from environments running the vulnerable version.

What systems are affected by CVE-2024-4181?

This vulnerability affects the following AI/ML architecture patterns: agent frameworks, LLM integrations, RAG pipelines, AI application backends, inference pipelines.

What is the CVSS score for CVE-2024-4181?

No CVSS score has been assigned yet.

What is the AI security impact?

Affected AI Architectures

agent frameworksLLM integrationsRAG pipelinesAI application backendsinference pipelines

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0011.000 Unsafe AI Artifacts
AML.T0050 Command and Scripting Interpreter

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: A.6.2
NIST AI RMF: GOVERN-6.1
OWASP LLM Top 10: LLM02, LLM05

What are the technical details?

Original Advisory

A command injection vulnerability exists in the RunGptLLM class of the llama_index library, version 0.9.47, used by the RunGpt framework from JinaAI to connect to Language Learning Models (LLMs). The vulnerability arises from the improper use of the eval function, allowing a malicious or compromised LLM hosting provider to execute arbitrary commands on the client's machine. This issue was fixed in version 0.10.13. The exploitation of this vulnerability could lead to a hosting provider gaining full control over client machines.

Exploitation Scenario

An adversary operates or compromises an LLM hosting provider compatible with the JinaAI RunGpt interface. When a victim's application queries the LLM via llama_index's RunGptLLM connector, the provider returns a crafted response containing malicious Python code. The unpatched library passes this response directly to eval(), executing the payload with the privileges of the AI application process. The attacker achieves RCE sufficient to exfiltrate API keys, database credentials, and model artifacts, or to establish a reverse shell for persistent access — all triggered silently during a routine LLM inference call.

Weaknesses (CWE)

CWE-94 — Improper Control of Generation of Code ('Code Injection'): The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the syntax or behavior of the intended code segment.

  • [Architecture and Design] Refactor your program so that you do not have to dynamically generate code.
  • [Architecture and Design] Run your code in a "jail" or similar sandbox environment that enforces strict boundaries between the process and the operating system. This may effectively restrict which code can be executed by your product. Examples include the Unix chroot jail and AppArmor. In general, managed code may provide some protection. This may not be a feasible solution, and it only limits the impact to the operating system; the rest of your application may still be subject to compromise. Be careful to avoid CWE-243 and other weaknesses related to jails.

Source: MITRE CWE corpus.

Timeline

Published
May 16, 2024
Last Modified
October 21, 2025
First Seen
May 16, 2024

Related Vulnerabilities