vLLM's --revision and --code-revision deployment pins fail to propagate to secondary artifacts — including dynamic code modules, GGUF files, image processors, and same-repository side weights — meaning a deployment you believe is locked to an audited revision can silently load behavior-affecting components from a mutable, unreviewed source. With 130 downstream dependents and 46 prior CVEs in this package, vLLM is a high-value supply chain target; the CVSS integrity impact is rated High (I:H), directly undermining any compliance program that cites model revision pins as provenance evidence for ISO 42001 or EU AI Act audits. No public exploit or active exploitation exists today, but the window is exploitable by any adversary with write access to a secondary artifact's default branch. Upgrade to vLLM 0.22.0, which patches all affected code paths via PR #42616; until patched, switch to locally-cached, hash-verified model artifacts rather than live Hugging Face Hub resolution.
What is the risk?
Medium overall risk (CVSS 6.5) with disproportionate compliance impact. High attack complexity (AC:H) limits opportunistic exploitation, requiring an adversary with write access to a HuggingFace repository's default branch or the ability to compromise a secondary artifact source. However, the integrity impact is High (I:H) and the damage is stealthy — a pinned deployment silently diverges from its audited state without any observable configuration change. For organizations using vLLM revision pins as compliance evidence under ISO 42001 or EU AI Act, this is a control failure, not merely a technical flaw. The 130-dependent blast radius means compromise of any shared secondary model artifact could cascade across downstream deployments without detection.
How does the attack unfold?
What systems are affected?
| Package | Ecosystem | Vulnerable Range | Patched |
|---|---|---|---|
| vLLM | pip | < 0.22.0 | 0.22.0 |
Do you use vLLM? You're affected.
How severe is it?
What is the attack surface?
What should I do?
5 steps-
Patch: Upgrade vLLM to ≥0.22.0 immediately (PR #42616 fixes all seven identified code paths across registry.py, gguf_loader.py, roberta.py, kimi_k25.py, kimi_audio.py, and default_loader.py).
-
Audit: Identify all production vLLM deployments using --revision or --code-revision flags; flag those serving Kimi-Audio, Kimi-K2.5, BGE-M3, or GGUF-format models as highest priority.
-
Workaround (pre-patch): Snapshot and locally cache all model artifacts including secondary weights with SHA-256 hashes before deployment; configure vLLM to load from local_dir instead of live hub resolution to eliminate unpinned fetches.
-
Compliance: Review existing audit evidence that cites vLLM revision pins — those records may not accurately represent the full artifact set served in production; flag for re-attestation after patching.
-
Detection: If your environment supports HuggingFace Hub audit logging, monitor for secondary artifact fetches that reference revisions different from your configured pin.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2026-47155?
vLLM's --revision and --code-revision deployment pins fail to propagate to secondary artifacts — including dynamic code modules, GGUF files, image processors, and same-repository side weights — meaning a deployment you believe is locked to an audited revision can silently load behavior-affecting components from a mutable, unreviewed source. With 130 downstream dependents and 46 prior CVEs in this package, vLLM is a high-value supply chain target; the CVSS integrity impact is rated High (I:H), directly undermining any compliance program that cites model revision pins as provenance evidence for ISO 42001 or EU AI Act audits. No public exploit or active exploitation exists today, but the window is exploitable by any adversary with write access to a secondary artifact's default branch. Upgrade to vLLM 0.22.0, which patches all affected code paths via PR #42616; until patched, switch to locally-cached, hash-verified model artifacts rather than live Hugging Face Hub resolution.
Is CVE-2026-47155 actively exploited?
No confirmed active exploitation of CVE-2026-47155 has been reported, but organizations should still patch proactively.
How to fix CVE-2026-47155?
1. Patch: Upgrade vLLM to ≥0.22.0 immediately (PR #42616 fixes all seven identified code paths across registry.py, gguf_loader.py, roberta.py, kimi_k25.py, kimi_audio.py, and default_loader.py). 2. Audit: Identify all production vLLM deployments using --revision or --code-revision flags; flag those serving Kimi-Audio, Kimi-K2.5, BGE-M3, or GGUF-format models as highest priority. 3. Workaround (pre-patch): Snapshot and locally cache all model artifacts including secondary weights with SHA-256 hashes before deployment; configure vLLM to load from local_dir instead of live hub resolution to eliminate unpinned fetches. 4. Compliance: Review existing audit evidence that cites vLLM revision pins — those records may not accurately represent the full artifact set served in production; flag for re-attestation after patching. 5. Detection: If your environment supports HuggingFace Hub audit logging, monitor for secondary artifact fetches that reference revisions different from your configured pin.
What systems are affected by CVE-2026-47155?
This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, multimodal inference, embeddings and retrieval pipelines, model serving.
What is the CVSS score for CVE-2026-47155?
CVE-2026-47155 has a CVSS v3.1 base score of 6.5 (MEDIUM).
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0010 AI Supply Chain Compromise AML.T0010.001 AI Software AML.T0010.003 Model AML.T0109 AI Supply Chain Rug Pull Compliance Controls Affected
What are the technical details?
Original Advisory
### Summary vLLM's revision pinning controls do not consistently apply to all artifacts loaded for a model. A deployment that supplies `--revision` or `--code-revision` can still load dynamic code, GGUF files, image processors, retrieval side weights, or same-repository subfolder weights/config from an unpinned/default revision. This is a supply-chain integrity issue for pinned vLLM deployments. Operators can believe they are serving a reviewed model revision while vLLM resolves behavior-affecting nested or sibling artifacts outside that reviewed revision. ### Details The expected invariant is: > When a vLLM operator supplies a model or code revision pin, every code, config, processor, weight file, side weight, and same-repository subfolder artifact loaded as part of that model should resolve under that pin unless vLLM exposes and enforces a separate explicit pin for that artifact. Current `main` was verified affected at commit `3795d7acf431980e62e738493f437ae2a51549da`. Affected source boundaries: - `vllm/model_executor/models/registry.py:1045-1051` and `:1058-1064` - `_try_resolve_transformers()` passes `revision=model_config.revision` and `trust_remote_code=model_config.trust_remote_code`, but omits `code_revision=model_config.code_revision` for external `auto_map` dynamic module imports. - `vllm/model_executor/model_loader/gguf_loader.py:58-60` - The direct-file GGUF form `repo/file.gguf` calls `hf_hub_download(repo_id=repo_id, filename=filename)` without passing `revision`. - `vllm/model_executor/models/roberta.py:203-209` - BGE-M3 secondary sparse and ColBERT side weights are declared with `revision=None`. - `vllm/model_executor/models/kimi_k25.py:111-114` - Kimi-K2.5 calls `cached_get_image_processor()` without passing `model_config.revision`. - `vllm/model_executor/models/kimi_audio.py:92-95` - Kimi-Audio loads Whisper config from the `whisper-large-v3` subfolder without a `revision` argument. - `vllm/model_executor/models/kimi_audio.py:425-430` - Kimi-Audio declares same-repository `whisper-large-v3` secondary weights with `revision=None`. - `vllm/model_executor/model_loader/default_loader.py:287-301` - The default loader preserves `model_config.revision` for the primary source, then consumes model-supplied secondary sources as declared. The strongest example is Kimi-Audio: the primary `moonshotai/Kimi-Audio-7B-Instruct` weights preserve the configured model revision, but the same-repository `whisper-large-v3` audio tower config/weights do not. A pinned Kimi-Audio deployment can therefore load the Whisper subfolder outside the audited revision. This report does not claim a `trust_remote_code=False` bypass, unauthenticated RCE, or real artifact compromise. The issue is improper propagation of explicit artifact pins across supported loader paths. ### Impact Affected users are operators who pin vLLM model deployments to a reviewed Hugging Face revision for safety review, provenance, rollback, or reproducibility. The impact is that the pin does not reliably describe the full set of artifacts vLLM serves. Even when the operator selects an audited revision, vLLM can resolve behavior-affecting secondary artifacts from the repository default branch or another mutable ref. Depending on the model path, the unpinned artifact can be dynamic model code, a GGUF file, an image processor, retrieval side weights, or the same-repository Kimi-Audio Whisper subfolder weights/config. This breaks the operational guarantee of a pinned deployment: "serve the exact artifact set I reviewed." A later change to an unpinned secondary artifact can alter model behavior without changing the operator's configured revision, making review, rollback, incident response, and audit records unreliable. ### Occurrences - `vllm/model_executor/models/kimi_k25.py` L111-L114 — Kimi-K2.5 loads its image processor with `cached_get_image_processor()` but does not pass `self.ctx.model_config.revision`. The processor can therefore resolve from the default repository revision even when the model deployment is pinned. - `vllm/model_executor/models/kimi_audio.py` L425-L430 — Kimi-Audio declares same-repository `whisper-large-v3` secondary weights with `revision=None`. A pinned Kimi-Audio deployment can therefore load the Whisper audio tower weights from an unpinned/default revision. - `vllm/model_executor/models/kimi_audio.py` L92-L95 — Kimi-Audio loads Whisper config from the same repository's `whisper-large-v3` subfolder without passing the top-level model revision. The config for this behavior-affecting subcomponent can be resolved outside the audited model revision. - `vllm/model_executor/models/registry.py` L1058-L1064 — The later dynamic model-class resolution repeats the same pin-decay pattern: it forwards `revision` and `trust_remote_code`, but omits `code_revision`. This means an operator-provided code pin is not enforced at the dynamic module loader boundary. - `vllm/model_executor/model_loader/gguf_loader.py` L58-L60 — The direct GGUF form `repo/file.gguf` calls `hf_hub_download(repo_id=repo_id, filename=filename)` without passing `model_config.revision`. A deployment that pins the model revision can therefore resolve this GGUF file from the repository default revision. - `vllm/model_executor/models/registry.py` L1045-L1051 — `try_get_class_from_dynamic_module()` is called for external `auto_map` config/model classes with `revision=model_config.revision`, but without forwarding `model_config.code_revision`. When `--code-revision` is set, this dynamic module resolution can still fall back to the default code revision instead of the audited code revision. - `vllm/model_executor/models/roberta.py` L203-L209 — `BgeM3EmbeddingModel` creates same-repository secondary sparse/ColBERT weight sources with `revision=None`. The primary model revision is not propagated to these side weights, so they can be downloaded outside the operator-selected model revision. ### Fixes This was fixed in: https://github.com/vllm-project/vllm/pull/42616 ___ Originally filed via huntr: https://huntr.com/bounties/3f1e24c0-87d2-4f6c-a705-820f380879ac. The vLLM maintainer (Russell Bryant) redirected the report to the private GHSA channel. Offline proof bundle (`vllm_artifact_pin_decay_bundle_verify.py` + `bundle-verification-20260430T143506Z.json`) is available upon request.
Exploitation Scenario
An adversary with write access to a HuggingFace repository (via stolen token, compromised maintainer account, or a Supply Chain Rug Pull setup) modifies the whisper-large-v3 subfolder weights on the default branch of moonshotai/Kimi-Audio-7B-Instruct. An enterprise SOC running a pinned vLLM Kimi-Audio deployment for audio threat analysis restarts their inference service after a routine maintenance window. The primary model weights load correctly from the operator's configured revision hash, but vLLM's audio tower loader (kimi_audio.py L425-L430) fetches the Whisper side weights with revision=None, pulling the adversary's modified weights silently from the default branch. The operator's audit log and compliance evidence reference only the primary revision pin, masking the secondary artifact substitution entirely. The adversary's modified audio tower now produces systematically biased transcriptions or suppresses specific audio signal patterns — with zero observable change to the deployment configuration the operator monitors for drift.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:L/I:H/A:N References
Timeline
Related Vulnerabilities
CVE-2024-9053 9.8 vllm: RCE via unsafe pickle deserialization in RPC server
Same package: vllm CVE-2024-11041 9.8 vllm: RCE via unsafe pickle deserialization in MessageQueue
Same package: vllm CVE-2026-25960 9.8 vllm: SSRF allows internal network access
Same package: vllm CVE-2025-47277 9.8 vLLM: RCE via exposed TCPStore in distributed inference
Same package: vllm CVE-2025-32444 9.8 vLLM: RCE via pickle deserialization on ZeroMQ
Same package: vllm