CVE-2026-47155: vLLM: revision pin bypass loads unreviewed artifacts

GHSA-3ww4-5jv9-j5gm MEDIUM
Published June 10, 2026
CISO Take

vLLM's --revision and --code-revision deployment pins fail to propagate to secondary artifacts — including dynamic code modules, GGUF files, image processors, and same-repository side weights — meaning a deployment you believe is locked to an audited revision can silently load behavior-affecting components from a mutable, unreviewed source. With 130 downstream dependents and 46 prior CVEs in this package, vLLM is a high-value supply chain target; the CVSS integrity impact is rated High (I:H), directly undermining any compliance program that cites model revision pins as provenance evidence for ISO 42001 or EU AI Act audits. No public exploit or active exploitation exists today, but the window is exploitable by any adversary with write access to a secondary artifact's default branch. Upgrade to vLLM 0.22.0, which patches all affected code paths via PR #42616; until patched, switch to locally-cached, hash-verified model artifacts rather than live Hugging Face Hub resolution.

Sources: NVD GitHub Advisory ATLAS

What is the risk?

Medium overall risk (CVSS 6.5) with disproportionate compliance impact. High attack complexity (AC:H) limits opportunistic exploitation, requiring an adversary with write access to a HuggingFace repository's default branch or the ability to compromise a secondary artifact source. However, the integrity impact is High (I:H) and the damage is stealthy — a pinned deployment silently diverges from its audited state without any observable configuration change. For organizations using vLLM revision pins as compliance evidence under ISO 42001 or EU AI Act, this is a control failure, not merely a technical flaw. The 130-dependent blast radius means compromise of any shared secondary model artifact could cascade across downstream deployments without detection.

How does the attack unfold?

Supply Chain Positioning
Adversary gains write access to a HuggingFace model repository's default branch via compromised credentials or account takeover, targeting a model whose secondary artifacts vLLM loads without revision enforcement.
AML.T0010
Artifact Substitution
Adversary modifies secondary artifacts on the default branch — such as Whisper subfolder weights for Kimi-Audio, ColBERT side weights for BGE-M3, or GGUF files — while leaving the primary pinned revision intact to evade detection.
AML.T0010.003
Unpinned Load Triggered
On next vLLM service restart or new replica spinup, the pinned deployment fetches primary artifacts from the reviewed revision but silently pulls secondary artifacts from the adversary-modified default branch due to missing revision propagation in vLLM's loader code paths.
AML.T0010.001
Silent Behavioral Compromise
Modified secondary artifacts alter inference behavior without changing the operator's configured revision pin, leaving audit logs, compliance records, and rollback procedures referencing an artifact set that no longer matches production reality.
AML.T0109

What systems are affected?

Package Ecosystem Vulnerable Range Patched
vLLM pip < 0.22.0 0.22.0
82.1K 130 dependents Pushed 3d ago 55% patched ~32d to patch Full package profile →

Do you use vLLM? You're affected.

How severe is it?

CVSS 3.1
6.5 / 10
EPSS
N/A
Exploitation Status
No known exploitation
Sophistication
Moderate

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC High
PR None
UI None
S Unchanged
C Low
I High
A None

What should I do?

5 steps
  1. Patch: Upgrade vLLM to ≥0.22.0 immediately (PR #42616 fixes all seven identified code paths across registry.py, gguf_loader.py, roberta.py, kimi_k25.py, kimi_audio.py, and default_loader.py).

  2. Audit: Identify all production vLLM deployments using --revision or --code-revision flags; flag those serving Kimi-Audio, Kimi-K2.5, BGE-M3, or GGUF-format models as highest priority.

  3. Workaround (pre-patch): Snapshot and locally cache all model artifacts including secondary weights with SHA-256 hashes before deployment; configure vLLM to load from local_dir instead of live hub resolution to eliminate unpinned fetches.

  4. Compliance: Review existing audit evidence that cites vLLM revision pins — those records may not accurately represent the full artifact set served in production; flag for re-attestation after patching.

  5. Detection: If your environment supports HuggingFace Hub audit logging, monitor for secondary artifact fetches that reference revisions different from your configured pin.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 9 - Risk management system
ISO 42001
8.4 - AI system lifecycle processes
NIST AI RMF
GOVERN 1.7 - Processes for tracking and managing AI risks are in place
OWASP LLM Top 10
LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2026-47155?

vLLM's --revision and --code-revision deployment pins fail to propagate to secondary artifacts — including dynamic code modules, GGUF files, image processors, and same-repository side weights — meaning a deployment you believe is locked to an audited revision can silently load behavior-affecting components from a mutable, unreviewed source. With 130 downstream dependents and 46 prior CVEs in this package, vLLM is a high-value supply chain target; the CVSS integrity impact is rated High (I:H), directly undermining any compliance program that cites model revision pins as provenance evidence for ISO 42001 or EU AI Act audits. No public exploit or active exploitation exists today, but the window is exploitable by any adversary with write access to a secondary artifact's default branch. Upgrade to vLLM 0.22.0, which patches all affected code paths via PR #42616; until patched, switch to locally-cached, hash-verified model artifacts rather than live Hugging Face Hub resolution.

Is CVE-2026-47155 actively exploited?

No confirmed active exploitation of CVE-2026-47155 has been reported, but organizations should still patch proactively.

How to fix CVE-2026-47155?

1. Patch: Upgrade vLLM to ≥0.22.0 immediately (PR #42616 fixes all seven identified code paths across registry.py, gguf_loader.py, roberta.py, kimi_k25.py, kimi_audio.py, and default_loader.py). 2. Audit: Identify all production vLLM deployments using --revision or --code-revision flags; flag those serving Kimi-Audio, Kimi-K2.5, BGE-M3, or GGUF-format models as highest priority. 3. Workaround (pre-patch): Snapshot and locally cache all model artifacts including secondary weights with SHA-256 hashes before deployment; configure vLLM to load from local_dir instead of live hub resolution to eliminate unpinned fetches. 4. Compliance: Review existing audit evidence that cites vLLM revision pins — those records may not accurately represent the full artifact set served in production; flag for re-attestation after patching. 5. Detection: If your environment supports HuggingFace Hub audit logging, monitor for secondary artifact fetches that reference revisions different from your configured pin.

What systems are affected by CVE-2026-47155?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, multimodal inference, embeddings and retrieval pipelines, model serving.

What is the CVSS score for CVE-2026-47155?

CVE-2026-47155 has a CVSS v3.1 base score of 6.5 (MEDIUM).

What is the AI security impact?

Affected AI Architectures

LLM inference servingmultimodal inferenceembeddings and retrieval pipelinesmodel serving

MITRE ATLAS Techniques

AML.T0010 AI Supply Chain Compromise
AML.T0010.001 AI Software
AML.T0010.003 Model
AML.T0109 AI Supply Chain Rug Pull

Compliance Controls Affected

EU AI Act: Article 9
ISO 42001: 8.4
NIST AI RMF: GOVERN 1.7
OWASP LLM Top 10: LLM05

What are the technical details?

Original Advisory

### Summary vLLM's revision pinning controls do not consistently apply to all artifacts loaded for a model. A deployment that supplies `--revision` or `--code-revision` can still load dynamic code, GGUF files, image processors, retrieval side weights, or same-repository subfolder weights/config from an unpinned/default revision. This is a supply-chain integrity issue for pinned vLLM deployments. Operators can believe they are serving a reviewed model revision while vLLM resolves behavior-affecting nested or sibling artifacts outside that reviewed revision. ### Details The expected invariant is: > When a vLLM operator supplies a model or code revision pin, every code, config, processor, weight file, side weight, and same-repository subfolder artifact loaded as part of that model should resolve under that pin unless vLLM exposes and enforces a separate explicit pin for that artifact. Current `main` was verified affected at commit `3795d7acf431980e62e738493f437ae2a51549da`. Affected source boundaries: - `vllm/model_executor/models/registry.py:1045-1051` and `:1058-1064` - `_try_resolve_transformers()` passes `revision=model_config.revision` and `trust_remote_code=model_config.trust_remote_code`, but omits `code_revision=model_config.code_revision` for external `auto_map` dynamic module imports. - `vllm/model_executor/model_loader/gguf_loader.py:58-60` - The direct-file GGUF form `repo/file.gguf` calls `hf_hub_download(repo_id=repo_id, filename=filename)` without passing `revision`. - `vllm/model_executor/models/roberta.py:203-209` - BGE-M3 secondary sparse and ColBERT side weights are declared with `revision=None`. - `vllm/model_executor/models/kimi_k25.py:111-114` - Kimi-K2.5 calls `cached_get_image_processor()` without passing `model_config.revision`. - `vllm/model_executor/models/kimi_audio.py:92-95` - Kimi-Audio loads Whisper config from the `whisper-large-v3` subfolder without a `revision` argument. - `vllm/model_executor/models/kimi_audio.py:425-430` - Kimi-Audio declares same-repository `whisper-large-v3` secondary weights with `revision=None`. - `vllm/model_executor/model_loader/default_loader.py:287-301` - The default loader preserves `model_config.revision` for the primary source, then consumes model-supplied secondary sources as declared. The strongest example is Kimi-Audio: the primary `moonshotai/Kimi-Audio-7B-Instruct` weights preserve the configured model revision, but the same-repository `whisper-large-v3` audio tower config/weights do not. A pinned Kimi-Audio deployment can therefore load the Whisper subfolder outside the audited revision. This report does not claim a `trust_remote_code=False` bypass, unauthenticated RCE, or real artifact compromise. The issue is improper propagation of explicit artifact pins across supported loader paths. ### Impact Affected users are operators who pin vLLM model deployments to a reviewed Hugging Face revision for safety review, provenance, rollback, or reproducibility. The impact is that the pin does not reliably describe the full set of artifacts vLLM serves. Even when the operator selects an audited revision, vLLM can resolve behavior-affecting secondary artifacts from the repository default branch or another mutable ref. Depending on the model path, the unpinned artifact can be dynamic model code, a GGUF file, an image processor, retrieval side weights, or the same-repository Kimi-Audio Whisper subfolder weights/config. This breaks the operational guarantee of a pinned deployment: "serve the exact artifact set I reviewed." A later change to an unpinned secondary artifact can alter model behavior without changing the operator's configured revision, making review, rollback, incident response, and audit records unreliable. ### Occurrences - `vllm/model_executor/models/kimi_k25.py` L111-L114 — Kimi-K2.5 loads its image processor with `cached_get_image_processor()` but does not pass `self.ctx.model_config.revision`. The processor can therefore resolve from the default repository revision even when the model deployment is pinned. - `vllm/model_executor/models/kimi_audio.py` L425-L430 — Kimi-Audio declares same-repository `whisper-large-v3` secondary weights with `revision=None`. A pinned Kimi-Audio deployment can therefore load the Whisper audio tower weights from an unpinned/default revision. - `vllm/model_executor/models/kimi_audio.py` L92-L95 — Kimi-Audio loads Whisper config from the same repository's `whisper-large-v3` subfolder without passing the top-level model revision. The config for this behavior-affecting subcomponent can be resolved outside the audited model revision. - `vllm/model_executor/models/registry.py` L1058-L1064 — The later dynamic model-class resolution repeats the same pin-decay pattern: it forwards `revision` and `trust_remote_code`, but omits `code_revision`. This means an operator-provided code pin is not enforced at the dynamic module loader boundary. - `vllm/model_executor/model_loader/gguf_loader.py` L58-L60 — The direct GGUF form `repo/file.gguf` calls `hf_hub_download(repo_id=repo_id, filename=filename)` without passing `model_config.revision`. A deployment that pins the model revision can therefore resolve this GGUF file from the repository default revision. - `vllm/model_executor/models/registry.py` L1045-L1051 — `try_get_class_from_dynamic_module()` is called for external `auto_map` config/model classes with `revision=model_config.revision`, but without forwarding `model_config.code_revision`. When `--code-revision` is set, this dynamic module resolution can still fall back to the default code revision instead of the audited code revision. - `vllm/model_executor/models/roberta.py` L203-L209 — `BgeM3EmbeddingModel` creates same-repository secondary sparse/ColBERT weight sources with `revision=None`. The primary model revision is not propagated to these side weights, so they can be downloaded outside the operator-selected model revision. ### Fixes This was fixed in: https://github.com/vllm-project/vllm/pull/42616 ___ Originally filed via huntr: https://huntr.com/bounties/3f1e24c0-87d2-4f6c-a705-820f380879ac. The vLLM maintainer (Russell Bryant) redirected the report to the private GHSA channel. Offline proof bundle (`vllm_artifact_pin_decay_bundle_verify.py` + `bundle-verification-20260430T143506Z.json`) is available upon request.

Exploitation Scenario

An adversary with write access to a HuggingFace repository (via stolen token, compromised maintainer account, or a Supply Chain Rug Pull setup) modifies the whisper-large-v3 subfolder weights on the default branch of moonshotai/Kimi-Audio-7B-Instruct. An enterprise SOC running a pinned vLLM Kimi-Audio deployment for audio threat analysis restarts their inference service after a routine maintenance window. The primary model weights load correctly from the operator's configured revision hash, but vLLM's audio tower loader (kimi_audio.py L425-L430) fetches the Whisper side weights with revision=None, pulling the adversary's modified weights silently from the default branch. The operator's audit log and compliance evidence reference only the primary revision pin, masking the secondary artifact substitution entirely. The adversary's modified audio tower now produces systematically biased transcriptions or suppresses specific audio signal patterns — with zero observable change to the deployment configuration the operator monitors for drift.

CVSS Vector

CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:L/I:H/A:N

Timeline

Published
June 10, 2026
Last Modified
June 10, 2026
First Seen
June 10, 2026

Related Vulnerabilities