CVE-2025-46722 — HIGH (CVSS 7.3) AI Security Vulnerability

CISO Take

If your organization runs vLLM for multimodal LLM inference (vision models), upgrade to 0.9.0 immediately. Crafted images can collide in the multimodal cache, causing inference results intended for one user to be silently returned to another — a cross-tenant data leakage risk requiring no authentication. A patch is available and there are no known workarounds short of disabling multimodal caching.

Risk Assessment

High risk for organizations running vLLM in multi-tenant or publicly exposed multimodal inference environments. CVSS 7.3 with a network-accessible, no-auth, no-user-interaction vector means any API client submitting images can potentially trigger collisions. EPSS is currently very low (0.09%), indicating no active exploitation at time of publication, but the attack primitive is conceptually straightforward once the bug is understood. Patch availability reduces urgency slightly, but the ease of exploitation and breadth of vLLM deployments warrant prompt action.

Affected Systems

Package	Ecosystem	Vulnerable Range	Patched
vllm	pip	—	No patch
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →
vllm	pip	>= 0.7.0, < 0.9.0	`0.9.0`
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →

Severity & Risk

CVSS 3.1

7.3 / 10

EPSS

0.2%

chance of exploitation in 30 days

Higher than 46% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

No known exploitation

Sophistication

Moderate

Attack Surface

AV Network

AC Low

PR None

UI None

S Unchanged

C Low

I Low

A Low

Recommended Action

6 steps

Upgrade vLLM to >= 0.9.0 immediately — patch is in commit 99404f53.
Verify current version: pip show vllm; any 0.7.0–0.8.x is vulnerable.
If immediate upgrade is blocked, disable multimodal prefix caching in vLLM configuration (--disable-prefix-caching flag) as a temporary workaround.
In multi-tenant deployments, implement per-tenant cache namespace isolation at the application layer until patched.
Review inference logs for anomalous cache hit rates on multimodal endpoints as a post-hoc indicator of exploitation.
If running vLLM behind an API gateway, consider adding request fingerprinting that includes image dimensions as a compensating control.

CISA SSVC Assessment

Decision Track

Exploitation none

Automatable No

Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

Data Leakage Privacy Violation Inference Framework AML.T0031 - Erode AI Model Integrity AML.T0040 - AI Model Inference API Access AML.T0049 - Exploit Public-Facing Application AML.T0057 - LLM Data Leakage

Compliance Impact

This CVE is relevant to:

EU AI Act

Article 15 - Accuracy, Robustness and Cybersecurity

ISO 42001

8.4 - AI System Operation — Data Integrity

NIST AI RMF

MANAGE-2.2 - Risk Treatment and Residual Risk

OWASP LLM Top 10

LLM02:2025 - Sensitive Information Disclosure

Frequently Asked Questions

What is CVE-2025-46722?

If your organization runs vLLM for multimodal LLM inference (vision models), upgrade to 0.9.0 immediately. Crafted images can collide in the multimodal cache, causing inference results intended for one user to be silently returned to another — a cross-tenant data leakage risk requiring no authentication. A patch is available and there are no known workarounds short of disabling multimodal caching.

Is CVE-2025-46722 actively exploited?

No confirmed active exploitation of CVE-2025-46722 has been reported, but organizations should still patch proactively.

How to fix CVE-2025-46722?

1. Upgrade vLLM to >= 0.9.0 immediately — patch is in commit 99404f53. 2. Verify current version: `pip show vllm`; any 0.7.0–0.8.x is vulnerable. 3. If immediate upgrade is blocked, disable multimodal prefix caching in vLLM configuration (--disable-prefix-caching flag) as a temporary workaround. 4. In multi-tenant deployments, implement per-tenant cache namespace isolation at the application layer until patched. 5. Review inference logs for anomalous cache hit rates on multimodal endpoints as a post-hoc indicator of exploitation. 6. If running vLLM behind an API gateway, consider adding request fingerprinting that includes image dimensions as a compensating control.

What systems are affected by CVE-2025-46722?

This vulnerability affects the following AI/ML architecture patterns: Multimodal inference serving, Vision-language model deployments, Multi-tenant LLM APIs, LLM serving infrastructure.

What is the CVSS score for CVE-2025-46722?

CVE-2025-46722 has a CVSS v3.1 base score of 7.3 (HIGH). The EPSS exploitation probability is 0.23%.

Technical Details

NVD Description

vLLM is an inference and serving engine for large language models (LLMs). In versions starting from 0.7.0 to before 0.9.0, in the file vllm/multimodal/hasher.py, the MultiModalHasher class has a security and data integrity issue in its image hashing method. Currently, it serializes PIL.Image.Image objects using only obj.tobytes(), which returns only the raw pixel data, without including metadata such as the image’s shape (width, height, mode). As a result, two images of different sizes (e.g., 30x100 and 100x30) with the same pixel byte sequence could generate the same hash value. This may lead to hash collisions, incorrect cache hits, and even data leakage or security risks. This issue has been patched in version 0.9.0.

Exploitation Scenario

An adversary with access to a multi-tenant vLLM multimodal API submits a 100x30 image with specific pixel data. If a prior user's 30x100 image produced identical raw bytes, the attacker receives that cached inference result — potentially revealing the prior user's prompt context, processed image interpretation, or model output. In a production API serving multiple tenants, the attacker systematically probes with geometrically transposed versions of common images (landscape vs portrait variants of known content), fishing for hash collisions that surface other tenants' inference results. No authentication bypass, elevated privileges, or specialized AI knowledge is required — only knowledge of the bug and basic image manipulation tooling.