CVE-2026-10813: LMCache: weak hash enables KV cache integrity bypass

LOW
Published June 4, 2026
CISO Take

LMCache up to v0.4.6 uses a cryptographically weak hashing algorithm (CWE-327/CWE-328) in its KV Cache Handler's hex_hash_to_int16 function, which routes and deduplicates cached key-value pairs for vLLM inference. An attacker with local access and low privileges could craft inputs that produce hash collisions, causing the cache to serve incorrect inference results or disrupt cache availability — in a multi-tenant inference deployment, this could result in one user's query receiving another user's cached response fragment. With a CVSS of 3.6 (Low), high attack complexity, and a local-only attack vector, immediate risk is limited, but shared vLLM inference infrastructure where KV cache isolation is implicitly trusted warrants review. Teams using LMCache with vLLM should track PR #2932 for the pending upstream fix and validate that local access controls on inference hosts are appropriately hardened.

Sources: NVD ATLAS

What is the risk?

Overall risk is low: CVSS 3.6, local attack vector only, high complexity, and no confidentiality impact (C:N). However, in production multi-tenant LLM inference environments — shared enterprise API gateways, inference clusters — hash collisions carry outsized integrity risk because cached KV state implicitly encodes prior conversation context. The exploit has been published (CVSS E:P) and a patch PR exists but has not yet been merged, leaving deployments in a window of unpatched exposure. The attacker population is constrained to those with existing local access and sufficient knowledge of LMCache internals.

How does the attack unfold?

Local Access
Attacker obtains low-privilege local access to the inference host running LMCache, such as a shared account on an enterprise vLLM inference cluster.
AML.T0012
Hash Collision Crafting
Attacker engineers input sequences that produce hex_hash_to_int16 collisions with target users' existing KV cache entries, exploiting the weak hash space.
AML.T0043
Cache Integrity Compromise
Colliding inputs cause LMCache to retrieve and return incorrect or cross-user KV cache entries, corrupting inference context for the targeted request.
AML.T0031
Impact: Context Leakage or Availability Loss
Attacker receives fragments of another user's cached inference context, or sustained collision flooding degrades cache availability and inference throughput for legitimate users.
AML.T0029

What systems are affected?

Package Ecosystem Vulnerable Range Patched
LMCache No patch

Do you use LMCache? You're affected.

How severe is it?

CVSS 3.1
3.6 / 10
EPSS
N/A
Exploitation Status
No known exploitation
Sophistication
Moderate

What is the attack surface?

AV AC PR UI S C I A
AV Local
AC High
PR Low
UI None
S Unchanged
C None
I Low
A Low

What should I do?

1 step
  1. 1) Track and apply PR #2932 (https://github.com/LMCache/LMCache/pull/2932) once merged upstream. 2) As an interim workaround, disable KV cache sharing between users in multi-tenant deployments or enforce per-user cache namespacing if the configuration supports it. 3) Audit lmcache/integration/vllm/utils.py for all calls to hex_hash_to_int16 and evaluate whether a local substitution with a collision-resistant hash (e.g., SHA-256 truncated or SipHash) can be applied as a hotfix pending the upstream merge. 4) Restrict local user privileges on inference hosts to shrink the attacker population eligible to exploit the local attack vector.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.9.4 - Cryptographic controls
NIST AI RMF
MS-2.5 - AI system integrity monitoring
OWASP LLM Top 10
LLM09:2025 - Misinformation

Frequently Asked Questions

What is CVE-2026-10813?

LMCache up to v0.4.6 uses a cryptographically weak hashing algorithm (CWE-327/CWE-328) in its KV Cache Handler's hex_hash_to_int16 function, which routes and deduplicates cached key-value pairs for vLLM inference. An attacker with local access and low privileges could craft inputs that produce hash collisions, causing the cache to serve incorrect inference results or disrupt cache availability — in a multi-tenant inference deployment, this could result in one user's query receiving another user's cached response fragment. With a CVSS of 3.6 (Low), high attack complexity, and a local-only attack vector, immediate risk is limited, but shared vLLM inference infrastructure where KV cache isolation is implicitly trusted warrants review. Teams using LMCache with vLLM should track PR #2932 for the pending upstream fix and validate that local access controls on inference hosts are appropriately hardened.

Is CVE-2026-10813 actively exploited?

No confirmed active exploitation of CVE-2026-10813 has been reported, but organizations should still patch proactively.

How to fix CVE-2026-10813?

1) Track and apply PR #2932 (https://github.com/LMCache/LMCache/pull/2932) once merged upstream. 2) As an interim workaround, disable KV cache sharing between users in multi-tenant deployments or enforce per-user cache namespacing if the configuration supports it. 3) Audit lmcache/integration/vllm/utils.py for all calls to hex_hash_to_int16 and evaluate whether a local substitution with a collision-resistant hash (e.g., SHA-256 truncated or SipHash) can be applied as a hotfix pending the upstream merge. 4) Restrict local user privileges on inference hosts to shrink the attacker population eligible to exploit the local attack vector.

What systems are affected by CVE-2026-10813?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, Multi-tenant model serving, vLLM-based inference clusters, KV cache-accelerated inference pipelines.

What is the CVSS score for CVE-2026-10813?

CVE-2026-10813 has a CVSS v3.1 base score of 3.6 (LOW).

What is the AI security impact?

Affected AI Architectures

LLM inference servingMulti-tenant model servingvLLM-based inference clustersKV cache-accelerated inference pipelines

MITRE ATLAS Techniques

AML.T0024 Exfiltration via AI Inference API
AML.T0031 Erode AI Model Integrity
AML.T0037 Data from Local System

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: A.9.4
NIST AI RMF: MS-2.5
OWASP LLM Top 10: LLM09:2025

What are the technical details?

Original Advisory

A flaw has been found in LMCache up to 0.4.6. This affects the function hex_hash_to_int16 of the file lmcache/integration/vllm/utils.py of the component KV Cache Handler. Executing a manipulation can lead to use of weak hash. The attack needs to be launched locally. The attack requires a high level of complexity. It is indicated that the exploitability is difficult. The exploit has been published and may be used. The pull request to fix this issue awaits acceptance.

Exploitation Scenario

An attacker with a low-privilege local account on a shared vLLM inference server running LMCache crafts a series of inference requests whose input tokens are specifically engineered to collide with the weak hex_hash_to_int16 digest of a target user's prior request. When the collision fires, LMCache retrieves and returns the target's cached KV state — which encodes the internal representation of their conversation context — in response to the attacker's query. This produces a corrupted model response that leaks fragments of the target's inference context. Alternatively, the attacker deliberately floods the cache with colliding entries to evict legitimate cache lines, degrading inference throughput for other users and causing an availability degradation.

Weaknesses (CWE)

CWE-327 — Use of a Broken or Risky Cryptographic Algorithm: The product uses a broken or risky cryptographic algorithm or protocol.

  • [Architecture and Design] When there is a need to store or transmit sensitive data, use strong, up-to-date cryptographic algorithms to encrypt that data. Select a well-vetted algorithm that is currently considered to be strong by experts in the field, and use well-tested implementations. As with all cryptographic mechanisms, the source code should be available for analysis. For example, US government systems require FIPS 140-2 certification [REF-1192]. Do not develop custom or private cryptographic algorithms. They will likely be exposed to attacks that are well-understood by cryptographers. Reverse engineering techniques are mature. If the algorithm can be compromised if attackers find out how it works, then it is especially weak. Periodically ensure that the cryptography has not become obsolete. Some older algorithms, once thought to require a billion years of computing time, can now be broken in days or hours. This includes MD4, MD5, SHA1, DES, and other algorithms that were once regarded as strong. [REF-267
  • [Architecture and Design] Ensure that the design allows one cryptographic algorithm to be replaced with another in the next generation or version. Where possible, use wrappers to make the interfaces uniform. This will make it easier to upgrade to stronger algorithms. With hardware, design the product at the Intellectual Property (IP) level so that one cryptographic algorithm can be replaced with another in the next generation of the hardware product.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:L/AC:H/PR:L/UI:N/S:U/C:N/I:L/A:L/E:P/RL:X/RC:R

References

Timeline

Published
June 4, 2026
Last Modified
June 4, 2026
First Seen
June 12, 2026

Related Vulnerabilities