vLLM instances using built-in API key validation (--api-key flag) are vulnerable to timing-based authentication bypass via statistical analysis of response times. Upgrade to vLLM 0.11.0 immediately. If upgrade is blocked, offload authentication to a reverse proxy (nginx, Kong, cloud API gateway) and disable vLLM's native key validation entirely.
Risk Assessment
Moderate-to-high risk for production deployments. Exploitation is fully automatable and requires no prior credentials — only network access and patience for hundreds to thousands of HTTP requests. The EPSS score is low today (0.37%) but timing attack tooling is widely available and the technique is well-documented. Internet-exposed vLLM endpoints or multi-tenant inference clusters face the highest exposure. Internal deployments with network segmentation are lower risk but not immune.
Affected Systems
Severity & Risk
Attack Surface
Recommended Action
5 steps-
PATCH
Upgrade to vLLM >= 0.11.0 which uses constant-time comparison (hmac.compare_digest).
-
WORKAROUND
If immediate upgrade is blocked, terminate API authentication at the reverse proxy layer (nginx auth_request, Kong key-auth, AWS API Gateway authorizer) and remove the --api-key flag from vLLM.
-
ROTATE
After patching, rotate all API keys that were in use on affected instances.
-
DETECT
Review access logs for anomalous authentication patterns — thousands of 401 responses with incremental key variations from a single IP.
-
RATE-LIMIT: Implement aggressive rate limiting on the authentication endpoint to significantly increase the time required for timing analysis, even pre-patch.
CISA SSVC Assessment
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
Classification
Compliance Impact
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2025-59425?
vLLM instances using built-in API key validation (--api-key flag) are vulnerable to timing-based authentication bypass via statistical analysis of response times. Upgrade to vLLM 0.11.0 immediately. If upgrade is blocked, offload authentication to a reverse proxy (nginx, Kong, cloud API gateway) and disable vLLM's native key validation entirely.
Is CVE-2025-59425 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2025-59425, increasing the risk of exploitation.
How to fix CVE-2025-59425?
1. PATCH: Upgrade to vLLM >= 0.11.0 which uses constant-time comparison (hmac.compare_digest). 2. WORKAROUND: If immediate upgrade is blocked, terminate API authentication at the reverse proxy layer (nginx auth_request, Kong key-auth, AWS API Gateway authorizer) and remove the --api-key flag from vLLM. 3. ROTATE: After patching, rotate all API keys that were in use on affected instances. 4. DETECT: Review access logs for anomalous authentication patterns — thousands of 401 responses with incremental key variations from a single IP. 5. RATE-LIMIT: Implement aggressive rate limiting on the authentication endpoint to significantly increase the time required for timing analysis, even pre-patch.
What systems are affected by CVE-2025-59425?
This vulnerability affects the following AI/ML architecture patterns: model serving, LLM inference APIs, AI API gateways, multi-tenant inference clusters.
What is the CVSS score for CVE-2025-59425?
CVE-2025-59425 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.37%.
Technical Details
NVD Description
vLLM is an inference and serving engine for large language models (LLMs). Before version 0.11.0rc2, the API key support in vLLM performs validation using a method that was vulnerable to a timing attack. API key validation uses a string comparison that takes longer the more characters the provided API key gets correct. Data analysis across many attempts could allow an attacker to determine when it finds the next correct character in the key sequence. Deployments relying on vLLM's built-in API key validation are vulnerable to authentication bypass using this technique. Version 0.11.0rc2 fixes the issue.
Exploitation Scenario
An adversary discovers a vLLM instance via passive internet scanning (Shodan, Censys — vLLM's OpenAI-compatible /health endpoint is a clear fingerprint). They write a Python script that iterates over each character position of the API key, sending batches of requests with candidate characters and measuring median response time across each batch. Because Python's == operator compares strings character-by-character and exits early on mismatch, correct characters produce measurably longer comparison times. After O(N * charset_size * samples) requests — fully automatable in hours on a fast connection — the full API key is reconstructed. The attacker then authenticates legitimately and queries the model for sensitive system prompt extraction, runs cost-harvesting inference loops, or uses the access as initial foothold for lateral movement.
Weaknesses (CWE)
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N References
- github.com/advisories/GHSA-wr9h-g72x-mwhm
- nvd.nist.gov/vuln/detail/CVE-2025-59425
- github.com/vllm-project/vllm/blob/4b946d693e0af15740e9ca9c0e059d5f333b1083/vllm/entrypoints/openai/api_server.py Product
- github.com/vllm-project/vllm/commit/ee10d7e6ff5875386c7f136ce8b5f525c8fcef48 Patch
- github.com/vllm-project/vllm/releases/tag/v0.11.0 Release
- github.com/vllm-project/vllm/security/advisories/GHSA-wr9h-g72x-mwhm Exploit Vendor
- github.com/fkie-cad/nvd-json-data-feeds Exploit
Timeline
Related Vulnerabilities
CVE-2024-9053 9.8 vllm: RCE via unsafe pickle deserialization in RPC server
Same package: vllm CVE-2024-11041 9.8 vllm: RCE via unsafe pickle deserialization in MessageQueue
Same package: vllm CVE-2026-25960 9.8 vllm: SSRF allows internal network access
Same package: vllm CVE-2025-47277 9.8 vLLM: RCE via exposed TCPStore in distributed inference
Same package: vllm CVE-2025-32444 9.8 vLLM: RCE via pickle deserialization on ZeroMQ
Same package: vllm
AI Threat Alert