vLLM instances using built-in API key validation (--api-key flag) are vulnerable to timing-based authentication bypass via statistical analysis of response times. Upgrade to vLLM 0.11.0 immediately. If upgrade is blocked, offload authentication to a reverse proxy (nginx, Kong, cloud API gateway) and disable vLLM's native key validation entirely.
What is the risk?
Moderate-to-high risk for production deployments. Exploitation is fully automatable and requires no prior credentials — only network access and patience for hundreds to thousands of HTTP requests. The EPSS score is low today (0.37%) but timing attack tooling is widely available and the technique is well-documented. Internet-exposed vLLM endpoints or multi-tenant inference clusters face the highest exposure. Internal deployments with network segmentation are lower risk but not immune.
What systems are affected?
How severe is it?
What is the attack surface?
What should I do?
5 steps-
PATCH
Upgrade to vLLM >= 0.11.0 which uses constant-time comparison (hmac.compare_digest).
-
WORKAROUND
If immediate upgrade is blocked, terminate API authentication at the reverse proxy layer (nginx auth_request, Kong key-auth, AWS API Gateway authorizer) and remove the --api-key flag from vLLM.
-
ROTATE
After patching, rotate all API keys that were in use on affected instances.
-
DETECT
Review access logs for anomalous authentication patterns — thousands of 401 responses with incremental key variations from a single IP.
-
RATE-LIMIT: Implement aggressive rate limiting on the authentication endpoint to significantly increase the time required for timing analysis, even pre-patch.
What does CISA's SSVC say?
Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.
How is it classified?
Which compliance frameworks are affected?
This CVE is relevant to:
Frequently Asked Questions
What is CVE-2025-59425?
vLLM instances using built-in API key validation (--api-key flag) are vulnerable to timing-based authentication bypass via statistical analysis of response times. Upgrade to vLLM 0.11.0 immediately. If upgrade is blocked, offload authentication to a reverse proxy (nginx, Kong, cloud API gateway) and disable vLLM's native key validation entirely.
Is CVE-2025-59425 actively exploited?
Proof-of-concept exploit code is publicly available for CVE-2025-59425, increasing the risk of exploitation.
How to fix CVE-2025-59425?
1. PATCH: Upgrade to vLLM >= 0.11.0 which uses constant-time comparison (hmac.compare_digest). 2. WORKAROUND: If immediate upgrade is blocked, terminate API authentication at the reverse proxy layer (nginx auth_request, Kong key-auth, AWS API Gateway authorizer) and remove the --api-key flag from vLLM. 3. ROTATE: After patching, rotate all API keys that were in use on affected instances. 4. DETECT: Review access logs for anomalous authentication patterns — thousands of 401 responses with incremental key variations from a single IP. 5. RATE-LIMIT: Implement aggressive rate limiting on the authentication endpoint to significantly increase the time required for timing analysis, even pre-patch.
What systems are affected by CVE-2025-59425?
This vulnerability affects the following AI/ML architecture patterns: model serving, LLM inference APIs, AI API gateways, multi-tenant inference clusters.
What is the CVSS score for CVE-2025-59425?
CVE-2025-59425 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.54%.
What is the AI security impact?
Affected AI Architectures
MITRE ATLAS Techniques
AML.T0006 Active Scanning AML.T0034 Cost Harvesting AML.T0040 AI Model Inference API Access AML.T0049 Exploit Public-Facing Application AML.T0106 Exploitation for Credential Access Compliance Controls Affected
What are the technical details?
Original Advisory
vLLM is an inference and serving engine for large language models (LLMs). Before version 0.11.0rc2, the API key support in vLLM performs validation using a method that was vulnerable to a timing attack. API key validation uses a string comparison that takes longer the more characters the provided API key gets correct. Data analysis across many attempts could allow an attacker to determine when it finds the next correct character in the key sequence. Deployments relying on vLLM's built-in API key validation are vulnerable to authentication bypass using this technique. Version 0.11.0rc2 fixes the issue.
Exploitation Scenario
An adversary discovers a vLLM instance via passive internet scanning (Shodan, Censys — vLLM's OpenAI-compatible /health endpoint is a clear fingerprint). They write a Python script that iterates over each character position of the API key, sending batches of requests with candidate characters and measuring median response time across each batch. Because Python's == operator compares strings character-by-character and exits early on mismatch, correct characters produce measurably longer comparison times. After O(N * charset_size * samples) requests — fully automatable in hours on a fast connection — the full API key is reconstructed. The attacker then authenticates legitimately and queries the model for sensitive system prompt extraction, runs cost-harvesting inference loops, or uses the access as initial foothold for lateral movement.
Weaknesses (CWE)
CWE-385 — Covert Timing Channel: Covert timing channels convey information by modulating some aspect of system behavior over time, so that the program receiving the information can observe system behavior and infer protected information.
- [Architecture and Design] Whenever possible, specify implementation strategies that do not introduce time variances in operations.
- [Implementation] Often one can artificially manipulate the time which operations take or -- when operations occur -- can remove information from the attacker.
Source: MITRE CWE corpus.
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:N/A:N References
- github.com/advisories/GHSA-wr9h-g72x-mwhm
- nvd.nist.gov/vuln/detail/CVE-2025-59425
- github.com/vllm-project/vllm/blob/4b946d693e0af15740e9ca9c0e059d5f333b1083/vllm/entrypoints/openai/api_server.py Product
- github.com/vllm-project/vllm/commit/ee10d7e6ff5875386c7f136ce8b5f525c8fcef48 Patch
- github.com/vllm-project/vllm/releases/tag/v0.11.0 Release
- github.com/vllm-project/vllm/security/advisories/GHSA-wr9h-g72x-mwhm Exploit Vendor
- github.com/fkie-cad/nvd-json-data-feeds Exploit
Timeline
Related Vulnerabilities
CVE-2024-9053 9.8 vllm: RCE via unsafe pickle deserialization in RPC server
Same package: vllm CVE-2024-11041 9.8 vllm: RCE via unsafe pickle deserialization in MessageQueue
Same package: vllm CVE-2026-25960 9.8 vllm: SSRF allows internal network access
Same package: vllm CVE-2025-47277 9.8 vLLM: RCE via exposed TCPStore in distributed inference
Same package: vllm CVE-2025-32444 9.8 vLLM: RCE via pickle deserialization on ZeroMQ
Same package: vllm