CVE-2025-48887: vLLM: ReDoS in tool parser causes service outage

GHSA-w6q7-j642-7c25 MEDIUM PoC AVAILABLE CISA: TRACK*
Published May 30, 2025
CISO Take

vLLM deployments with tool calling enabled are vulnerable to service disruption: any authenticated API user can send a crafted payload to trigger catastrophic regex backtracking in the tool call parser, taking down the inference service. Upgrade to vLLM 0.9.0 immediately; if patching is delayed, disable tool/function calling endpoints or add strict input length limits at the API gateway. Impact is limited to availability — no data exfiltration or code execution risk.

What is the risk?

Operational risk is higher than CVSS 6.5 suggests for organizations running vLLM as a production inference endpoint. Attack complexity is trivial (crafted string input), requires only authenticated access (PR:L), and is network-reachable. The affected code path (pythonic_tool_parser.py) is active whenever function/tool calling is used — a standard pattern in agentic pipelines. EPSS of 0.00122 reflects low exploit-in-the-wild activity today, but the technique is well-understood and requires no ML expertise. Primary risk is availability of LLM inference infrastructure, not confidentiality.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
vLLM pip No patch
83.4K 130 dependents Pushed 2d ago 34% patched ~32d to patch Full package profile →
vLLM pip >= 0.6.4, < 0.9.0 0.9.0
83.4K 130 dependents Pushed 2d ago 34% patched ~32d to patch Full package profile →

How severe is it?

CVSS 3.1
6.5 / 10
EPSS
0.4%
chance of exploitation in 30 days
Higher than 34% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

What should I do?

5 steps
  1. Patch

    Upgrade vLLM to >= 0.9.0 (commit 4fc1bf813ad80172c1db31264beaef7d93fe0601 contains the fix).

  2. Workaround

    If immediate upgrade is blocked, disable the pythonic tool parser by switching to an alternative tool_call_parser (e.g., --tool-call-parser hermes) or disabling tool calling in your deployment config.

  3. API gateway controls

    Enforce maximum request payload size and per-user rate limits at the reverse proxy (nginx/Envoy) to reduce blast radius.

  4. Detection

    Alert on abnormal response latency spikes in your vLLM service metrics — ReDoS manifests as CPU saturation with near-zero throughput on the serving process.

  5. Verify exposure

    Check if your deployment enables --enable-auto-tool-choice or uses --tool-call-parser pythonic; those are the critical indicators.

What does CISA's SSVC say?

Decision Track*
Exploitation poc
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2.6 - AI system availability and resilience
NIST AI RMF
MANAGE-2.2 - Risk treatment — availability and reliability
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2025-48887?

vLLM deployments with tool calling enabled are vulnerable to service disruption: any authenticated API user can send a crafted payload to trigger catastrophic regex backtracking in the tool call parser, taking down the inference service. Upgrade to vLLM 0.9.0 immediately; if patching is delayed, disable tool/function calling endpoints or add strict input length limits at the API gateway. Impact is limited to availability — no data exfiltration or code execution risk.

Is CVE-2025-48887 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-48887, increasing the risk of exploitation.

How to fix CVE-2025-48887?

1. **Patch**: Upgrade vLLM to >= 0.9.0 (commit 4fc1bf813ad80172c1db31264beaef7d93fe0601 contains the fix). 2. **Workaround**: If immediate upgrade is blocked, disable the pythonic tool parser by switching to an alternative tool_call_parser (e.g., `--tool-call-parser hermes`) or disabling tool calling in your deployment config. 3. **API gateway controls**: Enforce maximum request payload size and per-user rate limits at the reverse proxy (nginx/Envoy) to reduce blast radius. 4. **Detection**: Alert on abnormal response latency spikes in your vLLM service metrics — ReDoS manifests as CPU saturation with near-zero throughput on the serving process. 5. **Verify exposure**: Check if your deployment enables `--enable-auto-tool-choice` or uses `--tool-call-parser pythonic`; those are the critical indicators.

What systems are affected by CVE-2025-48887?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, Agent frameworks, Function calling / tool use endpoints, OpenAI-compatible API deployments, Multi-tenant AI platforms.

What is the CVSS score for CVE-2025-48887?

CVE-2025-48887 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.43%.

What is the AI security impact?

Affected AI Architectures

LLM inference servingAgent frameworksFunction calling / tool use endpointsOpenAI-compatible API deploymentsMulti-tenant AI platforms

MITRE ATLAS Techniques

AML.T0029 Denial of AI Service
AML.T0034 Cost Harvesting
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: A.6.2.6
NIST AI RMF: MANAGE-2.2
OWASP LLM Top 10: LLM04

What are the technical details?

Original Advisory

vLLM, an inference and serving engine for large language models (LLMs), has a Regular Expression Denial of Service (ReDoS) vulnerability in the file `vllm/entrypoints/openai/tool_parsers/pythonic_tool_parser.py` of versions 0.6.4 up to but excluding 0.9.0. The root cause is the use of a highly complex and nested regular expression for tool call detection, which can be exploited by an attacker to cause severe performance degradation or make the service unavailable. The pattern contains multiple nested quantifiers, optional groups, and inner repetitions which make it vulnerable to catastrophic backtracking. Version 0.9.0 contains a patch for the issue.

Exploitation Scenario

An adversary with legitimate API access to a vLLM-backed service (e.g., an internal AI platform user or a malicious external user of a public API) crafts a tool call payload with a pathological structure designed to trigger catastrophic backtracking in the nested quantifiers of the pythonic tool parser regex. The request is sent repeatedly via the OpenAI-compatible `/v1/chat/completions` endpoint. Each malicious request causes the regex engine to spin at near-100% CPU for an extended period. Within seconds of sustained requests, the vLLM serving process becomes unresponsive, denying service to all legitimate users. In agentic pipelines where the vLLM instance drives autonomous agents, this collapses the entire agent fleet — a single low-privilege user effectively disables production AI operations.

Weaknesses (CWE)

CWE-1333 — Inefficient Regular Expression Complexity: The product uses a regular expression with a worst-case computational complexity that is inefficient and possibly exponential.

  • [Architecture and Design] Use regular expressions that do not support backtracking, e.g. by removing nested quantifiers.
  • [System Configuration] Set backtracking limits in the configuration of the regular expression implementation, such as PHP's pcre.backtrack_limit. Also consider limits on execution time for the process.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 30, 2025
Last Modified
June 19, 2025
First Seen
May 30, 2025

Related Vulnerabilities