CVE-2025-4287 — LOW (CVSS 3.3) AI Security Vulnerability

CISO Take

Low-severity local DoS in PyTorch's NCCL reduce function (torch.cuda.nccl.reduce). Exploiting requires local access with unprivileged credentials — primary risk is in shared GPU clusters or multi-tenant ML training environments where a rogue user can crash distributed training jobs. Apply the upstream patch; if patching is blocked, restrict local access to training nodes.

Risk Assessment

Risk is low in typical deployments. CVSS 3.3 reflects the local-only attack vector and availability-only impact. Effective risk elevates in shared HPC/GPU cluster environments where multiple teams share nodes — there, a low-privileged insider or compromised account can disrupt expensive distributed training runs. Not exploitable remotely. No active exploitation observed. No CISA KEV listing.

Severity & Risk

CVSS 3.1

3.3 / 10

EPSS

0.1%

chance of exploitation in 30 days

Higher than 23% of all CVEs

Source: EPSS v3 — FIRST.org

Exploitation Status

Exploit Available

Exploitation: MEDIUM

Sophistication

Trivial

Exploitation Confidence

medium

○ CISA SSVC: Public PoC

Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV Local

AC Low

PR Low

UI None

S Unchanged

C None

I None

A Low

Recommended Action

5 steps

Patch: apply commit 5827d2061dcb4acd05ac5f8e65d8693a481ba0f5 or update PyTorch once a patched release ships.
Workaround: restrict local shell access to GPU training nodes to authorized users via SSH key controls and namespace isolation.
In Kubernetes/containerized training (e.g., Kubeflow, Ray), enforce pod security standards and limit inter-pod privilege escalation.
Detection: monitor for unexpected process terminations or hangs in distributed training jobs correlated with nccl.reduce call stacks (check NCCL logs and PyTorch DDP error traces).
Inventory all PyTorch 2.6.0+cu124 deployments in training infrastructure.

CISA SSVC Assessment

Decision Track*

Exploitation poc

Automatable No

Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

Classification

DoS Framework Training Data AML.T0010.001 - AI Software AML.T0029 - Denial of AI Service AML.T0034 - Cost Harvesting

Compliance Impact

This CVE is relevant to:

EU AI Act

Article 15 - Accuracy, robustness and cybersecurity

ISO 42001

A.10.2 - Availability of AI system resources

NIST AI RMF

MANAGE-2.4 - Mechanisms to respond to risks or harms

OWASP LLM Top 10

LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2025-4287?

Low-severity local DoS in PyTorch's NCCL reduce function (torch.cuda.nccl.reduce). Exploiting requires local access with unprivileged credentials — primary risk is in shared GPU clusters or multi-tenant ML training environments where a rogue user can crash distributed training jobs. Apply the upstream patch; if patching is blocked, restrict local access to training nodes.

Is CVE-2025-4287 actively exploited?

No confirmed active exploitation of CVE-2025-4287 has been reported, but organizations should still patch proactively.

How to fix CVE-2025-4287?

1. Patch: apply commit 5827d2061dcb4acd05ac5f8e65d8693a481ba0f5 or update PyTorch once a patched release ships. 2. Workaround: restrict local shell access to GPU training nodes to authorized users via SSH key controls and namespace isolation. 3. In Kubernetes/containerized training (e.g., Kubeflow, Ray), enforce pod security standards and limit inter-pod privilege escalation. 4. Detection: monitor for unexpected process terminations or hangs in distributed training jobs correlated with nccl.reduce call stacks (check NCCL logs and PyTorch DDP error traces). 5. Inventory all PyTorch 2.6.0+cu124 deployments in training infrastructure.

What systems are affected by CVE-2025-4287?

This vulnerability affects the following AI/ML architecture patterns: distributed training pipelines, multi-GPU model serving, training pipelines.

What is the CVSS score for CVE-2025-4287?

CVE-2025-4287 has a CVSS v3.1 base score of 3.3 (LOW). The EPSS exploitation probability is 0.08%.

Technical Details

NVD Description

A vulnerability was found in PyTorch 2.6.0+cu124. It has been rated as problematic. Affected by this issue is the function torch.cuda.nccl.reduce of the file torch/cuda/nccl.py. The manipulation leads to denial of service. It is possible to launch the attack on the local host. The exploit has been disclosed to the public and may be used. The patch is identified as 5827d2061dcb4acd05ac5f8e65d8693a481ba0f5. It is recommended to apply a patch to fix this issue.

Exploitation Scenario

An adversary with low-privileged local access to a shared GPU training node (e.g., via a shared HPC account or a compromised ML engineer credential) triggers torch.cuda.nccl.reduce with crafted input that causes improper resource release. This crashes or hangs the NCCL collective operation, causing the entire distributed training job to stall — potentially destroying hours or days of in-progress model training with no data corruption of stored checkpoints. In a multi-tenant GPU cluster scenario (e.g., research institution or internal ML platform), this could be used as sabotage against a competing team's training run.