CVE-2022-35989: TensorFlow: MaxPool GPU kernel DoS via oversized ksize

HIGH
Published September 16, 2022
CISO Take

Any TensorFlow deployment accepting externally-influenced inputs to GPU-accelerated MaxPool operations is vulnerable to remote service crash. Upgrade to TensorFlow 2.10.0, 2.9.1, 2.8.1, or 2.7.2 immediately — no workaround exists. Prioritize inference-serving infrastructure exposed to untrusted inputs.

What is the risk?

CVSS 7.5 High with AV:N/AC:L/PR:N/UI:N makes this exploitable by unauthenticated attackers over the network with no user interaction. Exploitability is straightforward: crafting an oversized ksize array is trivial. Impact is limited to availability (no C/I impact), but in production ML inference services this translates to complete service disruption. Real-world exposure depends on whether ksize parameters are user-controlled; direct TF Serving deployments without input sanitization are most at risk.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 2d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
7.5 / 10
EPSS
0.4%
chance of exploitation in 30 days
Higher than 30% of all CVEs
Exploitation Status
No known exploitation
Sophistication
Trivial

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Unchanged
C None
I None
A High

What should I do?

5 steps
  1. PATCH

    Upgrade to TensorFlow 2.10.0, or apply cherrypicks to 2.9.1 / 2.8.1 / 2.7.2 (commit 32d7bd3defd134f21a4e344c8dfd40099aaf6b18).

  2. VALIDATE INPUTS

    Enforce server-side validation that ksize dimensions do not exceed input tensor dimensions before GPU execution.

  3. NETWORK CONTROLS

    Restrict TF Serving endpoints to authenticated clients; reject malformed requests at the API gateway layer.

  4. DETECT

    Monitor for abnormal GPU process crashes or serving pod restarts — repeated crashes targeting MaxPool ops may indicate active exploitation.

  5. INVENTORY

    Audit all TF versions in use via SBOM or container image scanning; flag any 2.7.x–2.9.x deployments without the patch.

What does CISA's SSVC say?

Decision Track
Exploitation none
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, Robustness and Cybersecurity
ISO 42001
6.1.2 - AI Risk Assessment 9.1 - Monitoring, Measurement, Analysis and Evaluation
NIST AI RMF
GOVERN-6.1 - Policies for AI Risk MANAGE-2.4 - Residual Risk Treatment

Frequently Asked Questions

What is CVE-2022-35989?

Any TensorFlow deployment accepting externally-influenced inputs to GPU-accelerated MaxPool operations is vulnerable to remote service crash. Upgrade to TensorFlow 2.10.0, 2.9.1, 2.8.1, or 2.7.2 immediately — no workaround exists. Prioritize inference-serving infrastructure exposed to untrusted inputs.

Is CVE-2022-35989 actively exploited?

No confirmed active exploitation of CVE-2022-35989 has been reported, but organizations should still patch proactively.

How to fix CVE-2022-35989?

1. PATCH: Upgrade to TensorFlow 2.10.0, or apply cherrypicks to 2.9.1 / 2.8.1 / 2.7.2 (commit 32d7bd3defd134f21a4e344c8dfd40099aaf6b18). 2. VALIDATE INPUTS: Enforce server-side validation that ksize dimensions do not exceed input tensor dimensions before GPU execution. 3. NETWORK CONTROLS: Restrict TF Serving endpoints to authenticated clients; reject malformed requests at the API gateway layer. 4. DETECT: Monitor for abnormal GPU process crashes or serving pod restarts — repeated crashes targeting MaxPool ops may indicate active exploitation. 5. INVENTORY: Audit all TF versions in use via SBOM or container image scanning; flag any 2.7.x–2.9.x deployments without the patch.

What systems are affected by CVE-2022-35989?

This vulnerability affects the following AI/ML architecture patterns: model serving, training pipelines, inference.

What is the CVSS score for CVE-2022-35989?

CVE-2022-35989 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.38%.

What is the AI security impact?

Affected AI Architectures

model servingtraining pipelinesinference

MITRE ATLAS Techniques

AML.T0029 Denial of AI Service
AML.T0034 Cost Harvesting
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: 6.1.2, 9.1
NIST AI RMF: GOVERN-6.1, MANAGE-2.4

What are the technical details?

Original Advisory

TensorFlow is an open source platform for machine learning. When `MaxPool` receives a window size input array `ksize` with dimensions greater than its input tensor `input`, the GPU kernel gives a `CHECK` fail that can be used to trigger a denial of service attack. We have patched the issue in GitHub commit 32d7bd3defd134f21a4e344c8dfd40099aaf6b18. The fix will be included in TensorFlow 2.10.0. We will also cherrypick this commit on TensorFlow 2.9.1, TensorFlow 2.8.1, and TensorFlow 2.7.2, as these are also affected and still in supported range. There are no known workarounds for this issue.

Exploitation Scenario

An adversary identifies a TensorFlow Serving endpoint (or custom inference API) processing CNN-based models with MaxPool layers on GPU. They submit an inference request crafting the ksize window size parameter to exceed the spatial dimensions of the input tensor. The GPU kernel hits a CHECK assertion failure and crashes the serving process. In Kubernetes environments this triggers a pod restart, creating a brief outage window. Repeated requests maintain a sustained DoS, degrading SLA for downstream applications. In shared ML platforms, this could be used to disrupt competing tenants or mask other malicious activity during the outage window.

Weaknesses (CWE)

CWE-617 — Reachable Assertion: The product contains an assert() or similar statement that can be triggered by an attacker, which leads to an application exit or other behavior that is more severe than necessary.

  • [Implementation] Make sensitive open/close operation non reachable by directly user-controlled data (e.g. open/close resources)
  • [Implementation] Perform input validation on user data.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
September 16, 2022
Last Modified
November 21, 2024
First Seen
September 16, 2022

Related Vulnerabilities