CVE-2022-41897: TensorFlow: OOB read in FractionMaxPoolGrad causes DoS

HIGH PoC AVAILABLE CISA: TRACK*
Published November 18, 2022
CISO Take

A network-reachable, zero-auth crash in TensorFlow's fractional max pooling gradient allows any attacker to bring down TF-based serving infrastructure by sending oversized sequence inputs. Upgrade to TensorFlow 2.11, 2.10.1, 2.9.3, or 2.8.4 immediately. If patching is blocked, add input size validation at the API gateway layer to reject abnormally large pooling sequence arrays.

What is the risk?

High exploitability: CVSS 7.5 with AV:N/AC:L/PR:N/UI:N means any unauthenticated attacker on the network can trigger this with a single crafted request. Impact is limited to availability (no confidentiality or integrity loss), but a persistent crash loop against a production ML serving endpoint constitutes a full service outage. Not in CISA KEV and no active exploitation evidence as of disclosure, but the GitHub advisory is tagged 'Exploit', suggesting PoC exists. Risk is highest for organizations exposing TensorFlow inference or training APIs directly to untrusted networks.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 3d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
7.5 / 10
EPSS
0.4%
chance of exploitation in 30 days
Higher than 35% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Unchanged
C None
I None
A High

What should I do?

5 steps
  1. PATCH

    Upgrade to TensorFlow 2.11.0, 2.10.1, 2.9.3, or 2.8.4. Apply commit d71090c3e5ca325bdf4b02eb236cfb3ee823e927 if building from source.

  2. WORKAROUND

    Add server-side validation to reject requests where row_pooling_sequence or col_pooling_sequence exceed expected bounds before reaching TF ops.

  3. HARDEN

    Place TF Serving behind an API gateway that enforces payload size limits and schema validation on tensor inputs.

  4. ISOLATE

    Ensure model training and serving processes run with minimal privileges and in containerized environments to limit blast radius of a crash.

  5. DETECT

    Alert on unexpected TensorFlow process exits or restart loops — these are the primary exploitation signal.

What does CISA's SSVC say?

Decision Track*
Exploitation poc
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Art. 17 - Quality management system Art. 9 - Risk management system
ISO 42001
6.1.2 - AI risk assessment 8.4 - AI system operation and monitoring
NIST AI RMF
GOVERN 1.4 - Organizational teams are committed to transparency and accountability MANAGE 2.2 - Mechanisms to sustain AI risk management
OWASP LLM Top 10
LLM05 - Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2022-41897?

A network-reachable, zero-auth crash in TensorFlow's fractional max pooling gradient allows any attacker to bring down TF-based serving infrastructure by sending oversized sequence inputs. Upgrade to TensorFlow 2.11, 2.10.1, 2.9.3, or 2.8.4 immediately. If patching is blocked, add input size validation at the API gateway layer to reject abnormally large pooling sequence arrays.

Is CVE-2022-41897 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2022-41897, increasing the risk of exploitation.

How to fix CVE-2022-41897?

1. PATCH: Upgrade to TensorFlow 2.11.0, 2.10.1, 2.9.3, or 2.8.4. Apply commit d71090c3e5ca325bdf4b02eb236cfb3ee823e927 if building from source. 2. WORKAROUND: Add server-side validation to reject requests where row_pooling_sequence or col_pooling_sequence exceed expected bounds before reaching TF ops. 3. HARDEN: Place TF Serving behind an API gateway that enforces payload size limits and schema validation on tensor inputs. 4. ISOLATE: Ensure model training and serving processes run with minimal privileges and in containerized environments to limit blast radius of a crash. 5. DETECT: Alert on unexpected TensorFlow process exits or restart loops — these are the primary exploitation signal.

What systems are affected by CVE-2022-41897?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, inference endpoints.

What is the CVSS score for CVE-2022-41897?

CVE-2022-41897 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.44%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servinginference endpoints

MITRE ATLAS Techniques

AML.T0029 Denial of AI Service
AML.T0034 Cost Harvesting
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Art. 17, Art. 9
ISO 42001: 6.1.2, 8.4
NIST AI RMF: GOVERN 1.4, MANAGE 2.2
OWASP LLM Top 10: LLM05

What are the technical details?

Original Advisory

TensorFlow is an open source platform for machine learning. If `FractionMaxPoolGrad` is given outsize inputs `row_pooling_sequence` and `col_pooling_sequence`, TensorFlow will crash. We have patched the issue in GitHub commit d71090c3e5ca325bdf4b02eb236cfb3ee823e927. The fix will be included in TensorFlow 2.11. We will also cherrypick this commit on TensorFlow 2.10.1, 2.9.3, and TensorFlow 2.8.4, as these are also affected and still in supported range.

Exploitation Scenario

An attacker identifies a publicly accessible TensorFlow Serving endpoint or a training API (e.g., custom training loop exposed via REST). They craft a request invoking FractionMaxPoolGrad with row_pooling_sequence or col_pooling_sequence arrays whose dimensions exceed the expected bounds. TensorFlow performs an out-of-bounds read (CWE-125) during the gradient computation, causing an immediate process crash. In an autoscaling environment, repeated requests can outpace the restart policy, resulting in sustained unavailability. In a single-node training environment, the attack terminates an active training job, destroying ephemeral checkpoint state if not persisted. No ML knowledge is required — the attacker only needs to know the API accepts pooling gradient operations.

Weaknesses (CWE)

CWE-125 — Out-of-bounds Read: The product reads data past the end, or before the beginning, of the intended buffer.

  • [Implementation] Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylis
  • [Architecture and Design] Use a language that provides appropriate memory abstractions.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
November 18, 2022
Last Modified
November 21, 2024
First Seen
November 18, 2022

Related Vulnerabilities