CVE-2021-41196: TensorFlow: integer underflow crashes Keras pooling layers

MEDIUM PoC AVAILABLE
Published November 5, 2021
CISO Take

A local attacker with low privileges can crash TensorFlow ML services by passing zero or negative pool sizes to Keras pooling layers, triggering an integer underflow segfault. Any TF 2.4.x–2.6.x deployment accepting user-controlled model configurations or layer parameters is at risk of availability disruption. Patch immediately to TF 2.7.0 or the respective backport (2.6.1, 2.5.2, 2.4.4); validate pool dimension inputs at service boundaries as a compensating control.

Risk Assessment

Medium operational risk. CVSS 5.5 reflects limited scope: local access required, no confidentiality or integrity impact. In practice, ML inference services running as shared infrastructure or accepting external model configs elevate the availability risk—a single malformed request crashes the entire TF process. Not actively exploited and no CISA KEV listing. Priority: patch during next maintenance window rather than emergency response.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed today 4% patched ~1372d to patch Full package profile →

Do you use tensorflow? You're affected.

Severity & Risk

CVSS 3.1
5.5 / 10
EPSS
0.0%
chance of exploitation in 30 days
Higher than 15% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

Recommended Action

5 steps
  1. PATCH

    Upgrade to TensorFlow 2.7.0 or apply backports 2.6.1, 2.5.2, 2.4.4 (commit 12b1ff82b3f26ff8de17e58703231d5a02ef1b8b).

  2. INPUT VALIDATION

    Enforce that pool_size > 0 and all spatial dimensions are strictly positive before instantiating any Keras pooling layer—especially when layer configs come from user input or external files.

  3. ISOLATION

    Run ML inference workloads in separate processes or containers; a segfault should not cascade to unrelated services.

  4. DETECTION

    Monitor for unexpected TF process crashes (exit code 139/SIGSEGV) in serving logs; alert on repeated crashes from the same client or model definition source.

  5. DEPENDENCY AUDIT

    Scan model-serving Dockerfiles and requirements.txt for pinned TF versions in the affected range.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, Robustness and Cybersecurity
ISO 42001
A.9.2 - AI System Operational Controls
NIST AI RMF
MANAGE 2.2 - Incident Response and Recovery MAP 5.1 - Likelihood of AI Risks

Frequently Asked Questions

What is CVE-2021-41196?

A local attacker with low privileges can crash TensorFlow ML services by passing zero or negative pool sizes to Keras pooling layers, triggering an integer underflow segfault. Any TF 2.4.x–2.6.x deployment accepting user-controlled model configurations or layer parameters is at risk of availability disruption. Patch immediately to TF 2.7.0 or the respective backport (2.6.1, 2.5.2, 2.4.4); validate pool dimension inputs at service boundaries as a compensating control.

Is CVE-2021-41196 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2021-41196, increasing the risk of exploitation.

How to fix CVE-2021-41196?

1. PATCH: Upgrade to TensorFlow 2.7.0 or apply backports 2.6.1, 2.5.2, 2.4.4 (commit 12b1ff82b3f26ff8de17e58703231d5a02ef1b8b). 2. INPUT VALIDATION: Enforce that pool_size > 0 and all spatial dimensions are strictly positive before instantiating any Keras pooling layer—especially when layer configs come from user input or external files. 3. ISOLATION: Run ML inference workloads in separate processes or containers; a segfault should not cascade to unrelated services. 4. DETECTION: Monitor for unexpected TF process crashes (exit code 139/SIGSEGV) in serving logs; alert on repeated crashes from the same client or model definition source. 5. DEPENDENCY AUDIT: Scan model-serving Dockerfiles and requirements.txt for pinned TF versions in the affected range.

What systems are affected by CVE-2021-41196?

This vulnerability affects the following AI/ML architecture patterns: model serving, training pipelines, inference.

What is the CVSS score for CVE-2021-41196?

CVE-2021-41196 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.05%.

Technical Details

NVD Description

TensorFlow is an open source platform for machine learning. In affected versions the Keras pooling layers can trigger a segfault if the size of the pool is 0 or if a dimension is negative. This is due to the TensorFlow's implementation of pooling operations where the values in the sliding window are not checked to be strictly positive. The fix will be included in TensorFlow 2.7.0. We will also cherrypick this commit on TensorFlow 2.6.1, TensorFlow 2.5.2, and TensorFlow 2.4.4, as these are also affected and still in supported range.

Exploitation Scenario

An attacker with access to an ML platform (e.g., a shared Jupyter environment, an internal model registry, or a model upload endpoint) submits a Keras model definition with pool_size=0 in a MaxPooling2D layer. When the model is loaded and a prediction is requested, TensorFlow iterates over the sliding window without validating that the pool size is positive, triggering CWE-191 integer underflow. The resulting segfault crashes the TF Serving worker process or notebook kernel. In a multi-tenant ML platform, this disrupts inference for all users sharing that worker, effectively a targeted DoS against ML production services with no elevated privileges required.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
November 5, 2021
Last Modified
November 21, 2024
First Seen
November 5, 2021

Related Vulnerabilities