CVE-2022-29195: TensorFlow: StagePeek DoS via unvalidated index scalar

MEDIUM PoC AVAILABLE CISA: TRACK*
Published May 20, 2022
CISO Take

A local attacker with low privileges can crash TensorFlow processes by passing a non-scalar tensor as the index argument to tf.raw_ops.StagePeek, triggering a CHECK-failure. Patch to TF 2.9.0, 2.8.1, 2.7.2, or 2.6.4 immediately — risk is highest in shared ML platforms or notebook environments where multiple users execute TF ops. Not remotely exploitable, so it does not warrant emergency response for isolated single-tenant deployments.

What is the risk?

Effective risk is medium-low for most organizations. The local attack vector and low-privilege requirement limit exposure to insider threat scenarios or multi-tenant ML infrastructure (shared Jupyter environments, ML platforms, MLflow/Kubeflow clusters). In those contexts, a malicious or compromised user could deliberately disrupt co-tenants' training jobs or inference services — a meaningful operational risk. No remote exploitation path exists; CVSS 5.5 accurately reflects the constrained scope.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 2d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
5.5 / 10
EPSS
0.3%
chance of exploitation in 30 days
Higher than 23% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

What should I do?

5 steps
  1. Patch: Upgrade TensorFlow to 2.9.0, 2.8.1, 2.7.2, or 2.6.4 — official patches exist for all branches.

  2. Inventory: Identify all deployments running TF in shared/multi-tenant environments (MLflow, Kubeflow, JupyterHub, managed notebook services).

  3. Isolation: Enforce user-level process isolation on shared ML platforms to prevent a crash from affecting other tenants.

  4. Input validation: If operating an API that proxies TF raw ops, validate that index arguments are scalar before passing to StagePeek.

  5. Detection: Alert on unexpected TF process crashes (CHECK-failure logs contain 'StagePeek') in production inference or training infrastructure.

What does CISA's SSVC say?

Decision Track*
Exploitation poc
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity for high-risk AI systems
ISO 42001
6.1.2 - AI risk treatment — third-party component vulnerabilities 8.4 - AI system operation and monitoring
NIST AI RMF
GOVERN-6.1 - Policies for third-party AI risk MANAGE-2.2 - Mechanisms to sustain effectiveness of risk treatments

Frequently Asked Questions

What is CVE-2022-29195?

A local attacker with low privileges can crash TensorFlow processes by passing a non-scalar tensor as the index argument to tf.raw_ops.StagePeek, triggering a CHECK-failure. Patch to TF 2.9.0, 2.8.1, 2.7.2, or 2.6.4 immediately — risk is highest in shared ML platforms or notebook environments where multiple users execute TF ops. Not remotely exploitable, so it does not warrant emergency response for isolated single-tenant deployments.

Is CVE-2022-29195 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2022-29195, increasing the risk of exploitation.

How to fix CVE-2022-29195?

1. Patch: Upgrade TensorFlow to 2.9.0, 2.8.1, 2.7.2, or 2.6.4 — official patches exist for all branches. 2. Inventory: Identify all deployments running TF in shared/multi-tenant environments (MLflow, Kubeflow, JupyterHub, managed notebook services). 3. Isolation: Enforce user-level process isolation on shared ML platforms to prevent a crash from affecting other tenants. 4. Input validation: If operating an API that proxies TF raw ops, validate that index arguments are scalar before passing to StagePeek. 5. Detection: Alert on unexpected TF process crashes (CHECK-failure logs contain 'StagePeek') in production inference or training infrastructure.

What systems are affected by CVE-2022-29195?

This vulnerability affects the following AI/ML architecture patterns: training pipelines, model serving, shared ML platforms.

What is the CVSS score for CVE-2022-29195?

CVE-2022-29195 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.32%.

What is the AI security impact?

Affected AI Architectures

training pipelinesmodel servingshared ML platforms

MITRE ATLAS Techniques

AML.T0029 Denial of AI Service
AML.T0034 Cost Harvesting
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: 6.1.2, 8.4
NIST AI RMF: GOVERN-6.1, MANAGE-2.2

What are the technical details?

Original Advisory

TensorFlow is an open source platform for machine learning. Prior to versions 2.9.0, 2.8.1, 2.7.2, and 2.6.4, the implementation of `tf.raw_ops.StagePeek` does not fully validate the input arguments. This results in a `CHECK`-failure which can be used to trigger a denial of service attack. The code assumes `index` is a scalar but there is no validation for this before accessing its value. Versions 2.9.0, 2.8.1, 2.7.2, and 2.6.4 contain a patch for this issue.

Exploitation Scenario

An adversary with access to a shared ML training platform (e.g., a compromised data scientist account or a malicious insider) submits a training notebook that calls tf.raw_ops.StagePeek with a multi-dimensional tensor as the index parameter instead of a scalar. TensorFlow's CHECK macro fires without validation, crashing the TF worker process. On a shared Kubeflow or SageMaker training cluster, this disrupts co-located training jobs, forcing restarts and causing data pipeline stalls — a targeted denial-of-service against a competitor's or colleague's long-running training run.

Weaknesses (CWE)

CWE-20 — Improper Input Validation: The product receives input or data, but it does not validate or incorrectly validates that the input has the properties that are required to process the data safely and correctly.

  • [Architecture and Design] Consider using language-theoretic security (LangSec) techniques that characterize inputs using a formal language and build "recognizers" for that language. This effectively requires parsing to be a distinct layer that effectively enforces a boundary between raw input and internal data representations, instead of allowing parser code to be scattered throughout the program, where it could be subject to errors or inconsistencies that create weaknesses. [REF-1109] [REF-1110] [REF-1111]
  • [Architecture and Design] Use an input validation framework such as Struts or the OWASP ESAPI Validation API. Note that using a framework does not automatically address all input validation problems; be mindful of weaknesses that could arise from misusing the framework itself (CWE-1173).

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
May 20, 2022
Last Modified
November 21, 2024
First Seen
May 20, 2022

Related Vulnerabilities