CVE-2023-25676: TensorFlow: NULL ptr deref DoS in ParallelConcat op

HIGH
Published March 25, 2023
CISO Take

A remotely-triggerable crash in TensorFlow's XLA path allows any unauthenticated client to bring down a TF inference service with a single malformed request. No code execution, but a 100% availability kill against exposed ML endpoints. Patch to TF 2.11.1 or 2.12.0 immediately; if upgrade is blocked, disable XLA JIT compilation as a stopgap.

What is the risk?

CVSS 7.5 is accurate for availability-only impact. Real risk is higher in practice: ML inference APIs are commonly deployed internally with no authentication, making 'network-accessible, no-auth, no-interaction' a realistic attack surface. The crash is deterministic and trivially reproducible — a single crafted request kills the process. No exploitation sophistication required. Not in KEV and no known active exploitation, so risk is elevated for exposed services but lower for well-segmented deployments.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 3d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
7.5 / 10
EPSS
0.4%
chance of exploitation in 30 days
Higher than 31% of all CVEs
Exploitation Status
No known exploitation
Sophistication
Trivial

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Unchanged
C None
I None
A High

What should I do?

5 steps
  1. PATCH

    Upgrade TensorFlow to 2.12.0 (all users) or 2.11.1 (LTS track). Commit da66bc6d5ff466aee084f9e7397980a24890cd15 is the fix.

  2. WORKAROUND (if upgrade blocked): Disable XLA with TF_XLA_FLAGS=--tf_xla_auto_jit=0 environment variable or avoid tf.function(jit_compile=True).

  3. INPUT VALIDATION

    Validate shape tensors at API boundary — reject any shape input with rank == 0 before passing to TF ops.

  4. DEFENSE IN DEPTH

    Place TF inference endpoints behind authenticated API gateways; never expose raw TF Serving gRPC/HTTP ports to untrusted networks.

  5. DETECTION

    Alert on unexpected TensorFlow process restarts or segfault entries in system logs (dmesg, container crash loops).

What does CISA's SSVC say?

Decision Track
Exploitation none
Automatable Yes
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Art.15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.2.6 - AI system availability and resilience
NIST AI RMF
GOVERN 1.7 - Processes for decommissioning and vulnerability response MANAGE 2.4 - Residual risks and documented response plans

Frequently Asked Questions

What is CVE-2023-25676?

A remotely-triggerable crash in TensorFlow's XLA path allows any unauthenticated client to bring down a TF inference service with a single malformed request. No code execution, but a 100% availability kill against exposed ML endpoints. Patch to TF 2.11.1 or 2.12.0 immediately; if upgrade is blocked, disable XLA JIT compilation as a stopgap.

Is CVE-2023-25676 actively exploited?

No confirmed active exploitation of CVE-2023-25676 has been reported, but organizations should still patch proactively.

How to fix CVE-2023-25676?

1. PATCH: Upgrade TensorFlow to 2.12.0 (all users) or 2.11.1 (LTS track). Commit da66bc6d5ff466aee084f9e7397980a24890cd15 is the fix. 2. WORKAROUND (if upgrade blocked): Disable XLA with `TF_XLA_FLAGS=--tf_xla_auto_jit=0` environment variable or avoid `tf.function(jit_compile=True)`. 3. INPUT VALIDATION: Validate shape tensors at API boundary — reject any shape input with rank == 0 before passing to TF ops. 4. DEFENSE IN DEPTH: Place TF inference endpoints behind authenticated API gateways; never expose raw TF Serving gRPC/HTTP ports to untrusted networks. 5. DETECTION: Alert on unexpected TensorFlow process restarts or segfault entries in system logs (`dmesg`, container crash loops).

What systems are affected by CVE-2023-25676?

This vulnerability affects the following AI/ML architecture patterns: model serving, training pipelines, inference APIs.

What is the CVSS score for CVE-2023-25676?

CVE-2023-25676 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.39%.

What is the AI security impact?

Affected AI Architectures

model servingtraining pipelinesinference APIs

MITRE ATLAS Techniques

AML.T0029 Denial of AI Service
AML.T0034 Cost Harvesting
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Art.15
ISO 42001: A.6.2.6
NIST AI RMF: GOVERN 1.7, MANAGE 2.4

What are the technical details?

Original Advisory

TensorFlow is an open source machine learning platform. When running versions prior to 2.12.0 and 2.11.1 with XLA, `tf.raw_ops.ParallelConcat` segfaults with a nullptr dereference when given a parameter `shape` with rank that is not greater than zero. A fix is available in TensorFlow 2.12.0 and 2.11.1.

Exploitation Scenario

An attacker targeting an organization's ML inference API (e.g., an internal TF Serving endpoint for an AI feature) discovers the service is running TensorFlow < 2.12.0 with XLA enabled. They craft a gRPC predict request that invokes `ParallelConcat` with a shape tensor of rank 0 (a scalar). The TF XLA kernel dereferences a null pointer, segfaults, and the serving process dies. In a Kubernetes deployment, the pod restarts in ~30 seconds — the attacker scripts this to send one request per 25 seconds, creating a continuous DoS that keeps the AI feature offline indefinitely. No credentials required; a single API probe reveals the crash behavior.

Weaknesses (CWE)

CWE-476 — NULL Pointer Dereference: The product dereferences a pointer that it expects to be valid but is NULL.

  • [Implementation] For any pointers that could have been modified or provided from a function that can return NULL, check the pointer for NULL before use. When working with a multithreaded or otherwise asynchronous environment, ensure that proper locking APIs are used to lock before the check, and unlock when it has finished [REF-1484].
  • [Requirements] Select a programming language that is not susceptible to these issues.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
March 25, 2023
Last Modified
November 21, 2024
First Seen
March 25, 2023

Related Vulnerabilities