CVE-2023-25673: TensorFlow: FPE in TensorListSplit (XLA) remote DoS

HIGH
Published March 25, 2023
CISO Take

A remotely exploitable floating point exception in TensorFlow's XLA-compiled TensorListSplit operation allows unauthenticated attackers to crash any TF serving endpoint that processes user-controlled tensor inputs — no privileges or interaction required. If your model serving infrastructure runs TF < 2.11.1/2.12.0 with XLA enabled, this is an uptime risk to production inference APIs. Patch immediately or disable XLA compilation as a temporary workaround.

What is the risk?

High severity for AI/ML production environments. CVSS 7.5 reflects unauthenticated network-accessible DoS with low attack complexity. Internet-facing TF Serving APIs, REST/gRPC model endpoints, and any pipeline that feeds user-controlled data into XLA-compiled graphs are directly exposed. The availability-only impact limits blast radius, but continuous exploitation can take down inference infrastructure with trivial automation. Severity is elevated in regulated environments where model availability underpins compliance SLAs.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 3d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
7.5 / 10
EPSS
0.4%
chance of exploitation in 30 days
Higher than 31% of all CVEs
Exploitation Status
No known exploitation
Sophistication
Moderate

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR None
UI None
S Unchanged
C None
I None
A High

What should I do?

6 steps
  1. PATCH

    Upgrade to TensorFlow 2.12.0 or 2.11.1 — the only definitive fix per the vendor advisory.

  2. WORKAROUND

    Disable XLA JIT compilation via tf.config.optimizer.set_jit(False) or TF_XLA_FLAGS env var until patching is feasible.

  3. INPUT VALIDATION

    Add server-side tensor shape and type validation before feeding inputs to XLA-compiled graphs — reject malformed or unexpected tensor list dimensions at the API boundary.

  4. RATE LIMITING

    Apply request rate limits and circuit breakers on model serving endpoints to reduce DoS impact.

  5. DETECTION

    Monitor for abnormal process crashes or restarts in TF Serving containers; alert on SIGFPE signals in model serving processes.

  6. ISOLATION

    Run model serving in isolated containers/processes with automatic restart policies to minimize downtime during exploitation attempts.

What does CISA's SSVC say?

Decision Track
Exploitation none
Automatable Yes
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 9 - Risk Management System
ISO 42001
8.4 - AI System Lifecycle — Vulnerability Management
NIST AI RMF
GOVERN 6.1 - Policies for third-party AI software and dependencies MANAGE 2.2 - Mechanisms to sustain AI system performance and availability

Frequently Asked Questions

What is CVE-2023-25673?

A remotely exploitable floating point exception in TensorFlow's XLA-compiled TensorListSplit operation allows unauthenticated attackers to crash any TF serving endpoint that processes user-controlled tensor inputs — no privileges or interaction required. If your model serving infrastructure runs TF < 2.11.1/2.12.0 with XLA enabled, this is an uptime risk to production inference APIs. Patch immediately or disable XLA compilation as a temporary workaround.

Is CVE-2023-25673 actively exploited?

No confirmed active exploitation of CVE-2023-25673 has been reported, but organizations should still patch proactively.

How to fix CVE-2023-25673?

1. PATCH: Upgrade to TensorFlow 2.12.0 or 2.11.1 — the only definitive fix per the vendor advisory. 2. WORKAROUND: Disable XLA JIT compilation via tf.config.optimizer.set_jit(False) or TF_XLA_FLAGS env var until patching is feasible. 3. INPUT VALIDATION: Add server-side tensor shape and type validation before feeding inputs to XLA-compiled graphs — reject malformed or unexpected tensor list dimensions at the API boundary. 4. RATE LIMITING: Apply request rate limits and circuit breakers on model serving endpoints to reduce DoS impact. 5. DETECTION: Monitor for abnormal process crashes or restarts in TF Serving containers; alert on SIGFPE signals in model serving processes. 6. ISOLATION: Run model serving in isolated containers/processes with automatic restart policies to minimize downtime during exploitation attempts.

What systems are affected by CVE-2023-25673?

This vulnerability affects the following AI/ML architecture patterns: model serving, training pipelines, inference endpoints, TPU/XLA-accelerated workloads.

What is the CVSS score for CVE-2023-25673?

CVE-2023-25673 has a CVSS v3.1 base score of 7.5 (HIGH). The EPSS exploitation probability is 0.40%.

What is the AI security impact?

Affected AI Architectures

model servingtraining pipelinesinference endpointsTPU/XLA-accelerated workloads

MITRE ATLAS Techniques

AML.T0010.001 AI Software
AML.T0029 Denial of AI Service
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 9
ISO 42001: 8.4
NIST AI RMF: GOVERN 6.1, MANAGE 2.2

What are the technical details?

Original Advisory

TensorFlow is an open source platform for machine learning. Versions prior to 2.12.0 and 2.11.1 have a Floating Point Exception in TensorListSplit with XLA. A fix is included in TensorFlow version 2.12.0 and version 2.11.1.

Exploitation Scenario

An attacker identifies a publicly accessible TensorFlow Serving REST API endpoint (common in MLOps platforms, internal AI services, or SaaS products built on TF). Using knowledge of the target model's input schema — obtainable via the model's /v1/models endpoint or error messages — the attacker crafts a request containing a TensorList with edge-case numeric parameters that trigger the XLA TensorListSplit division-by-zero or invalid comparison. The FPE signal crashes the serving worker process. The attacker automates this with a simple curl/Python script to continuously restart-crash the server, achieving sustained DoS. No authentication, no ML expertise beyond basic TF API knowledge required.

Weaknesses (CWE)

CWE-697 — Incorrect Comparison: The product compares two entities in a security-relevant context, but the comparison is incorrect.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
March 25, 2023
Last Modified
November 21, 2024
First Seen
March 25, 2023

Related Vulnerabilities