CVE-2020-15202: TensorFlow: Shard API int truncation enables memory corruption

CRITICAL PoC AVAILABLE
Published September 25, 2020
CISO Take

Any TensorFlow deployment on versions prior to 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1 is vulnerable to memory corruption attacks via crafted large-batch inputs to parallelized operations. Patch immediately — this affects both training infrastructure and inference serving endpoints. If immediate patching is not possible, restrict network access to TensorFlow Serving endpoints and audit for anomalous large-batch requests.

Risk Assessment

CVSS 9.0 with network vector and no privilege requirement makes this high-priority despite the high attack complexity rating. The S:C (scope changed) flag indicates blast radius extends beyond the TF process itself. In AI/ML infrastructure, memory corruption vulnerabilities are particularly dangerous because silent data corruption during training can produce subtly compromised models without triggering obvious failures — making detection difficult and impact potentially long-tailed. Organizations running TF Serving exposed to untrusted networks face the highest risk.

Affected Systems

Package Ecosystem Vulnerable Range Patched
tensorflow pip No patch
195.0K OpenSSF 7.2 3.7K dependents Pushed 6d ago 4% patched ~1372d to patch Full package profile →
leap No patch

Severity & Risk

CVSS 3.1
9.0 / 10
EPSS
0.5%
chance of exploitation in 30 days
Higher than 66% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Moderate
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, CISA SSVC, EPSS, trickest/cve, and Nuclei templates.

Attack Surface

AV AC PR UI S C I A
AV Network
AC High
PR None
UI None
S Changed
C High
I High
A High

Recommended Action

5 steps
  1. PATCH

    Upgrade to TF 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1 immediately. Verify via 'pip show tensorflow'.

  2. NETWORK CONTROLS

    Place TF Serving behind API gateway with input validation; enforce maximum batch size limits at the gateway layer.

  3. DETECTION

    Monitor for anomalous large-batch requests (unusually high tensor dimensions), segfault/OOM logs in TF Serving processes, and unexpected process crashes.

  4. CONTAINERIZATION

    Ensure TF workloads run in isolated containers with seccomp profiles to limit damage from memory corruption primitives.

  5. INVENTORY

    Enumerate all TF versions in use across training pipelines, inference servers, and notebook environments — ML teams often run pinned older versions.

Classification

Compliance Impact

This CVE is relevant to:

EU AI Act
Article 9 - Risk Management System
ISO 42001
A.6.2.6 - AI system updates and maintenance A.9.2 - Information security in AI system development
NIST AI RMF
GOVERN-1.7 - Processes and procedures are in place for decommissioning and phasing out AI systems MANAGE-2.2 - Mechanisms are in place and applied to sustain the value of deployed AI systems
OWASP LLM Top 10
LLM05:2025 - Improper Output Handling / Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2020-15202?

Any TensorFlow deployment on versions prior to 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1 is vulnerable to memory corruption attacks via crafted large-batch inputs to parallelized operations. Patch immediately — this affects both training infrastructure and inference serving endpoints. If immediate patching is not possible, restrict network access to TensorFlow Serving endpoints and audit for anomalous large-batch requests.

Is CVE-2020-15202 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2020-15202, increasing the risk of exploitation.

How to fix CVE-2020-15202?

1. PATCH: Upgrade to TF 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1 immediately. Verify via 'pip show tensorflow'. 2. NETWORK CONTROLS: Place TF Serving behind API gateway with input validation; enforce maximum batch size limits at the gateway layer. 3. DETECTION: Monitor for anomalous large-batch requests (unusually high tensor dimensions), segfault/OOM logs in TF Serving processes, and unexpected process crashes. 4. CONTAINERIZATION: Ensure TF workloads run in isolated containers with seccomp profiles to limit damage from memory corruption primitives. 5. INVENTORY: Enumerate all TF versions in use across training pipelines, inference servers, and notebook environments — ML teams often run pinned older versions.

What systems are affected by CVE-2020-15202?

This vulnerability affects the following AI/ML architecture patterns: model serving, training pipelines, distributed training, ML notebook environments, batch inference pipelines.

What is the CVSS score for CVE-2020-15202?

CVE-2020-15202 has a CVSS v3.1 base score of 9.0 (CRITICAL). The EPSS exploitation probability is 0.50%.

Technical Details

NVD Description

In Tensorflow before versions 1.15.4, 2.0.3, 2.1.2, 2.2.1 and 2.3.1, the `Shard` API in TensorFlow expects the last argument to be a function taking two `int64` (i.e., `long long`) arguments. However, there are several places in TensorFlow where a lambda taking `int` or `int32` arguments is being used. In these cases, if the amount of work to be parallelized is large enough, integer truncation occurs. Depending on how the two arguments of the lambda are used, this can result in segfaults, read/write outside of heap allocated arrays, stack overflows, or data corruption. The issue is patched in commits 27b417360cbd671ef55915e4bb6bb06af8b8a832 and ca8c013b5e97b1373b3bb1c97ea655e69f31a575, and is released in TensorFlow versions 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

Exploitation Scenario

An adversary targeting an organization's ML inference API crafts a prediction request with extremely large batch dimensions — sufficient to cause the int64-to-int32 truncation in the Shard API's lambda callbacks. When TensorFlow parallelizes this workload across threads, the truncated iteration bounds cause reads or writes outside heap-allocated buffers. In a worst-case scenario against TF Serving: the attacker achieves RCE within the serving container, enabling exfiltration of proprietary model weights, poisoning of serving responses, or lateral movement into ML training infrastructure. Alternatively, repeated exploitation causes serving process crashes, degrading inference availability for downstream AI-powered products.

Weaknesses (CWE)

CVSS Vector

CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:C/C:H/I:H/A:H

Timeline

Published
September 25, 2020
Last Modified
November 21, 2024
First Seen
September 25, 2020

Related Vulnerabilities