CVE-2020-15202: TensorFlow: Shard API int truncation enables memory corruption

CRITICAL PoC AVAILABLE
Published September 25, 2020
CISO Take

Any TensorFlow deployment on versions prior to 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1 is vulnerable to memory corruption attacks via crafted large-batch inputs to parallelized operations. Patch immediately — this affects both training infrastructure and inference serving endpoints. If immediate patching is not possible, restrict network access to TensorFlow Serving endpoints and audit for anomalous large-batch requests.

What is the risk?

CVSS 9.0 with network vector and no privilege requirement makes this high-priority despite the high attack complexity rating. The S:C (scope changed) flag indicates blast radius extends beyond the TF process itself. In AI/ML infrastructure, memory corruption vulnerabilities are particularly dangerous because silent data corruption during training can produce subtly compromised models without triggering obvious failures — making detection difficult and impact potentially long-tailed. Organizations running TF Serving exposed to untrusted networks face the highest risk.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 2d ago 4% patched ~1372d to patch Full package profile →
leap No patch

How severe is it?

CVSS 3.1
9.0 / 10
EPSS
1.2%
chance of exploitation in 30 days
Higher than 65% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Moderate
Exploitation Confidence
medium
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC High
PR None
UI None
S Changed
C High
I High
A High

What should I do?

5 steps
  1. PATCH

    Upgrade to TF 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1 immediately. Verify via 'pip show tensorflow'.

  2. NETWORK CONTROLS

    Place TF Serving behind API gateway with input validation; enforce maximum batch size limits at the gateway layer.

  3. DETECTION

    Monitor for anomalous large-batch requests (unusually high tensor dimensions), segfault/OOM logs in TF Serving processes, and unexpected process crashes.

  4. CONTAINERIZATION

    Ensure TF workloads run in isolated containers with seccomp profiles to limit damage from memory corruption primitives.

  5. INVENTORY

    Enumerate all TF versions in use across training pipelines, inference servers, and notebook environments — ML teams often run pinned older versions.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 9 - Risk Management System
ISO 42001
A.6.2.6 - AI system updates and maintenance A.9.2 - Information security in AI system development
NIST AI RMF
GOVERN-1.7 - Processes and procedures are in place for decommissioning and phasing out AI systems MANAGE-2.2 - Mechanisms are in place and applied to sustain the value of deployed AI systems
OWASP LLM Top 10
LLM05:2025 - Improper Output Handling / Supply Chain Vulnerabilities

Frequently Asked Questions

What is CVE-2020-15202?

Any TensorFlow deployment on versions prior to 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1 is vulnerable to memory corruption attacks via crafted large-batch inputs to parallelized operations. Patch immediately — this affects both training infrastructure and inference serving endpoints. If immediate patching is not possible, restrict network access to TensorFlow Serving endpoints and audit for anomalous large-batch requests.

Is CVE-2020-15202 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2020-15202, increasing the risk of exploitation.

How to fix CVE-2020-15202?

1. PATCH: Upgrade to TF 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1 immediately. Verify via 'pip show tensorflow'. 2. NETWORK CONTROLS: Place TF Serving behind API gateway with input validation; enforce maximum batch size limits at the gateway layer. 3. DETECTION: Monitor for anomalous large-batch requests (unusually high tensor dimensions), segfault/OOM logs in TF Serving processes, and unexpected process crashes. 4. CONTAINERIZATION: Ensure TF workloads run in isolated containers with seccomp profiles to limit damage from memory corruption primitives. 5. INVENTORY: Enumerate all TF versions in use across training pipelines, inference servers, and notebook environments — ML teams often run pinned older versions.

What systems are affected by CVE-2020-15202?

This vulnerability affects the following AI/ML architecture patterns: model serving, training pipelines, distributed training, ML notebook environments, batch inference pipelines.

What is the CVSS score for CVE-2020-15202?

CVE-2020-15202 has a CVSS v3.1 base score of 9.0 (CRITICAL). The EPSS exploitation probability is 1.23%.

What is the AI security impact?

Affected AI Architectures

model servingtraining pipelinesdistributed trainingML notebook environmentsbatch inference pipelines

MITRE ATLAS Techniques

AML.T0001 Search Open AI Vulnerability Analysis
AML.T0010.001 AI Software
AML.T0031 Erode AI Model Integrity
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 9
ISO 42001: A.6.2.6, A.9.2
NIST AI RMF: GOVERN-1.7, MANAGE-2.2
OWASP LLM Top 10: LLM05:2025

What are the technical details?

Original Advisory

In Tensorflow before versions 1.15.4, 2.0.3, 2.1.2, 2.2.1 and 2.3.1, the `Shard` API in TensorFlow expects the last argument to be a function taking two `int64` (i.e., `long long`) arguments. However, there are several places in TensorFlow where a lambda taking `int` or `int32` arguments is being used. In these cases, if the amount of work to be parallelized is large enough, integer truncation occurs. Depending on how the two arguments of the lambda are used, this can result in segfaults, read/write outside of heap allocated arrays, stack overflows, or data corruption. The issue is patched in commits 27b417360cbd671ef55915e4bb6bb06af8b8a832 and ca8c013b5e97b1373b3bb1c97ea655e69f31a575, and is released in TensorFlow versions 1.15.4, 2.0.3, 2.1.2, 2.2.1, or 2.3.1.

Exploitation Scenario

An adversary targeting an organization's ML inference API crafts a prediction request with extremely large batch dimensions — sufficient to cause the int64-to-int32 truncation in the Shard API's lambda callbacks. When TensorFlow parallelizes this workload across threads, the truncated iteration bounds cause reads or writes outside heap-allocated buffers. In a worst-case scenario against TF Serving: the attacker achieves RCE within the serving container, enabling exfiltration of proprietary model weights, poisoning of serving responses, or lateral movement into ML training infrastructure. Alternatively, repeated exploitation causes serving process crashes, degrading inference availability for downstream AI-powered products.

Weaknesses (CWE)

CWE-197 — Numeric Truncation Error: Truncation errors occur when a primitive is cast to a primitive of a smaller size and data is lost in the conversion.

  • [Implementation] Ensure that no casts, implicit or explicit, take place that move from a larger size primitive or a smaller size primitive.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:C/C:H/I:H/A:H

Timeline

Published
September 25, 2020
Last Modified
November 21, 2024
First Seen
September 25, 2020

Related Vulnerabilities