CVE-2021-37669: TensorFlow: integer conversion DoS in NonMaxSuppression ops

MEDIUM
Published August 12, 2021
CISO Take

This is a local denial-of-service in TensorFlow's object detection operations (NonMaxSuppression, CombinedNonMaxSuppression) caused by a negative integer being implicitly cast to unsigned, crashing the process. If your model serving endpoints expose these ops to user-controlled inputs, an attacker can crash your inference service with a single malformed request. Patch to TF 2.6.0 or the backport releases (2.5.1, 2.4.3, 2.3.4) immediately.

What is the risk?

Medium risk in isolation, higher in model serving contexts. CVSS 5.5 (Local) understates real-world exposure: model serving APIs accepting arbitrary inference requests effectively lower the attack vector to Network. No data exfiltration or code execution possible — pure availability impact. Exploitation requires no AI/ML knowledge, just a negative integer in the right field. Not in CISA KEV and no known active exploitation, but the technique is trivially reproducible from the public advisory.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 2d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
5.5 / 10
EPSS
0.2%
chance of exploitation in 30 days
Higher than 7% of all CVEs
Exploitation Status
No known exploitation
Sophistication
Trivial

What is the attack surface?

AV AC PR UI S C I A
AV Local
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

What should I do?

5 steps
  1. PATCH

    Upgrade to TensorFlow 2.6.0+. If pinned, apply cherry-picks: commit 3a7362750d5c (NonMaxSuppression fix) and b5cdbf12ffca (CombinedNonMaxSuppression fix).

  2. INPUT VALIDATION

    Add server-side validation rejecting non-positive values for output_size / max_output_size parameters before they reach TF ops.

  3. DETECTION

    Monitor inference service crash rates and restarts; sudden spikes on endpoints accepting detection models indicate exploitation attempts.

  4. SANDBOXING

    Run TF Serving instances in containers with restart policies so DoS impact is limited to brief availability windows.

  5. AUDIT

    Inventory all internal services using tf.raw_ops.NonMaxSuppressionV5 or CombinedNonMaxSuppression directly.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.6.1.2 - AI risk assessment A.6.2.4 - AI system operation and monitoring
NIST AI RMF
GOVERN 1.7 - Processes for AI risk monitoring MEASURE 2.5 - AI system availability and resiliency
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2021-37669?

This is a local denial-of-service in TensorFlow's object detection operations (NonMaxSuppression, CombinedNonMaxSuppression) caused by a negative integer being implicitly cast to unsigned, crashing the process. If your model serving endpoints expose these ops to user-controlled inputs, an attacker can crash your inference service with a single malformed request. Patch to TF 2.6.0 or the backport releases (2.5.1, 2.4.3, 2.3.4) immediately.

Is CVE-2021-37669 actively exploited?

No confirmed active exploitation of CVE-2021-37669 has been reported, but organizations should still patch proactively.

How to fix CVE-2021-37669?

1. PATCH: Upgrade to TensorFlow 2.6.0+. If pinned, apply cherry-picks: commit 3a7362750d5c (NonMaxSuppression fix) and b5cdbf12ffca (CombinedNonMaxSuppression fix). 2. INPUT VALIDATION: Add server-side validation rejecting non-positive values for output_size / max_output_size parameters before they reach TF ops. 3. DETECTION: Monitor inference service crash rates and restarts; sudden spikes on endpoints accepting detection models indicate exploitation attempts. 4. SANDBOXING: Run TF Serving instances in containers with restart policies so DoS impact is limited to brief availability windows. 5. AUDIT: Inventory all internal services using tf.raw_ops.NonMaxSuppressionV5 or CombinedNonMaxSuppression directly.

What systems are affected by CVE-2021-37669?

This vulnerability affects the following AI/ML architecture patterns: model serving, inference pipelines, object detection pipelines, training pipelines.

What is the CVSS score for CVE-2021-37669?

CVE-2021-37669 has a CVSS v3.1 base score of 5.5 (MEDIUM). The EPSS exploitation probability is 0.18%.

What is the AI security impact?

Affected AI Architectures

model servinginference pipelinesobject detection pipelinestraining pipelines

MITRE ATLAS Techniques

AML.T0029 Denial of AI Service
AML.T0034 Cost Harvesting
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: A.6.1.2, A.6.2.4
NIST AI RMF: GOVERN 1.7, MEASURE 2.5
OWASP LLM Top 10: LLM04

What are the technical details?

Original Advisory

TensorFlow is an end-to-end open source platform for machine learning. In affected versions an attacker can cause denial of service in applications serving models using `tf.raw_ops.NonMaxSuppressionV5` by triggering a division by 0. The [implementation](https://github.com/tensorflow/tensorflow/blob/460e000de3a83278fb00b61a16d161b1964f15f4/tensorflow/core/kernels/image/non_max_suppression_op.cc#L170-L271) uses a user controlled argument to resize a `std::vector`. However, as `std::vector::resize` takes the size argument as a `size_t` and `output_size` is an `int`, there is an implicit conversion to unsigned. If the attacker supplies a negative value, this conversion results in a crash. A similar issue occurs in `CombinedNonMaxSuppression`. We have patched the issue in GitHub commit 3a7362750d5c372420aa8f0caf7bf5b5c3d0f52d and commit [b5cdbf12ffcaaffecf98f22a6be5a64bb96e4f58. The fix will be included in TensorFlow 2.6.0. We will also cherrypick this commit on TensorFlow 2.5.1, TensorFlow 2.4.3, and TensorFlow 2.3.4, as these are also affected and still in supported range.

Exploitation Scenario

An attacker probes a public-facing object detection API (e.g., a retail image tagging service or autonomous vehicle inference endpoint powered by TF Serving). They craft a POST request to the inference endpoint with max_detections set to -1 or another negative integer. TensorFlow's NonMaxSuppression kernel receives the value, implicitly converts it to size_t (becoming a massive unsigned integer), attempts to allocate ~18 exabytes of memory, and crashes. With no restart policy, the service goes down. With a simple restart policy, the attacker can loop requests to maintain a continuous DoS at negligible cost. This can be scripted in under 10 lines of Python using the TF Serving REST API.

Weaknesses (CWE)

CWE-681 — Incorrect Conversion between Numeric Types: When converting from one data type to another, such as long to integer, data can be omitted or translated in a way that produces unexpected values. If the resulting values are used in a sensitive context, then dangerous behaviors may occur.

  • [Implementation] Avoid making conversion between numeric types. Always check for the allowed ranges.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
August 12, 2021
Last Modified
November 21, 2024
First Seen
August 12, 2021

Related Vulnerabilities