CVE-2022-23585: TensorFlow: memory leak in PNG decode causes DoS

MEDIUM PoC AVAILABLE CISA: TRACK*
Published February 4, 2022
CISO Take

Authenticated users can crash TensorFlow image processing services by submitting malformed PNG files, exhausting memory without cleanup. If you expose TensorFlow-based image inference endpoints—CV models, image classifiers, multimodal pipelines—to any authenticated user or internal service, patch immediately to TF 2.8.0, 2.7.1, 2.6.3, or 2.5.3. No workaround exists short of input validation that rejects malformed PNGs before they reach the decoder.

What is the risk?

Medium severity in isolation, but operationally significant for production ML serving. Low attack complexity and only low privileges required means any authenticated API user—or a compromised internal service account—can trigger it repeatedly to degrade or crash an inference node. The absence of CISA KEV listing and no active exploitation evidence keeps this out of critical tier, but unpatched TF deployments processing untrusted image inputs face real DoS risk in multi-tenant or externally-accessible environments.

What systems are affected?

Package Ecosystem Vulnerable Range Patched
TensorFlow pip No patch
195.8K OpenSSF 7.1 3.7K dependents Pushed 3d ago 4% patched ~1372d to patch Full package profile →

Do you use TensorFlow? You're affected.

How severe is it?

CVSS 3.1
6.5 / 10
EPSS
0.9%
chance of exploitation in 30 days
Higher than 56% of all CVEs
Exploitation Status
Exploit Available
Exploitation: MEDIUM
Sophistication
Trivial
Exploitation Confidence
medium
CISA SSVC: Public PoC
Public PoC indexed (trickest/cve)
Composite signal derived from CISA KEV, VulnCheck KEV, CISA SSVC, EPSS, Metasploit, Exploit-DB, trickest/cve, Nuclei templates, and inthewild.io exploitation reports.

What is the attack surface?

AV AC PR UI S C I A
AV Network
AC Low
PR Low
UI None
S Unchanged
C None
I None
A High

What should I do?

5 steps
  1. PATCH

    Upgrade to TensorFlow 2.8.0, 2.7.1, 2.6.3, or 2.5.3 — the fix is cherry-picked across all supported branches.

  2. VALIDATE INPUTS

    Implement upstream PNG validation (e.g., Pillow's verify() or libpng header checks) before passing images to TensorFlow decoders.

  3. RESOURCE LIMITS

    Apply memory limits and OOM kill policies to TF Serving containers/pods so a leak crash is bounded and auto-restarts.

  4. RATE LIMIT

    Throttle authenticated image submission endpoints to slow exhaustion attacks.

  5. MONITOR

    Alert on abnormal memory growth in TF serving processes — this leak is detectable via standard container memory metrics.

What does CISA's SSVC say?

Decision Track*
Exploitation poc
Automatable No
Technical Impact partial

Source: CISA Vulnrichment (SSVC v2.0). Decision based on the CISA Coordinator decision tree.

How is it classified?

Which compliance frameworks are affected?

This CVE is relevant to:

EU AI Act
Article 15 - Accuracy, robustness and cybersecurity
ISO 42001
A.9.3 - AI system robustness and availability
NIST AI RMF
MEASURE 2.5 - AI system resilience and robustness testing
OWASP LLM Top 10
LLM04 - Model Denial of Service

Frequently Asked Questions

What is CVE-2022-23585?

Authenticated users can crash TensorFlow image processing services by submitting malformed PNG files, exhausting memory without cleanup. If you expose TensorFlow-based image inference endpoints—CV models, image classifiers, multimodal pipelines—to any authenticated user or internal service, patch immediately to TF 2.8.0, 2.7.1, 2.6.3, or 2.5.3. No workaround exists short of input validation that rejects malformed PNGs before they reach the decoder.

Is CVE-2022-23585 actively exploited?

Proof-of-concept exploit code is publicly available for CVE-2022-23585, increasing the risk of exploitation.

How to fix CVE-2022-23585?

1. PATCH: Upgrade to TensorFlow 2.8.0, 2.7.1, 2.6.3, or 2.5.3 — the fix is cherry-picked across all supported branches. 2. VALIDATE INPUTS: Implement upstream PNG validation (e.g., Pillow's verify() or libpng header checks) before passing images to TensorFlow decoders. 3. RESOURCE LIMITS: Apply memory limits and OOM kill policies to TF Serving containers/pods so a leak crash is bounded and auto-restarts. 4. RATE LIMIT: Throttle authenticated image submission endpoints to slow exhaustion attacks. 5. MONITOR: Alert on abnormal memory growth in TF serving processes — this leak is detectable via standard container memory metrics.

What systems are affected by CVE-2022-23585?

This vulnerability affects the following AI/ML architecture patterns: model serving, training pipelines, inference APIs, image processing pipelines.

What is the CVSS score for CVE-2022-23585?

CVE-2022-23585 has a CVSS v3.1 base score of 6.5 (MEDIUM). The EPSS exploitation probability is 0.93%.

What is the AI security impact?

Affected AI Architectures

model servingtraining pipelinesinference APIsimage processing pipelines

MITRE ATLAS Techniques

AML.T0029 Denial of AI Service
AML.T0034 Cost Harvesting
AML.T0049 Exploit Public-Facing Application

Compliance Controls Affected

EU AI Act: Article 15
ISO 42001: A.9.3
NIST AI RMF: MEASURE 2.5
OWASP LLM Top 10: LLM04

What are the technical details?

Original Advisory

Tensorflow is an Open Source Machine Learning Framework. When decoding PNG images TensorFlow can produce a memory leak if the image is invalid. After calling `png::CommonInitDecode(..., &decode)`, the `decode` value contains allocated buffers which can only be freed by calling `png::CommonFreeDecode(&decode)`. However, several error case in the function implementation invoke the `OP_REQUIRES` macro which immediately terminates the execution of the function, without allowing for the memory free to occur. The fix will be included in TensorFlow 2.8.0. We will also cherrypick this commit on TensorFlow 2.7.1, TensorFlow 2.6.3, and TensorFlow 2.5.3, as these are also affected and still in supported range.

Exploitation Scenario

An adversary with low-privilege API access to a computer vision inference service (e.g., an image classification endpoint used for product moderation or medical imaging) crafts a batch of syntactically invalid PNG files — malformed IHDR chunks or truncated image data that passes basic size checks but fails internal TensorFlow decode validation. They submit these in rapid succession via the API. Each request triggers the memory leak in `decode_image_op.cc` without cleanup. Over minutes to hours, the TF Serving instance exhausts available memory, causing OOM crashes. In a Kubernetes environment without proper restart limits, this creates a degradation-of-service loop. In training pipeline context, an adversary with write access to a shared training dataset store poisons it with invalid PNGs, causing training jobs to crash and forcing costly re-runs.

Weaknesses (CWE)

CWE-401 — Missing Release of Memory after Effective Lifetime: The product does not sufficiently track and release allocated memory after it has been used, making the memory unavailable for reallocation and reuse.

  • [Implementation] Choose a language or tool that provides automatic memory management, or makes manual memory management less error-prone. For example, glibc in Linux provides protection against free of invalid pointers. When using Xcode to target OS X or iOS, enable automatic reference counting (ARC) [REF-391]. To help correctly and consistently manage memory when programming in C++, consider using a smart pointer class such as std::auto_ptr (defined by ISO/IEC ISO/IEC 14882:2003), std::shared_ptr and std::unique_ptr (specified by an upcoming revision of the C++ standard, informally referred to as C++ 1x), or equivalent solutions such as Boost.
  • [Architecture and Design] Use an abstraction library to abstract away risky APIs. Not a complete solution.

Source: MITRE CWE corpus.

CVSS Vector

CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

Timeline

Published
February 4, 2022
Last Modified
November 21, 2024
First Seen
February 4, 2022

Related Vulnerabilities