Inference
Inference servers are the most actively-exploited component of the AI stack because they sit between the model and the public internet and they hold the GPU. The shape of the bugs is mostly web-app classes magnified by the cost of compute: missing auth on /v1 endpoints, SSRF that escapes the sandbox onto the platform's control plane, unsafe deserialization on model-loading paths, and path traversal in artifact-management endpoints. vLLM, Triton, TGI, BentoML, Ray Serve, and Ollama have each shipped multiple high-severity CVEs since 2023; CVE-2024-11041 in vLLM was a notable example combining prompt injection with code execution. Multi-tenant deployments are particularly exposed because a single bug typically crosses tenant boundaries. Defenses: aggressive patching, mandatory auth, network segmentation between inference and control plane, and per-tenant resource quotas to bound abuse.
| Severity | CVE | Headline | Package | CVSS |
|---|---|---|---|---|
| MEDIUM | CVE-2021-29554 | TensorFlow: divide-by-zero DoS in DenseCountSparseOutput | tensorflow | 5.5 |
| HIGH | CVE-2021-29513 | TensorFlow: type confusion → null ptr deref (CVSS 7.8) | tensorflow | 7.8 |
| HIGH | CVE-2021-29514 | TensorFlow: heap buffer overflow in RaggedBincount op | tensorflow | 7.8 |
| HIGH | CVE-2021-29515 | TensorFlow: NULL ptr deref in MatrixDiag ops (crash/RCE) | tensorflow | 7.8 |
| MEDIUM | CVE-2021-29517 | TensorFlow: Conv3D div-by-zero crashes ML processes | tensorflow | 5.5 |
| HIGH | CVE-2021-29518 | TensorFlow: null ptr deref in session ops, local RCE | tensorflow | 7.8 |
| HIGH | CVE-2021-29525 | TensorFlow: div-by-zero DoS in Conv2DBackpropInput | tensorflow | 7.8 |
| MEDIUM | CVE-2021-29526 | TensorFlow: Conv2D divide-by-zero crashes ML workloads | tensorflow | 5.5 |
| MEDIUM | CVE-2021-29527 | TensorFlow: divide-by-zero DoS in QuantizedConv2D | tensorflow | 5.5 |
| MEDIUM | CVE-2021-29528 | TensorFlow: DoS via division-by-zero in QuantizedMul | tensorflow | 5.5 |
| HIGH | CVE-2021-29529 | TensorFlow: heap buffer overflow in quantized image resize | tensorflow | 7.8 |
| HIGH | CVE-2021-29532 | TensorFlow: heap OOB read via RaggedCross op | tensorflow | 7.1 |
| MEDIUM | CVE-2021-29533 | TensorFlow: DoS via empty image in DrawBoundingBoxes | tensorflow | 5.5 |
| MEDIUM | CVE-2021-29534 | TensorFlow: DoS via CHECK-fail in SparseConcat op | tensorflow | 5.5 |
| HIGH | CVE-2021-29535 | TensorFlow: heap overflow in QuantizedMul op | tensorflow | 7.8 |
| HIGH | CVE-2021-29536 | TensorFlow: heap overflow in QuantizedReshape op | tensorflow | 7.8 |
| HIGH | CVE-2021-29537 | TensorFlow: heap overflow in QuantizedResizeBilinear op | tensorflow | 7.8 |
| MEDIUM | CVE-2021-29539 | TensorFlow: type confusion in ImmutableConst causes DoS | tensorflow | 5.5 |
| MEDIUM | CVE-2021-29542 | TensorFlow: StringNGrams heap overflow crashes ML process | tensorflow | 5.5 |
| MEDIUM | CVE-2021-29543 | TensorFlow: DoS via assertion fail in CTCGreedyDecoder | tensorflow | 5.5 |