AI Component

API

AI APIs are the boundary between the application and the model. Self-hosted inference servers (vLLM, Triton, Ollama, TGI) and third-party gateways (LiteLLM, OpenRouter) expose OpenAI-compatible endpoints, and the same web-app vulnerability classes appear here: missing or weak authentication on /v1/chat/completions, broken authorization between tenants, lack of rate limiting that lets an attacker drain quota or burn GPU time, and overly permissive CORS that leaks API keys from browser-side calls. The blast radius is unusual: a single auth-bypass on an inference endpoint exposes both data and compute, and in the case of paid hosted models, directly costs money. We have seen production CVEs across most popular self-hosted servers in the last 18 months. Defenses: require auth on every endpoint, per-tenant rate limits, separate key scopes for read vs admin, and pin server versions aggressively.

325
Total CVEs
17
Pages
Page 9 of 17
Current
Severity CVE CVSS
MEDIUM CVE-2025-61620 6.5
MEDIUM CVE-2026-33401 6.5
HIGH CVE-2024-7036 7.5
MEDIUM GHSA-hf3c-wxg2-49q9 6.5
HIGH CVE-2024-8984 7.5
HIGH CVE-2024-6982 8.4
MEDIUM CVE-2024-7035 6.9
HIGH CVE-2024-8020 7.5
HIGH CVE-2024-7990 8.4
HIGH CVE-2024-8060 8.1
HIGH CVE-2024-8053 7.5
HIGH CVE-2024-7983 7.5
HIGH CVE-2024-7806 8.0
HIGH GHSA-6wj5-5pgr-jwq8 7.5
HIGH GHSA-w466-2wfc-8g58 7.5
HIGH CVE-2024-7053 7.6
MEDIUM CVE-2024-7046 4.3
HIGH CVE-2024-12534 7.5
MEDIUM CVE-2024-7034 6.5
HIGH CVE-2024-7039 8.3

Page 9 of 17