AI Component

API

AI APIs are the boundary between the application and the model. Self-hosted inference servers (vLLM, Triton, Ollama, TGI) and third-party gateways (LiteLLM, OpenRouter) expose OpenAI-compatible endpoints, and the same web-app vulnerability classes appear here: missing or weak authentication on /v1/chat/completions, broken authorization between tenants, lack of rate limiting that lets an attacker drain quota or burn GPU time, and overly permissive CORS that leaks API keys from browser-side calls. The blast radius is unusual: a single auth-bypass on an inference endpoint exposes both data and compute, and in the case of paid hosted models, directly costs money. We have seen production CVEs across most popular self-hosted servers in the last 18 months. Defenses: require auth on every endpoint, per-tenant rate limits, separate key scopes for read vs admin, and pin server versions aggressively.

325
Total CVEs
17
Pages
Page 2 of 17
Current
Severity CVE CVSS
CRITICAL CVE-2023-38896 9.8
HIGH CVE-2024-28088 8.1
CRITICAL CVE-2024-7042 9.8
MEDIUM CVE-2025-6854 4.3
CRITICAL CVE-2025-45150 9.8
MEDIUM CVE-2023-1651 5.4
CRITICAL CVE-2023-3686 9.8
HIGH CVE-2024-34527 7.5
MEDIUM CVE-2024-0451 5.0
HIGH CVE-2024-0452 7.7
HIGH CVE-2024-0453 7.7
MEDIUM CVE-2024-4858 5.3
LOW CVE-2024-40594 2.3
HIGH CVE-2024-6587 7.5
MEDIUM CVE-2024-6845 5.3
HIGH CVE-2024-7714 7.5
CRITICAL CVE-2024-52384 9.9
HIGH CVE-2024-32965 8.6
MEDIUM CVE-2024-11896 6.4
UNKNOWN CVE-2024-56516 -

Page 2 of 17