AI Security Glossary
20 terms covering AI attack techniques and components, backed by 7,220 real CVE mappings.
Attack Types
Remote code execution vulnerabilities in AI frameworks and inference servers allow attackers to run arbitrary code on hosts running ML inference, training, or agent workloads — often via unsafe deserialization or template injection.
Data extraction attacks against AI systems exfiltrate training data, model weights, user conversations, system prompts, or other sensitive material through inference-time queries or vulnerable APIs.
Supply chain attacks against AI compromise the software, model, or data dependencies of an ML system — including poisoned PyPI/npm packages, malicious HuggingFace model uploads, and tampered training data distributed through trusted channels.
Authentication bypass vulnerabilities in AI platforms and inference servers let attackers reach protected model endpoints, admin interfaces, or other tenants' data without valid credentials.
Denial-of-service attacks against AI systems exploit resource-intensive operations — long-context inference, expensive tokenization, or recursive agent loops — to exhaust compute, memory, or API budget.
Data leakage vulnerabilities allow unauthorized access to sensitive data processed by AI systems — including PII memorised in training data, API keys included in prompts, or confidential information returned in model responses.
Privacy violations in AI systems involve unauthorized collection, processing, or exposure of personal data through model memorization, training-data leaks, third-party API logging, or inadequate consent management.
Prompt injection is an attack where adversaries craft malicious input to manipulate an LLM into ignoring system instructions, exfiltrating data, or executing unauthorized actions. It is the most common attack vector against generative AI.
AI-enhanced social engineering uses generative models to scale phishing, impersonation, deepfake fraud, and deceptive content — making attacks that previously required skilled humans cheap and ubiquitous.
Model poisoning corrupts a machine-learning model during training by injecting malicious data, modifying weights, or tampering with the training pipeline to plant backdoors or degrade specific behaviours.
Adversarial examples are inputs deliberately perturbed to cause a machine-learning model to produce wrong outputs while appearing normal to humans, exploiting the high-dimensional geometry of neural networks.
Jailbreaking refers to techniques that bypass safety guardrails and content filters in language models, enabling generation of harmful, restricted, or policy-violating content the model was trained to refuse.
AI Components
AI/ML frameworks (LangChain, LlamaIndex, PyTorch, TensorFlow, Hugging Face Transformers) are the foundational libraries for building AI applications. Vulnerabilities here have wide blast radius due to high adoption.
Inference-layer vulnerabilities target the serving infrastructure that runs ML models in production — including vLLM, TensorRT, Triton, BentoML, Ray Serve, and Ollama — where bugs expose compute, data, and other tenants.
AI agent frameworks (AutoGPT, LangGraph, CrewAI, AutoGen) orchestrate LLM-driven autonomous actions over tools and APIs. Their tool-use capabilities create attack surfaces not present in simple chat interfaces.
AI API vulnerabilities affect the interfaces used to interact with language models and ML services — including authentication, rate limiting, input validation, and response handling — and often expose paid compute to abuse.
Model-level vulnerabilities affect the trained weights, architectures, or inference behaviour of AI/ML models — including adversarial robustness, backdoor attacks, model extraction, and unsafe model-file formats.
Plugin and tool vulnerabilities affect the external integrations that extend AI systems — browser tools, code interpreters, API connectors, file-system access — and are a primary lever for prompt-injection escalation in agents.
Training-data vulnerabilities involve poisoned datasets, data theft, privacy violations in training corpora, and unauthorized use of copyrighted or sensitive content during model training.
RAG (Retrieval-Augmented Generation) vulnerabilities target the vector store, embedding pipeline, or retrieval logic that grounds LLM responses in external knowledge — exposing the application to data poisoning and indirect prompt injection.