How-To Setup GPU resources Innovation Release
This documentation covers the current Innovation Release of
EDB Postgres AI. See also:
- Hybrid Manager dual release strategy
- Documentation for the current Long-term support release
Prerequisite: Access to the Hybrid Manager UI with AI Factory enabled. See AI Factory in Hybrid Manager.
Use this guide to prepare GPU resources in your Kubernetes cluster (Hybrid Manager or compatible) to support Model Serving with KServe.
Goal
Prepare your cluster to run GPU-based Model Serving workloads using KServe.
Estimated time
20–40 minutes (provisioning depends on your cloud provider).
What you accomplish
- Provision GPU node groups/pools in your cluster.
- Label and taint GPU nodes correctly.
- Deploy the NVIDIA device plugin DaemonSet.
- Store your NVIDIA API key as a Kubernetes secret.
- Enable your cluster to run NIM model containers in KServe.
Prerequisites
- Access to a Kubernetes cluster with appropriate permissions.
- Administrative access to provision node groups (AWS EKS / GCP GKE / RHOS).
- NVIDIA API key for accessing NIM models.
- Familiarity with
kubectl.
Provision GPU nodes
Provision GPU node groups (EKS) or node pools (GKE/RHOS):
- Use instances with L40S or A100 GPUs (for example,
g6e.12xlargeon AWS ora2-highgpu-4gon GCP). - Recommended: at least one node with four GPUs for large models.
Label and taint GPU nodes
kubectl label node <gpu-node-name> nvidia.com/gpu=true kubectl taint nodes <gpu-node-name> nvidia.com/gpu=true:NoSchedule
Deploy the NVIDIA device plugin
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.1/nvidia-device-plugin.yml kubectl get ds -n kube-system nvidia-device-plugin-daemonset
Store NVIDIA API key as Kubernetes secret
kubectl create secret generic nvidia-nim-secrets --from-literal=NGC_API_KEY=<your_NVIDIA_API_KEY>
This secret is used by ClusterServingRuntime for NIM models.