How-To Setup GPU resources v1.3.6 (LTS)
The March 2026 Innovation Release of EDB Postgres AI is available. For more information, see the release notes.
Prerequisite: Access to the Hybrid Manager UI with AI Factory enabled. See AI Factory in Hybrid Manager.
Use this guide to prepare GPU resources in your Kubernetes cluster (Hybrid Manager or compatible) to support Model Serving with KServe.
Goal
Prepare your cluster to run GPU-based Model Serving workloads using KServe.
Estimated time
20–40 minutes (provisioning depends on your cloud provider).
What you accomplish
- Provision GPU node groups/pools in your cluster.
- Label and taint GPU nodes correctly.
- Deploy the NVIDIA device plugin DaemonSet.
- Store your NVIDIA API key as a Kubernetes secret.
- Enable your cluster to run NIM model containers in KServe.
Prerequisites
- Access to a Kubernetes cluster with appropriate permissions.
- Administrative access to provision node groups (AWS EKS / GCP GKE / RHOS).
- NVIDIA API key for accessing NIM models.
- Familiarity with
kubectl.
Provision GPU nodes
Provision GPU node groups (EKS) or node pools (GKE/RHOS):
- Use instances with L40S or A100 GPUs (for example,
g6e.12xlargeon AWS ora2-highgpu-4gon GCP). - Recommended: at least one node with four GPUs for large models.
Label and taint GPU nodes
kubectl label node <gpu-node-name> nvidia.com/gpu=true kubectl taint nodes <gpu-node-name> nvidia.com/gpu=true:NoSchedule
Deploy the NVIDIA device plugin
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.14.1/nvidia-device-plugin.yml kubectl get ds -n kube-system nvidia-device-plugin-daemonset
Store NVIDIA API key as Kubernetes secret
kubectl create secret generic nvidia-nim-secrets --from-literal=NGC_API_KEY=<your_NVIDIA_API_KEY>
This secret is used by ClusterServingRuntime for NIM models.