Model Serving How-To Guides

Suggest edits

This section provides step-by-step guides to help you deploy, configure, and manage models with the AI Factory Model Serving capability.

These guides focus on real-world tasks you will perform when using KServe-based model serving within Hybrid Manager Kubernetes infrastructure.

Getting started

Model Serving Quickstart

Deployment

GPU configuration and tuning

Update GPU resources

Monitoring and observability

Monitor InferenceService

Model Serving Explained

For broader architecture context, see AI Factory Concepts.

Configure ServingRuntime

Learn how to configure a ClusterServingRuntime in KServe to define an AI model serving environment on Kubernetes.

Create InferenceService

How to create an InferenceService to deploy an NVIDIA NIM container with KServe on Kubernetes.

Deploy NIM Container

Learn how to deploy an NVIDIA NIM container using KServe on a Kubernetes cluster. Understand core concepts and prepare for using this capability in EDB Hybrid Manager AI Factory.

FAQ

Frequently Asked Questions about using Model Serving in AI Factory with KServe.

Monitor model serving

Learn how to monitor deployed AI models using KServe, check status and resource utilization, and prepare for integration with Hybrid Manager AI Factory observability.