Model Serving How-To Guides

Model Serving How-To Guides

This section provides step-by-step guides to help you deploy, configure, and manage models with the AI Factory Model Serving capability.

These guides focus on real-world tasks you will perform when using KServe-based model serving within Hybrid Manager Kubernetes infrastructure.

Getting started

Deployment

GPU configuration and tuning

Monitoring and observability



For broader architecture context, see AI Factory Concepts.

Configure ServingRuntime

Learn how to configure a ClusterServingRuntime in KServe to define an AI model serving environment on Kubernetes.

Create InferenceService

How to create an InferenceService to deploy an NVIDIA NIM container with KServe on Kubernetes.

Deploy NIM Container

Learn how to deploy an NVIDIA NIM container using KServe on a Kubernetes cluster. Understand core concepts and prepare for using this capability in EDB Hybrid Manager AI Factory.

FAQ

Frequently Asked Questions about using Model Serving in AI Factory with KServe.

Monitor model serving

Learn how to monitor deployed AI models using KServe, check status and resource utilization, and prepare for integration with Hybrid Manager AI Factory observability.

Observability

Learn how to monitor and observe your Model Serving workloads in AI Factory.

Update GPU Resources

How to update GPU resource allocation for an NVIDIA NIM InferenceService deployed with KServe.


Could this page be better? Report a problem or suggest an addition!