Model Serving How-To Guides
Model Serving How-To Guides
This section provides step-by-step guides to help you deploy, configure, and manage models with the AI Factory Model Serving capability.
These guides focus on real-world tasks you will perform when using KServe-based model serving within Hybrid Manager Kubernetes infrastructure.
Getting started
Deployment
GPU configuration and tuning
Monitoring and observability
Related Concepts
For broader architecture context, see AI Factory Concepts.
Configure ServingRuntime
Learn how to configure a ClusterServingRuntime in KServe to define an AI model serving environment on Kubernetes.
Create InferenceService
How to create an InferenceService to deploy an NVIDIA NIM container with KServe on Kubernetes.
Deploy NIM Container
Learn how to deploy an NVIDIA NIM container using KServe on a Kubernetes cluster. Understand core concepts and prepare for using this capability in EDB Hybrid Manager AI Factory.
FAQ
Frequently Asked Questions about using Model Serving in AI Factory with KServe.
Monitor model serving
Learn how to monitor deployed AI models using KServe, check status and resource utilization, and prepare for integration with Hybrid Manager AI Factory observability.
Observability
Learn how to monitor and observe your Model Serving workloads in AI Factory.
Update GPU Resources
How to update GPU resource allocation for an NVIDIA NIM InferenceService deployed with KServe.
- On this page
- Model Serving How-To Guides
← Prev
Manage Repository and Image Tag Metadata
↑ Up
How-to Guides
Next →
Configure a ClusterServingRuntime
Could this page be better? Report a problem or suggest an addition!