Model Serving in Hybrid Manager
Model Serving in Hybrid Manager provides a scalable, Kubernetes-native way to serve AI models as production-grade inference services.
It is implemented using KServe and runs on GPU-enabled nodes in your Hybrid Manager project’s Kubernetes cluster. Model Serving enables Gen AI applications, Knowledge Bases, and custom pipelines to use high-performance models under your control.
How Model Serving fits in the Hybrid Manager architecture
Model Serving is a core capability of Hybrid Manager’s AI Factory workload:
- Models are deployed as KServe InferenceServices within the project’s Kubernetes cluster.
- Model Serving is powered by GPU-enabled infrastructure that you provision and manage.
- Model images come from the Asset Library (formerly Model Library), backed by Hybrid Manager’s image governance.
- Model endpoints (HTTP/gRPC) are available to:
- Gen AI Builder Assistants.
- AIDB Knowledge Bases.
- External applications and APIs.
Model Serving in Hybrid Manager ensures that all model serving is governed, auditable, and runs securely within your infrastructure — enabling Sovereign AI patterns.
How it works in Hybrid Manager
- KServe is installed and managed by Hybrid Manager within your project’s Kubernetes cluster.
- You must provision GPU node groups or node pools to support high-performance model serving.
- GPU nodes must be correctly labeled and configured to support KServe workloads.
- Models are deployed from the Asset Library via ClusterServingRuntime and InferenceService definitions.
- Your applications and AI Factory workloads can invoke model endpoints via REST or gRPC.
Key Hybrid Manager considerations
- GPU infrastructure is required for most advanced models, such as LLMs, embeddings, and vision models.
- Hybrid Manager enables full observability of model serving, including Prometheus metrics and Kubernetes-native monitoring.
- Model serving endpoints are secured and managed within your Hybrid Manager project scope.
- Governance for model images and deployment comes from Hybrid Manager’s integrated Asset Library and image controls.
Typical use cases
- Power Gen AI Builder Assistants with LLM or embedding models.
- Enable AIDB Knowledge Bases with GPU-accelerated embedding pipelines.
- Serve image models (OCR, vision) as part of multi-modal retrieval systems.
- Expose enterprise-grade model APIs to downstream applications.
Links to learn more
← Prev
Asset Library in Hybrid Manager
↑ Up
Model capabilities in Hybrid Manager
Next →
Analytics in Hybrid Manager
Could this page be better? Report a problem or suggest an addition!