Model Serving

Suggest edits

Model Serving in EDB Postgres® AI (EDB PG AI) enables you to deploy and scale AI model inference services — turning AI models into production-ready APIs and services.

Important: Model Serving requires Hybrid Manager and the Asset Library to manage model images. Serving is performed within your Hybrid Manager Kubernetes environment.

What Is Model Serving?

Model Serving lets you take AI models — including large language models (LLMs), vision models, and embedding models — and deploy them as scalable, observable Inference Services:

In-cluster or on-prem, controlled within your environment
Integrated with EDB PG AI Pipelines, Knowledge Bases, and Gen AI Builder
With full control over model lifecycle, scaling, and usage

EDB PG AI Model Serving is powered by the open source KServe engine and deeply integrated into Hybrid Manager.

Why Use Model Serving?

Model Serving provides the backbone for operationalizing AI models:

Sovereignty — Run models on your infrastructure, not external APIs
Governance — Ensure models used in production are trusted and auditable
Scalability — Automatically scale inference services based on load
Observability — Monitor and troubleshoot model serving behavior
Integration — Seamlessly connect Pipelines, Knowledge Bases, and Gen AI assistants to served models

How It Works

Model Serving works in combination with Hybrid Manager and the Model Library:

You register model images in the Asset Library
You create InferenceServices via Hybrid Manager or the API
Hybrid Manager deploys the model as a containerized service using KServe
EDB PG AI components (such as Pipelines, Gen AI Builder, and Knowledge Bases) connect to these InferenceServices at runtime

InferenceServices can support:

Open embedding models for Pipelines and RAG
Text generation models (LLMs) for Gen AI
Vision models (CLIP, OCR, image embedding) for multi-modal Knowledge Bases

When to Use

You should use Model Serving when:

You want to run your own models, not rely on external API calls
You need to ensure model inference runs where your data resides (Sovereign AI)
You want to manage model lifecycle centrally across Pipelines, Knowledge Bases, and assistants
You need scalable, observable model serving integrated with Hybrid Manager

Learn More

With Model Serving, EDB PG AI gives you the tools to operationalize AI models with full control, scalability, and integration — powering intelligent applications on top of your Postgres and data platforms.

← Prev

Model Library

↑ Up

AI Factory Models

AI Factory Pipelines

Could this page be better? Report a problem or suggest an addition!