EDB Docs - EDB Postgres AI Innovation Release

Context

An external inference service connects Hybrid Manager (HM) to a remote model provider hosted outside the cluster — such as OpenAI, Google Gemini, Anthropic, or NVIDIA NIM. You configure the provider's URL, model name, and API key once; HM stores the credentials in Kubernetes secrets and handles authentication transparently for every downstream request.

This page covers:

Viewing the Inference Services list
Registering a new external inference service
Getting the details of a registered service
Updating a registered service
Deregistering a service

Inference Services list

Open the Inference Services list page from the Estate → Inference Services menu in your HM project.

The list displays each service's name, model, and current status. When the status isn't Ready, a short Status Message explains the cause.

Status

Status	Meaning
Ready	The service is healthy and accepting requests.
Failed	The most recent health check failed. The service detail page shows a Status Message with the diagnosis (for example, `API key is missing or invalid`, `endpoint unreachable`, or `account quota exhausted`).
Pending	The endpoint hasn't yet been verified by the background health check. Typically seen during the first few minutes after an HM restart, while the cache warms up. The Status Message reads `Remote connection check is pending — upstream has not yet been verified`.
Unknown	The underlying Kubernetes resources haven't reported a status yet — usually a transient state right after creation.

Status is refreshed every 5 minutes by a background health check. After you create or edit a service, the connectivity probe runs inline so the result is reflected immediately.

Register an external inference service

Navigate to Estate → Quick Actions → Register External Inference Service in your HM project.

Tip

You can also reach the form from the Inference Services list page via the Quick Actions menu.

Prerequisites

Before registering, confirm you have:

The provider's base URL (scheme and host, without a trailing /v1).
The model name exactly as the provider expects it (case-sensitive).
A valid API key for the provider.
Network reachability from the HM cluster to the upstream hostname. Your HM administrator may need to allow egress to the provider's domain.

Quick reference by provider

Use this table to fill in the form for the most common providers. The columns map directly to the fields described below.

Provider	Model base URL	API protocol version	Example model name	Notes
OpenAI	`https://api.openai.com`	`OPENAI_V1`	`gpt-4o-mini`	API key validated at registration.
OpenRouter	`https://openrouter.ai/api`	`OPENAI_V1`	`openai/gpt-4o-mini`	API key not validated at registration.
Google Gemini	`https://generativelanguage.googleapis.com`	`GEMINI_V1_BETA`	`gemini-2.5-pro`	API key validated at registration.
Anthropic	`https://api.anthropic.com`	`ANTHROPIC_V1`	`claude-sonnet-4-5`	API key validated at registration.
NVIDIA NIM	`https://integrate.api.nvidia.com`	`OPENAI_V1`	`meta/llama-3.1-8b-instruct`	API key not validated at registration.
Self-hosted / vLLM	Your internal service URL, for example `http://vllm-svc.inference:8000`	`OPENAI_V1`	The model name your vLLM server is serving	API key not validated at registration when vLLM runs unauthenticated.
Other HM (federation)	Another HM's external inference endpoint URL	`OPENAI_V1`	Same model name as on the upstream HM	Enable Allow Insecure Connection. See Allow Insecure Connection below.

For providers not listed here, see the form field descriptions below.

Form fields

External Service Name (required)

A unique identifier for this service within HM. Must follow DNS-style naming rules:

Lowercase letters and digits only.
Hyphens (-) are allowed within segments but not at the start or end.
Dots (.) are allowed as segment separators.
No uppercase letters, underscores, or spaces.
Maximum 63 characters.

Example: openai-gpt-4o-mini, azure.gpt-4o.prod.

Tags (optional)

Reuse existing HM tags to group and filter services. Tags have no effect on request routing or authentication.

Model name (required)

The exact identifier the upstream provider expects, as documented by the provider. This value is case-sensitive. Slash-separated names (such as meta/llama-3.1-8b-instruct or openai/gpt-4o-mini) are supported. See the Quick reference by provider table above for examples.

API Key (required for most providers)

The API key only — don't include the Authorization: Bearer … prefix. HM adds the correct auth header automatically based on the API Protocol Version you select.

Model Base URL (required)

The scheme and host (plus any required path prefix) for the provider's API. See the Quick reference by provider table above for the correct value for common providers.

Don't include /v1 — consumer applications append /v1 (or /v1beta for Gemini) themselves. Including /v1 here causes duplicated paths such as /v1/v1/chat/completions, which returns a 404. A trailing slash is tolerated and stripped automatically.

Tip

Self-hosted vLLM servers return 404 on the per-model endpoint /v1/models/{name}, which normally causes the registration probe to fail. HM handles this by falling back to GET /v1/models and matching the model name in the list, so vLLM-served models register successfully.

Functions (optional, multi-select)

Capability tags that consumer applications filter on when discovering available models. Use the predefined values below for HM's built-in consumers; for your own applications, any string is valid.

Built-in consumer	Required function tag
HM chatbot	`chatbot-gen-content`
AIDB pipeline step	The matching `aidb-*` tag (see your AIDB pipeline documentation)

Leave this field empty if you're exposing the service exclusively to custom applications that perform their own model selection.

API Protocol Version (required)

Controls both the request body format and the outbound authentication header. Choose the option that matches the provider's native API.

Option	Request body shape	Auth header sent	Use for
`OPENAI_V1`	OpenAI Chat Completions	`Authorization: Bearer <key>`	OpenAI, NVIDIA NIM, vLLM, OpenRouter, any OpenAI-compatible endpoint
`GEMINI_V1_BETA`	Google Gemini	`x-goog-api-key: <key>`	Google Gemini native API only
`ANTHROPIC_V1`	Anthropic Messages	`x-api-key: <key>` + `anthropic-version: 2023-06-01`	Anthropic Claude

Allow Insecure Connection (optional, default off)

Disables TLS certificate verification on outbound calls to the upstream. Enable this only if the upstream uses a self-signed certificate or a certificate signed by a CA not trusted by the HM cluster.

Use this toggle when registering another HM's external inference endpoint (for example, HM2 federating through HM1). HM1's ingress typically presents a self-signed certificate, which causes TLS errors for both the registration probe and runtime forwarding. HM-to-HM federation also requires both ends to use OPENAI_V1 as the API Protocol Version.

Warning

This setting is create-only. You can't toggle it after registration. If you need to change it, delete the service and re-register. Only enable this for development environments or trusted self-signed certificates — disabling TLS verification reduces security.

After clicking Register

HM validates the endpoint before creating any infrastructure. For OpenAI (OPENAI_V1), Anthropic (ANTHROPIC_V1), and Google Gemini (GEMINI_V1_BETA), HM performs a live connectivity probe that checks both reachability and credential validity. If the endpoint is unreachable or the API key is rejected, registration fails immediately with an error — no resources are created.

Note

For some OPENAI_V1 providers — such as NVIDIA NIM, HuggingFace, OpenRouter, and self-hosted vLLM running without authentication — the models endpoint doesn't require an API key. A connectivity probe is still performed, but a wrong API key may still return HTTP 200. Key validity isn't guaranteed at registration time for these providers.

After registration, the status is refreshed every 5 minutes by a background health check. If the upstream becomes unreachable or starts rejecting credentials, the service flips to Failed (with an explanatory Status Message) at the next refresh tick.

Use the service

Once the service is ready, it's available to:

HM chatbot — The chatbot picks up services tagged with chatbot-gen-content automatically.
Pipeline Designer — Registered external models appear in the model picker alongside HM-hosted models. For details, see External inference services in Pipeline Designer.
Gen AI Builder — Models are available as inference targets in Gen AI Builder pipelines once registered.

Retrieve inference service details

Click a service name in the Inference Services list to open its detail view, which shows the service's configuration, current status, and available actions.

Details

Field	Description
External Service Name	The unique identifier assigned at registration.
Model name	The model identifier forwarded to the upstream provider.
Model Base URL	The upstream endpoint the proxy routes requests to.
API Protocol Version	The request format and authentication header in use (`OPENAI_V1`, `GEMINI_V1_BETA`, or `ANTHROPIC_V1`).
Functions	The capability tags currently assigned to the service.
Allow Insecure Connection	Whether TLS certificate verification is disabled for outbound calls.
Status	Current health of the service: Ready, Failed, Pending, or Unknown.
Status Message	A human-readable explanation that accompanies the status. Empty when the service is Ready; carries the probe diagnosis (for example, `API key is missing or invalid (status 401 from ...)` or `endpoint unreachable: ...`) when Failed; carries the cache warm-up text when Pending.

Note

The API key isn't displayed in the service detail page. To replace it, open Quick Actions → Edit Service and enter a new value. HM rotates the underlying secret automatically.

Troubleshooting from the Status Message

When a service is Failed, the Status Message tells you what to fix. Common patterns:

Status message	Likely cause	What to do
`API key is missing or invalid`	Wrong or expired API key.	Open Edit and replace the API Key value.
`API key is missing/invalid or account credits are exhausted` (Anthropic)	Anthropic returns the same status for a bad key and for depleted credits.	Verify the key first; if it's correct, top up the Anthropic account.
`account quota exhausted — check billing and credits` (OpenAI)	OpenAI billing balance is empty.	Top up the OpenAI account; no change to the HM service is needed.
`API key lacks permission for this endpoint`	Key is valid but doesn't have access to the model (project scoping, tier).	Grant model access on the provider's console, or use a different key.
`endpoint not found — check the base URL and model name`	Typo in Model Base URL or wrong model name.	Verify both values. Make sure Base URL doesn't end with `/v1`.
`endpoint unreachable: ... x509: ...`	TLS error against an untrusted certificate.	Delete and re-register with Allow Insecure Connection enabled (the toggle is create-only).
`endpoint unreachable: ...` (other)	Network egress blocked, DNS failure, or upstream down.	Check egress rules from the HM cluster and confirm the provider is online.
`rate limited by upstream`	Provider is throttling requests.	Wait and retry; recurring rate limits indicate the upstream account tier needs upgrading.
`upstream is temporarily unavailable` / `upstream server error`	Provider-side outage.	Wait; the next refresh tick will recover automatically once upstream is healthy.
`Remote connection check is pending — upstream has not yet been verified`	Background health check hasn't run yet (typically after an HM restart).	Wait up to 5 minutes for the next refresh tick.

Update inference service parameters

Some parameters can be updated after registration; others require deregistering and re-registering the service.

What can and cannot be changed

Field	Editable after registration?
Functions	Yes
API Key	Yes — HM rotates the underlying Kubernetes secret automatically
External Service Name	No — delete and re-register
Model name	No — delete and re-register
Model Base URL	No — delete and re-register
API Protocol Version	No — delete and re-register
Allow Insecure Connection	No — delete and re-register

Note

API Protocol Version is locked after registration. Each protocol probes a different endpoint path with a different authentication header, and the connectivity probe that ran at registration only passed because the original protocol matched the Model Base URL and Model name. Since those two fields can't be changed, any attempt to switch to a different protocol fails the connectivity probe and the update is rejected. To use a different protocol, deregister the service and register a new one.

How to edit a service parameter

Open the service detail page and select Quick Actions → Edit Service, or click the pencil icon on the Inference Services list.

Note

HM runs a connectivity probe before applying the update. If the endpoint is unreachable or the new API key is rejected, the update fails and no changes are applied.

De-register an external inference service

Warning

Deregister is permanent. All associated Kubernetes resources (namespace, secret, ServingRuntime, InferenceService) are removed immediately. This action can't be undone.

How to deregister

To delete a service, either open the service detail page and select Quick Actions → Deregister External Inference Service, or click the trash icon on the Inference Services list.

HM blocks deletion if the service is currently referenced by one or more pipelines. Remove or update those pipelines first, then retry.

When deletion succeeds, HM:

Removes all Kubernetes resources backing the service (including the API key secret).
Removes the service record from the database.
Clears all tags associated with the service.

External inference services Innovation Release

Context

Inference Services list

Status

Register an external inference service

Tip

Prerequisites

Quick reference by provider

Form fields

External Service Name (required)

Tags (optional)

Model name (required)

API Key (required for most providers)

Model Base URL (required)

Tip

Functions (optional, multi-select)

API Protocol Version (required)

Allow Insecure Connection (optional, default off)

Warning

After clicking Register

Note

Use the service

Retrieve inference service details

Details

Note

Troubleshooting from the Status Message

Update inference service parameters

What can and cannot be changed

Note

How to edit a service parameter

Note

De-register an external inference service

Warning

How to deregister

← Prev

↑ Up