Completions

Model name: completions

Model aliases:

  • openai_completions
  • nim_completions

About completions

Completions enables the use of any OpenAI API-compatible text-generation model. It's suitable for chat/text transforms, text completion, and other text generation tasks.

Based on the name of the model, the model provider sets defaults accordingly:

  • When invoked as completions or openai_completions, the model provider defaults to using the OpenAI API.

  • When invoked as nim_completions, the model provider defaults to using the NVIDIA NIM API.

Supported aidb operations

  • decode_text
  • decode_text_batch

Supported models

  • Any text generation model that's supported by the provider.

Supported OpenAI models

See a list of supported OpenAI models here.

Supported NIM models

Creating the default model

There's no default model for completions. You can create any supported model using the aidb.create_model function.

Creating an OpenAI model

You can create any supported OpenAI model using the aidb.create_model function.

This example creates a GPT-4o model with the name my_openai_model:

SELECT aidb.create_model(
  'my_openai_model',
  'openai_completions',
  '{"model": "gpt-4o"}'::JSONB,
  '{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"}'::JSONB 
);

Creating a NIM model

SELECT aidb.create_model(
          'my_nim_completions', 
          'nim_completions',
          '{"model": "meta/llama-3.2-1b-instruct"}'::JSONB,
          credentials=>'{"api_key": "sk-abc123xyz456def789ghi012jkl345mn"'::JSONB);

Model configuration settings

The following configuration settings are available for OpenAI models:

  • model The model to use.
  • url The URL of the model to use. This setting is optional and can be used to specify a custom model URL.
    • If openai_completions (or completions) is the model, url defaults to https://api.openai.com/v1/chat/completions.
    • If nim_completions is the model, url defaults to https://integrate.api.nvidia.com/v1/chat/completions.
  • max_concurrent_requests The maximum number of concurrent requests to make to the OpenAI model. The default is 25.
  • max_tokens The maximum number of tokens generated by the model. The default is NULL.
    • If set to NULL, the maximum tokens parameter is omitted, and the model defaults to its internal maximum.

Maximum tokens

The max_tokens parameter sets the number of tokens the model is allowed to generate. This operates as an upper bound, allowing manage token budget constraints.

Parameters

  • size The maximum number of tokens generated by the model.
  • format Defines how the maximum tokens parameter is sent to the model.
    • If set to default, the max_completion_tokens parameter is provided in the request payload. This is ideal for models provided by OpenAI.
    • If set to legacy, the max_tokens parameter is provided in the request payload. This is more common among non-OpenAI models.
    • If set to both, both max_completion_tokens and max_tokens are provided in the request payload.

Example Usage

SELECT aidb.create_model(
          'my_completions_model',
          'completions',
          '{"model": "meta/llama-3.2-1b-instruct", "max_tokens": {"format": "default", "size": 1024}}'::JSONB

Model credentials

The following credentials may be required by the service providing these models. Note: api_key and basic_auth are exclusive. You can use only one of these two options.

  • api_key The API key to use for Bearer Token authentication. The api_key is sent in a header field as Authorization: Bearer <api_key>.
  • basic_auth Credentials for HTTP Basic authentication. The credentials provided here are sent verbatim as Authorization: Basic <basic_auth>. If your server requires username/password with HTTP Basic authentication, use the syntax username:password. If only a token is required, provide only the token.

Could this page be better? Report a problem or suggest an addition!