Reranking (NIM)

Model name: nim_reranking

About reranking

Reranking is a method in text search that sorts results by relevance to make them more accurate. It gives scores to documents using cross-attention mechanisms, improving the initial search results.

Supported aidb operations

  • rerank_text

Supported models

NVIDIA NGC

  • nvidia/llama-3.2-nv-rerankqa-1b-v2 (default)

Example

The function accepts a string as the "rerank query" and an array of texts to rerank. The id column in the output refers to the index of the text in the input array.

SELECT * from aidb.rerank_text('my_nim_reranker', 'how can I open a door?', '{Ask for help, Push the handle, Lie down and wait, Shout at it}'::text[]) ORDER BY logit_score DESC;
Output
text        | logit_score  | id
-------------------+--------------+----
 Push the handle   | -3.697265625 |  1
 Ask for help      |   -6.2578125 |  0
 Shout at it       |  -7.39453125 |  3
 Lie down and wait |      -11.375 |  2
(4 rows)

Creating the default model

SELECT aidb.create_model(
    'my_nim_reranker',
    'nim_reranking',
    '{"url":"http://nim-nv-rerankqa-llama-l-1xgpu-g6-predictor.default.svc.cluster.local/v1/ranking", "model": "nvidia/llama-3.2-nv-rerankqa-1b-v2"}'
    );

This example uses a locally deployed NIM model that does not require credentials. Credentials and other configuration can be provided as described in using models.

Model configuration settings

The following configuration settings are available for NIM models:

  • model The NIM model to use. The default is nvidia/llama-3.2-nv-rerankqa-1b-v2 and is the only supported model.

  • url The URL of the model to use. This setting is optional and can be used to specify a custom model URL. The default is https://ai.api.nvidia.com/v1/retrieval.

  • tls_config Optional TLS configuration for secure HTTPS endpoints. This field accepts an object with:

  • insecure_skip_verify (boolean): If true, disables certificate verification (useful for development or self-signed certs).

  • ca_path (string): Path to a PEM-encoded certificate authority file to trust when validating HTTPS connections.

The following example creates the model without using certificates (disabled)

SELECT aidb.create_model(
    'my_nim_reranker_notsecure',
    'nim_reranking',
    '{
      "model": "nvidia/llama-3.2-nv-rerankqa-1b-v2",
      "url": "https://localhost/v1/ranking",
      "tls_config": {
        "insecure_skip_verify": true
      }
    }'::jsonb
);

The following example shows how to create a model using certificates:

SELECT aidb.create_model(
    'my_nim_reranker',
    'nim_reranking',
    '{
      "model": "nvidia/llama-3.2-nv-rerankqa-1b-v2",
      "url": "https://my.secure.endpoint/v1/ranking",
      "tls_config": {
        "ca_path": "/etc/aidb/myCA.pem"
      }
    }'::jsonb
);

Model credentials

The following credentials are required if executing inside NVIDIA NGC:

  • api_key The NVIDIA Cloud API key to use for authentication.

Could this page be better? Report a problem or suggest an addition!