Pipelines knowledge bases usage
Creating a new knowledge base configuration
There are two types of knowledge bases: table and volume. Given the different nature of the data sources and the options required for each, there are different functions to create them.
Knowledge base for a table data source
The aidb.create_table_knowledge_base()
function is used to create a knowledge base for a table data source. Below you can find a minimal example. See aidb.create_table_knowledge_base
reference for a full list of options.
Example: Creating a knowledge base
SELECT aidb.create_table_knowledge_base( name => 'test_knowledge_base', model_name => 'bert', source_table => 'test_source_table', source_data_column => 'content', source_data_format => 'Text' );
If you're using only Postgres tables, skip to Creating the embeddings.
If you're using external data sources, you need to create a volume and create a knowledge base for it, which is explained next.
Knowledge base for a volume data source
Creating a new volume
Before you can create a knowledge base for a volume, you need to create a volume and a PGFS storage location. See volumes for more detail.
Here, we create a PGFS storage location for a public S3 bucket:
SELECT pgfs.create_storage_location('aidb_text_bucket', 's3://aidb-rag-app', options => '{"region": "eu-central-1", "skip_signature": "true"}' );
The aidb.create_volume function is used to create a volume.
Example: Creating a volume
SELECT aidb.create_volume('aidb_text', 'aidb_text_bucket', '/', 'Text');
The server_name
comes from calling PGFS functions to create a storage location pgfs.create_storage_location. The path
is the path to the data in the storage location. The last argument indicates the data format; Text
or Image
.
Creating a knowledge base for a volume
The aidb.create_volume_knowledge_base()
function is used to create a knowledge base for a volume data source. Below you can find a minimal example. See aidb.create_volume_knowledge_base
reference for a full list of options.
Example: Creating a knowledge base for a volume
SELECT aidb.create_volume_knowledge_base( name => 'test_knowledge_base_volume', model_name => 'bert', source_volume_name => 'aidb_text' );
Creating the embeddings
You can use bulk embedding to manually run embedding computation:
SELECT aidb.bulk_embedding('test_knowledge_base');
and for the volume example:
SELECT aidb.bulk_embedding('test_knowledge_base_volume');
Enable live auto-processing (automated embeddings) for any future changes:
SELECT aidb.set_auto_knowledge_base('test_knowledge_base', 'Live');
You can disable auto-processing as well (It is 'Disabled' by default when creating new knowledge bases):
SELECT aidb.set_auto_knowledge_base('test_knowledge_base', 'Disabled');
Retrieving
A basic key knowledge base, aidb.retrieve_key
is available. It doesn't look up the source data but returns the ID/key of the matching embeddings.
Retrieving the key
aidb.retrieve_key(<knowledge base name>, <query string>, <optional number of results>);
Example: Retrieving the key
SELECT * FROM aidb.retrieve_key('test_knowledge_base', 'shoes', 2);
key | distance -------+-------------------- 43941 | 0.2938963414490189 19337 | 0.3023805122617119 (2 rows)
You can use this command if you want to do a join/lookup yourself based on the key. For knowledge bases with external (volume) data sources, this capability is especially useful. Usually the application itself wants to do the retrieval from the external data source. Or you might want to push down the actual retrieval to a client application.
Retrieving the text
The retrieve_text
function joins the embeddings with the source data and directly returns the results:
aidb.retrieve_text(<knowledge base name>, <query string>, <optional number of results>);
Example
SELECT * FROM aidb.retrieve_text('test_knowledge_base', 'jacket', 2);
key | value | distance -------+----------------------------------------------------+-------------------- 19337 | United Colors of Benetton Men Stripes Black Jacket | 0.2994317672742334 55018 | Lakme 3 in 1 Orchid Aqua Shine Lip Color | 0.3804609668507203 (2 rows)
Listing the knowledge bases
A view is available that lists all the knowledge_bases. aidb.knowledge_bases also includes some of the knowledge bases configuration:
SELECT * FROM aidb.knowledge_bases;
id | name | vector_table_name | vector_table_key_column | vector_table_vector_column | model_name | topk | distance_operator | options | source_table_name | source_table_data_column | source_table_data_column_type | source_table_key_column | source_volume_name ----+---------------------+----------------------------+-------------------------+----------------------------+--------------+------+-------------------+---------+-------------------+--------------------------+-------------------------------+-------------------------+-------------------- 2 | test_knowledge_base | test_knowledge_base_vector | id | embeddings | simple_model | 5 | InnerProduct | {} | test_source_table | content | Text | id | 5 | test_knowledge_base_cosa | test_knowledge_base_cosa_vector | id | embeddings | simple_model | 1 | L2 | {} | test_source_table | content | Text | id | 3 | test_knowledge_base_cos | test_knowledge_base_cos_vector | id | embeddings | simple_model | 5 | Cosine | {} | test_source_table | content | Text | id | (3 rows)
We recommend that you select only the columns you're interested in:
SELECT name, source_table_name FROM aidb.knowledge_bases;
name | source_table_name ---------------------+------------------- test_knowledge_base | test_source_table test_knowledge_base_cos | test_source_table test_knowledge_base_cosa | test_source_table (3 rows)
Deleting a knowledge base
This command doesn't delete the vector table or anything else. It only deletes the configuration:
SELECT aidb.delete_knowledge_base(<name>);
End-to-end example
This example is an end-to-end example for a table/text knowledge base.
← Prev
Knowledge bases concepts
↑ Up
AI Accelerator Pipelines knowledge bases
Next →
Knowledge bases end-to-end example usage
Could this page be better? Report a problem or suggest an addition!