Preparer chaining example Innovation Release

Suggest edits

This documentation covers the current Innovation Release of EDB Postgres AI. You may also want the docs for the current LTS version.

Example of chaining multiple preparers together with auto-processing using the ChunkText and SummarizeText operations in AI Accelerator.

Create the first preparer to chunk text

-- Create source test table
CREATE TABLE source_table__1321
(
    id      INT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    content TEXT NOT NULL
);

SELECT aidb.create_table_preparer(
    name => 'chunking_preparer__1321',
    operation => 'ChunkText',
    source_table => 'source_table__1321',
    source_key_column => 'id',
    source_data_column => 'content',
    destination_table => 'chunked_data__1321',
    destination_data_column => 'chunk',
    destination_key_column => 'id',
    options => '{"desired_length": 1000}'::JSONB  -- Configuration for the ChunkText operation
);

Create the second preparer to summarize the chunked text

-- Create the model. It must support the decode_text and decode_text_batch operations.
SELECT aidb.create_model('model__1321', 't5_local');

SELECT aidb.create_table_preparer(
    name => 'summarizing_preparer__1321',
    operation => 'SummarizeText',
    source_table => 'chunked_data__1321',         -- Reference the output from the ChunkText preparer
    source_key_column => 'unique_id',             -- Reference the unique column from the output of the ChunkText preparer
    source_data_column => 'chunk',                -- Reference the output from the ChunkText preparer
    destination_table => 'summarized_data__1321',
    destination_data_column => 'summary',
    destination_key_column => 'chunk_unique_id',
    options => '{"model": "model__1321"}'::JSONB  -- Configuration for the SummarizeText operation
);

Tip

This operation transforms the shape of the data, automatically unnesting collections by introducing a part_id column. See the unnesting concept for more detail.

Set both preparers to live auto-processing

SELECT aidb.set_auto_preparer('chunking_preparer__1321', 'Live');
SELECT aidb.set_auto_preparer('summarizing_preparer__1321', 'Live');

Insert data for processing

When you insert data into the source data table, you see processed results flowing automatically:

INSERT INTO source_table__1321
VALUES (1, 'This is a significantly longer text example that might require splitting into smaller chunks. The purpose of this function is to partition text data into segments of a specified maximum length, for example, this sentence 145 is characters. This enables processing or storage of data in manageable parts.');

Chunks calculated automatically:

SELECT * FROM chunked_data__1321;

Output

 id | part_id | unique_id |                                                                       chunk
----+---------+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  |       0 | 1.part.0  | This is a significantly longer text example that might require splitting into smaller chunks.
 1  |       1 | 1.part.1  | The purpose of this function is to partition text data into segments of a specified maximum length, for example, this sentence 145 is characters.
 1  |       2 | 1.part.2  | This enables processing or storage of data in manageable parts.
(3 rows)

Summaries of the chunks calculated automatically:

SELECT * FROM summarized_data__1321;

Output

 chunk_unique_id |                                               summary
-----------------+------------------------------------------------------------------------------------------------------
 1.part.0        | text example might require splitting into smaller chunks .
 1.part.1        | the purpose of this function is to partition text data into segments of a specified maximum length .
 1.part.2        | enables processing or storage of data in manageable parts .
(3 rows)

The same automatic flow of logic occurs for deletions:

DELETE FROM source_table__1321 WHERE id = 1;

SELECT * FROM chunked_data__1321;

Output

 id | part_id | unique_id | chunk
----+---------+-----------+-------
(0 rows)

SELECT * FROM summarized_data__1321;

Output

 chunk_unique_id | summary
-----------------+---------
(0 rows)

← Prev

AI Accelerator Pipelines - Preparers examples

↑ Up

AI Accelerator Pipelines - Preparers examples

Preparers auto-processing example

Preparer chaining example Innovation Release

Create the first preparer to chunk text

Create the second preparer to summarize the chunked text

Tip

Set both preparers to live auto-processing

Insert data for processing

← Prev

↑ Up

Next →