Preparer chaining example Innovation Release
This documentation covers the current Innovation Release of
EDB Postgres AI. You may also want the docs for the current LTS version.
Example of chaining multiple preparers together with auto-processing using the ChunkText and SummarizeText operations in AI Accelerator.
Create the first preparer to chunk text
-- Create source test table CREATE TABLE source_table__1321 ( id INT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY, content TEXT NOT NULL ); SELECT aidb.create_table_preparer( name => 'chunking_preparer__1321', operation => 'ChunkText', source_table => 'source_table__1321', source_key_column => 'id', source_data_column => 'content', destination_table => 'chunked_data__1321', destination_data_column => 'chunk', destination_key_column => 'id', options => '{"desired_length": 1000}'::JSONB -- Configuration for the ChunkText operation );
Create the second preparer to summarize the chunked text
-- Create the model. It must support the decode_text and decode_text_batch operations. SELECT aidb.create_model('model__1321', 't5_local'); SELECT aidb.create_table_preparer( name => 'summarizing_preparer__1321', operation => 'SummarizeText', source_table => 'chunked_data__1321', -- Reference the output from the ChunkText preparer source_key_column => 'unique_id', -- Reference the unique column from the output of the ChunkText preparer source_data_column => 'chunk', -- Reference the output from the ChunkText preparer destination_table => 'summarized_data__1321', destination_data_column => 'summary', destination_key_column => 'chunk_unique_id', options => '{"model": "model__1321"}'::JSONB -- Configuration for the SummarizeText operation );
Tip
This operation transforms the shape of the data, automatically unnesting collections by introducing a part_id column. See the unnesting concept for more detail.
Set both preparers to live auto-processing
SELECT aidb.set_auto_preparer('chunking_preparer__1321', 'Live'); SELECT aidb.set_auto_preparer('summarizing_preparer__1321', 'Live');
Insert data for processing
When you insert data into the source data table, you see processed results flowing automatically:
INSERT INTO source_table__1321 VALUES (1, 'This is a significantly longer text example that might require splitting into smaller chunks. The purpose of this function is to partition text data into segments of a specified maximum length, for example, this sentence 145 is characters. This enables processing or storage of data in manageable parts.');
Chunks calculated automatically:
SELECT * FROM chunked_data__1321;
Output
id | part_id | unique_id | chunk ----+---------+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------- 1 | 0 | 1.part.0 | This is a significantly longer text example that might require splitting into smaller chunks. 1 | 1 | 1.part.1 | The purpose of this function is to partition text data into segments of a specified maximum length, for example, this sentence 145 is characters. 1 | 2 | 1.part.2 | This enables processing or storage of data in manageable parts. (3 rows)
Summaries of the chunks calculated automatically:
SELECT * FROM summarized_data__1321;
Output
chunk_unique_id | summary -----------------+------------------------------------------------------------------------------------------------------ 1.part.0 | text example might require splitting into smaller chunks . 1.part.1 | the purpose of this function is to partition text data into segments of a specified maximum length . 1.part.2 | enables processing or storage of data in manageable parts . (3 rows)
The same automatic flow of logic occurs for deletions:
DELETE FROM source_table__1321 WHERE id = 1;
SELECT * FROM chunked_data__1321;
Output
id | part_id | unique_id | chunk ----+---------+-----------+------- (0 rows)
SELECT * FROM summarized_data__1321;
Output
chunk_unique_id | summary -----------------+--------- (0 rows)