Preparers auto-processing example Innovation Release
This documentation covers the current Innovation Release of
EDB Postgres AI. You may also want the docs for the current LTS version.
This example uses preparer auto-processing with the ChunkText operation in AI Accelerator.
Tip
This operation transforms the shape of the data, automatically unnesting collections by introducing a part_id column. See the unnesting concept for more detail.
Preparer with table data source
-- Create source test table CREATE TABLE source_table__1628 ( id INT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY, content TEXT NOT NULL ); SELECT aidb.create_table_preparer( name => 'preparer__1628', operation => 'ChunkText', source_table => 'source_table__1628', source_data_column => 'content', destination_table => 'chunked_data__1628', destination_data_column => 'chunk', source_key_column => 'id', destination_key_column => 'id', options => '{"desired_length": 150}'::JSONB -- Configuration for the ChunkText operation ); SELECT aidb.set_auto_preparer('preparer__1628', 'Live'); INSERT INTO source_table__1628 VALUES (1, 'This is a significantly longer text example that might require splitting into smaller chunks. The purpose of this function is to partition text data into segments of a specified maximum length, for example, this sentence 145 is characters. This enables processing or storage of data in manageable parts.');
SELECT * FROM chunked_data__1628;
Output
id | part_id | unique_id | chunk ----+---------+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------- 1 | 0 | 1.part.0 | This is a significantly longer text example that might require splitting into smaller chunks. 1 | 1 | 1.part.1 | The purpose of this function is to partition text data into segments of a specified maximum length, for example, this sentence 145 is characters. 1 | 2 | 1.part.2 | This enables processing or storage of data in manageable parts. (3 rows)
INSERT INTO source_table__1628 VALUES (2, 'This sentence is short enough for desired_length=150 so it wont be chunked. But this second sentence will be in another chunk since both together are over 150 characters.');
SELECT * FROM chunked_data__1628;
Output
id | part_id | unique_id | chunk ----+---------+-----------+--------------------------------------------------------------------------------------------------------------------------------------------------- 1 | 0 | 1.part.0 | This is a significantly longer text example that might require splitting into smaller chunks. 1 | 1 | 1.part.1 | The purpose of this function is to partition text data into segments of a specified maximum length, for example, this sentence 145 is characters. 1 | 2 | 1.part.2 | This enables processing or storage of data in manageable parts. 2 | 0 | 2.part.0 | This sentence is short enough for desired_length=150 so it wont be chunked. 2 | 1 | 2.part.1 | But this second sentence will be in another chunk since both together are over 150 characters. (5 rows)
DELETE FROM source_table__1628 WHERE id = 1;
SELECT * FROM chunked_data__1628;
Output
id | part_id | unique_id | chunk ----+---------+-----------+---------------------------------------- 2 | 0 | 2.part.0 | This sentence is short enough for desired_length=150 so it wont be chunked. 2 | 1 | 2.part.1 | But this second sentence will be in another chunk since both together are over 150 characters. (2 rows)
SELECT aidb.set_auto_preparer('preparer__1628', 'Disabled');
- On this page
- Preparer with table data source