Configure PGD node group for analytics offload

Suggest edits

After configuring a PGFS storage location, you must configure your PGD node group to offload analytics data for Tiered Tables.

You can choose to:

Offload without a catalog (simpler setup — filesystem-based Iceberg tables), or
Offload to an Iceberg catalog (recommended — enables interoperability with external tools)

Why configure PGD node group for offload?

By configuring this:

You enable automatic offloading of cold partitions created by BDR AutoPartition.
You can control where your offloaded data is stored — either in raw filesystem paths or as catalog-managed Iceberg tables.
This step is required to implement full Tiered Tables behavior.

For background, see: Tiered Tables in Hybrid Manager

Prerequisites

Before configuring PGD node group for offload:

You have a PGD cluster provisioned in Hybrid Manager.
You have created a PGFS storage location: Configure PGFS storage for Tiered Tables
(Optional) You have configured an Iceberg catalog connection (recommended): Configure an Iceberg catalog connection

Offloading without a catalog

If you want PGD to offload directly to raw Iceberg filesystem paths:

Step 1: Identify your PGD node group

SELECT node_group_name FROM bdr.node_group;

Example result: edb_pgdx_node_group_1

Step 2: Set the node group option `analytics_storage_location`

SELECT bdr.alter_node_group_option(
'edb_pgdx_node_group_1',
'analytics_storage_location',
'hm_tiered_analytics_store'
);

Replace 'hm_tiered_analytics_store' with the name of your PGFS storage location.

Result:

PGD will offload eligible partitions into filesystem-based Iceberg tables at this location.

Offloading with an Iceberg catalog (recommended)

If you want PGD to offload into Iceberg catalog-managed tables:

Step 1: Ensure your catalog is configured and attached

SELECT pgaa.add_catalog(
'hm_main_lakekeeper',
'iceberg-rest',
'{
"url": "https://hm.example.com/catalog/v1",
"token": "your_hm_api_key",
"warehouse": "lakehouse_warehouse_1"
}'
);

SELECT pgaa.attach_catalog('hm_main_lakekeeper');

Step 2: Set the node group option `analytics_write_catalog`

SELECT bdr.alter_node_group_option(
'edb_pgdx_node_group_1',
'analytics_write_catalog',
'hm_main_lakekeeper'
);

Replace 'hm_main_lakekeeper' with the name of your attached catalog.

Result:

PGD will offload eligible partitions as Iceberg tables managed by this catalog.

Validate the configuration

Run this to verify:

SELECT * FROM bdr.node_group_option WHERE node_group_name = 'edb_pgdx_node_group_1';

You should see either:

analytics_storage_location set (no catalog), or
analytics_write_catalog set (using a catalog).

Notes

You must set either analytics_storage_location or analytics_write_catalog, not both.
Using a catalog is strongly recommended if you want offloaded data to be queryable by:
Lakehouse clusters
External analytics engines (Spark, Trino, etc.)
When using no catalog, offloaded tables will be written into filesystem paths in your PGFS location — suitable for internal use, but harder to interoperate.

Next steps

Now that you have configured PGD node group offload:

You can configure AutoPartition to begin partitioning and triggering offload: Configure BDR AutoPartition with analytics offload
You can monitor Tiered Table offload and storage savings: Monitor Tiered Tables status and storage savings
You can query Tiered Tables from PGD parent table or Lakehouse: Query Tiered Tables from PGD and Lakehouse
If you need to adjust storage location or catalog later, simply run bdr.alter_node_group_option() again.

For an architecture view of how this fits into the Tiered Tables flow, see: Tiered Tables in Hybrid Manager

← Prev

Configure BDR AutoPartition with analytics offload

↑ Up

How-To Guides for Analytics in Hybrid Manager

Configure PGFS storage for Tiered Tables

Could this page be better? Report a problem or suggest an addition!