Configure PGD node group for analytics offload

After configuring a PGFS storage location, you must configure your PGD node group to offload analytics data for Tiered Tables.

You can choose to:

  • Offload without a catalog (simpler setup — filesystem-based Iceberg tables), or
  • Offload to an Iceberg catalog (recommended — enables interoperability with external tools)

Why configure PGD node group for offload?

By configuring this:

  • You enable automatic offloading of cold partitions created by BDR AutoPartition.
  • You can control where your offloaded data is stored — either in raw filesystem paths or as catalog-managed Iceberg tables.
  • This step is required to implement full Tiered Tables behavior.

For background, see: Tiered Tables in Hybrid Manager

Prerequisites

Before configuring PGD node group for offload:

Offloading without a catalog

If you want PGD to offload directly to raw Iceberg filesystem paths:

Step 1: Identify your PGD node group

SELECT node_group_name FROM bdr.node_group;

Example result: edb_pgdx_node_group_1

Step 2: Set the node group option analytics_storage_location

SELECT bdr.alter_node_group_option(
'edb_pgdx_node_group_1',
'analytics_storage_location',
'hm_tiered_analytics_store'
);
  • Replace 'hm_tiered_analytics_store' with the name of your PGFS storage location.

Result:

  • PGD will offload eligible partitions into filesystem-based Iceberg tables at this location.

If you want PGD to offload into Iceberg catalog-managed tables:

Step 1: Ensure your catalog is configured and attached

SELECT pgaa.add_catalog(
'hm_main_lakekeeper',
'iceberg-rest',
'{
"url": "https://hm.example.com/catalog/v1",
"token": "your_hm_api_key",
"warehouse": "lakehouse_warehouse_1"
}'
);

SELECT pgaa.attach_catalog('hm_main_lakekeeper');

Step 2: Set the node group option analytics_write_catalog

SELECT bdr.alter_node_group_option(
'edb_pgdx_node_group_1',
'analytics_write_catalog',
'hm_main_lakekeeper'
);
  • Replace 'hm_main_lakekeeper' with the name of your attached catalog.

Result:

  • PGD will offload eligible partitions as Iceberg tables managed by this catalog.

Validate the configuration

Run this to verify:

SELECT * FROM bdr.node_group_option WHERE node_group_name = 'edb_pgdx_node_group_1';

You should see either:

  • analytics_storage_location set (no catalog), or
  • analytics_write_catalog set (using a catalog).

Notes

  • You must set either analytics_storage_location or analytics_write_catalog, not both.
  • Using a catalog is strongly recommended if you want offloaded data to be queryable by:
  • Lakehouse clusters
  • External analytics engines (Spark, Trino, etc.)
  • When using no catalog, offloaded tables will be written into filesystem paths in your PGFS location — suitable for internal use, but harder to interoperate.

Next steps

Now that you have configured PGD node group offload:

For an architecture view of how this fits into the Tiered Tables flow, see: Tiered Tables in Hybrid Manager


Could this page be better? Report a problem or suggest an addition!