Tiered Tables in Hybrid Manager v1.3.5

Tiered Tables in Hybrid Manager (HM) provide an automated solution for managing large time-series or historical datasets by moving older data from PGD clusters to cost-effective object storage in Apache Iceberg® format.

Hub quick link: Analytics Hub

Hybrid Manager integrates this capability across:

  • PGD clusters with BDR AutoPartition
  • PGAA extensions for offloading
  • Lakehouse Clusters for querying
  • Centralized catalog services (Lakekeeper or external Iceberg catalogs)

Why use Tiered Tables with Hybrid Manager

  • Lower storage costs: Offload "cold" data to object storage (Iceberg) and shrink primary PGD transactional tables.
  • Faster transactional performance: Keep "hot" data partitions small for efficient PGD operations.
  • Automated lifecycle management: Move data across tiers automatically based on age.
  • Transparent analytics: Query both hot and cold data via PGD parent table or Lakehouse Cluster.
  • Unified management: Configure and monitor all components through Hybrid Manager.

Key terms and architecture overview

When should I use Tiered Tables in Hybrid Manager?

Use Tiered Tables in Hybrid Manager when you want to:

  • Manage large time-series datasets in a cost-efficient way.
  • Keep PGD operational tables lean for better performance.
  • Meet compliance needs by keeping older data available but outside of PGD storage.
  • Enable BI users to run historical trend queries without impacting production databases.
  • Automate your data lifecycle with minimal manual intervention.

Use cases for Tiered Tables

  • Time-series data: Logging, IoT sensor readings, application telemetry.
  • Archival: Long-term retention of cold data for compliance.
  • Historical trend analysis: BI tools querying years of data without impacting PGD performance.
  • Large, append-mostly tables: Keep transactional footprint small while retaining full analytical access.

How Tiered Tables work in your HM architecture

  • PGD clusters: Manage partitioning and automatic offload of old partitions to Iceberg.
  • PGFS storage locations: Define object storage targets for offload.
  • Iceberg catalogs: Optionally manage offloaded tables in a catalog (Lakekeeper or external).
  • Lakehouse Clusters: Provide scalable analytical compute to query offloaded Iceberg data.
  • Monitoring: Use HM monitoring tools and observability queries to track offload status and storage savings.

Prerequisites within EDB Hybrid Manager

Before implementing Tiered Tables in HM:

  • Active Hybrid Manager instance
  • Provisioned PGD cluster: Version 6.0+ with PGAA and PGFS extensions enabled
  • Lakehouse Cluster (recommended): For querying offloaded data
  • Catalog service: Optional, but recommended — HM-managed Lakekeeper or external REST-compatible catalog
  • Machine user for catalog (if using catalog): With appropriate catalog data writer/reader permissions
  • Object storage: S3-compatible, with credentials if private
  • User permissions: Database user must have create/alter/execute privileges for BDR and PGAA functions

Main capabilities

  • Automated partitioning: Define BDR AutoPartition strategy and analytics_offload_period.
  • Storage tiering: Use PGFS or Iceberg catalog targets for offloaded data.
  • Query transparently: PGD parent table queries can hit both local and Iceberg tiers. Lakehouse Clusters can query Iceberg tables directly.
  • Monitor status: Track offload progress, validate Iceberg content, and observe space savings.

Getting started with Tiered Tables in Hybrid Manager

To begin using Tiered Tables with Hybrid Manager:

  1. Configure PGFS for object storage access.
  2. Learn Tiered Tables concepts and apply policies (AutoPartition, offload period) in PGD.
  3. (Optional) Use a catalog (Lakekeeper or external) when interoperability across engines is needed.
  4. Query offloaded Iceberg data via Lakehouse clusters.

Observability tips

  • Use HM dashboards for PGD cluster health and offload progress.
  • Run analytics queries on bdr.analytics_table and partition views.
  • Use pg_total_relation_size() to observe space reclaimed on PGD nodes.
  • Use cloud storage console or analytics to track Iceberg object size growth.

Next topic

Lakehouse Clusters in Hybrid Manager