EDB Postgres Lakehouse Clusters on Hybrid Manager

EDB Postgres Lakehouse Clusters provide high-speed analytical compute for querying open lakehouse table formats (Apache Iceberg, Delta Lake) in object storage.

Hybrid Manager (HM) enables you to provision and manage Lakehouse Clusters through its UI and API, integrating them with your existing PGD clusters and broader analytics ecosystem.

For foundational concepts about EDB Postgres Lakehouse architecture, see EDB Postgres Lakehouse: An Overview.

Why use Lakehouse Clusters in Hybrid Manager

  • Managed analytical compute: Provision Lakehouse Clusters in HM to run fast queries on object storage.
  • Separation of compute and storage: Keep data in object storage, scale analytical compute as needed.
  • Integration with PGD: Use Lakehouse Clusters with PGD Tiered Tables and offloaded data.
  • Interoperability: Lakehouse Clusters can query Iceberg and Delta Lake tables shared with other analytics engines.
  • Familiar Postgres interface: Use standard SQL clients and tools with EDB Postgres Lakehouse Clusters.

Key terms and architecture overview

For definitions of core analytics terms used in Hybrid Manager—such as PGFS, PGAA, Lakehouse Cluster, and Analytics Offload—see Analytics Concepts in Hybrid Manager.

When should I use Lakehouse Clusters in Hybrid Manager?

Use Lakehouse Clusters in Hybrid Manager when you want to:

  • Run fast, scalable analytics on large datasets stored in object storage.
  • Query data from PGD Tiered Tables or offloaded PGD transactional data.
  • Integrate Postgres with external data lakes built on Iceberg or Delta Lake.
  • Support BI tools and ad-hoc users with Postgres SQL access to data lake content.
  • Manage analytical compute separately from transactional databases—scale only when needed.

Key capabilities of Lakehouse Clusters in Hybrid Manager

Running fast queries on object storage

What: Run analytical queries on Apache Iceberg and Delta Lake tables stored in object storage.

Why: Enable fast, scalable analytics on large datasets without moving them into Postgres storage.

How: Provision Lakehouse Clusters in HM and query data using PGAA with vectorized query execution.

Where: S3-compatible object storage buckets connected via PGFS.

How-To: Query existing Iceberg tables

How-To: Query Delta Lake tables

Supporting PGD Tiered Tables and offloading

What: Use Lakehouse Clusters as the query target for PGD offloaded and tiered data.

Why: Optimize PGD operational storage while keeping historical data queryable.

How: Configure PGD node groups for analytics replication to object storage in Iceberg format.

Where: Offloaded PGD data in Iceberg, queried through Lakehouse Clusters.

How-To: Offload PGD data to Apache Iceberg

How-To: Query Tiered Tables

Querying external data lakes

What: Query Iceberg or Delta Lake tables created by other tools (Spark, Trino, Flink).

Why: Avoid data duplication and ETL by querying external data lakes from Postgres.

How: Connect Lakehouse Clusters to external Iceberg catalogs or directly to Delta Lake storage.

Where: S3-compatible object storage, with optional catalog integration.

How-To: Configure an Iceberg REST catalog connection

How-To: Query existing Iceberg tables

How-To: Query Delta Lake tables

Centralized management of analytical compute

What: Provision, monitor, and manage analytical clusters in HM alongside your PGD clusters.

Why: Manage analytical and transactional compute in one control plane.

How: Use HM UI or API to create and configure Lakehouse Clusters.

Where: Lakehouse Clusters run in cloud environments managed by HM.

How-To: Create a Lakehouse Cluster

Getting started with Lakehouse Clusters in Hybrid Manager

To begin using Lakehouse Clusters with Hybrid Manager:

  1. Provision a Lakehouse Cluster.
  2. Configure PGFS storage locations pointing to object storage.
  3. (Optional) Configure an Iceberg catalog connection.
  4. (Optional) Offload PGD data to Iceberg using Tiered Tables.
  5. Create Lakehouse tables or external readers for Iceberg/Delta data.
  6. Query your data using Postgres clients.

Next topic

Apache Iceberg in Hybrid Manager


Could this page be better? Report a problem or suggest an addition!