Create a Lakehouse cluster v1.2
This guide explains how to create a Lakehouse cluster in Hybrid Manager (HM). Lakehouse clusters provide scalable analytical compute for fast queries on Apache Iceberg® or Delta Lake tables stored in object storage.
For architectural background, see Lakehouse clusters in Hybrid Manager.
Why create a Lakehouse cluster?
By creating a Lakehouse cluster:
- You unlock the ability to run fast Postgres SQL queries on large datasets in object storage.
- You can query Tiered Tables offloaded from PGD or external Iceberg/Delta Lake tables.
- You provide compute for BI tools and ad-hoc analytics on your lakehouse data.
This is a foundational step for building your analytics architecture in Hybrid Manager.
Goals
After completing this How-To, you will be able to:
- Deploy a Lakehouse cluster.
- Connect to it using standard Postgres clients.
- Configure storage and catalogs for querying Iceberg and Delta Lake tables.
- Use the cluster to query existing or newly defined lakehouse tables.
Prerequisites
Before you begin:
- You have access to a Hybrid Manager environment with permissions to create Lakehouse clusters.
- You have object storage available (S3, GCS, MinIO, etc.).
- You have Iceberg or Delta Lake tables available — or plan to define them after cluster creation.
- You have an Iceberg catalog prepared if working with Iceberg tables:
- Configure an Iceberg REST catalog connection
Steps
Step 1: Initiate cluster creation
- In the HM dashboard, click Create New.
- Select Lakehouse Analytics or Analytical Cluster.
- Confirm that you are creating an Analytics Cluster for querying open lakehouse formats.
Step 2: Choose your path
You will be prompted to choose:
- Templates — Use a pre-configured template (if available).
- Custom build — Click Start from Scratch for full configuration control (recommended for most cases).
Step 3: Configure cluster settings
Cluster Settings
- Cluster name — Enter a unique name (e.g.
sales_analytics_lakehouse). - Password — Enter a strong superuser password.
- Tags — Optionally add tags (e.g.
environment:production,project:q3-reporting). - Deployment location — Select cloud provider and region.
- Database type — Choose EDB Postgres Extended Server or Advanced Server.
- Postgres version — Select supported version with Lakehouse extensions.
- Instance size — Select compute size appropriate for your workload.
Additional Settings
- Networking — Configure VPC, subnet, security groups, IP whitelisting.
- Backups — Configure backup settings if applicable.
Step 4: Review and create
- Review the Cluster Summary.
- Confirm all settings.
- Click Create Cluster.
- Monitor progress in the HM dashboard.
What you can do next
Now that your Lakehouse cluster is provisioned, you can:
Connect to the cluster to run SQL queries: Use standard Postgres clients (psql, DBeaver, pgAdmin) with HM-provided connection details.
Configure storage and catalogs to access data:
Configure a PGFS storage location — if querying Delta Lake or filesystem-based Iceberg.
Configure an Iceberg REST catalog connection — if querying catalog-managed Iceberg tables or using PGD Tiered Tables.
Define and query Lakehouse tables:
Query Tiered Tables offloaded from PGD: