EDB Postgres Lakehouse

EDB Postgres® Lakehouse brings modern lakehouse architecture to Postgres-based analytics.

It enables fast SQL-based analytics on data stored in object storage (S3-compatible), using open table formats such as Apache Iceberg and Delta Lake.

For implementation and management in Hybrid Manager (HM), see Lakehouse clusters in EDB Hybrid Manager.

What is the EDB Postgres Lakehouse

EDB Postgres Lakehouse is an architecture pattern and set of capabilities that:

Integrate Postgres with modern lakehouse patterns
Provide fast, scalable analytics on data in object storage
Use vectorized query execution and columnar storage formats
Leverage open table formats for interoperability and data governance

Related concept: Data lakehouse

Why Lakehouse matters for EDB analytics

Lakehouse enables Analytics Accelerator users to:

Run fast SQL analytics on object storage — no data movement required
Implement separation of storage and compute for cost efficiency
Support interoperability with external tools (Spark, Trino, Flink, data science frameworks)
Query PGD Tiered Tables offloaded to Iceberg seamlessly
Enable unified OLTP + OLAP architectures with Postgres at the center

Related concept: Analytics Accelerator concepts

How EDB implements Lakehouse architecture

Core components:

EDB Postgres Lakehouse Nodes:
Stateless analytical compute nodes
Provisioned and managed via Hybrid Manager (HM) or self-managed
Vectorized query engine:
Powered by Apache DataFusion
Processes data in Parquet and other columnar formats efficiently
PGAA:
Postgres extensions enabling Lakehouse behavior:
External tables over Iceberg and Delta Lake
Unified access to PGD hot data + offloaded cold data
PGFS:
Unified access layer to object storage
Supports S3, GCS, MinIO, Azure Data Lake Storage, and compatible systems
Open table formats:
Apache Iceberg (full support including catalogs)
Delta Lake (read-only support)

Related concepts:

Common use cases

Use case	EDB Postgres Lakehouse capability
Business intelligence & reporting	Run fast, scalable SQL on data in object storage
Historical analytics	Seamlessly query offloaded PGD Tiered Tables
Data lake analytics	Query existing Iceberg and Delta Lake tables without ETL
Data science pipelines	Provide efficient access to training data and features
Hybrid architectures	Enable Postgres-centered OLTP + OLAP patterns

Role-based guidance

Database administrators (DBAs)

Analytics Accelerator for your role: DBA

Data scientists / analysts

Analytics Accelerator for your role: Data scientist / analyst

DevOps / SRE

Analytics Accelerator for your role: DevOps / SRE

Application developers

Analytics Accelerator for your role: Application developer

Learning paths

Next steps

For Hybrid Manager users

Lakehouse clusters in EDB Hybrid Manager

How-To guides

Explore more in the Analytics Accelerator learning guide.

← Prev

Apache Iceberg

↑ Up

Analytics/Lakehouse

Next →

Analytics Accelerator learning resources

Could this page be better? Report a problem or suggest an addition!