EDB Postgres Lakehouse

EDB Postgres® Lakehouse brings modern lakehouse architecture to Postgres-based analytics.

It enables fast SQL-based analytics on data stored in object storage (S3-compatible), using open table formats such as Apache Iceberg and Delta Lake.

For implementation and management in Hybrid Manager (HM), see Lakehouse clusters in EDB Hybrid Manager.

What is the EDB Postgres Lakehouse

EDB Postgres Lakehouse is an architecture pattern and set of capabilities that:

  • Integrate Postgres with modern lakehouse patterns
  • Provide fast, scalable analytics on data in object storage
  • Use vectorized query execution and columnar storage formats
  • Leverage open table formats for interoperability and data governance

Related concept: Data lakehouse

Why Lakehouse matters for EDB analytics

Lakehouse enables Analytics Accelerator users to:

  • Run fast SQL analytics on object storage — no data movement required
  • Implement separation of storage and compute for cost efficiency
  • Support interoperability with external tools (Spark, Trino, Flink, data science frameworks)
  • Query PGD Tiered Tables offloaded to Iceberg seamlessly
  • Enable unified OLTP + OLAP architectures with Postgres at the center

Related concept: Analytics Accelerator concepts

How EDB implements Lakehouse architecture

Core components:

  • EDB Postgres Lakehouse Nodes:
  • Stateless analytical compute nodes
  • Provisioned and managed via Hybrid Manager (HM) or self-managed
  • Vectorized query engine:
  • Powered by Apache DataFusion
  • Processes data in Parquet and other columnar formats efficiently
  • PGAA:
  • Postgres extensions enabling Lakehouse behavior:
  • External tables over Iceberg and Delta Lake
  • Unified access to PGD hot data + offloaded cold data
  • PGFS:
  • Unified access layer to object storage
  • Supports S3, GCS, MinIO, Azure Data Lake Storage, and compatible systems
  • Open table formats:
  • Apache Iceberg (full support including catalogs)
  • Delta Lake (read-only support)

Related concepts:

Common use cases

Use caseEDB Postgres Lakehouse capability
Business intelligence & reportingRun fast, scalable SQL on data in object storage
Historical analyticsSeamlessly query offloaded PGD Tiered Tables
Data lake analyticsQuery existing Iceberg and Delta Lake tables without ETL
Data science pipelinesProvide efficient access to training data and features
Hybrid architecturesEnable Postgres-centered OLTP + OLAP patterns

Role-based guidance

Database administrators (DBAs)

Data scientists / analysts

DevOps / SRE

Application developers

Learning paths

Next steps

For Hybrid Manager users

How-To guides

Explore more in the Analytics Accelerator learning guide.


Could this page be better? Report a problem or suggest an addition!