EDB Postgres Lakehouse
EDB Postgres® Lakehouse brings modern lakehouse architecture to Postgres-based analytics.
It enables fast SQL-based analytics on data stored in object storage (S3-compatible), using open table formats such as Apache Iceberg and Delta Lake.
For implementation and management in Hybrid Manager (HM), see Lakehouse clusters in EDB Hybrid Manager.
What is the EDB Postgres Lakehouse
EDB Postgres Lakehouse is an architecture pattern and set of capabilities that:
- Integrate Postgres with modern lakehouse patterns
- Provide fast, scalable analytics on data in object storage
- Use vectorized query execution and columnar storage formats
- Leverage open table formats for interoperability and data governance
Related concept: Data lakehouse
Why Lakehouse matters for EDB analytics
Lakehouse enables Analytics Accelerator users to:
- Run fast SQL analytics on object storage — no data movement required
- Implement separation of storage and compute for cost efficiency
- Support interoperability with external tools (Spark, Trino, Flink, data science frameworks)
- Query PGD Tiered Tables offloaded to Iceberg seamlessly
- Enable unified OLTP + OLAP architectures with Postgres at the center
Related concept: Analytics Accelerator concepts
How EDB implements Lakehouse architecture
Core components:
- EDB Postgres Lakehouse Nodes:
- Stateless analytical compute nodes
- Provisioned and managed via Hybrid Manager (HM) or self-managed
- Vectorized query engine:
- Powered by Apache DataFusion
- Processes data in Parquet and other columnar formats efficiently
- PGAA:
- Postgres extensions enabling Lakehouse behavior:
- External tables over Iceberg and Delta Lake
- Unified access to PGD hot data + offloaded cold data
- PGFS:
- Unified access layer to object storage
- Supports S3, GCS, MinIO, Azure Data Lake Storage, and compatible systems
- Open table formats:
- Apache Iceberg (full support including catalogs)
- Delta Lake (read-only support)
Related concepts:
Common use cases
Use case | EDB Postgres Lakehouse capability |
---|---|
Business intelligence & reporting | Run fast, scalable SQL on data in object storage |
Historical analytics | Seamlessly query offloaded PGD Tiered Tables |
Data lake analytics | Query existing Iceberg and Delta Lake tables without ETL |
Data science pipelines | Provide efficient access to training data and features |
Hybrid architectures | Enable Postgres-centered OLTP + OLAP patterns |
Role-based guidance
Database administrators (DBAs)
Data scientists / analysts
DevOps / SRE
Application developers
Learning paths
- Analytics Accelerator 101: Foundational concepts
- Analytics Accelerator 201: Practical application
- Analytics Accelerator 301: Advanced techniques and optimization
Related concepts
- Data lakehouse
- Separation of storage and compute
- Vectorized query engines
- Columnar storage formats
- Open table formats
Next steps
For Hybrid Manager users
How-To guides
Explore more in the Analytics Accelerator learning guide.
Could this page be better? Report a problem or suggest an addition!