Analytics/Lakehouse
Use the Analytics Accelerator to explore the analytical capabilities built on EDB Postgres®. This accelerator helps you understand core concepts, explore key technologies such as EDB Postgres® Lakehouse, and learn how to implement analytics with EDB Hybrid Manager (HM).
We integrate modern data architectures and open standards with the reliability and flexibility of Postgres to help you unlock valuable insights.
Navigating the analytics accelerator
The accelerator organizes content into four areas:
Conceptual foundations Build your understanding of analytics principles and EDB’s approach.
EDB core analytics technologies Learn about EDB solutions and technologies that power our analytics offerings.
Practical guidance and solutions Find use cases, persona-based guides, how-to articles, and tutorials.
Product-specific implementations Access documentation for how these analytics capabilities surface and are managed in EDB products, such as EDB Hybrid Manager.
Conceptual foundations
Understand the principles and strategies behind modern data analytics and EDB’s approach.
Generic analytics concepts Learn about data architectures (Data Warehouse, Data Lake, Lakehouse) and foundational technologies (columnar storage, vectorized engines, and others).
EDB analytics concepts Explore EDB’s vision for Postgres® analytics and how EDB leverages core technologies.
Explained: Analytics Review in-depth explanations of EDB analytical features, design choices, and advanced topics. (Coming soon)
EDB core analytics technologies
Learn about EDB’s analytics technologies and how they extend Postgres®.
Lakehouse clusters (EDB Postgres Lakehouse overview) Review the EDB Postgres® Lakehouse solution and its components for enabling analytics on object storage.
Apache Iceberg with EDB solutions Understand how EDB solutions use Apache Iceberg to manage large analytical datasets.
Delta Lake with EDB solutions Learn how EDB Postgres® interacts with Delta tables to enable reliable data lakes.
Tiered tables with EDB Postgres Manage data across storage tiers using EDB Postgres Distributed (PGD) and Lakehouse capabilities to optimize cost and performance.
Practical guidance and solutions
Apply EDB’s analytics capabilities to meet your needs.
- Analytics for your role (persona-based guide) Follow learning paths for DBAs, DevOps engineers, data scientists, and application developers.
Product-specific implementations
Review how EDB analytics concepts and technologies are implemented in EDB products.
- HM Analytics Spoke (Analytics in EDB Hybrid Manager) Access documentation for analytics features in EDB Hybrid Manager. This includes HM Lakehouse clusters, using Iceberg, Delta, and tiered tables in HM, and HM-specific tutorials.
Where to start
- Start with Generic analytics concepts and Lakehouse overview to understand core ideas.
- Explore practical guidance when available.
- Use product-specific documentation when working with EDB Hybrid Manager.
Postgres Lakehouse is built using a number of technologies:
- PostgreSQL
- Seafowl, an analytical database
- Apache DataFusion, the query engine used by Seafowl
- Delta Lake (and specifically delta-rs), for implementing the storage and retrieval layer of Delta Tables
Level 100
The most important thing to understand about Postgres Lakehouse is that it separates storage from compute. This design allows you to scale them independently, which is ideal for analytical workloads where queries can be unpredictable and spiky. You wouldn't want to keep a machine mostly idle just to hold data on its attached hard drives. Instead, you can keep data in object storage (and also in highly compressible formats), and only provision the compute needed to query it when necessary.
On the compute side, a vectorized query engine is optimized to query Lakehouse tables but still fall back to Postgres for full compatibility.
On the storage side, Lakehouse tables are stored using highly compressible columnar storage formats optimized for analytics.
Level 200
Here's a slightly more comprehensive diagram of how these services fit together:
Level 300
Here's the more detailed, zoomed-in view of "what's in the box":
Quick start
Launch a Lakehouse node and query sample data.
Tiered Tables
Understand how Tiered Tables enable cost-efficient analytics and data management with EDB Postgres Distributed (PGD) and the Analytics Accelerator.
Reference
Things to know about EDB Postgres® AI Lakehouse
Delta Lake
Understand how Delta Lake enhances modern data lakes and how Analytics Accelerator leverages it within the EDB Postgres ecosystem.
Iceberg
Understand how Apache Iceberg enables scalable, reliable data lake storage and how Analytics Accelerator leverages it within the EDB Postgres ecosystem.
Lakehouse
Understand how the EDB Postgres Lakehouse architecture supports fast, scalable analytics on modern data lake storage.
Learning Resources
Navigate Analytics Accelerator documentation with explanations, tutorials, how-to guides, use cases, persona-based guidance, and structured learning paths.
WarehousePG and EDB Postgres AI support for Greenplum workloads
Covering the open source WarehousePG and EDB Postgres AI support for Greenplum workloads with WarehousePG.
Could this page be better? Report a problem or suggest an addition!