Architecture overview v2
Hadoop is a framework that allows you to store a large data set in a distributed file system.
The Hadoop data wrapper provides an interface between a Hadoop file system and a Postgres database. The Hadoop data wrapper transforms a Postgres SELECT statement into a query that is understood by the HiveQL or Spark SQL interface.
When possible, the foreign data wrapper asks the Hive or Spark server to perform the actions associated with the WHERE clause of a SELECT statement. Pushing down the WHERE clause improves performance by decreasing the amount of data moving across the network.