Reference

When you create the PGFS extension, the system automatically initializes a dedicated environment within your database to manage external files.

pgfs schema

To keep your database organized and avoid naming conflicts with your existing data, all internal PGFS objects such as tables, views, functions are isolated within a dedicated pgfs schema.

Catalog tables

The extension creates internal catalog tables that can help track your storage locations. These tables store:

  • Endpoint URLS: Where the storage is located. For example, AWS S3, Google Cloud Storage (GCS), or Azure Blob Storage.

  • Authentication configuration: References to static credentials or the keys of IAM roles needed to access the data.

  • Path mapping: Definitions of which virtual paths in Postgres map to which physical paths in the cloud.

Integration points for Pipelines

PGFS is a dependency for AI Accelerator Pipelines. It creates the hooks necessary for:

  • Volumes: Virtual storage abstractions that Pipelines use as data sources.

  • Format handlers: The logic required to identify and process different data formats like Text or Image stored in the cloud.

Functions

These functions are created in the pgfs schema to create, manage, and view the storage locations:

Function nameReturn typeDescription
pgfs.create_storage_locationTEXTCreates a foreign server and user mapping for cloud storage using a URL, options, and credentials or to a local file system.
pgfs.create_foreign_tableTEXTCreates a foreign table with a standard schema (key, size, last_modified, body) on a specified server.
pgfs.create_storage_location_with_foreign_tableTEXTCreates a storage location in the database and associates it with a foreign table.
pgfs.delete_storage_locationVOIDRemoves the foreign server and user mapping for a storage location.
pgfs.get_default_storage_locationTABLERetrieves details of the default storage location.
pgfs.get_storage_locationTABLERetrieves details for a specific storage location by name.
pgfs.list_storage_locationsTABLEReturns a list of all PGFS storage locations including their URLs, IDs, options, and credentials.
pgfs.set_default_storage_locationTEXTUpdates the pgfs.config table to set a global default storage location.
pgfs.update_storage_locationTEXTReplaces an existing storage location with new configuration details.

pgfs.create_storage_location

Creates a storage location in the database using the following parameters:

ParameterTypeDefaultDescription
nameTEXTNoneName for storage location.
urlTEXTNoneURL for this storage location in the prefix://path format. The prefix can be s3:, gs:, file:, az:, adl:, azure:, abs:, abfss, or https.
optionsJSONNullOptions for the storage location (optional).
credentialsJSONNullCredentials for the storage location (optional).

The options and credentials parameters values vary by storage providers. For more information, see Storage specific provider section.

pgfs.create_foreign_table

Creates a foreign table in the database using following parameters:

ParameterTypeDefaultDescription
table_nameTEXTNoneThe name of the table to be created.
server_nameTEXTNoneThe storage location server to link to.

pgfs.create_storage_location_with_foreign_table

Creates a storage location in the database and associates it with a foreign table using following parameters:

ParameterTypeDefaultDescription
storage_location_nameTEXTNoneName for storage location and the table prefix.
urlTEXTNoneURL for this storage location in the prefix://path format. The prefix can be s3:, gs:, file:, az:, adl:, azure:, abs:, abfss, or https.
optionsJSONNullConnection parameters for the storage location (optional).
credentialsJSONNullAuthentication parameters for the storage location (optional).

The options and credentials parameters values vary by storage providers. For more information, see Storage specific provider section.

pgfs.delete_storage_location

Deletes a storage location from the database using the following parameter:

ParameterTypeDefaultDescription
storage_location_nameTEXTNoneName of storage location to delete.

pgfs.get_default_storage_location

Returns the default storage location.

ColumnTypeDescription
default_storage_locationTEXTReturns the details of the default storage location.

pgfs.get_storage_location

Returns information about a storage location using the following parameter:

ParameterTypeDefaultDescription
storage_location_nameTEXTNoneName of the storage location.

It returns the following values:

ColumnTypeDescription
nameTEXTName for storage location.
urlTEXTURL for this storage location in the prefix://path format. The prefix can be s3:, gs:, file:, az:, adl:, azure:, abs:, abfss, or https.
optionsJSONOptions used to connect the given storage location.
credentialsJSONCredentials used to connect the given storage location.

pgfs.list_storage_locations

Lists all storage locations in the database with the following values:

ColumnTypeDescription
nameTEXTName for storage location.
urlTEXTURL for this storage location in the prefix://path format. The prefix can be s3:, gs:, file:, az:, adl:, azure:, abs:, abfss, or https.
optionsJSONOptions used for the storage location.
credentialsJSONCredentials used for the storage location.

pgfs.set_default_storage_location

Sets the default storage location using the following parameter:

ParameterTypeDefaultDescription
storage_location_nameTEXTNoneThe name of the location to set as default.

pgfs.update_storage_location

Updates a storage location in the database using the following parameters:

ParameterTypeDefaultDescription
nameTEXTNoneThe name of the storage location to update.
urlTEXTNoneThe new endpoint URL.
optionsJSONNullNew connection parameters with new values.
credentialsJSONNullNew authentication parameters.

Could this page be better? Report a problem or suggest an addition!