Pipelines PGFS with S3-compatible storage

Overview: S3-compatible storage

PGFS uses the s3: prefix to indicate an S3-compatible storage system.

The general syntax for using S3 is:

select pgfs.create_storage_location(
               'storage_location_name',
               's3://bucket_name',
               options => '{}',
               credentials => '{}'
       );

Settings for the options argument in JSON format settings

OptionDescription
bucketThe bucket name. Used to explicitly provide the bucket name if it can't be passed in the url.
skip_signatureDisable HMAC authentication. (Set this to "true" when you're not providing access_key_id/secret_access_key in the credentials.)
regionThe region of the S3-compatible storage system. If the region isn't specified, the client attempts auto-discovery.
endpointThe endpoint of the S3-compatible storage system.
allow_httpWhether the endpoint uses plain HTTP (rather than HTTPS/TLS). Set this to true if your endpoint starts with http://.

Settings for the credentials argument in JSON format

OptionDescription
access_key_idHMAC credentials (often the "username" in non-AWS S3 providers)
secret_access_keyHMAC credentials (often the "password" in non-AWS S3 providers)
session_tokenA session token that can be used instead of HMAC credentials

Example: AWS S3 public bucket

This example uses a public bucket on AWS S3. Public buckets don't require authentication.

SELECT pgfs.create_storage_location('edb_ai_example_images', 's3://public-ai-team',
                                    options => '{"region": "eu-central-1", "skip_signature": "true"}'
       );

Example: AWS S3 private bucket

This example uses a private bucket on AWS S3. Private buckets require authentication. The example uses HMAC credentials.

SELECT pgfs.create_storage_location('internal_ai_project', 's3://my-company-ai-images',
                                    options => '{"region": "eu-central-1"}',
                                    credentials => '{"access_key_id": "secret", "secret_access_key":"secret!"}'
       );

Example: non-AWS S3 / S3-compatible with HTTPS

This example uses an S3-compatible system like minIO. The endpoint must be provided in this case. You can omit it only when using AWS S3.

SELECT pgfs.create_storage_location('ai_images_local_minio', 's3://my-ai-images',
                                    options => '{"endpoint": "https://minio-api.apps.local"}',
                                    credentials => '{"access_key_id": "my_username", "secret_access_key":"my_password"}'
       );

Example: non-AWS S3 / S3-compatible with HTTP

This example uses an S3-compatible system like minIO. The endpoint must be provided in this case. You can omit it only be when using AWS S3.

In this case, the server doesn't use TLS encryption, so the code configures a plain HTTP connection.

SELECT pgfs.create_storage_location('ai_images_local_minio', 
                                    's3://my-ai-images',
                                    options => '{"endpoint": "http://minio-api.apps.local", "allow_http":"true"}',
                                    credentials => '{"access_key_id": "my_username", "secret_access_key":"my_password"}'
       );

Could this page be better? Report a problem or suggest an addition!