Enabling Data Catalogs on Hybrid Manager

The Hybrid Manager's data catalog requires additional configuration to manage Lakehouse data storage. This involves creating a Kubernetes secret with a confounding key that the Lakekeeper service can use to store encrypted data.

Creating a secret with the key

For EKS installations using the eks-install-secrets.sh script, you can skip key and secret creation and go straight to Backing up the confounding key. Other EKS setups and operating systems require manual secret creation with the confounding key.

  1. Create a confounding key and store in a variable:

    PG_CONFOUNDING_KEY=$(dd if=/dev/urandom bs=32 count=1 2>/dev/null | base64)
    Note
    • A confounding key is a randomized string that is at least 32 bytes long.
    • Create a new confounding key per Hybrid Manager deployment.
  2. Create a namespace for the service:

    kubectl create namespace upm-lakekeeper
  3. Create a secret that references the created confounding key and is stored in the dedicated namespace:

    kubectl apply -f - <<EOF
      apiVersion: v1
      kind: Secret
      metadata:
        name: pg-confounding-key
        namespace: upm-lakekeeper
      stringData:
        PG_CONFOUNDING_KEY: ${PG_CONFOUNDING_KEY}
    EOF

    After you have configured the secret, continue with the Hybrid Manager installation.

Backing up the confounding key

The Hybrid Manager administrator must keep the confounding key safe and back it up. The loss of this key in a disaster scenario leads to a situation in which there is no mechanism for accessing the Lakehouse data managed by the Hybrid Manager data catalog. This would require the administrator to create and store the new key, restart the upm-lakekeeper/lakekeeper workload, and rebuild all of the existing data catalogs carefully without deleting them. Such a procedure is full of risks and would require support from EDB PG AI Professional Services team.

Fetch the key so you can store it safely:

kubectl get secrets -n upm-lakekeeper pg-confounding-key -o yaml

Could this page be better? Report a problem or suggest an addition!