Creating an Red Hat OpenShift cluster for use with Hybrid Manager

Suggest edits

After you've installed your system, you're ready to create a Red Hat OpenShift (RHOS) cluster for use with Hybrid Manager (HM).

Preparing an OpenShift cluster for HM installation

To successfully install HM on a Red Hat OpenShift (RHOS) on-premises cluster, certain prerequisites and configurations must be in place.

Preconfiguration cluster requirements

HM currently supports:
- OpenShift 4.17 with Kubernetes 1.30
  or:
- OpenShift 4.18 with Kubernetes 1.31

Step 1: Configure networking

Define network policies. Ensure appropriate network policies are set up to manage traffic between Kubernetes control plane, worker nodes, and external systems.
Define ingress/egress rules.
Ensure communication between cluster components and the HM stack is secure.
- Validate ingress/egress network policies to allow cluster components to communicate while restricting unnecessary external access.
Set up load balancer: If using an external load balancer, configure it to handle OpenShift API traffic and worker node communication.
Configure DNS: Ensure DNS records are set up for the cluster, including internal service discovery for HM.
Ensure the cluster has adequate private network isolation for worker nodes. OpenShift typically handles this via the software-defined network (SDN) configuration.

Step 2: Configure node size and resources

To support HM and its associated components, the nodes in your OpenShift cluster must meet the resource requirements.

Node sizing

Master nodes: OpenShift control plane nodes must meet the standard requirements for resource allocation (for example, CPU, memory, and disk size).
- General recommendations:
  - 3 master nodes each with:
    - CPU:
      - Minimum: 4 vCPUs (for up to 10 worker nodes)
      - Recommended: 8 vCPUs for medium-sized clusters (10-50 worker nodes)
      - For >50 worker nodes: 16+ vCPUs
    - Memory:
      - Minimum: 32 GB RAM (for up to 10 worker nodes)
      - Recommended: 64 GB RAM or more for larger workloads (10-50 worker nodes)
      - For >50 worker nodes: 64+ GB RAM
    - Disk
      - Minimum: 100 GB SSD
      - Recommended: 200 GB SSD for larger workloads
      - For >50 worker nodes: >200 GB SSD
Worker nodes: Allocate sufficient resources to worker nodes to handle HM control components and Postgres clusters: 3 worker nodes for the HM control components and 3 worker nodes for the Postgres workloads are recommended for smaller workloads (5-20 Postgres clusters). As you scale up Postgres workloads from this, you may require more worker nodes for the Postgres workloads and eventually even more control worker nodes for supporting the increasing number of Postgres workload worker nodes.
- General recommendations:
- 6 worker nodes each with:
  - CPU:
    - Minimum: 4 vCPUs (this could be higher depending on the type of workload)
  - Memory:
    - Minimum: 32 GB of RAM per node
  - Disk:
    - Minimum: 100 GB of persistent storage per node (adjust based on database and logging requirements). Use fast disks (SSD) for optimal performance.
To (optionally) use different types of nodes for the HM control nodes and the HM Postgres data nodes hosting the Postgres workloads, RHOS requires additional machine sets:
- Control nodes machine set:
```
spec:
    replicas: 3
    template:
    spec:
        metadata:
        labels:
            edbaiplatform.io/control-plane: "true"
    taints:
    - key: edbplatform.io/control-plane
        value: "true
        effect: NoSchedule
```
- Data nodes hosting Postgres workloads machine set:
```
spec:
    replicas: 3
    template:
    spec:
        metadata:
        labels:
            edbaiplatform.io/control-plane: “true”
        taints:
        - key: edbplatform.io/control-plane
            value: "true"
            effect: NoSchedule
```

Resource reservations

To prevent contention with HM workloads, reserve CPU and memory for system components.
Allocate additional resources for HM monitoring and observability tools (for example, Grafana, Loki, and Prometheus).
Node pool configuration:
- If desired, use taints and tolerations to isolate HM workloads on specific nodes.
- Label nodes for specific tasks (for example, "monitoring", "database") to optimize resource allocation.
Scaling:
- Set up autoscaling to ensure the cluster can dynamically handle increased demand. OpenShift's Horizontal Pod Autoscaler (HPA) can be used to scale pods as needed.

Networking bandwidth

Ensure each node has adequate networking capacity to handle HM communication and external data transfer (for example, S3 backups).

Baseline recommendations:

Component	Recommended bandwidth	Justification
K8s control plane nodes	1 Gbps+ per node	Handles internal Kubernetes traffic, API server requests, logs/metrics, and orchestration tasks.
Worker nodes (HM)	1 Gbps+ per node	For HM control components, PostgreSQL replication, internode communication, and metrics/logs.
External bandwidth	1-20 Gbps (aggregated)	S3 backups and inter-cluster replication may require high throughput.

Step 3: Install the required CSI drivers

HM uses storage and snapshot features that depend on CSI drivers. Verify the following drivers are installed and properly configured in your OpenShift cluster:

Persistent storage driver: Install a storage class driver compatible with your storage backend (for example, OpenShift Container Storage, Ceph, or other on-prem solutions).
Snapshot controller: Configure the CSI snapshot controller for volume snapshot management.

Step 4: Configure storage classes

Default storage class: Set up a default storage class optimized for HM workloads.

Custom storage classes: Create additional storage classes if specific workloads require custom configurations.

Step 5: Prepare namespace and secrets

Namespace for HM: Create a dedicated namespace for HM components to ensure isolation and manageability.

```bash
oc create ns edbpgai-bootstrap
```

Secrets: Create Kubernetes secrets for any required credentials, such as object storage credentials, database access tokens, or other sensitive information. The following are HM-specific secrets and configurations. The first ImagePullSecret is necessary for HM to function properly. Griptape and DeltaLake object storage and secret for Catalog are optional, as they depend on whether you're setting up your cluster for GenAI Builder or Catalog.

Configure ImagePullSecret and namespace (necessary for HM)

Set environmental variables.

So that you can more easily log in to OC Registry, there are a few container registry variables to set:

export CONTAINER_REGISTRY_URI=<local-registry-uri>
export CONTAINER_REGISTRY_USERNAME=<container-registry-username>
export CONTAINER_REGISTRY_PASSWORD=<container-registry-password>

Prepare the pull secret file.
Prepare the directory for the image pull secret:
```
tmpdir=$(mktemp -d /tmp/hm-pull-secret)
chmod 0700 "${tmpdir}"
```

oc registry login \
--registry="${CONTAINER_REGISTRY_URI}" \
--auth-basic="${CONTAINER_REGISTRY_USERNAME}:${CONTAINER_REGISTRY_PASSWORD}" \
--to="$tmpdir/pull-secret"

Then update the secret:

oc set data secret/pull-secret -n openshift-config \
    --from-file=.dockerconfigjson="$tmpdir/pull-secret"

Clean up.
Clean up the temporary directory you made for the secret:
```
rm -rf $tmpdir
```

Create a secret for Griptape and configure DeltaLake object storage (optional for GenAI Builder)
Create a secret for Catalog (optional for using Catalog)
Create custom secrets for Migration Portal (optional to secure internal communications with Migration Portal)

Step 6: Configure authentication and security keys

HM and its underlying components require secure authentication mechanisms to ensure proper communication between components and to protect sensitive data.

Generate AES-256 encryption key.
HM uses an AES-256 encryption to secure sensitive data during communication or at rest (for example, database credentials, tokens). To generate a random AES-256 encryption key:
```
export AES_256_KEY=$(openssl rand -base64 32)
```
Store the key in Kubernetes.
To make the key accessible to HM and associated services, create a Kubernetes secret in the appropriate namespace:
1. Run the following command to create the secret:
```
kubectl create secret generic hm-auth-key \
    --namespace <hm-namespace> \
    --from-literal=aes-256-key=$AES_256_KEY
```
2. Verify the secret:
```
kubectl get secret hm-auth-key --namespace <hm-namespace>
```

Step 7: Configure object storage

Object storage backend: To support HM backups, connect the cluster to a dedicated object storage solution (for example, MinIO, Ceph Object Gateway, or AWS S3-compatible storage).

Among others, Hybrid Manager uses this bucket to store:

Managed storage locations
Postgres WALs + backups
Logs (both Postgres and internal services)
Metrics (both Postgres and internal services)

Create a bucket

Create a MinIO or Ceph bucket.

Create a bucket user

After creating a dedicated bucket for Hybrid Manager with your provider, create an identity user with full access to the bucket, as well as AWS static credentials for the identity user to grant the Hybrid Manager access to the bucket.

MinIO bucket user
Ceph Object Storage bucket user

Download the MinIO Client.

Retrieve the MinIO deployment name and create a username and password:

export MINIO_DEPLOYMENT_NAME=<minio_deployment_name> # deployment name to minio
export BUCKET_NAME=<minio_bucket_name> # bucket name
export AWS_ACCESS_KEY_ID=<minio_user_name> # user name
export AWS_SECRET_ACCESS_KEY=<minio_user_password> # password to the user
export MINIO_POLICY_NAME=<minio_policy_name> # policy to minio

Create a new user:

mc admin user add ${MINIO_DEPLOYMENT_NAME} ${AWS_ACCESS_KEY_ID} ${AWS_SECRET_ACCESS_KEY}

Create one MinIO policy:

cat << EOF > policy.json
{
    "Version" : "2012-10-17",
    "Statement": [
        {
            "Effect" : "Allow",
            "Action" : [
                "s3:*"
            ],
            "Resource" : [
                "arn:aws:s3:::${BUCKET_NAME}",
                "arn:aws:s3:::${BUCKET_NAME}/*"
            ]
        }
    ]
}
EOF

Apply the policy:

mc admin policy create ${MINIO_DEPLOYMENT_NAME} ${MINIO_POLICY_NAME} ./policy.json

Attach the policy to the user:

mc admin policy attach ${MINIO_DEPLOYMENT_NAME} ${MINIO_POLICY_NAME} --user ${AWS_ACCESS_KEY_ID}

Download Ceph's command line interface Radosgw-admin utility.

Retrieve the bucket and some environment variables:

export BUCKET_NAME=<bucket_name> # bucket name
export USER_NAME=<user_name> # user name

Create the user:

radosgw-admin user create --uid ${USER_NAME} --display-name= ${USER_NAME} > user.json

export AWS_ACCESS_KEY_ID=$(cat user.json | jq -r '.keys[0].access_key')
export AWS_SECRET_ACCESS_KEY=$(cat user.json | jq -r '.keys[0].secret_key')

Grant the user access to the bucket:

radosgw-admin bucket link --bucket ${BUCKET_NAME} --uid ${USER_NAME}

Create a secret for bucket access

After preparing the dedicated user for HM access to the bucket, create a secret that associates the created user with the Hybrid Manager to provide access for object storage.

apiVersion: v1
kind: Secret
metadata:
  name: edb-object-storage # name cannot be changed
  namespace: default # namespace cannot be changed
stringData:
    auth_type: credentials

    # Optional: Used only when the object storage server's certificate is not issued by a well-known CA
    #
    # Base64 string of the CA bundle for the certificate used by the object storage server
    aws_ca_bundle_base64: xxx

    # Required: Endpoint URL to the object storage
    aws_endpoint_url_s3: xxx

    # Required: AWS Static Credentials - AWS_ACCESS_KEY_ID
    aws_access_key_id: xxx

    # Required: AWS Static Credentials - AWS_SECRET_ACCESS_KEY
    aws_secret_access_key: xxx

    # Required: Bucket name
    bucket_name: xxx

    # Required: Region of the bucket
    aws_region: xxx

    # Optional: true or false
    #
    # When server-side encryption is disabled, set this to true. By default, its value is false, indicating that server-side encryption is enabled.
    server_side_encryption_disabled: bool

Step 8: Validate network security

Firewall rules: Ensure firewall rules allow necessary communication between the OpenShift cluster, external object storage, and any external dependencies of HM.
Ingress/egress rules: Restrict access to only the required endpoints and IP ranges to minimize exposure.

Step 9: Adjust cluster configurations

Some default OpenShift configurations may need adjustments:

Node pool tuning: Optimize node configurations for HM workloads, including resource allocation and taints/tolerations.
Storage class adjustments: Replace or modify default storage classes as needed.
Networking policies: Refine networking configurations to ensure compatibility with HM requirements.

Step 10: Install required add-ons

Install and configure any required OpenShift operators or add-ons:
- Service mesh: HM relies on istio service mesh.
- Monitoring and logging: Set up OpenShift monitoring and logging solutions to integrate with HM observability components.

Proceed to setting necessary environmental variables

With the cluster configured, proceed to the next phase of the pre-installation: setting the necessary environmental variables for installation.

← Prev

Install system

↑ Up

Red Hat Open Shift installation prerequisites

Set the necessary environmental variables for install

Could this page be better? Report a problem or suggest an addition!