Configuring EDB Pgpool-II v4
The configuration files are created in the /etc/sysconfig/edb/pgpool<x.y> directory, where <x.y> is the EDB Pgpool-II release version. By default, .sample is appended to the configuration file name. remove .sample from the configuration file after copying the file to create your custom configuration.
Note
The configuration options for Pgpool-II are extensive. Consider the options listed here as a starting point. For more information about configuring and using Pgpool-II, consult the project website.
Commonly used parameters
The table lists pgpool.conf parameters to use when implementing connection pooling:
| Parameter name | Description |
|---|---|
listen_addresses | Hostname or IP address used by EDB Pgpool-II to listen for connections. The default is localhost. Change to '*' for all addresses. |
port | Port for EDB Pgpool-II connections. The default is 9999. |
pcp_port | Port for PCP connections. The default is 9898. |
backend_hostname0 | Hostname or IP address for backend 0. You can specify '' if the backend and EDB Pgpool-II are running on the same host. |
backend_port0 | Port number for backend 0. |
backend_weight0 | Weight for backend 0 (only in load balancing mode). Specify 1 for each backend if you want to balance the load equally or decimal values (.9, .1, etc.) to weigh the load toward specific backends. |
backend_data_directory0 | Data directory for backend 0. |
enable_pool_hba | Set to on to use pool_hba.conf for client authentication. |
num_init_children | Number of pools. Default is 32. |
max_pool | Number of connections per pool. Default is 4. |
connection_cache | Set to on to enable connection pooling. |
pool_conn_dbname | Database name to which EDB Pgpool-II connects. By default, EDB Pgpool-II connects with Postgres. As of v4.3, the deprecated pool_conn_dbname parameter is removed. |
sr_check_user | User name to perform streaming replication check. Required as of EDB Pgpool-II v4.3. |
sr_check_password | Password of the sr_check_user user to perform the streaming replication checks. Required as of EDB Pgpool-II v4.3. |
The following table lists pgpool.conf parameters to use when implementing replication and load balancing:
| Parameter name | Description |
|---|---|
Allow_sql_comments | If on, ignore SQL comments. Changes to this parameter require reloading the pgpool.conf file. |
load_balance_mode | Set to on to activate load balancing mode. If load_balance_mode is on and replicate_select is off, SELECT statements are sent to one backend. The parameter backend_weight<N>.z determines the proportion of SELECT statements each backend receives. |
ignore_leading_white_space | Ignore leading white spaces of each query. Certain APIs such as DBI/DBD::Pg for Perl add white space that you can't control. Default is on. |
Configuring connection pooling
EDB Pgpool-II provides a set of child processes that maintain cached connections to one or more database servers. When a client connects, EDB Pgpool-II attempts to reuse a connection from its pool, thus avoiding the overhead of opening and closing client connections.
You can reuse a connection in the pool only if the target database and the connection user match a prior connection that is currently in the pool. The pgpool.conf file specifies the connection pooling configuration options (such as the number of child processes and the maximum number of cached connections per child).
To configure connection pooling with one database server:
Configure the
pg_hba.conffile on thePgpool-IIhost to permit connections between the clients and the server.Copy the
pgpool.conf.samplefile topgpool.conf, modify the file, set theconnection_cacheparameter toon, and specify connection properties for your database server.The following example shows how to connect with the EDB Postgres Advanced Server:
connection_cache = on backend_hostname0 = 'localhost' backend_port0 = 5444 backend_weight0 = 1 backend_data_directory0 = '/var/lib/edb/as14/data' sr_check_user = 'enterprisedb' sr_check_password = 'enterprisedb_password'
The following example shows how to connect with the PostgreSQL Server:
connection_cache = on backend_hostname0 = 'localhost' backend_port0 = 5432 backend_weight0 = 1 backend_data_directory0 = '/var/lib/pgsql/14/data' sr_check_user = 'enterprisedb' sr_check_password = 'enterprisedb_password'
Note
In the
pgpool.conffile, connection parameters have an appended digit that specifies a cluster node identifier. Database node0specifies values for the primary node.Optionally, configure EDB Pgpool-II client authentication.
Optionally, configure the PCP administrative interface.
Start EDB Pgpool-II:
systemctl start edb-pgpool-<x.y>.service
<x.y> is the EDB Pgpool release version.
Run the following platform-specific command to connect to Pgpool43:
On EDB Postgres Advanced Server for CentOS 7:
./psql -d edb -p 9999 -U enterprisedb -h /tmp
On EDB Postgres Advanced Server for Debian:
./psql -d edb -p 9999 -U enterprisedb
On PostgreSQL Server for CentOS 7:
./psql -d postgres -p 9999 -U postgres -h /tmp
On PostgreSQL Server for Debian:
./psql -d postgres -p 9999 -U postgres
Configuring load balancing
EDB supports replication scenarios that use EDB Pgpool-II load balancing with PostgreSQL streaming replication or Slony replication. The supported replication methods ensure that database updates made by client applications apply to multiple backend servers. For detailed information about the benefits of each replication method and configuration instructions, see the project documentation for each utility.
When load balancing is enabled, EDB Pgpool-II distributes some types of SELECT statements to backend servers, allowing multiple database servers and hosts to share the processing load of SELECT statements issued by client applications.
When configuring EDB Pgpool-II load balancing, the initial database environments in all backend servers must be identical:
- Tables must have the same name, definition, and row content.
- Schemas must exist in each backend application database.
- Roles and privileges on each backend server must be configured to ensure the result set of SQL statements are identical on all servers.
If you use password authentication, assign the same password to an associated user name on each database server. Use the same user name/password pair to connect EDB Pgpool-II to each backend connection.
In a replication scenario, each backend is uniquely identified by the hostname (or IP address) and the port number on which the database server instance is listening for connections. Make sure that the pool_hba.conf and pg_hba.conf files allow a connection between that server and the host on which EDB Pgpool-II is running.
The following example shows how to implement EDB Pgpool-II load balancing with two servers (the primary and replica nodes) in a streaming replication scenario. Configuring EDB Pgpool-II load balancing for a Slony replication scenario is similar. See the Slony documentation for information about configuring Slony replication.
Configuring the primary node of the replication scenario
Open an SSH session with the primary node of the replication scenario, and modify the pg_hba.conf file (located in the /var/lib/edb/as14/data directory). Add connection information for the replication user. (In the example that follows, edbrepuser resides on a standby node with an IP address of 107.178.217.178):
host replication edbrepuser 107.178.217.178/32 md5
The connection information must specify the address of the replication scenario's standby node and your preferred authentication method.
Modify the postgresql.conf file (located in /var/lib/edb/as14/data), adding the following replication parameter and values to the end of the file:
wal_level = replica max_wal_senders = 10 checkpoint_segments = 8 wal_keep_segments = 0
Save the configuration file, and restart the server:
To restart on RHEL/Rocky Linux/AlmaLinux 8 platforms:
systemctl restart edb-as-14
To restart on the Debian platform:
/usr/edb/as14/bin/epas_ctlcluster 14 main restartUse the sudo su - command to assume the identity of the enterprisedb database superuser:
sudo su - enterprisedb
Then, start a psql session, connecting to the edb database:
psql -d edbAt the psql command line, create a user with the replication attribute:
CREATE ROLE edbrepuser WITH REPLICATION LOGIN PASSWORD 'password';
Configuring the standby node of the replication scenario
Open an SSH session with the standby server and assume the identity of the database superuser (enterprisedb):
sudo su - enterprisedb
With your choice of editor, create a .pgpass file in the home directory of the enterprisedb user. The .pgpass file holds the password of the replication user in plain-text form. If you're using a .pgpass file, make sure that only trusted users have access to it:
Add an entry that specifies connection information for the replication user:
*:5444:*:edbrepuser:password
The server enforces restrictive permissions on the .pgpass file. Use the following command to set the file permissions:
chmod 600 .pgpass
Relinquish the identity of the database superuser:
exitThen, assume superuser privileges:
sudo su -
Use your platform-specific command to stop the database server before replacing the data directory on the standby node with the data directory of the primary node.
Then, delete the data directory on the standby node:
rm -rf /var/lib/edb/as14/data
After deleting the existing data directory, use the pg_basebackup utility to copy the data directory of the primary node to the standby:
pg_basebackup --format=p --label=standby --host=146.148.46.44 --username=edbrepuser --password --wal-method=stream -R
The call to pg_basebackup specifies the IP address of the primary node and the name of the replication user created on the primary node.
Including the -R option creates the standby.signal file and appends connection settings to postgresql.auto.conf in the output directory (or into the base archive file when using tar format) to ease setting up a standby server.
For more information about the options available with the pg_basebackup utility, see the PostgreSQL core documentation.
When prompted by pg_basebackup, provide the password associated with the replication user.
After copying the data directory, change ownership of the directory to the database superuser (enterprisedb):
chown -R enterprisedb /var/lib/edb/as14/data
Modify the postgresql.conf file (located in /var/lib/edb/as14/data), specifying the following values at the end of the file:
wal_level = replica hot_standby = on
The data file has been copied from the primary node and contains the replication parameters specified previously.
Then, restart the server. At this point, the primary node is replicating data to the standby node.
Configuring EDB Pgpool-II load balancing
In the pgpool.conf file, modify the parameter settings to specify that load balancing is enabled:
load_balance_mode = on
Then, specify the connections settings for the primary database node in the parameter set that ends with a 0. For example:
backend_hostname0 = '146.148.46.44' backend_port0 = 5444 backend_weight0 = 1 backend_data_directory0 = '/var/lib/edb/as14/data'
Then, specify the connections settings for each node to which queries are distributed. Increment the number that follows the parameter name for each node, and provide connection details:
backend_hostname1 = '107.178.217.178' backend_port1 = 5444 backend_weight1 = 1 backend_data_directory1 = '/var/lib/edb/as14/data'
Use the backend_weight parameter to specify how to distribute queries distributed among the nodes. Specify a value of 1 to indicate that you want (qualified) queries to be equally distributed across the nodes of the replication scenario.
Restart EDB Pgpool-II
systemctl restart edb-pgpool-<x.y>.service
<x.y> is the EDB Pgpool release version.
Configuring client authentication
When EDB Pgpool-II is enabled, client applications connect to EDB Pgpool-II, which acts as a middleman for a Postgres server. A connecting client application is first authenticated with the EDB Pgpool-II server and then with the Postgres server.
Parameter settings in the pool_hba.conf configuration file determine the EDB Pgpool-II authentication properties. The pool_hba.conf file is similar in format and function to the Postgres pg_hba.conf configuration file. See the Pgpool-II documentation for detailed information about pool_hba.conf entries.
To enable EDB Pgpool-II authentication:
- Copy the
pool_hba.conf.samplefile topool_hba.conf. - Modify the
pool_hba.conffile, specifying authentication information for servers or users you want to connect. Entries must follow the same format used in thepg_hba.conffile. - Modify the
pgpool.conffile, setting theenable_pool_hbaparameter toon. - Restart EDB Pgpool-II to reload the EDB Pgpool-II configuration files.
Note
When authenticating with the database server, use the user names and passwords specified in the pool_hba.conf file. You must also specify these user names and passwords in the database server's pg_hba.conf file.
Configuring PCP
PCP is an administrative interface for EDB Pgpool-II that allows you to retrieve information about database nodes, EDB Pgpool-II child processes, and other information. Issue PCP commands from the Linux command line.
pcp.conf is the password configuration file for the PCP client. Before using PCP commands, modify the pcp.conf file, providing the user names and passwords you provide when invoking a PCP command. The user names in the pcp.conf file are independent of the database server user names and passwords.
Use the following steps to configure PCP:
Copy the
pcp.conf.samplefile topcp.conf.Add an entry to the
pcp.conffile in the following form:username:md5_password
usernameis a PCP user name.md5_passwordis the PCP password inmd5format.You can use the
pg_md5program to generate the encrypted password from the clear-text form:$ pg_md5 mypassword 34819d7beeabb9260a5c854bc85b3e44
For example, the entry in the
pcp.conffile for a PCP user named pcpuser with the password ofmypasswordis:# USERID:MD5PASSWD pcpuser:34819d7beeabb9260a5c854bc85b3e44
Restart the EDB Pgpool service.
When issuing a PCP command, specify the PCP user name and the unencrypted form of the password:
$ pcp_node_info 5 localhost 9898 pcpuser mypassword 0 localhost 5444 1 1.000000
After configuring PCP, you can use the following PCP commands to control EDB Pgpool-II and retrieve information.
| PCP command | Description |
|---|---|
pcp_common_options | Common options used in PCP commands |
pcp_node_count | Displays the total number of database nodes |
pcp_node_info | Displays the information on the given node ID |
pcp_health_check_stats | Displays health check statistics data on given node ID |
pcp_watchdog_info | Displays the watchdog status of the EDB Pgpool-II |
pcp_proc_count | Displays the list of EDB Pgpool-II children process IDs |
pcp_proc_info | Displays the information on the given EDB Pgpool-II child process ID |
pcp_pool_status | Displays the parameter values as defined in pgpool.conf |
pcp_detach_node | Detaches the given node from EDB Pgpool-II, forcing existing connections to EDB Pgpool-II to be disconnected |
pcp_attach_node | Attaches the given node to EDB Pgpool-II |
pcp_promote_node | Promotes the given node as new main to EDB Pgpool-II |
pcp_stop_pgpool | Terminates the EDB Pgpool-II process |
pcp_reload_config | Reloads EDB Pgpool-II config file |
pcp_recovery_node | Attaches the given backend node with recovery |
Note
pcp_health_check_stats and pcp_reload_config commands are available from EDB Pgpool version 4.2 onwards.
To view more information about PCP command options, see the Pgpool project site.
Configuring number of connections and pooling
EDB Pgpool-II has some configuration to tune the pooling and connection processing. Depending on this
configuration, you must also set the Postgres configuration for max_connections to ensure
all connections can be accepted as required. Furthermore, the cloud architecture works
with active/active instances, which needs to spread num_init_children over all EDB Pgpool
instances (divide the normally used value by the number of active instances).
max_pool: Generally, advised to set max_pool to 1. Alternatively, for applications with many reconnects, you can set max_pool to the number of distinct combinations of users, databases,
and connection options for the application connections. All but one connection in the pool are stale connections, which consume a connection slot from Postgres without adding to the performance.
We therefore recommend that you don't configure max_pool beyond 4 to preserve a healthy ratio
between active and stale connections. As an example, for an application that constantly reconnects
and uses two distinct users, both connecting to their own database, set it to 2. If both users
can connect to both databases, set it to 4. Increasing max_pool requires
that you tune down num_init_children in EDB Pgpool or tune up max_connections in Postgres.
num_init_children: We recommend setting num_init_children to the number of connections
that could be running active in parallel, but the divide value by the number of active
EDB Pgpool-II instances (one with the on-premise architecture and all instances for the cloud architecture). As an example: in an architecture with three EDB Pgpool instances, to allow the application to
have 100 active connections in parallel, set num_init_children to 100 for the on-premises architecture, and set num_init_children to 33 for the cloud architecture. Increasing
num_init_children generally requires that you tune up max_connections in Postgres.
listen_backlog_multiplier: Can be set to multiply the number of open connections (as perceived by the application) with the number of active connections (num_init_children).
As an example, when the application might open 500 connections, of which 100 should be
active in parallel, with the on-premises architecture, set num_init_children to
100 and listen_backlog_multiplier to 4. This setup can process
100 connections active in parallel, and another 400 (listen_backlog_multiplier x
num_init_children) connections are queued before connections are blocked. The
application perceives a total of 500 open connections, and Postgres processes the load of
100 connections maximum at all times. Increasing listen_backlog_multiplier causes the application to perceive more connections but doesn't increase the number of
parallel active connections (which is determined by num_init_children).
max_connections: We recommend setting max_connections in Postgres higher
than [number of active pgpool instances] x [max_pool] x [num_init_children] +
[superuser_reserved_connections] (Postgres). As an example: in the on-premises setup
with three instances active/passive, max_pool set to 2, num_init_children set to 100, and
superuser_reserved_connections (Postgres) set to 5, set Postgres max_connections equal to or higher than [1 x 2 x 100+5], which is 205 connections, or higher. A similar setup in the cloud
setup runs with three active instances, max_pool set to 2, num_init_children
set to 33, and superuser_reserved_connections (Postgres) set to 5. In this case
set Postgres max_connections equal or higher than [3x 2 x 33+5] which is 203 or
higher. Configuring below the advised setting can cause issues opening new connections
and, in combination with max_pool, can cause unexpected behavior (low or no active connections but still connection issues due to stale pooled connections using connection slots from
Postgres.
EDB Pgpool-II host setup
After modifying the parameter settings that implement EDB Pgpool-II functionality for your installation, start the EDB Pgpool-II service.
When EDB Pgpool-II starts, it records its process ID in a pgpool.conf file whose name is determined by the pid_file_name configuration parameter. The initial value of the pid_file_name parameter in the sample file is:
pid_file_name = /var/run/edb/pgpool<x.y>/edb-pgpool-<x.y>.pid
<x.y> is the EDB Pgpool release version.
Note
The operating system might remove the contents of the /var/run directory (including the pgpool directory) during a reboot. Don't use the /var/run/edb/pgpool directory as the location for the pgpool.pid file. Modify the pid_file_name parameter to specify a safer directory location.