flow_server.json reference

The flow_server.json file configures the FlowServer service. Pass it to FlowServer at startup using the -c flag, as described in Configure FlowServer.

{
    "Host": "",
    "Port": 6060,
    "Gpfdist": {
        "Host": "",
        "Port": 6070,
        "ReuseTables": true
    },
    "Prometheus": {
        "Host": "",
        "Port": 9080,
        "MetricsPath": "/flow_metrics"
    },
    "DebugPort": 6080,
    "Logging": {
        "SplitLogByJob": false,
        "FrontendLevel": "debug",
        "BackendLevel": "info"
    }
}

Parameters

  • Host: The hostname or IP address of the server. The default is an empty string, which means it will listen on all interfaces.
  • Port: The port number on which the server listens for incoming connections. The default is 6060.
  • Gpfdist: Configuration options for the gpfdist service.
    • Host: The hostname or IP address of the gpfdist service. The default is an empty string, which means it will listen on all interfaces.
    • Port: The port number on which the gpfdist service listens. The default is 6070.
    • ReuseTables: Whether to reuse existing tables in the database. The default is false. When you reuse external tables, FlowServer generates the external table name using a hash of various load configuration property values. By default, FlowServer drops the external table associated with a load operation (if one exists) and creates a new external table when you start or restart the job. If you don't reuse external tables, the external table name is based on the job name.
  • Prometheus: Configuration options for the Prometheus metrics endpoint. When this block is present, FlowServer automatically starts an HTTP server to expose metrics for scraping.
    • Host: The hostname or IP address of the Prometheus service. The default is an empty string, which means it will listen on all interfaces.
    • Port: The port number on which the Prometheus service listens. The default is 9080.
    • MetricsPath: The path to the metrics endpoint. The default is /metrics.
  • DebugPort: The port number for the debug server. When set, FlowServer starts an HTTP listener, which exposes runtime diagnostics including CPU profiles, heap usage, and goroutine stacks at http://<flowserver_host>:<DebugPort>/debug/pprof/. If not set, the debug server doesn't start.
  • Logging: Configuration options for logging. The supported values are debug, info, warn, error, and fatal.
    • SplitLogByJob: Whether to split logs by job. The default is true, meaning logs are separated by job.
    • FrontendLevel: The logging level for the frontend/stdout. The default is info.
    • BackendLevel: The logging level for the backend/log file. The default is debug.

Prometheus metrics

FlowServer exposes the following metrics at the configured Host:Port/MetricsPath endpoint.

MetricDescription
whpg_flow_server_job_failedNumber of jobs that have faile.
whpg_flow_server_job_runningNumber of jobs currently running.
whpg_flow_server_job_totalTotal number of jobs processed by the server.
whpg_flow_server_process_cpu_seconds_totalTotal user and system CPU time spent in seconds.
whpg_flow_server_process_max_fdsMaximum number of open file descriptors.
whpg_flow_server_process_network_receive_bytes_totalNumber of bytes received by the process over the network.
whpg_flow_server_process_network_transmit_bytes_totalNumber of bytes sent by the process over the network.
whpg_flow_server_process_open_fdsNumber of open file descriptors.
whpg_flow_server_process_resident_memory_bytesResident memory size in bytes.
whpg_flow_server_process_start_time_secondsStart time of the process since Unix epoch in seconds.
whpg_flow_server_process_virtual_memory_bytesVirtual memory size in bytes.
whpg_flow_server_process_virtual_memory_max_bytesMaximum amount of virtual memory available in bytes.

Could this page be better? Report a problem or suggest an addition!