Observability (Agent Factory on HM) v1.4.0 (LTS)

Observability in Hybrid Manager

When you deploy Agent Factory components on Hybrid Manager, the platform gives you integrated observability for both Model Serving (KServe) and Langflow workloads.

Model Serving (KServe)

  • Use the platform dashboards and logs to monitor InferenceServices and GPU workloads.
  • Metrics include request latency, throughput, GPU/CPU utilization, and error rates.
  • See: Monitor InferenceService

Langflow

  • Monitor Langflow logs, pod health, replica count, and flow availability via Prometheus and Grafana.
  • See: Langflow troubleshooting

Accessing metrics and logs

  • Dashboards. Hybrid Manager surfaces Agent Factory metrics through the built-in observability dashboards (Grafana).
  • Logs. Use kubectl logs or the HM UI log viewer to inspect Langflow pods, KServe InferenceServices, and retriever jobs.

Typical commands

For cluster-level troubleshooting or custom dashboards, you can also pull metrics and logs directly:

# Check logs for a running InferenceService
kubectl logs -n <project-namespace> svc/<model-service-name>

# Port-forward Grafana if running locally
kubectl port-forward -n observability svc/grafana 3000:3000

Grafana dashboard setup for Agent Factory metrics

Hybrid Manager includes a Grafana instance in the observability stack. You can access it through the Launchpad or configure dashboards for Agent Factory-specific metrics.

Access Grafana

Navigate to the Launchpad in the Hybrid Manager Portal and click on the Grafana tile.

Build or import a dashboard with the following panels to cover Agent Factory health at a glance:

PanelMetricVisualisation
Inference request raterate(kserve_request_duration_seconds_count[5m])Time series
P95 inference latencyhistogram_quantile(0.95, rate(kserve_request_duration_seconds_bucket[5m]))Time series
Inference error raterate(kserve_request_duration_seconds_count{response_code!~"2.."}[5m])Time series
Model server replica countkube_deployment_status_replicas_ready{deployment=~".*inferenceservice.*"}Stat
Langflow replica countkube_deployment_status_replicas_ready{deployment=~".*langflow.*"}Stat
GPU utilisationDCGM_FI_DEV_GPU_UTILGauge

Import a pre-built dashboard

KServe publishes a community Grafana dashboard. To import it:

  1. In Grafana, go to DashboardsImport.
  2. Enter the dashboard ID or paste the JSON from the KServe community dashboards.
  3. Select your Prometheus data source.
  4. Select Import.

Key takeaway

  • Model Serving → monitored as KServe services with full metrics.
  • Langflow → pod health, replica count, and flow availability monitored via Prometheus and Grafana.
  • Hybrid Manager provides a unified observability layer — no separate monitoring stack to configure.
  • Enterprise monitoring → extend the built-in stack with custom Grafana dashboards.