Observability in Hybrid Manager
When you deploy Agent Factory components on Hybrid Manager, the platform gives you integrated observability for both Model Serving (KServe) and Langflow workloads.
Model Serving (KServe)
- Use the platform dashboards and logs to monitor InferenceServices and GPU workloads.
- Metrics include request latency, throughput, GPU/CPU utilization, and error rates.
- See: Monitor InferenceService
Langflow
- Monitor Langflow logs, pod health, replica count, and flow availability via Prometheus and Grafana.
- See: Langflow troubleshooting
Accessing metrics and logs
- Dashboards. Hybrid Manager surfaces Agent Factory metrics through the built-in observability dashboards (Grafana).
- Logs. Use
kubectl logsor the HM UI log viewer to inspect Langflow pods, KServe InferenceServices, and retriever jobs.
Typical commands
For cluster-level troubleshooting or custom dashboards, you can also pull metrics and logs directly:
# Check logs for a running InferenceService kubectl logs -n <project-namespace> svc/<model-service-name> # Port-forward Grafana if running locally kubectl port-forward -n observability svc/grafana 3000:3000
Grafana dashboard setup for Agent Factory metrics
Hybrid Manager includes a Grafana instance in the observability stack. You can access it through the Launchpad or configure dashboards for Agent Factory-specific metrics.
Access Grafana
Navigate to the Launchpad in the Hybrid Manager Portal and click on the Grafana tile.
Recommended dashboard panels
Build or import a dashboard with the following panels to cover Agent Factory health at a glance:
| Panel | Metric | Visualisation |
|---|---|---|
| Inference request rate | rate(kserve_request_duration_seconds_count[5m]) | Time series |
| P95 inference latency | histogram_quantile(0.95, rate(kserve_request_duration_seconds_bucket[5m])) | Time series |
| Inference error rate | rate(kserve_request_duration_seconds_count{response_code!~"2.."}[5m]) | Time series |
| Model server replica count | kube_deployment_status_replicas_ready{deployment=~".*inferenceservice.*"} | Stat |
| Langflow replica count | kube_deployment_status_replicas_ready{deployment=~".*langflow.*"} | Stat |
| GPU utilisation | DCGM_FI_DEV_GPU_UTIL | Gauge |
Import a pre-built dashboard
KServe publishes a community Grafana dashboard. To import it:
- In Grafana, go to Dashboards → Import.
- Enter the dashboard ID or paste the JSON from the KServe community dashboards.
- Select your Prometheus data source.
- Select Import.
Key takeaway
- Model Serving → monitored as KServe services with full metrics.
- Langflow → pod health, replica count, and flow availability monitored via Prometheus and Grafana.
- Hybrid Manager provides a unified observability layer — no separate monitoring stack to configure.
- Enterprise monitoring → extend the built-in stack with custom Grafana dashboards.