What are the recommended Prometheus metrics to scrape?

My YB cluster has a Prometheus version 2.2.1 instance running. I want to send a number of these metrics to DataDog for monitoring. Does YB team have a set of recommended metrics to send?

Right now, I have the “up” and “node_filesystem_avail” metrics sent to DataDog. There are a number of "handler_latency_yb_" metrics. We have YCQL but the handler_latency_yb_ metrics do not include YCQL. We also have YEDIS, but not sure what metrics to send.

So, I am looking for a set of YCQL and YEDIS metrics to send to DataDog. The purpose is to monitor cluster health.

Hi @Steve_Liang

Have you seen https://docs.yugabyte.com/latest/explore/observability/prometheus-integration/linux/ ?

Hello @Steve_Liang!

Please use the documentation provided by @dorian_yugabyte to set up the relabel configs for Prometheus if you haven’t already.

A full list of useful metrics should be available in the Grafana JSON config at https://grafana.com/grafana/dashboards/12620. I would recommend bringing up the Grafana dashboard after configuring Prometheus as described at https://docs.yugabyte.com/latest/explore/observability/prometheus-integration/linux/ so you can see the key metrics organized by category (YCQL, YSQL, DocDB etc.