dashboards: updates operator dashboard (#7139)

* Replaces deprecated graphs with Timeseries panels
* Adds new latency dashboards for rest client and golang scheduler
* Adds new overview panels
* Adds VM Datasource version of dashboard

---------

Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
This commit is contained in:
Nikolay 2024-09-30 15:35:39 +02:00 committed by GitHub
parent 0b2d3d7752
commit fbaa026ae6
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 3396 additions and 561 deletions

View file

@ -15,3 +15,4 @@ dashboards-sync:
SRC=vmagent.json D_UID=G7Z9GzMGz TITLE="VictoriaMetrics - vmagent" $(MAKE) dashboard-copy SRC=vmagent.json D_UID=G7Z9GzMGz TITLE="VictoriaMetrics - vmagent" $(MAKE) dashboard-copy
SRC=vmalert.json D_UID=LzldHAVnz TITLE="VictoriaMetrics - vmalert" $(MAKE) dashboard-copy SRC=vmalert.json D_UID=LzldHAVnz TITLE="VictoriaMetrics - vmalert" $(MAKE) dashboard-copy
SRC=vmauth.json D_UID=nbuo5Mr4k TITLE="VictoriaMetrics - vmauth" $(MAKE) dashboard-copy SRC=vmauth.json D_UID=nbuo5Mr4k TITLE="VictoriaMetrics - vmauth" $(MAKE) dashboard-copy
SRC=operator.json D_UID=1H179hunk TITLE="VictoriaMetrics - operator" $(MAKE) dashboard-copy

File diff suppressed because it is too large Load diff

2149
dashboards/vm/operator.json Normal file

File diff suppressed because it is too large Load diff

View file

@ -31,6 +31,7 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/).
* FEATURE: [vmui](https://docs.victoriametrics.com/#vmui): add link to vmalert when proxy is enabled. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5924). * FEATURE: [vmui](https://docs.victoriametrics.com/#vmui): add link to vmalert when proxy is enabled. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5924).
* FEATURE: [vmui](https://docs.victoriametrics.com/#vmui): keep selected columns in table view on page reloads. Before, selected columns were reset on each update. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7016). * FEATURE: [vmui](https://docs.victoriametrics.com/#vmui): keep selected columns in table view on page reloads. Before, selected columns were reset on each update. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7016).
* FEATURE: [dashboards](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/dashboards) for VM single-node, cluster, vmalert, vmagent, VictoriaLogs: add `Go scheduling latency` panel to show the 99th quantile of Go goroutines scheduling. This panel should help identifying insufficient CPU resources for the service. It is especially useful if CPU gets throttled, which now should be visible on this panel. * FEATURE: [dashboards](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/dashboards) for VM single-node, cluster, vmalert, vmagent, VictoriaLogs: add `Go scheduling latency` panel to show the 99th quantile of Go goroutines scheduling. This panel should help identifying insufficient CPU resources for the service. It is especially useful if CPU gets throttled, which now should be visible on this panel.
* FEATURE: [dashboards](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/dashboards): update [dashboard for VictoriaMetrics k8s operator](https://grafana.com/grafana/dashboards/17869-victoriametrics-operator/): replace deprecated TimeSeries panels, add latency panels for rest client and Golang scheduler, add overview panels. The dashboard is now also available for [VictoriaMetrics Grafana datasource](https://github.com/VictoriaMetrics/victoriametrics-datasource).
* FEATURE: [alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts-health.yml): add alerting rule to track the Go scheduling latency for goroutines. It should notify users if VM component doesn't have enough CPU to run or gets throttled. * FEATURE: [alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts-health.yml): add alerting rule to track the Go scheduling latency for goroutines. It should notify users if VM component doesn't have enough CPU to run or gets throttled.
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent/) and [Single-node VictoriaMetrics](https://docs.victoriametrics.com/): hide jobs that contain only healthy targets when `show_only_unhealthy` filter is enabled. Before, jobs without unhealthy targets were still displayed on the page. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3536). * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent/) and [Single-node VictoriaMetrics](https://docs.victoriametrics.com/): hide jobs that contain only healthy targets when `show_only_unhealthy` filter is enabled. Before, jobs without unhealthy targets were still displayed on the page. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3536).
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add service discovery support for [OVH Cloud VPS](https://www.ovhcloud.com/en/vps/) and [OVH Cloud dedicated server](https://ovhcloud.com/en/bare-metal/). See [these docs](https://docs.victoriametrics.com/sd_configs/#ovhcloud_sd_configs) and [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6071). * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add service discovery support for [OVH Cloud VPS](https://www.ovhcloud.com/en/vps/) and [OVH Cloud dedicated server](https://ovhcloud.com/en/bare-metal/). See [these docs](https://docs.victoriametrics.com/sd_configs/#ovhcloud_sd_configs) and [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6071).