dashboards: update cluster dashboard

* add panels for detailed visualization of traffic usage between vmstorage, vminsert, vmselect
components and their clients. New panels are available in the rows dedicated to specific components.

* update "Slow Queries" panel to show percentage of the slow queries to the total number of read queries
served by vmselect. The percentage value should make it more clear for users whether there is a service degradation.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
This commit is contained in:
hagen1778 2024-01-04 10:37:13 +01:00
parent eb08f5c7e5
commit 463455665b
No known key found for this signature in database
GPG key ID: 3BF75F3741CA9640
3 changed files with 1592 additions and 702 deletions

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -39,6 +39,8 @@ The sandbox cluster installation is running under the constant load generated by
* FEATURE: [vmctl](https://docs.victoriametrics.com/vmctl.html): rename cmd-line flag `vm-native-disable-retries` to `vm-native-disable-per-metric-migration` to better reflect its meaning.
* FEATURE: all VictoriaMetrics components: add ability to specify arbitrary HTTP headers to send with every request to `-pushmetrics.url`. See [`push metrics` docs](https://docs.victoriametrics.com/#push-metrics).
* FEATURE: all VictoriaMetrics components: add `-metrics.exposeMetadata` command-line flag, which allows displaying `TYPE` and `HELP` metadata at `/metrics` page exposed at `-httpListenAddr`. This may be needed when the `/metrics` page is scraped by collector, which requires the `TYPE` and `HELP` metadata such as [Google Cloud Managed Prometheus](https://cloud.google.com/stackdriver/docs/managed-prometheus/troubleshooting#missing-metric-type).
* FEATURE: dashboards/cluster: add panels for detailed visualization of traffic usage between vmstorage, vminsert, vmselect components and their clients. New panels are available in the rows dedicated to specific components.
* FEATURE: dashboards/cluster: update "Slow Queries" panel to show percentage of the slow queries to the total number of read queries served by vmselect. The percentage value should make it more clear for users whether there is a service degradation.
* BUGFIX: [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html): properly return full results when `-search.skipSlowReplicas` command-line flag is passed to `vmselect` and when [vmstorage groups](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#vmstorage-groups-at-vmselect) are in use. Previously partial results could be returned in this case.
* BUGFIX: `vminsert`: properly accept samples via [OpenTelemetry data ingestion protocol](https://docs.victoriametrics.com/#sending-data-via-opentelemetry) when these samples have no [resource attributes](https://opentelemetry.io/docs/instrumentation/go/resources/). Previously such samples were silently skipped.