The change list is the following:
* bump Grafana version to 9.2.6;
* replace old "Graph" panel with "TimeSeries" panel;
* show % usage of Mem and CPU additionally to of absolute values;
* `Caches` row was removed. All needed info for caches is now part of `Troubleshooting`;
* add Annotations for Alert triggers. Not all alerts are supposed to be displayed
on the dashboard, but only those with label `show_at: dashboard`.
See `alerts.yml` change.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* some unexpected DS UIDs were removed;
* replace `$instance.*` filter with `$instance` since we respect
the instance port anyway;
* remove predefined datasource for `clusterbytenant`
in favour of datasource variable `ds`.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: few updates
* apply consistent formatting across panels;
* make resource usage panels per component more detailed;
* add extra panels to vmselect for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/single: few updates
* apply consistent formatting across panels;
* add extra panels to Performance for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmagent: few updates
* apply consistent formatting across panels;
* add panels for showing number of samples ingested
or scraped;
* adapt resource usage panels for multiple selected jobs/instances;
* add adhoc variable;
* display vmagent's version in Stats.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmalert: few updates
* apply consistent formatting across panels;
* adapt resource usage panels for multiple selected jobs/instances;
* show vmalert version in Stats section.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: remove index filter from stats panel for DiskUsage
The diskUsage stats panel was showing disk usage without including
size of the index, which is not correct. The filter was removed
to reflect the total disk usage.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2368
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: add adhoc filter to dasbhoard variables
The adhoc filter allows to quickly apply global filters without
modifying the panels.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: add new panel `IndexDB items rate`
The new panel supposed to reflect the pressure on indexDB
caused by churn rate or new series registration.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: rm "Deferred merges" panel since it could be misleading
See more context here https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1682#issuecomment-938608067
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: replace fixed interval of `5m` for `rate` expressions
Before we used fixed `5m` interval for expressions with `rate` func.
Unfortunately, this interval wasn't a fit for all the cases. So we
switch to `$__rate_interval` instead.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: bump version requirement
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: rm `vm_indexdb_items_added_size_bytes_total` expression
Rate over `vm_indexdb_items_added_size_bytes_total` doesn't seem to be useful
on the dasbhoard panel.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
The new row Caches adds more visibility for cache utilization by VM.
It replaces the old `Cache size` panel.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: plot cpu limits for vmagent, vmalert and vm-single dashboards
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* alerts: add `TooHighCPUUsage` alert for all VM components
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: bump components version requirements
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmsingle: add "Merges deferred" panel
The new panel supposed to show if there were deferred merges
due to insufficient disk space.
It goes within alerting rule which suppose to send a signal
in such cases.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmsingle: add "Cache usage" panel
The new panel supposed to show the % of the used cache
compared to allowed size by type.
It should help to determine underutilized types of caches.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmsingle: bump version requirement
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmsingle: rm alert for `vm_merge_need_free_disk_space`
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dasbhoard: replace `null` datasources
null datasource value may confuse Grafana and make it drop panel query in some
versions.
* docker: bump grafana image version
* dashboards: add URL variable selector to vmagent dashboard
* dashboards: add new panel `Remote write connection saturation` to vmagent dashboard
* alerts: add new alert for `Remote write connection saturation` panel of vmagent dashboard
* dashboards: add "Logging rate" panel to vmagent dashboard
* dashboard: update single version dash
The update contains the following changes:
* display anonymous memory usage metric. This metric suppose to reflect
memory usage of the process which can't be freed by OS;
* add legends to all panels. This is important for cases when users share
the screenshots;
* modify panels for Grafana v8.0.0
* dashboard: update single version dash tags
* dashboard: update vmagent dash
The update contains the following changes:
* display anonymous memory usage metric. This metric suppose to reflect
memory usage of the process which can't be freed by OS;
* add legends to all panels. This is important for cases when users share
the screenshots;
* modify panels for Grafana v8.0.0
* dashboard: update single node dashboard
* add number of new series created over last 24h;
* bump version requirements.
* dashboard: update vmagent dashboard
* add panel for open file descriptors;
* add panel for disk I/O;
* add panel for `vmagent_remotewrite_packets_dropped_total` metric;
* bump version requirements.
* dashboard: update single node dashboard
* add panel `Open FDs` for file descriptors metrics;
* add panel `Disk writes/reads` to show the real read/write
load on storage layer;
* add `process_resident_memory_bytes` metric to memory usage panel;
* add stats panel to show available CPUs, memory and disk space;
* rm flags panel since it didn't prove its usefulness.
* alerts: add alert for reaching FDs limit
* dashboard: add `Storage full ETA` panel
The new panel suppose to help to estimate the time needed to run out of free
disk space.
Thx to @belm0 @hekmon
* disable legend for `Storage full ETA` panel
* dashboard: rename var `datasource` to `ds` for consistency reason
Dasbhoards for cluster version or vmagent operate with datasource variable
named `ds`. For consistency sake we rename this variable in single node version
as well.
* dashboard: add instance variable picker
See dashboard reviews here https://grafana.com/grafana/dashboards/10229/reviews
* dashboard: limit number of buckets in histogram to 12 for vmagent dashboard
* dashboard: bump version requirement in description for single version
* dashboard: drop extra series override for single version
* dashboard: set Y-min to zero for most of panels in vmagent dashboard
vmagent replaces Prometheus to perform scrapes and writes
into VictoriaMetrics installation. Prometheus datasource was
dropped, but its config was reused to feed vmagent.
Change also contains simplification in dashboard propagation
to Grafana container by removing excessive json manipulation
steps.
* Slow metrics load panel was removed since it is hard to interpret without
additional metrics and stats;
* Slow inserts panel was updated to display percentage of slow inserts comparing
to total number of inserts to show the real impact.
The new update introduces new row "Troubleshooting" that
contains panels for churn rate and slow-queries/inserts/loads metrics. This row supposed to be reveal the cause of low performance or other issues.
Panels for storage were updated with "bytes-per-datapoint" and "remaining disk size" panels.
The way how regex for column style in Table panel should be applied has changed in 6.7 Grafana version. The change supposed to fix Flags panel column styles accordingly.