* dashboards: add `CPU percentage` panel for cluster dashboards
The new panel `CPU percentage` was added instead if adding a limit
to the existing `CPU` panel because dasbhoard may display big number
of components each with own limits. The separate panel should provide
a clear display of CPU load.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: sync vmagent and vmalert changes from single version
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* docker: remove unsupported param from vmagent config
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* alerts: add `TooHighCPUUsage` alert for all VM components
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: add panels for vmstorage in read-only mode
vmstorage readonly status panel was addded to "vmstorage" row.
A one more panel for showing vminsert->vmstorage readonly status
was added to troubleshooting row.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: add "Cache usage" panel
The new panel supposed to show the % of the used cache
compared to allowed size by type.
It should help to determine underutilized types of caches.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: add "Merges deferred" panel
The new panel supposed to show if there were deferred merges
due to insufficient disk space.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: update Network panel for vminsert
* delete bytes_written query, since in most cases it is insiginificant
* change display type to Stack
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: bump version requirement
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmagent: shuffle panels for better visibility
More important error/dropped panels were moved higher on the main row.
Network usage panel moved to Resource usage row.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmagent: add Troubleshooting row to show top 5 instances/jobs by churn rate
New panels are supposed to show top 5 jobs or targets which generate the most
of the churn rate. They were placed into a new row "Troubleshooting".
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmagent: add panels for showing persistent queue saturation
New panels were added to Torubleshooting row to show the persistent queue
saturation. The corresponding alerts were added and linked to these
panels as well.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmagent: add alert "RejectedRemoteWriteDataBlocksAreDropped"
New alert suppose to send a notification when vmagent starts to drop
data blocks rejected by configured remote write destiantion.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dasbhoard: replace `null` datasources
null datasource value may confuse Grafana and make it drop panel query in some
versions.
* docker: bump grafana image version
* dashboards: add URL variable selector to vmagent dashboard
* dashboards: add new panel `Remote write connection saturation` to vmagent dashboard
* alerts: add new alert for `Remote write connection saturation` panel of vmagent dashboard
* dashboards: add "Logging rate" panel to vmagent dashboard
* rm cumulative visualisation for panel `Disk space used`.
It uses % threshold and cumulative display breaks it.
* remove area filling for resource usage row;
* add job name for panels in resource usage row.
* dashboard: update vmagent dash
The update contains the following changes:
* display anonymous memory usage metric. This metric suppose to reflect
memory usage of the process which can't be freed by OS;
* add legends to all panels. This is important for cases when users share
the screenshots;
* modify panels for Grafana v8.0.0
* dashboard: update cluster dash
The update contains the following changes:
* move stats panels to Configuration row, so it can be collapsed;
* display anonymous memory usage metric. This metric suppose to reflect
memory usage of the process which can't be freed by OS;
* add legends to all panels. This is important for cases when users share
the screenshots;
* modify panels for Grafana v8.0.0
* [draft] per tenant statistic
* updates metric name
update graph
adds link and example config
* quick fix
* adds grafana dashboard
adds example alert
Co-authored-by: f41gh7 <nik@victoriametrics.com>
* dashboard: change FreeDiskSpace panel to show percentage of used space instead
* dashboard: disable area fill for Cache hit ratio
* dashboard: minor display updates
* dashboard: add panel `Concurrent flushes on disk`
* dashboard: add `Rows ignored` panel
* dashboard: update ChurnRate panel with proper description and additional query over 24h time window
* dashboard: update single node dashboard
* add number of new series created over last 24h;
* bump version requirements.
* dashboard: update vmagent dashboard
* add panel for open file descriptors;
* add panel for disk I/O;
* add panel for `vmagent_remotewrite_packets_dropped_total` metric;
* bump version requirements.
* add panel `Open FDs` for file descriptors metrics;
* add panel `Disk writes/reads` to show the real read/write
load on storage layer;
* add stats panel to show available CPUs, memory and disk space.
* dashboard: rename var `datasource` to `ds` for consistency reason
Dasbhoards for cluster version or vmagent operate with datasource variable
named `ds`. For consistency sake we rename this variable in single node version
as well.
* dashboard: add instance variable picker
See dashboard reviews here https://grafana.com/grafana/dashboards/10229/reviews
* dashboard: limit number of buckets in histogram to 12 for vmagent dashboard
* dashboard: bump version requirement in description for single version
* dashboard: drop extra series override for single version
* dashboard: set Y-min to zero for most of panels in vmagent dashboard
`vmagent` Grafana dashboard suppose to provide basic observability over multiple
`vmagent` instances. Dashboard is saved in Grafana export format so it can be easily
imported. It was also integrated into docker-compose environment.
* Slow metrics load panel was removed since it is hard to interpret without
additional metrics and stats;
* Slow inserts panel was updated to display percentage of slow inserts comparing
to total number of inserts to show the real impact.
* The new update introduces new row "Troubleshooting" that
contains panels for churn rate and slow-queries/inserts/loads metrics. This row supposed to be reveal the cause of low performance or other issues;
* CPU panel got `short` units instead of `seconds`;
* Overview row was updated with panel showing bytes-per-datapoint stat;
* Overview row was updated with panel showing free disk space.
The list of changes is following:
* fix Uptime panel column styles according to changes introduced in 6.7 Grafana version
* fix panel `vminsert/Rows per insert` due to metric rename - see #336
* change default datasource to VictoriaMetrics since dashboard now uses MetricsQL for `vminsert/Rows per insert` panel