Commit graph

78 commits

Author SHA1 Message Date
hagen1778
9b173c2f01
dashboards: follow-up 4369bc1df2
* add more details to changelog
* simplify panels description
* remove capacity planning recommendation, as it proves it incompetent

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-08 12:55:42 +02:00
Hui Wang
0cd0ddc1c1
deployment/dashboards: fix Storage full ETA panels (#5747)
During background downsampling, rate(vm_deduplicated_samples_total{type="merge"}) could be much bigger than 
rate(vm_rows_added_to_storage_total) and it could last quite some time,
 which causes negative values of Storage full ETA and confuses users, see playground.

Instead of trying to get more accurate results during downsampling, I think it's ok to ignore 
vm_deduplicated_samples_total at all, it's more reasonable to see Storage full ETA increase after downsampling.

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-02-08 12:54:31 +02:00
hagen1778
bdbab7bed5
dashboards/all: add new panel CPU spent on GC
It should help identifying cases when too much CPU is spent on garbage collection,
 and advice users on how this can be addressed.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-05 11:42:28 +02:00
hagen1778
3dab94a6c1
dashboards: update to grafana/grafana:10.3.1
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-05 10:50:36 +02:00
hagen1778
151247c9b9
dashboards/single: fix typo in query for version annotation
The typo falsely produced many version change events.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-31 10:28:24 +02:00
hagen1778
5aa0f77d8c
dashboards: specify where to see details about dropped labels
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-29 17:23:38 +01:00
hagen1778
a2d3fce05f
deployment/dashboards: change title VictoriaMetrics to VictoriaMetrics - single-node
The new title should provide better understanding of this dashboard purpose.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-17 01:06:07 +02:00
Aliaksandr Valialkin
5c43f2261e
dashboards: remove path!="/favicon.ico" filter from requests rate graphs
The `path!="/favicon.ico"` filter has little sense, since there are many other special paths,
which may be filtered out - /metrics, /flags, /health, /ping, /robots.txt, /-/healthy, /-/ready, /reload, etc.
See /lib/httpserver/httpserver.go for more details.
It will be hard or impossible to maintain filters for all these paths, so it is better to drop this filter
in order to simplify queries and improve the consistency of these queries.
2023-11-16 19:29:46 +01:00
hagen1778
7d72474a38
dashboards: use version instead of short_version in annotations
`version` label won't show the difference if various flavors of the same
version were deployed. But `short_version` will.

For example, on the sandbox env we test VM builds before new version release.
Without this change, the version update won't be visible on dashboard.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d389a4fcf3)
2023-11-16 09:27:42 +01:00
hagen1778
72a40539b0
dashboards: update description for RSS and anonymous memory panels to be consistent for single-node, cluster and vmagent dashboards.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d3ae2b2f62)
2023-11-14 10:00:11 +01:00
hagen1778
777424082b
deployment/dashboards: respect job and instance filters for alerts annotation in cluster and single-node dashboards
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d6ae082598)
2023-11-14 10:00:11 +01:00
hagen1778
f2195cb914
dashboards/victoriametrics: account for instance filter in annotations
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-09-21 09:36:35 +02:00
hagen1778
05b4fbf0b5
dashboards: correctly calculate Bytes per point value
Correctly calculate `Bytes per point` value for single-server and cluster VM dashboards.
Before, the calculation mistakenly accounted for the number of entries in indexdb in
denominator, which could have shown lower values than expected.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-11 04:53:56 -07:00
Roman Khavronenko
ecd7ec4832
Dashboard upd (#4438)
dashboards: update dashboard for single-node version
* add anonymous mem usage panel;
* add syscall rate panel;
* add location to logs panel;
* update legend for panels to reflect instance name;
* update queries to aggregate per instance.

dashboards: update dashboard for cluster version
* add syscall rate panel;
* add drilldown to logs panel.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-06 16:49:42 -07:00
Aliaksandr Valialkin
531b35b6c0
docs/Troubleshooting.md: document an additional case, which could result in slow inserts
If `-cacheExpireDuration` is lower than the interval between ingested samples for the same time series,
then vm_slow_row_inserts_total` metric is increased.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183
2023-03-20 14:33:27 -07:00
Aliaksandr Valialkin
3db8d7cb01
dashboards: typo fix Datapoints scanned per series -> Datapoints scanned per query 2023-02-03 19:12:42 -08:00
Aliaksandr Valialkin
b275983403
lib/writeconcurrencylimiter: improve the logic behind -maxConcurrentInserts limit
Previously the -maxConcurrentInserts was limiting the number of established client connections,
which write data to VictoriaMetrics. Some of these connections could be idle.
Such connections do not consume big amounts of CPU and RAM, so there is a little sense in limiting
the number of such connections. So now the -maxConcurrentInserts command-line option
limits the number of concurrently executed insert requests, not including idle connections.

It is recommended removing -maxConcurrentInserts command-line option, since the default value
for this option should work good for most cases.
2023-01-06 22:07:16 -08:00
Thomas Danielsson
ec1f6811a1
dashboards: fix operator datasource variable (#3604)
Got "Failed to upgrade legacy queries Datasource $ds was not found" in
Grafana on operator dashboard.
It's datasource variable was incorrectly named `datasource`.

Also made the rest of the dashboards have homogeneous datasource-variable
names and selections, matching vmagent dashboard.
2023-01-05 16:49:19 -08:00
Roman Khavronenko
4917c9ad8a
dashboards: add VersionChange annotation (#3473)
The new annotation is hidden by default and suppose to show
component `short_version` label change on the panels.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-12 14:41:46 -08:00
Roman Khavronenko
5539beddb5
dashboards: remove DataLinks from single version (#3456)
Those data links were copy&paste artifact from cluster version
and aren't needed on the dash.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-07 09:49:32 -08:00
Aliaksandr Valialkin
d2e34b8052
{dashboards,alerts}: subtitute {type="indexdb"} with {type=~"indexdb.*"} inside queries after 8189770c50
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-05 16:00:42 -08:00
Roman Khavronenko
b2f45b4856
dashboards: update VM single dash (#3400)
The change list is the following:
* bump Grafana version to 9.2.6;
* replace old "Graph" panel with "TimeSeries" panel;
* show % usage of Mem and CPU additionally to of absolute values;
* `Caches` row was removed. All needed info for caches is now part of `Troubleshooting`;
* add Annotations for Alert triggers. Not all alerts are supposed to be displayed
on the dashboard, but only those with label `show_at: dashboard`.
See `alerts.yml` change.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-29 20:39:05 -08:00
Timur Bakeyev
b6064dd645
Update datasource entries consistently contain type prometheus and uid $ds. (#3393)
Co-authored-by: Timour I. Bakeev <tbakeev@ripe.net>
2022-11-28 16:43:58 -08:00
Roman Khavronenko
ed39d0d11c
dashboards: cleanup & remove artifacts (#3387)
* some unexpected DS UIDs were removed;
* replace `$instance.*` filter with `$instance` since we respect
the instance port anyway;
* remove predefined datasource for `clusterbytenant`
in favour of datasource variable `ds`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-25 07:28:24 -08:00
Roman Khavronenko
0efc20d7b8
dashboards: replace Index size panel with Active series (#3157)
Panel `Index size` showed itself impractical for users. So
replacing it with `Active series` panel.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/776#issuecomment-1255823734
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-26 08:48:25 +03:00
Roman Khavronenko
5dfe63e102
Dashboards (#3120)
* dashboards/cluster: few updates

* apply consistent formatting across panels;
* make resource usage panels per component more detailed;
* add extra panels to vmselect for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/single: few updates

* apply consistent formatting across panels;
* add extra panels to Performance for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmagent: few updates

* apply consistent formatting across panels;
* add panels for showing number of samples ingested
or scraped;
* adapt resource usage panels for multiple selected jobs/instances;
* add adhoc variable;
* display vmagent's version in Stats.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmalert: few updates

* apply consistent formatting across panels;
* adapt resource usage panels for multiple selected jobs/instances;
* show vmalert version in Stats section.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-19 15:04:37 +03:00
Max Golionko
e07f23a1b9
moved cluster dashboard to master (#3074)
dashboards: move cluster dashboard to master branch

This change should simplify dashboards management.
2022-09-08 11:47:25 +03:00
Roman Khavronenko
3c583c16a1
dashboards: add Cache usage % panel to Caches row (#2960)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2941
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-08 11:45:17 +02:00
Roman Khavronenko
fc03950efa
dashboards: update cluster dashboard (#2773)
* dashboards: update cluster dashboard

* add assisted merges panel https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2754
* add mem panel per each component
* remove lines filling for some panels for clarity

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update dashboards/victoriametrics.json
2022-06-23 09:46:28 +02:00
Roman Khavronenko
246d2df361
dashboards: add cpu usage panels per each component type (#2723)
The change adds extra panel per each component, showing
the amount of used CPU cores and the limit (summary of all instances).

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2696

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-16 20:49:55 +03:00
Roman Khavronenko
d956f6f68e
Dashboar cluster update (#2674)
* dashboard: fix query for `CPU percentage` panel

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboard: replace Uptime panel with Version panel

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-02 16:03:48 +02:00
Roman Khavronenko
46c06334ee
dashboards: use vm_concurrent_select_current instead of vm_concurrent_queries (#2655)
Using metric `vm_concurrent_queries` in relation to `vm_concurrent_select_capacity`
is incorrect. Switching to `vm_concurrent_select_current` in `Concurrent selects` panel.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-30 12:16:24 +03:00
Roman Khavronenko
8c30640828
dashboards: bump version requirement for cluster dashboard (#2537)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-05-05 13:40:07 +03:00
hagen1778
e856d74b7b dashboards: replace fixed interval of 5m for rate expressions
Before we used fixed `5m` interval for expressions with `rate` func.
Unfortunately, this interval wasn't a fit for all the cases. So we
switch to `$__rate_interval` instead.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-24 23:25:32 +03:00
hagen1778
16b3374874 dashboards: add new panel IndexDB items rate
The new panel supposed to reflect the pressure on indexDB
caused by churn rate or new series registration.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-24 23:25:32 +03:00
hagen1778
1762256c7e dashboards: mention that Rows.Sent can be affected by replication
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-24 23:25:32 +03:00
hagen1778
4255cb7559 dashboards: rm "Deferred merges" panel since it could be misleading
See more context here https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1682#issuecomment-938608067

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-24 23:25:32 +03:00
hagen1778
0bbc7221f3 dashboards: add adhoc filter to dasbhoard variables
The adhoc filter allows to quickly apply global filters without
modifying the panels.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-24 23:25:32 +03:00
hagen1778
80e8413f3a dashboards: remove index filter from stats panel for DiskUsage
The diskUsage stats panel was showing disk usage without including
size of the index, which is not correct. The filter was removed
to reflect the total disk usage.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2368

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-04-24 23:25:32 +03:00
Roman Khavronenko
3569352fe0
dashboards: update the threshold for slow inserts % on the dashboard (#2198)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-02-15 21:57:21 +02:00
Roman Khavronenko
3458a3d593
Monitoring cluster (#2191)
* dashboards: add `CPU percentage` panel for cluster dashboards

The new panel `CPU percentage` was added instead if adding a limit
to the existing `CPU` panel because dasbhoard may display big number
of components each with own limits. The separate panel should provide
a clear display of CPU load.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards: sync vmagent and vmalert changes from single version

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* docker: remove unsupported param from vmagent config

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* alerts: add `TooHighCPUUsage` alert for all VM components

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-02-15 11:57:58 +02:00
Roman Khavronenko
4010f548b5
dashboards: migrate from old table panel in cluster dashboard (#1993)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-22 11:21:06 +02:00
Roman Khavronenko
0311d3cc89
Dashboards cluster (#1983)
* dashboards/cluster: add panels for vmstorage in read-only mode

vmstorage readonly status panel was addded to "vmstorage" row.

A one more panel for showing vminsert->vmstorage readonly status
was added to troubleshooting row.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/cluster: add "Cache usage" panel

The new panel supposed to show the % of the used cache
compared to allowed size by type.
It should help to determine underutilized types of caches.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/cluster: add "Merges deferred" panel

The new panel supposed to show if there were deferred merges
due to insufficient disk space.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/cluster: update Network panel for vminsert

* delete bytes_written query, since in most cases it is insiginificant
* change display type to Stack

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/cluster: bump version requirement

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-20 17:32:05 +02:00
Aliaksandr Valialkin
6346b78fa8
dashboards: consistently use regexp filters for template vars (#1799)
Template vars may contain regexp when `all` is selected (.*) or when multiple values are selected (foo|bar).
So they must be passed to regexp filters.
2021-11-09 16:50:08 +02:00
Roman Khavronenko
18313f3f8e
Cluster dashboard update (#1594)
* dashboards: sync `vmagent` updates from master branch

* dashboards: add new `Storage connection saturation` panel for cluster dashboard

* dashboards: add new cluster alert for corresponding `Storage connection saturation` panel
2021-09-01 17:05:17 +03:00
Roman Khavronenko
c6cf821600
dashboard: several minor fixes (#1418)
* move panel `Disk writes/reads` to `Resource usage` row
* rename row `storage` to `vmstorage`
* remove cumulative display for `Storage ETA` panel
2021-07-01 05:45:35 +03:00
Roman Khavronenko
5cb378f5b5
dasbhoard: display tweaks (#1387)
* rm cumulative visualisation for panel `Disk space used`.
It uses % threshold and cumulative display breaks it.
* remove area filling for resource usage row;
* add job name for panels in resource usage row.
2021-06-18 10:48:40 +03:00
Roman Khavronenko
db39c4a7d1
dashboard: bump version requirements (#1379) 2021-06-14 13:32:32 +03:00
Roman Khavronenko
1053d3e5a9
Dashboard cluster (#1375)
* dashboard: update vmagent dash

The update contains the following changes:
* display anonymous memory usage metric. This metric suppose to reflect
memory usage of the process which can't be freed by OS;
* add legends to all panels. This is important for cases when users share
the screenshots;
* modify panels for Grafana v8.0.0

* dashboard: update cluster dash

The update contains the following changes:
* move stats panels to Configuration row, so it can be collapsed;
* display anonymous memory usage metric. This metric suppose to reflect
memory usage of the process which can't be freed by OS;
* add legends to all panels. This is important for cases when users share
the screenshots;
* modify panels for Grafana v8.0.0
2021-06-14 13:03:54 +03:00
Aliaksandr Valialkin
1c09e71f5b app/vminsert: add -disableRerouting command-line flag for disabling re-routing if some vmstorage nodes have lower performance than the others
Refactor the rerouting mechanism and make it more resilient to cases when some of vmstorage nodes are temporarily unavailable.

Reduce the probability of rerouting storm.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/791
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1054
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1165
2021-06-04 04:33:52 +03:00