github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Roman Khavronenko	4d0b41e63b	deployment: add panel and alerts for displying go scheduler latency (#7078 ) The panel and alerting rule should help to understand whether VM component doesn't have enough CPU resources or gets throttled. The alert is applicable for all VM components. The panel was added to vmalert, vmagent, vmsingle, vm clusert and victorialogs dashes. ------------------- This alerting rule should have help us identify resource shortage for sandbox vmagent - see [this link](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/?g0.range_input=23d13h25m25s424ms&g0.end_input=2024-09-23T14%3A11%3A00&g0.relative_time=none&g0.tab=0&g0.expr=histogram_quantile%280.99%2C+sum%28rate%28go_sched_latencies_seconds_bucket%7Bjob%3D%22vmagent-monitoring-vmagent%22%7D%5B5m%5D%29%29+by+%28le%2C+job%2C+instance%29%29+%3E+0.1) for example. We weren't aware of resource shortage, because VM metrics assumed this vmagent had 1vCPU while in fact its limit was 0.2vCPU. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-09-23 16:54:42 +02:00
hagen1778	9dd9b4442f	dashboards: use `$__interval` variable for offsets and look-behind windows in annotations This should improve precision of `restarts` and `version change` annotations when zooming-in/zooming-out on the dashboards. The change also makes `restarts` dashboard visible on the panels, so user can disable it from displaying if needed. This could be useful when restarts overlap with version change events. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-22 16:32:51 +02:00
hagen1778	9256df17fa	deployment: bump Grafana version to 10.4.2 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-04-29 12:10:24 +02:00
hagen1778	8606b48ce5	dashboards: add `Network Usage` panel to `Resource Usage` row https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4478 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-04-29 11:54:17 +02:00
Dima Lazerka	564463259a	deployment/dashboards: properly show version for non-stable docker images (#6150 ) re: .-(?:tags\|heads)-(.)-(?:0\|dirty)-.* cases: victoria-metrics-20240419-160209-heads-enterprise-single-node-0-g08f933ab0c enterprise-single-node victoria-metrics-20240201-133950-tags-v1.97.1-enterprise-0-g760a8733b v1.97.1-enterprise victoria-metrics-20240419-160209-heads-rotation-part-2-0-ge2367b6d1-dirty-848b54cd rotation-part-2-0-ge2367b6d1 victoria-metrics-20240419-160209-heads-lts-1.93-enterprise-search-contention-0-g30ef4aad21-amd64 lts-1.93-enterprise-search-contention victoria-metrics-20240425-150852-tags-v1.101.0-enterprise-0-g718138c64 v1.101.0-enterprise Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Dzmitry Lazerka <dlazerka@gmail.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-04-29 11:11:28 +02:00
Zakhar Bessarab	6b493582da	dashboards/victoria-metrics-single: allow selecting multiple instance values (#5870 ) Allowing to select multiple instance IPs makes it much easier to view metrics for longer periods of time in dynamic environments such as Kubernetes. In k8s update will also cause IP to change making it harder to use dashboard to check the status. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5869 --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-04-29 10:34:36 +02:00
Aliaksandr Valialkin	8eeb045d3f	all: replace old https://docs.victoriametrics.com/MetricsQL.html url with the new one - https://docs.victoriametrics.com/metricsql/	2024-04-18 02:14:53 +02:00
hagen1778	0ab1069363	dashboards: update links in various panels * use docs.victoriametrics.com instead of github docs * add links to common terms used in VictoriaMetrics Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-03-04 15:43:31 +01:00
hagen1778	3380043424	dashboards: follow-up `4369bc1df2` * add more details to changelog * simplify panels description * remove capacity planning recommendation, as it proves it incompetent Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-08 09:51:43 +01:00
Hui Wang	4369bc1df2	deployment/dashboards: fix `Storage full ETA` panels (#5747 ) During background downsampling, rate(vm_deduplicated_samples_total{type="merge"}) could be much bigger than rate(vm_rows_added_to_storage_total) and it could last quite some time, which causes negative values of Storage full ETA and confuses users, see playground. Instead of trying to get more accurate results during downsampling, I think it's ok to ignore vm_deduplicated_samples_total at all, it's more reasonable to see Storage full ETA increase after downsampling. --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-08 09:43:39 +01:00
hagen1778	487a94565b	dashboards/all: add new panel `CPU spent on GC` It should help identifying cases when too much CPU is spent on garbage collection, and advice users on how this can be addressed. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-02 16:21:21 +01:00
hagen1778	db11b94e30	dashboards: update to grafana/grafana:10.3.1 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-02 15:41:08 +01:00
hagen1778	02492bc1a4	dashboards/single: fix typo in query for `version` annotation The typo falsely produced many version change events. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-31 09:13:46 +01:00
hagen1778	c23e8bee89	dashboards: specify where to see details about dropped labels Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-29 07:37:51 +01:00
hagen1778	b0287867fe	deployment/dashboards: change title `VictoriaMetrics` to `VictoriaMetrics - single-node` The new title should provide better understanding of this dashboard purpose. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-16 20:39:52 +01:00
Aliaksandr Valialkin	aefd744abb	dashboards: remove `path!="/favicon.ico"` filter from `requests rate` graphs The `path!="/favicon.ico"` filter has little sense, since there are many other special paths, which may be filtered out - /metrics, /flags, /health, /ping, /robots.txt, /-/healthy, /-/ready, /reload, etc. See /lib/httpserver/httpserver.go for more details. It will be hard or impossible to maintain filters for all these paths, so it is better to drop this filter in order to simplify queries and improve the consistency of these queries.	2023-11-16 19:28:49 +01:00
hagen1778	d389a4fcf3	dashboards: use `version` instead of `short_version` in annotations `version` label won't show the difference if various flavors of the same version were deployed. But `short_version` will. For example, on the sandbox env we test VM builds before new version release. Without this change, the version update won't be visible on dashboard. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-16 09:26:47 +01:00
hagen1778	d3ae2b2f62	dashboards: update description for RSS and anonymous memory panels to be consistent for single-node, cluster and vmagent dashboards. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-14 09:50:06 +01:00
hagen1778	d6ae082598	deployment/dashboards: respect `job` and `instance` filters for `alerts` annotation in cluster and single-node dashboards Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-14 09:38:15 +01:00
hagen1778	0c60228fea	dashboards/victoriametrics: account for instance filter in annotations Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-09-20 14:50:03 +02:00
hagen1778	d890038a94	dashboards: correctly calculate `Bytes per point` value Correctly calculate `Bytes per point` value for single-server and cluster VM dashboards. Before, the calculation mistakenly accounted for the number of entries in indexdb in denominator, which could have shown lower values than expected. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-08-03 16:22:50 +02:00
Roman Khavronenko	ccaa9571ef	Dashboard upd (#4438 ) dashboards: update dashboard for single-node version * add anonymous mem usage panel; * add syscall rate panel; * add location to logs panel; * update legend for panels to reflect instance name; * update queries to aggregate per instance. dashboards: update dashboard for cluster version * add syscall rate panel; * add drilldown to logs panel. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-12 15:58:47 +02:00
Aliaksandr Valialkin	91533531f5	docs/Troubleshooting.md: document an additional case, which could result in slow inserts If `-cacheExpireDuration` is lower than the interval between ingested samples for the same time series, then vm_slow_row_inserts_total` metric is increased. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183	2023-03-20 13:28:36 -07:00
Aliaksandr Valialkin	88fed0232c	dashboards: typo fix `Datapoints scanned per series` -> `Datapoints scanned per query`	2023-02-03 19:12:33 -08:00
Aliaksandr Valialkin	c63755c316	lib/writeconcurrencylimiter: improve the logic behind -maxConcurrentInserts limit Previously the -maxConcurrentInserts was limiting the number of established client connections, which write data to VictoriaMetrics. Some of these connections could be idle. Such connections do not consume big amounts of CPU and RAM, so there is a little sense in limiting the number of such connections. So now the -maxConcurrentInserts command-line option limits the number of concurrently executed insert requests, not including idle connections. It is recommended removing -maxConcurrentInserts command-line option, since the default value for this option should work good for most cases.	2023-01-06 22:20:19 -08:00
Thomas Danielsson	9d1104d812	dashboards: fix operator datasource variable (#3604 ) Got "Failed to upgrade legacy queries Datasource $ds was not found" in Grafana on operator dashboard. It's datasource variable was incorrectly named `datasource`. Also made the rest of the dashboards have homogeneous datasource-variable names and selections, matching vmagent dashboard.	2023-01-05 14:59:56 +01:00
Roman Khavronenko	eb275be99d	dashboards: add VersionChange annotation (#3473 ) The new annotation is hidden by default and suppose to show component `short_version` label change on the panels. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-12-12 16:32:26 +01:00
Roman Khavronenko	0b6b6d52bf	dashboards: remove DataLinks from single version (#3456 ) Those data links were copy&paste artifact from cluster version and aren't needed on the dash. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-12-07 14:35:52 +01:00
Aliaksandr Valialkin	f3e84b4dea	{dashboards,alerts}: subtitute `{type="indexdb"}` with `{type=~"indexdb.*"}` inside queries after `8189770c50` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337	2022-12-05 16:00:22 -08:00
Roman Khavronenko	bdd0683c4a	dashboards: update VM single dash (#3400 ) The change list is the following: * bump Grafana version to 9.2.6; * replace old "Graph" panel with "TimeSeries" panel; * show % usage of Mem and CPU additionally to of absolute values; * `Caches` row was removed. All needed info for caches is now part of `Troubleshooting`; * add Annotations for Alert triggers. Not all alerts are supposed to be displayed on the dashboard, but only those with label `show_at: dashboard`. See `alerts.yml` change. Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-11-29 19:28:22 +01:00
Timur Bakeyev	9ad578214e	Update `datasource` entries consistently contain type `prometheus` and uid `$ds`. (#3393 ) Co-authored-by: Timour I. Bakeev <tbakeev@ripe.net>	2022-11-28 08:37:39 +01:00
Roman Khavronenko	42e63fe0fd	dashboards: cleanup & remove artifacts (#3387 ) * some unexpected DS UIDs were removed; * replace `$instance.` filter with `$instance` since we respect the instance port anyway; remove predefined datasource for `clusterbytenant` in favour of datasource variable `ds`. Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-11-25 09:28:14 +01:00
Roman Khavronenko	908fe6a623	dashboards: replace `Index size` panel with `Active series` (#3157 ) Panel `Index size` showed itself impractical for users. So replacing it with `Active series` panel. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/776#issuecomment-1255823734 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-09-25 21:49:18 +02:00
Roman Khavronenko	b4410b1c63	Dashboards (#3120 ) * dashboards/cluster: few updates * apply consistent formatting across panels; * make resource usage panels per component more detailed; * add extra panels to vmselect for displaying `vm_rows_read_per_query`, `vm_rows_scanned_per_query`, `vm_rows_read_per_series` and `vm_series_read_per_query` metrics. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/single: few updates * apply consistent formatting across panels; * add extra panels to Performance for displaying `vm_rows_read_per_query`, `vm_rows_scanned_per_query`, `vm_rows_read_per_series` and `vm_series_read_per_query` metrics. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/vmagent: few updates * apply consistent formatting across panels; * add panels for showing number of samples ingested or scraped; * adapt resource usage panels for multiple selected jobs/instances; * add adhoc variable; * display vmagent's version in Stats. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/vmalert: few updates * apply consistent formatting across panels; * adapt resource usage panels for multiple selected jobs/instances; * show vmalert version in Stats section. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-09-16 21:24:32 +02:00
Roman Khavronenko	289a4862ba	dashboards: add `Cache usage %` panel to Caches row (#2964 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2941 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-08-08 19:37:34 +03:00
Roman Khavronenko	4c1fbcd6b0	Single dashboards (#2492 ) * dashboards: remove index filter from stats panel for DiskUsage The diskUsage stats panel was showing disk usage without including size of the index, which is not correct. The filter was removed to reflect the total disk usage. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2368 Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards: add adhoc filter to dasbhoard variables The adhoc filter allows to quickly apply global filters without modifying the panels. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards: add new panel `IndexDB items rate` The new panel supposed to reflect the pressure on indexDB caused by churn rate or new series registration. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards: rm "Deferred merges" panel since it could be misleading See more context here https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1682#issuecomment-938608067 Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards: replace fixed interval of `5m` for `rate` expressions Before we used fixed `5m` interval for expressions with `rate` func. Unfortunately, this interval wasn't a fit for all the cases. So we switch to `$__rate_interval` instead. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards: bump version requirement Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards: rm `vm_indexdb_items_added_size_bytes_total` expression Rate over `vm_indexdb_items_added_size_bytes_total` doesn't seem to be useful on the dasbhoard panel. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-04-24 23:27:56 +03:00
Roman Khavronenko	ea86716d06	dashboards: add row Caches to single node dasbhoard (#2208 ) The new row Caches adds more visibility for cache utilization by VM. It replaces the old `Cache size` panel. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-02-18 13:40:19 +02:00
Roman Khavronenko	445edcc6ac	dashboards: update the threshold for slow inserts % on the dashboard (#2197 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-02-15 21:56:53 +02:00
Roman Khavronenko	e29b2b8444	Monitoring single (#2190 ) * dashboards: plot cpu limits for vmagent, vmalert and vm-single dashboards Signed-off-by: hagen1778 <roman@victoriametrics.com> * alerts: add `TooHighCPUUsage` alert for all VM components Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards: bump components version requirements Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-02-15 11:54:28 +02:00
Roman Khavronenko	52a3b2d77e	Dashboards vmsingle (#1980 ) * dashboards/vmsingle: add "Merges deferred" panel The new panel supposed to show if there were deferred merges due to insufficient disk space. It goes within alerting rule which suppose to send a signal in such cases. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/vmsingle: add "Cache usage" panel The new panel supposed to show the % of the used cache compared to allowed size by type. It should help to determine underutilized types of caches. Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/vmsingle: bump version requirement Signed-off-by: hagen1778 <roman@victoriametrics.com> * dashboards/vmsingle: rm alert for `vm_merge_need_free_disk_space` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2021-12-20 17:28:35 +02:00
Aliaksandr Valialkin	802f05f73f	dashboards: consistently use regexp filters for template vars (#1798 ) Template vars may contain regexp when `all` is selected (.*) or when multiple values are selected (foo\|bar). So they must be passed to regexp filters.	2021-11-09 16:50:21 +02:00
Roman Khavronenko	0f4bcc00b2	Single dashboards upd (#1593 ) * dasbhoard: replace `null` datasources null datasource value may confuse Grafana and make it drop panel query in some versions. * docker: bump grafana image version * dashboards: add URL variable selector to vmagent dashboard * dashboards: add new panel `Remote write connection saturation` to vmagent dashboard * alerts: add new alert for `Remote write connection saturation` panel of vmagent dashboard * dashboards: add "Logging rate" panel to vmagent dashboard	2021-09-01 11:46:22 +03:00
Roman Khavronenko	a38a6fe8ad	dashboard: move panel `Disk writes/reads` to `Resource usage` row (#1417 ) * dashboard: move panel `Disk writes/reads` to `Resource usage` row * dashboard: make Stats panel consistent with Cluster dashboard	2021-07-01 05:46:26 +03:00
Roman Khavronenko	a90012ef26	dashboard: bump version requirements (#1378 )	2021-06-14 13:31:59 +03:00
Roman Khavronenko	b8526e88d3	Dashboard single (#1374 ) * dashboard: update single version dash The update contains the following changes: * display anonymous memory usage metric. This metric suppose to reflect memory usage of the process which can't be freed by OS; * add legends to all panels. This is important for cases when users share the screenshots; * modify panels for Grafana v8.0.0 * dashboard: update single version dash tags * dashboard: update vmagent dash The update contains the following changes: * display anonymous memory usage metric. This metric suppose to reflect memory usage of the process which can't be freed by OS; * add legends to all panels. This is important for cases when users share the screenshots; * modify panels for Grafana v8.0.0	2021-06-14 13:03:23 +03:00
Aliaksandr Valialkin	6bc52fe41a	all: rename https://victoriametrics.github.io to https://docs.victoriametrics.com	2021-04-20 20:16:17 +03:00
Roman Khavronenko	b955fe0038	dashboard: use unit `short` for `Labels limit exceeded` panel (#1227 )	2021-04-19 13:33:21 +03:00
Roman Khavronenko	f80156d9df	dashboard: fix avg GC duration expression (#1228 ) Previous expression was not correct.	2021-04-19 13:28:41 +03:00
Aliaksandr Valialkin	edd1590ac7	dashboards/victoriametrics.json: typo fix: `chur rate` -> `churn rate`	2021-04-08 09:35:50 +03:00
Roman Khavronenko	b1e49bab52	Dashboards update (#1153 ) * dashboard: update single node dashboard * add number of new series created over last 24h; * bump version requirements. * dashboard: update vmagent dashboard * add panel for open file descriptors; * add panel for disk I/O; * add panel for `vmagent_remotewrite_packets_dropped_total` metric; * bump version requirements.	2021-03-29 12:37:17 +03:00

1 2

75 commits