Commit graph

165 commits

Author SHA1 Message Date
hagen1778
309a767fc5
dashboards: fix wrong templating for vmauth
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit e45d80cd79)
2024-07-02 14:37:15 +02:00
Andrii Chubatiuk
937ae2ca90
lib/streamaggr: added stale samples metric, added metrics labels (#6462)
### Describe Your Changes

- added stale metrics counters for input and output samples
- added labels for aggregator metrics =>
`name="{rwctx}:{aggrId}:{aggrSuffix}"`
   - rwctx - global or number starting from 1
   - aggrid - aggregator id starting from 1
   - aggrSuffix - <interval>_(by|without)_label1_label2_labeln
   e.g: `name="global:1:1m_without_instance_pod"`

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 861852f262)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-07-01 15:01:49 +02:00
Artem Navoiev
19c4dfd72c
dashboards: update statistic by tenant dashboard, fix billing disk usage pie panel (#6521)
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit c9f496bdd0)
2024-06-27 09:32:13 +02:00
Nikolay
bf1464fc33
dashboards: add dashboard and alerts for vmauth (#6491)
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit 14b9ef1e4d)
2024-06-25 11:17:04 +02:00
hagen1778
63c15d76cd
dashboards: fix typo in panel descriptions for vmagent
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit b201d1722d)
2024-06-21 11:44:29 +02:00
Hui Wang
7a21e6cb6b
vmalert-dashboard: replace variable query metric (#6505)
`vmalert_iteration_total` series number is 4 time less than
`vmalert_iteration_duration_seconds`, queries will be lighter.

(cherry picked from commit 75ad6c1b49)
2024-06-19 10:37:10 +02:00
James Rhoat
f4b52b8137
updating operator dashboard chart to be titled working instead of wokring (#6455)
### Describe Your Changes

Corrected spelling mistake in the operator json to be "working" instead
of "wokring"

### Checklist

The following checks are **mandatory**:

- [ x ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

(cherry picked from commit fbd4b8e1ab)
2024-06-11 17:05:03 +02:00
Andrii Chubatiuk
6fd314d8ba
vmagent: updated dashboard and alert for stream aggregation (#6427)
### Describe Your Changes

Added streaming aggregation section to vmagent dashboards
Added alert for streaming aggregation and deduplication flush timeouts
Removed deprecated compose versions from compose files

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 2da45a8368)
2024-06-10 12:37:22 +02:00
hagen1778
89819f2054
dashboards: use $__interval variable for offsets and look-behind windows in annotations
This should improve precision of `restarts` and `version change` annotations when
 zooming-in/zooming-out on the dashboards.

 The change also makes `restarts` dashboard visible on the panels, so user can disable it from
 displaying if needed. This could be useful when restarts overlap with version change events.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9dd9b4442f)
2024-05-22 16:40:08 +02:00
Hui Wang
5b8c3fc9d0
app/vmalert: support DNS SRV record in -remoteWrite.url (#6299)
part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053,
supports [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) address in
`-remoteWrite.url` command-line option.

(cherry picked from commit d7b5062917)
2024-05-22 10:53:22 +02:00
hagen1778
0dd3fec2b7
deployment/dashboards: fix AnnotationQueryRunner error in Grafana
The error appears when executing annotations query against Prometheus backend
because the query itself hasn't specified look-behind window (which is allowed
in VictoriaMetrics query engine).

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6309
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit c746ba154d)
2024-05-21 16:37:23 +02:00
hagen1778
d87c8757cf
dashboards: add new panel Concurrent selects to vmstorage row
The panel will show how many ongoing select queries are processed by vmstorage
and should help to identify resource bottlenecks. See panel description for more details.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d386a68b59)
2024-04-30 10:30:08 +02:00
hagen1778
0d77a55961
deployment: update per-tenant-statistic dashboard to be compatible with Grafana 10
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9e18724036)
2024-04-30 10:30:06 +02:00
hagen1778
a0698d92c3
deployment: update backupmanager dashboard to be compatible with Grafana 10
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 5917ac003e)
2024-04-30 10:30:04 +02:00
hagen1778
40af0fa179
deployment: update operator dashboard to be compatible with Grafana 10
- Use TimeSeries panel instead of deprecated Graph
- Update panel styles
- Fix version panel

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit f0c4d372bd)
2024-04-30 10:30:02 +02:00
hagen1778
0f72ab8ef6
deployment: bump Grafana version to 10.4.2
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9256df17fa)
2024-04-30 10:30:00 +02:00
hagen1778
8f48747802
dashboards: add Network Usage panel to Resource Usage row
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4478
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 8606b48ce5)
2024-04-30 10:29:58 +02:00
Dima Lazerka
d821e13c24
deployment/dashboards: properly show version for non-stable docker images (#6150)
re: .*-(?:tags|heads)-(.*)-(?:0|dirty)-.*

cases:
victoria-metrics-20240419-160209-heads-enterprise-single-node-0-g08f933ab0c
enterprise-single-node

victoria-metrics-20240201-133950-tags-v1.97.1-enterprise-0-g760a8733b
v1.97.1-enterprise

victoria-metrics-20240419-160209-heads-rotation-part-2-0-ge2367b6d1-dirty-848b54cd
rotation-part-2-0-ge2367b6d1

victoria-metrics-20240419-160209-heads-lts-1.93-enterprise-search-contention-0-g30ef4aad21-amd64
lts-1.93-enterprise-search-contention

victoria-metrics-20240425-150852-tags-v1.101.0-enterprise-0-g718138c64
v1.101.0-enterprise

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Dzmitry Lazerka <dlazerka@gmail.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 564463259a)
2024-04-30 10:29:56 +02:00
Zakhar Bessarab
88677f179c
dashboards/victoria-metrics-single: allow selecting multiple instance values (#5870)
Allowing to select multiple instance IPs makes it much easier to view
metrics for longer periods of time in dynamic environments such as
Kubernetes. In k8s update will also cause IP to change making it harder
to use dashboard to check the status.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5869

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 6b493582da)
2024-04-30 10:29:54 +02:00
hagen1778
d4b56d467f
dashboards: show max number of active merges instead of cumulative
The cumulative number of active merges could be red herring
as it its value depends on the number of vmstorages.
For example, vmstorage could be added or removed and this will affect
the panel.
Or, each vmstorage could start a merging process (i.e. for downsampling)
and visiually it could look like a massive change.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 035de57e5e)
2024-04-24 17:08:15 +02:00
Aliaksandr Valialkin
87641fa7e7
all: replace old https://docs.victoriametrics.com/Troubleshooting.html url with the new one - https://docs.victoriametrics.com/troubleshooting/ 2024-04-18 03:27:18 +02:00
Aliaksandr Valialkin
a21d1fcf57
all: replace old https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html url with the new one - https://docs.victoriametrics.com/cluster-victoriametrics/ 2024-04-18 02:56:28 +02:00
Aliaksandr Valialkin
64938732e3
all: replace old https://docs.victoriametrics.com/MetricsQL.html url with the new one - https://docs.victoriametrics.com/metricsql/ 2024-04-18 02:15:33 +02:00
Aliaksandr Valialkin
a99005eff6
all: replace old https://docs.victoriametrics.com/vmalert.html url with the new one - https://docs.victoriametrics.com/vmalert/ 2024-04-18 01:44:54 +02:00
Aliaksandr Valialkin
c0457ac11a
all: replace remaining https://docs.victoriametrics.com/vmagent.html urls with the new one - https://docs.victoriametrics.com/vmagent/ 2024-04-18 01:36:20 +02:00
Vadim Rutkovsky
59fc201aee
dashboards: fix typo in VictoriaLogs panel (#6102)
Comprasion -> compression

(cherry picked from commit 66c5fc3243)
2024-04-16 09:58:33 +02:00
Artem Navoiev
8f22c44db6
dashboards: statistic per tenant dashboard use variable for datasource in pie charts
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-03-17 23:22:35 +02:00
hagen1778
764fc566ff
dashboards: add more context to cluster dashboard panels
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-03-06 13:34:10 +02:00
hagen1778
51745ec5ff
dashboards: update links in various panels
* use docs.victoriametrics.com instead of github docs
* add links to common terms used in VictoriaMetrics

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-03-04 17:00:54 +02:00
Artem Navoiev
53c1c603d2
dashboards: update statistic per tenant dashbaord. Change to timeseries panel, add churn rate over 24h and query duration, add billing section
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2024-02-29 02:41:27 +02:00
hagen1778
f4578826b3
dashboards: add legend details to network panels in cluster dash
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit ecccd2a1cc)
2024-02-16 15:31:57 +01:00
hagen1778
9b173c2f01
dashboards: follow-up 4369bc1df2
* add more details to changelog
* simplify panels description
* remove capacity planning recommendation, as it proves it incompetent

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-08 12:55:42 +02:00
Hui Wang
0cd0ddc1c1
deployment/dashboards: fix Storage full ETA panels (#5747)
During background downsampling, rate(vm_deduplicated_samples_total{type="merge"}) could be much bigger than 
rate(vm_rows_added_to_storage_total) and it could last quite some time,
 which causes negative values of Storage full ETA and confuses users, see playground.

Instead of trying to get more accurate results during downsampling, I think it's ok to ignore 
vm_deduplicated_samples_total at all, it's more reasonable to see Storage full ETA increase after downsampling.

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-02-08 12:54:31 +02:00
hagen1778
bdbab7bed5
dashboards/all: add new panel CPU spent on GC
It should help identifying cases when too much CPU is spent on garbage collection,
 and advice users on how this can be addressed.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-05 11:42:28 +02:00
hagen1778
2206309439
dashboards: add Targets scraped/s
A new stat panel shows the number of targets scraped by the vmagent per-second.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-05 10:51:35 +02:00
hagen1778
3dab94a6c1
dashboards: update to grafana/grafana:10.3.1
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-05 10:50:36 +02:00
hagen1778
151247c9b9
dashboards/single: fix typo in query for version annotation
The typo falsely produced many version change events.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-31 10:28:24 +02:00
hagen1778
5aa0f77d8c
dashboards: specify where to see details about dropped labels
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-29 17:23:38 +01:00
hagen1778
6ef6b83b33
dashboards: reflect dashboard rename in copy script
This is a follow-up for ff33e60a3d

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-22 18:44:18 +02:00
hagen1778
a2d3fce05f
deployment/dashboards: change title VictoriaMetrics to VictoriaMetrics - single-node
The new title should provide better understanding of this dashboard purpose.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-17 01:06:07 +02:00
hagen1778
12acea2584
dashboards: update cluster dashboard
* add panels for detailed visualization of traffic usage between vmstorage, vminsert, vmselect
components and their clients. New panels are available in the rows dedicated to specific components.

* update "Slow Queries" panel to show percentage of the slow queries to the total number of read queries
served by vmselect. The percentage value should make it more clear for users whether there is a service degradation.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 463455665b)
2024-01-08 11:58:56 +01:00
Dmytro Kozlov
6a41e1ec0c
app/vmalert: replace error metrics for gauges with counter metrics (#5217)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5160

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 935bec447b)
2023-12-06 19:41:34 +01:00
Hui Wang
065f5a7f9e
vmagent: add vm_promscrape_scrape_pool_targets for scrape jobs like… (#5335)
* vmagent: export `vm_promscrape_scrape_pool_targets` metric to track the number of targets that each scrape_job discovers

* add extra panel for new metric
2023-12-06 14:46:02 +02:00
Aliaksandr Valialkin
5c43f2261e
dashboards: remove path!="/favicon.ico" filter from requests rate graphs
The `path!="/favicon.ico"` filter has little sense, since there are many other special paths,
which may be filtered out - /metrics, /flags, /health, /ping, /robots.txt, /-/healthy, /-/ready, /reload, etc.
See /lib/httpserver/httpserver.go for more details.
It will be hard or impossible to maintain filters for all these paths, so it is better to drop this filter
in order to simplify queries and improve the consistency of these queries.
2023-11-16 19:29:46 +01:00
hagen1778
7d72474a38
dashboards: use version instead of short_version in annotations
`version` label won't show the difference if various flavors of the same
version were deployed. But `short_version` will.

For example, on the sandbox env we test VM builds before new version release.
Without this change, the version update won't be visible on dashboard.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d389a4fcf3)
2023-11-16 09:27:42 +01:00
hagen1778
72a40539b0
dashboards: update description for RSS and anonymous memory panels to be consistent for single-node, cluster and vmagent dashboards.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d3ae2b2f62)
2023-11-14 10:00:11 +01:00
hagen1778
777424082b
deployment/dashboards: respect job and instance filters for alerts annotation in cluster and single-node dashboards
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit d6ae082598)
2023-11-14 10:00:11 +01:00
hagen1778
8c3bac8f40
dashboards/cluster: fix description about max threshold for Concurrent selects panel.
Before, it was mistakenly implying that `max` is equal to the double of available CPUs.

Addresses https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5214

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-31 19:03:21 +01:00
hagen1778
9debdb497c
dashboards/vmalert: add new panel Missed evaluations
The new panel supposed to indicate alerting groups that miss their evaluations.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit aaf9e3d526)
2023-10-31 10:35:57 +01:00
hagen1778
497c708aaa
dashboards: fix Errors rate to Alertmanager filter
The panel `Errors rate to Alertmanager` had `group` label filter
applied to the expression, while the metric `vmalert_alerts_send_errors_total`
doesn't have that label. This resulted into always empty results.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 8874b525b7)
2023-10-31 10:35:57 +01:00