Commit graph

19 commits

Author SHA1 Message Date
Artem Fetishev
683f8c2780
dashboards: add Restarts panel (#7394)
Reopening PR #7373 from a branch in VictoriaMetrics repo in order to
enable edits and rebase.

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-10-30 16:44:08 +01:00
Roman Khavronenko
4d0b41e63b
deployment: add panel and alerts for displying go scheduler latency (#7078)
The panel and alerting rule should help to understand whether VM
component doesn't have enough CPU resources or gets throttled. The alert
is applicable for all VM components.
The panel was added to vmalert, vmagent, vmsingle, vm clusert and
victorialogs dashes.

-------------------

This alerting rule should have help us identify resource shortage for
sandbox vmagent - see [this
link](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/#/?g0.range_input=23d13h25m25s424ms&g0.end_input=2024-09-23T14%3A11%3A00&g0.relative_time=none&g0.tab=0&g0.expr=histogram_quantile%280.99%2C+sum%28rate%28go_sched_latencies_seconds_bucket%7Bjob%3D%22vmagent-monitoring-vmagent%22%7D%5B5m%5D%29%29+by+%28le%2C+job%2C+instance%29%29+%3E+0.1)
for example. We weren't aware of resource shortage, because VM metrics
assumed this vmagent had 1vCPU while in fact its limit was 0.2vCPU.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-09-23 16:54:42 +02:00
Hui Wang
75ad6c1b49
vmalert-dashboard: replace variable query metric (#6505)
`vmalert_iteration_total` series number is 4 time less than
`vmalert_iteration_duration_seconds`, queries will be lighter.
2024-06-19 09:40:34 +02:00
hagen1778
9dd9b4442f
dashboards: use $__interval variable for offsets and look-behind windows in annotations
This should improve precision of `restarts` and `version change` annotations when
 zooming-in/zooming-out on the dashboards.

 The change also makes `restarts` dashboard visible on the panels, so user can disable it from
 displaying if needed. This could be useful when restarts overlap with version change events.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-22 16:32:51 +02:00
Hui Wang
d7b5062917
app/vmalert: support DNS SRV record in -remoteWrite.url (#6299)
part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053,
supports [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) address in
`-remoteWrite.url` command-line option.
2024-05-22 10:52:51 +02:00
hagen1778
c746ba154d
deployment/dashboards: fix AnnotationQueryRunner error in Grafana
The error appears when executing annotations query against Prometheus backend
because the query itself hasn't specified look-behind window (which is allowed
in VictoriaMetrics query engine).

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6309
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-21 11:39:02 +02:00
hagen1778
9256df17fa
deployment: bump Grafana version to 10.4.2
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-04-29 12:10:24 +02:00
Aliaksandr Valialkin
59d495d469
all: replace old https://docs.victoriametrics.com/Troubleshooting.html url with the new one - https://docs.victoriametrics.com/troubleshooting/ 2024-04-18 03:26:36 +02:00
Aliaksandr Valialkin
b4fac26360
all: replace old https://docs.victoriametrics.com/vmalert.html url with the new one - https://docs.victoriametrics.com/vmalert/ 2024-04-18 01:44:12 +02:00
Aliaksandr Valialkin
4927e64700
all: replace remaining https://docs.victoriametrics.com/vmagent.html urls with the new one - https://docs.victoriametrics.com/vmagent/ 2024-04-18 01:36:13 +02:00
hagen1778
0ab1069363
dashboards: update links in various panels
* use docs.victoriametrics.com instead of github docs
* add links to common terms used in VictoriaMetrics

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-03-04 15:43:31 +01:00
hagen1778
487a94565b
dashboards/all: add new panel CPU spent on GC
It should help identifying cases when too much CPU is spent on garbage collection,
 and advice users on how this can be addressed.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-02 16:21:21 +01:00
hagen1778
db11b94e30
dashboards: update to grafana/grafana:10.3.1
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-02-02 15:41:08 +01:00
Dmytro Kozlov
935bec447b
app/vmalert: replace error metrics for gauges with counter metrics (#5217)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5160

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-12-06 19:39:35 +01:00
hagen1778
aaf9e3d526
dashboards/vmalert: add new panel Missed evaluations
The new panel supposed to indicate alerting groups that miss their evaluations.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-31 10:35:19 +01:00
hagen1778
8874b525b7
dashboards: fix Errors rate to Alertmanager filter
The panel `Errors rate to Alertmanager` had `group` label filter
applied to the expression, while the metric `vmalert_alerts_send_errors_total`
doesn't have that label. This resulted into always empty results.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-31 10:16:45 +01:00
hagen1778
c2d252c045
dashboards/vmalert: respect job and instance filters in No data errors
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-17 09:40:39 +02:00
hagen1778
edba9f6266
dashboards/vmalert: use desc sorting for tooltips on panels
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-17 09:31:09 +02:00
Roman Khavronenko
e8db78eaa4
dashboards: provide copies of Grafana dashboards alternated with Vict… (#4905)
dashboards: provide copies of Grafana dashboards alternated with VictoriaMetrics datasource

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-29 11:06:55 +02:00