Commit graph

9 commits

Author SHA1 Message Date
hagen1778
c47138e1b0
dashboards: add panels for absoulte value of mem and cpu usage by vmalert
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4627

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-03 11:14:14 +02:00
Roman Khavronenko
3eebe52a06
Dashboards upd (#3942)
* dashboards/cluser: use `quantile` since `median` isn't supported by PromQL

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/*: add `restarts` annotation to show when there were restarts

The cluster's annotation query is aggregated `by job`,
while vmagent/vmalert are aggregated `by job, instance`.
This is because cluster dashboard can contains too many instances
and annotation could become too noisy.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/*: support instance filter in Version annotation

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-03-10 17:13:19 +01:00
Thomas Danielsson
9d1104d812
dashboards: fix operator datasource variable (#3604)
Got "Failed to upgrade legacy queries Datasource $ds was not found" in
Grafana on operator dashboard.
It's datasource variable was incorrectly named `datasource`.

Also made the rest of the dashboards have homogeneous datasource-variable
names and selections, matching vmagent dashboard.
2023-01-05 14:59:56 +01:00
Roman Khavronenko
eb275be99d
dashboards: add VersionChange annotation (#3473)
The new annotation is hidden by default and suppose to show
component `short_version` label change on the panels.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-12 16:32:26 +01:00
Roman Khavronenko
5d835a6d64
dashboards: update vmalert dash (#3404)
The change list is the following:
* bump Grafana version to 9.2.6;
* replace old Graph panel with TimeSeries panel;
* add RemoteWrite section;
* allow configuring topK elements for some of the panels;
* Preer grouping by job instead of grouping by instance.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-29 19:26:31 +01:00
Timur Bakeyev
9ad578214e
Update datasource entries consistently contain type prometheus and uid $ds. (#3393)
Co-authored-by: Timour I. Bakeev <tbakeev@ripe.net>
2022-11-28 08:37:39 +01:00
Roman Khavronenko
b4410b1c63
Dashboards (#3120)
* dashboards/cluster: few updates

* apply consistent formatting across panels;
* make resource usage panels per component more detailed;
* add extra panels to vmselect for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/single: few updates

* apply consistent formatting across panels;
* add extra panels to Performance for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmagent: few updates

* apply consistent formatting across panels;
* add panels for showing number of samples ingested
or scraped;
* adapt resource usage panels for multiple selected jobs/instances;
* add adhoc variable;
* display vmagent's version in Stats.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmalert: few updates

* apply consistent formatting across panels;
* adapt resource usage panels for multiple selected jobs/instances;
* show vmalert version in Stats section.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-09-16 21:24:32 +02:00
Roman Khavronenko
e29b2b8444
Monitoring single (#2190)
* dashboards: plot cpu limits for vmagent, vmalert and vm-single dashboards

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* alerts: add `TooHighCPUUsage` alert for all VM components

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards: bump components version requirements

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-02-15 11:54:28 +02:00
Roman Khavronenko
eff940aa76
Vmalert metrics update (#1580)
* vmalert: remove `vmalert_execution_duration_seconds` metric

The summary for `vmalert_execution_duration_seconds` metric gives no additional
value comparing to `vmalert_iteration_duration_seconds` metric.

* vmalert: update config reload success metric properly

Previously, if there was unsuccessfull attempt to reload config and then
rollback to previous version - the metric remained set to 0.

* vmalert: add Grafana dashboard to overview application metrics

* docker: include vmalert target into list for scraping

* vmalert: extend notifier metrics with addr label

The change adds an `addr` label to metrics for alerts_sent and alerts_send_errors
to identify which exact address is having issues.
The according change was made to vmalert dashboard.

* vmalert: update documentation and docker environment for vmalert's dashboard

Mention Grafana's dashboard in vmalert's README in a new section #Monitoring.

Update docker-compose env to automatically add vmalert's dashboard.
Update docker-compose README with additional info about services.
2021-08-31 12:28:02 +03:00