The error appears when executing annotations query against Prometheus backend
because the query itself hasn't specified look-behind window (which is allowed
in VictoriaMetrics query engine).
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6309
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* use docs.victoriametrics.com instead of github docs
* add links to common terms used in VictoriaMetrics
Signed-off-by: hagen1778 <roman@victoriametrics.com>
It should help identifying cases when too much CPU is spent on garbage collection,
and advice users on how this can be addressed.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
The panel `Errors rate to Alertmanager` had `group` label filter
applied to the expression, while the metric `vmalert_alerts_send_errors_total`
doesn't have that label. This resulted into always empty results.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluser: use `quantile` since `median` isn't supported by PromQL
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/*: add `restarts` annotation to show when there were restarts
The cluster's annotation query is aggregated `by job`,
while vmagent/vmalert are aggregated `by job, instance`.
This is because cluster dashboard can contains too many instances
and annotation could become too noisy.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/*: support instance filter in Version annotation
Signed-off-by: hagen1778 <roman@victoriametrics.com>
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Got "Failed to upgrade legacy queries Datasource $ds was not found" in
Grafana on operator dashboard.
It's datasource variable was incorrectly named `datasource`.
Also made the rest of the dashboards have homogeneous datasource-variable
names and selections, matching vmagent dashboard.
The new annotation is hidden by default and suppose to show
component `short_version` label change on the panels.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
The change list is the following:
* bump Grafana version to 9.2.6;
* replace old Graph panel with TimeSeries panel;
* add RemoteWrite section;
* allow configuring topK elements for some of the panels;
* Preer grouping by job instead of grouping by instance.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/cluster: few updates
* apply consistent formatting across panels;
* make resource usage panels per component more detailed;
* add extra panels to vmselect for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/single: few updates
* apply consistent formatting across panels;
* add extra panels to Performance for displaying
`vm_rows_read_per_query`, `vm_rows_scanned_per_query`,
`vm_rows_read_per_series` and `vm_series_read_per_query` metrics.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmagent: few updates
* apply consistent formatting across panels;
* add panels for showing number of samples ingested
or scraped;
* adapt resource usage panels for multiple selected jobs/instances;
* add adhoc variable;
* display vmagent's version in Stats.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards/vmalert: few updates
* apply consistent formatting across panels;
* adapt resource usage panels for multiple selected jobs/instances;
* show vmalert version in Stats section.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: plot cpu limits for vmagent, vmalert and vm-single dashboards
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* alerts: add `TooHighCPUUsage` alert for all VM components
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: bump components version requirements
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* vmalert: remove `vmalert_execution_duration_seconds` metric
The summary for `vmalert_execution_duration_seconds` metric gives no additional
value comparing to `vmalert_iteration_duration_seconds` metric.
* vmalert: update config reload success metric properly
Previously, if there was unsuccessfull attempt to reload config and then
rollback to previous version - the metric remained set to 0.
* vmalert: add Grafana dashboard to overview application metrics
* docker: include vmalert target into list for scraping
* vmalert: extend notifier metrics with addr label
The change adds an `addr` label to metrics for alerts_sent and alerts_send_errors
to identify which exact address is having issues.
The according change was made to vmalert dashboard.
* vmalert: update documentation and docker environment for vmalert's dashboard
Mention Grafana's dashboard in vmalert's README in a new section #Monitoring.
Update docker-compose env to automatically add vmalert's dashboard.
Update docker-compose README with additional info about services.