alerts: add docs section for the full list of alerting rules

The change also includes update of all references in other docs
to the alerting rules.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
This commit is contained in:
hagen1778 2023-08-03 10:21:18 +02:00
parent e311a7bf80
commit 1043fc1fd9
No known key found for this signature in database
GPG key ID: 3BF75F3741CA9640
9 changed files with 29 additions and 14 deletions

View file

@ -1782,7 +1782,7 @@ created by community.
Graphs on the dashboards contain useful hints - hover the `i` icon in the top left corner of each graph to read it.
We recommend setting up [alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml)
We recommend setting up [alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#alerts)
via [vmalert](https://docs.victoriametrics.com/vmalert.html) or via Prometheus.
VictoriaMetrics exposes currently running queries and their execution times at `/api/v1/status/active_queries` page.

View file

@ -109,3 +109,22 @@ Grafana is provisioned by default with following entities:
* `VictoriaMetrics - vmalert` dashboard
Remember to pick `VictoriaMetrics - cluster` datasource when viewing `VictoriaMetrics - cluster` dashboard.
## Alerts
See below a list of recommended alerting rules for various VictoriaMetrics components for running in production.
Some of the alerting rules thresholds are just recommendations and could require an adjustment. The list
of alerting rules is the following:
* [alerts-health.yml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts-health.yml):
alerting rules related to all VictoriaMetrics components for tracking their "health" state;
* [alerts.yml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml):
alerting rules related to [single-server VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html) installation;
* [alerts-cluster.yml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts-cluster.yml):
alerting rules related to [cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html);
* [alerts-vmagent.yml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts-vmagent.yml):
alerting rules related to [vmagent](https://docs.victoriametrics.com/vmagent.html) component;
* [alerts-vmalert.yml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts-vmalert.yml):
alerting rules related to [vmalert](https://docs.victoriametrics.com/vmalert.html) component;
Please, also see [how to monitor](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#monitoring)
VictoriaMetrics installations.

View file

@ -2139,7 +2139,7 @@ in front of VictoriaMetrics. [Contact us](mailto:sales@victoriametrics.com) if y
Released at 2021-01-13
* FEATURE: provide a sample list of alerting rules for VictoriaMetrics components. It is available [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml).
* FEATURE: provide a sample list of alerting rules for VictoriaMetrics components. It is available [here](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#alerts).
* FEATURE: disable final merge for data for the previous month at the beginning of new month, since it may result in high disk IO and CPU usage. Final merge can be enabled by setting `-finalMergeDelay` command-line flag to positive duration.
* FEATURE: add `tfirst_over_time(m[d])` and `tlast_over_time(m[d])` functions to [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) for returning timestamps for the first and the last data point in `m` over `d` duration.
* FEATURE: add ability to pass multiple labels to `sort_by_label()` and `sort_by_label_desc()` functions. See <https://github.com/VictoriaMetrics/VictoriaMetrics/issues/992> .

View file

@ -294,7 +294,7 @@ or Prometheus to scrape `/metrics` pages from all the cluster components, so the
with [the official Grafana dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176-victoriametrics-cluster/)
or [an alternative dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11831). Graphs on these dashboards contain useful hints - hover the `i` icon at the top left corner of each graph in order to read it.
It is recommended setting up alerts in [vmalert](https://docs.victoriametrics.com/vmalert.html) or in Prometheus from [this config](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/alerts.yml).
It is recommended setting up alerts in [vmalert](https://docs.victoriametrics.com/vmalert.html) or in Prometheus from [this list](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#alerts).
See more details in the article [VictoriaMetrics Monitoring](https://victoriametrics.com/blog/victoriametrics-monitoring/).
## Cardinality limiter

View file

@ -145,8 +145,7 @@ VictoriaMetric team prepared a list of [Grafana dashboards](https://grafana.com/
for the main components. Each dashboard contains a lot of useful information and tips. It is recommended
to have these dashboards installed and up to date.
The list of alerts for [single](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml)
and [cluster](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/alerts.yml)
Using the [recommended alerting rules](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#alerts)
versions would also help to identify and notify about issues with the system.
The rule of thumb is to have a separate installation of VictoriaMetrics or any other monitoring system

View file

@ -1785,7 +1785,7 @@ created by community.
Graphs on the dashboards contain useful hints - hover the `i` icon in the top left corner of each graph to read it.
We recommend setting up [alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml)
We recommend setting up [alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#alerts)
via [vmalert](https://docs.victoriametrics.com/vmalert.html) or via Prometheus.
VictoriaMetrics exposes currently running queries and their execution times at `/api/v1/status/active_queries` page.

View file

@ -1793,7 +1793,7 @@ created by community.
Graphs on the dashboards contain useful hints - hover the `i` icon in the top left corner of each graph to read it.
We recommend setting up [alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml)
We recommend setting up [alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#alerts)
via [vmalert](https://docs.victoriametrics.com/vmalert.html) or via Prometheus.
VictoriaMetrics exposes currently running queries and their execution times at `/api/v1/status/active_queries` page.

View file

@ -414,9 +414,8 @@ would help identify and prevent most of the issues listed above.
[Grafana dashboards](https://grafana.com/orgs/victoriametrics/dashboards) contain panels reflecting the
health state, resource usage and other specific metrics for VictoriaMetrics components.
Alerting rules for [single-node](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml)
and [cluster](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/alerts.yml) versions
of VictoriaMetrics will notify about issues with Victoriametrics components and provide recommendations for how to solve them.
The list of [recommended alerting rules](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#alerts)
for VictoriaMetrics components will notify about issues and provide recommendations for how to solve them.
Internally, we heavily rely both on dashboards and alerts, and constantly improve them.
It is important to stay up to date with such changes.

View file

@ -75,10 +75,8 @@ You can set up vmalert in each Ground control region that evaluates recording an
For alert deduplication, please use [cluster mode in Alertmanager](https://prometheus.io/docs/alerting/latest/alertmanager/#high-availability).
We also recommend adopting these alerts:
* VictoriaMetrics Single - [https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml)
* VictoriaMetrics Cluster - [https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/alerts.yml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/alerts.yml)
We also recommend adopting the list of [alerting rules](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#alerts)
for VictoriaMetrics components.
### Monitoring