diff --git a/app/vmalert/README.md b/app/vmalert/README.md index 9e8ad625a..67a72b7d7 100644 --- a/app/vmalert/README.md +++ b/app/vmalert/README.md @@ -746,7 +746,7 @@ the latter will have higher priority. ### Notifier configuration file -Notifier also supports configuration vai file specified with flag `notifier.config`: +Notifier also supports configuration via file specified with flag `notifier.config`: ``` ./bin/vmalert -rule=app/vmalert/config/testdata/rules.good.rules \ -datasource.url=http://localhost:8428 \ diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index 183ec9311..785a89a82 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -13,6 +13,8 @@ sort: 15 * Aggregate functions. For example, `sum(foo{a="b"}) by (c) + bar{c="d"}` is now optimized to `sum(foo{a="b",c="d"}) by (c) + bar{c="d"}` * FEATURE [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html): optimize joining with `*_info` labels. For example: `kube_pod_created{namespace="prod"} * on (uid) group_left(node) kube_pod_info` now automatically adds the needed filters on `uid` label to `kube_pod_info` before selecting series for the right side of `*` operation. This may save CPU, RAM and disk IO resources. See [this article](https://www.robustperception.io/exposing-the-software-version-to-prometheus) for details on `*_info` labels. * FEATURE: all: expose `process_cpu_cores_available` metric, which shows the number of CPU cores available to the app. The number can be fractional if the corresponding cgroup limit is set to a fractional value. This metric is useful for alerting on CPU saturation. For example, the following query alerts when the app uses more than 90% of CPU during the last 5 minutes: `rate(process_cpu_seconds_total[5m]) / process_cpu_cores_available > 0.9` . See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2107). +* FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): add ability to configure notifiers (e.g. alertmanager) via a file in the way similar to Prometheus. See [these docs](https://docs.victoriametrics.com/vmalert.html#notifier-configuration-file), [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2127). +* FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): add support for Consul service discovery for notifiers. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1947). * BUGFIX: return proper results from `highestMax()` function at [Graphite render API](https://docs.victoriametrics.com/#graphite-render-api-usage). Previously it was incorrectly returning timeseries with min peaks instead of max peaks. * BUGFIX: properly limit indexdb cache sizes. Previously they could exceed values set via `-memory.allowedPercent` and/or `-memory.allowedBytes` when `indexdb` contained many data parts. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007). diff --git a/docs/vmalert.md b/docs/vmalert.md index 5d9aa775f..4b1996a03 100644 --- a/docs/vmalert.md +++ b/docs/vmalert.md @@ -47,7 +47,8 @@ To start using `vmalert` you will need the following things: * list of rules - PromQL/MetricsQL expressions to execute; * datasource address - reachable MetricsQL endpoint to run queries against; * notifier address [optional] - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing, -aggregating alerts, and sending notifications. +aggregating alerts, and sending notifications. Please note, notifier address also supports Consul Service Discovery via +[config file](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go). * remote write address [optional] - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations) compatible storage to persist rules and alerts state info; * remote read address [optional] - MetricsQL compatible datasource to restore alerts state from. @@ -591,6 +592,9 @@ The shortlist of configuration flags is the following: -notifier.basicAuth.password array Optional basic auth password for -notifier.url Supports an array of values separated by comma or specified via multiple flags. + -notifier.basicAuth.passwordFile array + Optional path to basic auth password file for -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.basicAuth.username array Optional basic auth username for -notifier.url Supports an array of values separated by comma or specified via multiple flags. @@ -693,8 +697,8 @@ The shortlist of configuration flags is the following: absolute path to all .yaml files in root. Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars. Supports an array of values separated by comma or specified via multiple flags. - -rule.configCheckInterval duration - Interval for checking for changes in '-rule' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes + -configCheckInterval duration + Interval for checking for changes in '-rule' or '-notifier.config' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes -rule.maxResolveDuration duration Limits the maximum duration for automatic alert expiration, which is by default equal to 3 evaluation intervals of the parent group. -rule.validateExpressions @@ -707,6 +711,14 @@ The shortlist of configuration flags is the following: Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower -tlsKeyFile string Path to file with TLS key. Used only if -tls is set + -promscrape.consul.waitTime duration + Wait time used by Consul service discovery. Default value is used if not set + -promscrape.consulSDCheckInterval duration + Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) + -promscrape.discovery.concurrency int + The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100) + -promscrape.discovery.concurrentWaitTime duration + The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s) -version Show VictoriaMetrics version ``` @@ -715,7 +727,7 @@ The shortlist of configuration flags is the following: `vmalert` supports "hot" config reload via the following methods: * send SIGHUP signal to `vmalert` process; * send GET request to `/-/reload` endpoint; -* configure `-rule.configCheckInterval` flag for periodic reload +* configure `-configCheckInterval` flag for periodic reload on config change. ### URL params @@ -736,6 +748,88 @@ Please note, `params` are used only for executing rules expressions (requests to If there would be a conflict between URL params set in `datasource.url` flag and params in group definition the latter will have higher priority. +### Notifier configuration file + +Notifier also supports configuration via file specified with flag `notifier.config`: +``` +./bin/vmalert -rule=app/vmalert/config/testdata/rules.good.rules \ + -datasource.url=http://localhost:8428 \ + -notifier.config=app/vmalert/notifier/testdata/consul.good.yaml +``` + +The configuration file allows to configure static notifiers or discover notifiers via +[Consul](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config). +For example: +``` +static_configs: + - targets: + - localhost:9093 + - localhost:9095 + +consul_sd_configs: + - server: localhost:8500 + services: + - alertmanager +``` + +The list of configured or discovered Notifiers can be explored via [UI](#Web). + +The configuration file [specification](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go) +is the following: +``` +# Per-target Notifier timeout when pushing alerts. +[ timeout: | default = 10s ] + +# Prefix for the HTTP path alerts are pushed to. +[ path_prefix: | default = / ] + +# Configures the protocol scheme used for requests. +[ scheme: | default = http ] + +# Sets the `Authorization` header on every request with the +# configured username and password. +# password and password_file are mutually exclusive. +basic_auth: + [ username: ] + [ password: ] + [ password_file: ] + +# Optional `Authorization` header configuration. +authorization: + # Sets the authentication type. + [ type: | default: Bearer ] + # Sets the credentials. It is mutually exclusive with + # `credentials_file`. + [ credentials: ] + # Sets the credentials to the credentials read from the configured file. + # It is mutually exclusive with `credentials`. + [ credentials_file: ] + +# Configures the scrape request's TLS settings. +# see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config +tls_config: + [ ] + +# List of labeled statically configured Notifiers. +static_configs: + targets: + [ - '' ] + +# List of Consul service discovery configurations. +# See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config +consul_sd_configs: + [ - ... ] + +# List of relabel configurations. +# Supports the same relabeling features as the rest of VictoriaMetrics components. +# See https://docs.victoriametrics.com/vmagent.html#relabeling +relabel_configs: + [ - ... ] + +``` + +The configuration file can be [hot-reloaded](#hot-config-reload). + ## Contributing