Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files

This commit is contained in:
Aliaksandr Valialkin 2022-05-20 14:34:02 +03:00
commit 5a60387eea
No known key found for this signature in database
GPG key ID: A72BEC6CD3D0DED1
149 changed files with 10528 additions and 1108 deletions

View file

@ -283,7 +283,7 @@ golangci-lint: install-golangci-lint
golangci-lint run --exclude '(SA4003|SA1019|SA5011):' -D errcheck -D structcheck --timeout 2m golangci-lint run --exclude '(SA4003|SA1019|SA5011):' -D errcheck -D structcheck --timeout 2m
install-golangci-lint: install-golangci-lint:
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.45.1 which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.46.1
install-wwhrd: install-wwhrd:
which wwhrd || GO111MODULE=off go get github.com/frapposelli/wwhrd which wwhrd || GO111MODULE=off go get github.com/frapposelli/wwhrd

View file

@ -1055,6 +1055,25 @@ It is recommended leaving the following amounts of spare resources:
* 50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload. * 50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload.
* At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag. See also `-storage.minFreeDiskSpaceBytes` command-line flag description [here](#list-of-command-line-flags). * At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag. See also `-storage.minFreeDiskSpaceBytes` command-line flag description [here](#list-of-command-line-flags).
See also [resource usage limits docs](#resource-usage-limits).
## Resource usage limits
By default VictoriaMetrics is tuned for an optimal resource usage under typical workloads. Some workloads may need fine-grained resource usage limits. In these cases the following command-line flags may be useful:
- `-memory.allowedPercent` and `-search.allowedBytes` limit the amounts of memory, which may be used for various internal caches at VictoriaMetrics. Note that VictoriaMetrics may use more memory, since these flags don't limit additional memory, which may be needed on a per-query basis.
- `-search.maxUniqueTimeseries` limits the number of unique time series a single query can find and process. VictoriaMetrics keeps in memory some metainformation about the time series located by each query and spends some CPU time for processing the found time series. This means that the maximum memory usage and CPU usage a single query can use is proportional to `-search.maxUniqueTimeseries`.
- `-search.maxQueryDuration` limits the duration of a single query. If the query takes longer than the given duration, then it is canceled. This allows saving CPU and RAM when executing unexpected heavy queries.
- `-search.maxConcurrentRequests` limits the number of concurrent requests VictoriaMetrics can process. Bigger number of concurrent requests usually means bigger memory usage. For example, if a single query needs 100 MiB of additional memory during its execution, then 100 concurrent queries may need `100 * 100 MiB = 10 GiB` of additional memory. So it is better to limit the number of concurrent queries, while suspending additional incoming queries if the concurrency limit is reached. VictoriaMetrics provides `-search.maxQueueDuration` command-line flag for limiting the max wait time for suspended queries.
- `-search.maxSamplesPerSeries` limits the number of raw samples the query can process per each time series. VictoriaMetrics sequentially processes raw samples per each found time series during the query. It unpacks raw samples on the selected time range per each time series into memory and then applies the given [rollup function](https://docs.victoriametrics.com/MetricsQL.html#rollup-functions). The `-search.maxSamplesPerSeries` command-line flag allows limiting memory usage in the case when the query is executed on a time range, which contains hundreds of millions of raw samples per each located time series.
- `-search.maxSamplesPerQuery` limits the number of raw samples a single query can process. This allows limiting CPU usage for heavy queries.
- `-search.maxSeries` limits the number of time series, which may be returned from [/api/v1/series](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers). This endpoint is used mostly by Grafana for auto-completion of metric names, label names and label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxSeries` to quite low value in order limit CPU and memory usage.
- `-search.maxTagKeys` limits the number of items, which may be returned from [/api/v1/labels](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names). This endpoint is used mostly by Grafana for auto-completion of label names. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagKeys` to quite low value in order to limit CPU and memory usage.
- `-search.maxTagValues` limits the number of items, which may be returned from [/api/v1/label/.../values](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values). This endpoint is used mostly by Grafana for auto-completion of label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagValues` to quite low value in order to limit CPU and memory usage.
See also [capacity planning docs](#capacity-planning).
## High availability ## High availability
* Install multiple VictoriaMetrics instances in distinct datacenters (availability zones). * Install multiple VictoriaMetrics instances in distinct datacenters (availability zones).
@ -1682,7 +1701,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-influxDBLabel string -influxDBLabel string
Default label for the DB name sent over '?db={db_name}' query parameter (default "db") Default label for the DB name sent over '?db={db_name}' query parameter (default "db")
-influxListenAddr string -influxListenAddr string
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write
-influxMeasurementFieldSeparator string -influxMeasurementFieldSeparator string
Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_")
-influxSkipMeasurement -influxSkipMeasurement
@ -1745,7 +1764,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-promscrape.cluster.membersCount int -promscrape.cluster.membersCount int
The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets
-promscrape.cluster.replicationFactor int -promscrape.cluster.replicationFactor int
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1) The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 1, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
-promscrape.config string -promscrape.config string
Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details
-promscrape.config.dryRun -promscrape.config.dryRun

View file

@ -765,7 +765,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-influxDBLabel string -influxDBLabel string
Default label for the DB name sent over '?db={db_name}' query parameter (default "db") Default label for the DB name sent over '?db={db_name}' query parameter (default "db")
-influxListenAddr string -influxListenAddr string
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<vmagent>:8429/write TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<vmagent>:8429/write
-influxMeasurementFieldSeparator string -influxMeasurementFieldSeparator string
Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_")
-influxSkipMeasurement -influxSkipMeasurement
@ -846,7 +846,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-promscrape.cluster.membersCount int -promscrape.cluster.membersCount int
The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets
-promscrape.cluster.replicationFactor int -promscrape.cluster.replicationFactor int
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1) The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 1, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
-promscrape.config string -promscrape.config string
Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details
-promscrape.config.dryRun -promscrape.config.dryRun

View file

@ -43,7 +43,7 @@ var (
httpListenAddr = flag.String("httpListenAddr", ":8429", "TCP address to listen for http connections. "+ httpListenAddr = flag.String("httpListenAddr", ":8429", "TCP address to listen for http connections. "+
"Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. "+ "Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. "+
"Note that /targets and /metrics pages aren't available if -httpListenAddr=''") "Note that /targets and /metrics pages aren't available if -httpListenAddr=''")
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. "+ influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. "+
"This flag isn't needed when ingesting data over HTTP - just send it to http://<vmagent>:8429/write") "This flag isn't needed when ingesting data over HTTP - just send it to http://<vmagent>:8429/write")
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty") graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+ opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+

View file

@ -70,6 +70,8 @@ var (
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url") "If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
awsAccessKey = flagutil.NewArray("remoteWrite.aws.accessKey", "Optional AWS AccessKey to use for -remoteWrite.url if -remoteWrite.aws.useSigv4 is set. "+ awsAccessKey = flagutil.NewArray("remoteWrite.aws.accessKey", "Optional AWS AccessKey to use for -remoteWrite.url if -remoteWrite.aws.useSigv4 is set. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url") "If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
awsService = flagutil.NewArray("remoteWrite.aws.serice", "Optional AWS Service to use for -remoteWrite.url if -remoteWrite.aws.useSigv4 is set. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url. Defaults to \"aps\".")
awsSecretKey = flagutil.NewArray("remoteWrite.aws.secretKey", "Optional AWS SecretKey to use for -remoteWrite.url if -remoteWrite.aws.useSigv4 is set. "+ awsSecretKey = flagutil.NewArray("remoteWrite.aws.secretKey", "Optional AWS SecretKey to use for -remoteWrite.url if -remoteWrite.aws.useSigv4 is set. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url") "If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
) )
@ -232,7 +234,8 @@ func getAWSAPIConfig(argIdx int) (*awsapi.Config, error) {
roleARN := awsRoleARN.GetOptionalArg(argIdx) roleARN := awsRoleARN.GetOptionalArg(argIdx)
accessKey := awsAccessKey.GetOptionalArg(argIdx) accessKey := awsAccessKey.GetOptionalArg(argIdx)
secretKey := awsSecretKey.GetOptionalArg(argIdx) secretKey := awsSecretKey.GetOptionalArg(argIdx)
cfg, err := awsapi.NewConfig(region, roleARN, accessKey, secretKey) service := awsService.GetOptionalArg(argIdx)
cfg, err := awsapi.NewConfig(region, roleARN, accessKey, secretKey, service)
if err != nil { if err != nil {
return nil, err return nil, err
} }
@ -307,7 +310,7 @@ again:
req.Header.Set("Authorization", ah) req.Header.Set("Authorization", ah)
} }
if c.awsCfg != nil { if c.awsCfg != nil {
if err := c.awsCfg.SignRequest(req, "aps", sigv4Hash); err != nil { if err := c.awsCfg.SignRequest(req, sigv4Hash); err != nil {
// there is no need in retry, request will be rejected by client.Do and retried by code below // there is no need in retry, request will be rejected by client.Do and retried by code below
logger.Warnf("cannot sign remoteWrite request with AWS sigv4: %s", err) logger.Warnf("cannot sign remoteWrite request with AWS sigv4: %s", err)
} }

View file

@ -62,13 +62,14 @@ publish-vmalert:
test-vmalert: test-vmalert:
go test -v -race -cover ./app/vmalert -loggerLevel=ERROR go test -v -race -cover ./app/vmalert -loggerLevel=ERROR
go test -v -race -cover ./app/vmalert/templates
go test -v -race -cover ./app/vmalert/datasource go test -v -race -cover ./app/vmalert/datasource
go test -v -race -cover ./app/vmalert/notifier go test -v -race -cover ./app/vmalert/notifier
go test -v -race -cover ./app/vmalert/config go test -v -race -cover ./app/vmalert/config
go test -v -race -cover ./app/vmalert/remotewrite go test -v -race -cover ./app/vmalert/remotewrite
run-vmalert: vmalert run-vmalert: vmalert
./bin/vmalert -rule=app/vmalert/config/testdata/rules2-good.rules \ ./bin/vmalert -rule=app/vmalert/config/testdata/rules/rules2-good.rules \
-datasource.url=http://localhost:8428 \ -datasource.url=http://localhost:8428 \
-notifier.url=http://localhost:9093 \ -notifier.url=http://localhost:9093 \
-notifier.url=http://127.0.0.1:9093 \ -notifier.url=http://127.0.0.1:9093 \
@ -77,7 +78,7 @@ run-vmalert: vmalert
-external.label=cluster=east-1 \ -external.label=cluster=east-1 \
-external.label=replica=a \ -external.label=replica=a \
-evaluationInterval=3s \ -evaluationInterval=3s \
-rule.configCheckInterval=10s -configCheckInterval=10s
run-vmalert-sd: vmalert run-vmalert-sd: vmalert
./bin/vmalert -rule=app/vmalert/config/testdata/rules2-good.rules \ ./bin/vmalert -rule=app/vmalert/config/testdata/rules2-good.rules \

View file

@ -13,23 +13,24 @@ implementation and aims to be compatible with its syntax.
* Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB; * Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB;
* VictoriaMetrics [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) * VictoriaMetrics [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html)
support and expressions validation; support and expressions validation;
* Prometheus [alerting rules definition format](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#defining-alerting-rules) * Prometheus [alerting rules definition format](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#defining-alerting-rules)
support; support;
* Integration with [Alertmanager](https://github.com/prometheus/alertmanager) starting from [Alertmanager v0.16.0-aplha](https://github.com/prometheus/alertmanager/releases/tag/v0.16.0-alpha.0); * Integration with [Alertmanager](https://github.com/prometheus/alertmanager) starting from [Alertmanager v0.16.0-aplha](https://github.com/prometheus/alertmanager/releases/tag/v0.16.0-alpha.0);
* Keeps the alerts [state on restarts](#alerts-state-on-restarts); * Keeps the alerts [state on restarts](#alerts-state-on-restarts);
* Graphite datasource can be used for alerting and recording rules. See [these docs](#graphite); * Graphite datasource can be used for alerting and recording rules. See [these docs](#graphite);
* Recording and Alerting rules backfilling (aka `replay`). See [these docs](#rules-backfilling); * Recording and Alerting rules backfilling (aka `replay`). See [these docs](#rules-backfilling);
* Lightweight without extra dependencies. * Lightweight without extra dependencies.
* Supports [reusable templates](#reusable-templates) for annotations.
## Limitations ## Limitations
* `vmalert` execute queries against remote datasource which has reliability risks because of the network. * `vmalert` execute queries against remote datasource which has reliability risks because of the network.
It is recommended to configure alerts thresholds and rules expressions with the understanding that network It is recommended to configure alerts thresholds and rules expressions with the understanding that network
requests may fail; requests may fail;
* by default, rules execution is sequential within one group, but persistence of execution results to remote * by default, rules execution is sequential within one group, but persistence of execution results to remote
storage is asynchronous. Hence, user shouldn't rely on chaining of recording rules when result of previous storage is asynchronous. Hence, user shouldn't rely on chaining of recording rules when result of previous
recording rule is reused in the next one; recording rule is reused in the next one;
## QuickStart ## QuickStart
@ -48,8 +49,8 @@ To start using `vmalert` you will need the following things:
* list of rules - PromQL/MetricsQL expressions to execute; * list of rules - PromQL/MetricsQL expressions to execute;
* datasource address - reachable MetricsQL endpoint to run queries against; * datasource address - reachable MetricsQL endpoint to run queries against;
* notifier address [optional] - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing, * notifier address [optional] - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing,
aggregating alerts, and sending notifications. Please note, notifier address also supports Consul and DNS Service Discovery via aggregating alerts, and sending notifications. Please note, notifier address also supports Consul and DNS Service Discovery via
[config file](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go). [config file](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go).
* remote write address [optional] - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations) * remote write address [optional] - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
compatible storage to persist rules and alerts state info; compatible storage to persist rules and alerts state info;
* remote read address [optional] - MetricsQL compatible datasource to restore alerts state from. * remote read address [optional] - MetricsQL compatible datasource to restore alerts state from.
@ -146,12 +147,12 @@ expression and then act according to the Rule type.
There are two types of Rules: There are two types of Rules:
* [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) - * [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) -
Alerting rules allow defining alert conditions via `expr` field and to send notifications to Alerting rules allow defining alert conditions via `expr` field and to send notifications to
[Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty. [Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty.
* [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) - * [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) -
Recording rules allow defining `expr` which result will be then backfilled to configured Recording rules allow defining `expr` which result will be then backfilled to configured
`-remoteWrite.url`. Recording rules are used to precompute frequently needed or computationally `-remoteWrite.url`. Recording rules are used to precompute frequently needed or computationally
expensive expressions and save their result as a new set of time series. expensive expressions and save their result as a new set of time series.
`vmalert` forbids defining duplicates - rules with the same combination of name, expression, and labels `vmalert` forbids defining duplicates - rules with the same combination of name, expression, and labels
within one group. within one group.
@ -184,10 +185,52 @@ annotations:
[ <labelname>: <tmpl_string> ] [ <labelname>: <tmpl_string> ]
``` ```
It is allowed to use [Go templating](https://golang.org/pkg/text/template/) in annotations It is allowed to use [Go templating](https://golang.org/pkg/text/template/) in annotations to format data, iterate over it or execute expressions.
to format data, iterate over it or execute expressions.
Additionally, `vmalert` provides some extra templating functions Additionally, `vmalert` provides some extra templating functions
listed [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/template_func.go). listed [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/template_func.go) and [reusable templates](#reusable-templates).
#### Reusable templates
Like in Alertmanager you can define [reusable templates](https://prometheus.io/docs/prometheus/latest/configuration/template_examples/#defining-reusable-templates)
to share same templates across annotations. Just define the templates in a file and
set the path via `-rule.templates` flag.
For example, template `grafana.filter` can be defined as following:
{% raw %}
```
{{ define "grafana.filter" -}}
{{- $labels := .arg0 -}}
{{- range $name, $label := . -}}
{{- if (ne $name "arg0") -}}
{{- ( or (index $labels $label) "All" ) | printf "&var-%s=%s" $label -}}
{{- end -}}
{{- end -}}
{{- end -}}
```
{% endraw %}
And then used in annotations:
{% raw %}
```yaml
groups:
- name: AlertGroupName
rules:
- alert: AlertName
expr: any_metric > 100
for: 30s
labels:
alertname: 'Any metric is too high'
severity: 'warning'
annotations:
dashboard: '{{ $externalURL }}/d/dashboard?orgId=1{{ template "grafana.filter" (args .CommonLabels "account_id" "any_label") }}'
```
{% endraw %}
The `-rule.templates` flag supports wildcards so multiple files with templates can be loaded.
The content of `-rule.templates` can be also [hot reloaded](#hot-config-reload).
#### Recording rules #### Recording rules
@ -215,11 +258,11 @@ For recording rules to work `-remoteWrite.url` must be specified.
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags: the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state * `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol. into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
These are regular time series and maybe queried from VM just as any other time series. These are regular time series and maybe queried from VM just as any other time series.
The state is stored to the configured address on every rule evaluation. The state is stored to the configured address on every rule evaluation.
* `-remoteRead.url` - URL to VictoriaMetrics (Single) or vmselect (Cluster). `vmalert` will try to restore alerts state * `-remoteRead.url` - URL to VictoriaMetrics (Single) or vmselect (Cluster). `vmalert` will try to restore alerts state
from configured address by querying time series with name `ALERTS_FOR_STATE`. from configured address by querying time series with name `ALERTS_FOR_STATE`.
Both flags are required for proper state restoration. Restore process may fail if time series are missing Both flags are required for proper state restoration. Restore process may fail if time series are missing
in configured `-remoteRead.url`, weren't updated in the last `1h` (controlled by `-remoteRead.lookback`) in configured `-remoteRead.url`, weren't updated in the last `1h` (controlled by `-remoteRead.lookback`)
@ -275,7 +318,7 @@ for different scenarios.
Please note, not all flags in examples are required: Please note, not all flags in examples are required:
* `-remoteWrite.url` and `-remoteRead.url` are optional and are needed only if * `-remoteWrite.url` and `-remoteRead.url` are optional and are needed only if
you have recording rules or want to store [alerts state](#alerts-state-on-restarts) on `vmalert` restarts; you have recording rules or want to store [alerts state](#alerts-state-on-restarts) on `vmalert` restarts;
* `-notifier.url` is optional and is needed only if you have alerting rules. * `-notifier.url` is optional and is needed only if you have alerting rules.
#### Single-node VictoriaMetrics #### Single-node VictoriaMetrics
@ -341,6 +384,7 @@ Alertmanagers.
To avoid recording rules results and alerts state duplication in VictoriaMetrics server To avoid recording rules results and alerts state duplication in VictoriaMetrics server
don't forget to configure [deduplication](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#deduplication). don't forget to configure [deduplication](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#deduplication).
The recommended value for `-dedup.minScrapeInterval` must be greater or equal to vmalert's `evaluation_interval`.
Alertmanager will automatically deduplicate alerts with identical labels, so ensure that Alertmanager will automatically deduplicate alerts with identical labels, so ensure that
all `vmalert`s are having the same config. all `vmalert`s are having the same config.
@ -384,7 +428,7 @@ See also [downsampling docs](https://docs.victoriametrics.com/#downsampling).
* `http://<vmalert-addr>/api/v1/rules` - list of all loaded groups and rules; * `http://<vmalert-addr>/api/v1/rules` - list of all loaded groups and rules;
* `http://<vmalert-addr>/api/v1/alerts` - list of all active alerts; * `http://<vmalert-addr>/api/v1/alerts` - list of all active alerts;
* `http://<vmalert-addr>/api/v1/<groupID>/<alertID>/status"` - get alert status by ID. * `http://<vmalert-addr>/api/v1/<groupID>/<alertID>/status"` - get alert status by ID.
Used as alert source in AlertManager. Used as alert source in AlertManager.
* `http://<vmalert-addr>/metrics` - application metrics. * `http://<vmalert-addr>/metrics` - application metrics.
* `http://<vmalert-addr>/-/reload` - hot configuration reload. * `http://<vmalert-addr>/-/reload` - hot configuration reload.
@ -473,17 +517,17 @@ Execute the query against storage which was used for `-remoteWrite.url` during t
There are following non-required `replay` flags: There are following non-required `replay` flags:
* `-replay.maxDatapointsPerQuery` - the max number of data points expected to receive in one request. * `-replay.maxDatapointsPerQuery` - the max number of data points expected to receive in one request.
In two words, it affects the max time range for every `/query_range` request. The higher the value, In two words, it affects the max time range for every `/query_range` request. The higher the value,
the fewer requests will be issued during `replay`. the fewer requests will be issued during `replay`.
* `-replay.ruleRetryAttempts` - when datasource fails to respond vmalert will make this number of retries * `-replay.ruleRetryAttempts` - when datasource fails to respond vmalert will make this number of retries
per rule before giving up. per rule before giving up.
* `-replay.rulesDelay` - delay between sequential rules execution. Important in cases if there are chaining * `-replay.rulesDelay` - delay between sequential rules execution. Important in cases if there are chaining
(rules which depend on each other) rules. It is expected, that remote storage will be able to persist (rules which depend on each other) rules. It is expected, that remote storage will be able to persist
previously accepted data during the delay, so data will be available for the subsequent queries. previously accepted data during the delay, so data will be available for the subsequent queries.
Keep it equal or bigger than `-remoteWrite.flushInterval`. Keep it equal or bigger than `-remoteWrite.flushInterval`.
* `replay.disableProgressBar` - whether to disable progress bar which shows progress work. * `replay.disableProgressBar` - whether to disable progress bar which shows progress work.
Progress bar may generate a lot of log records, which is not formatted as standard VictoriaMetrics logger. Progress bar may generate a lot of log records, which is not formatted as standard VictoriaMetrics logger.
It could break logs parsing by external system and generate additional load on it. It could break logs parsing by external system and generate additional load on it.
See full description for these flags in `./vmalert --help`. See full description for these flags in `./vmalert --help`.
@ -792,6 +836,11 @@ The shortlist of configuration flags is the following:
absolute path to all .yaml files in root. absolute path to all .yaml files in root.
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars. Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.
Supports an array of values separated by comma or specified via multiple flags. Supports an array of values separated by comma or specified via multiple flags.
-rule.templates
Path or glob pattern to location with go template definitions for rules annotations templating. Flag can be specified multiple times.
Examples:
-rule.templates="/path/to/file". Path to a single file with go templates
-rule.templates="dir/*.tpl" -rule.templates="/*.tpl". Relative path to all .tpl files in "dir" folder, absolute path to all .tpl files in root.
-rule.configCheckInterval duration -rule.configCheckInterval duration
Interval for checking for changes in '-rule' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. DEPRECATED - see '-configCheckInterval' instead Interval for checking for changes in '-rule' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. DEPRECATED - see '-configCheckInterval' instead
-rule.maxResolveDuration duration -rule.maxResolveDuration duration
@ -822,7 +871,7 @@ The shortlist of configuration flags is the following:
* send SIGHUP signal to `vmalert` process; * send SIGHUP signal to `vmalert` process;
* send GET request to `/-/reload` endpoint; * send GET request to `/-/reload` endpoint;
* configure `-configCheckInterval` flag for periodic reload * configure `-configCheckInterval` flag for periodic reload
on config change. on config change.
### URL params ### URL params

View file

@ -12,6 +12,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger" "github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal" "github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
@ -152,7 +153,7 @@ type labelSet struct {
// toLabels converts labels from given Metric // toLabels converts labels from given Metric
// to labelSet which contains original and processed labels. // to labelSet which contains original and processed labels.
func (ar *AlertingRule) toLabels(m datasource.Metric, qFn notifier.QueryFn) (*labelSet, error) { func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*labelSet, error) {
ls := &labelSet{ ls := &labelSet{
origin: make(map[string]string, len(m.Labels)), origin: make(map[string]string, len(m.Labels)),
processed: make(map[string]string), processed: make(map[string]string),
@ -323,7 +324,7 @@ func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time) ([]prompbmarshal
} }
continue continue
} }
if a.State == notifier.StatePending && time.Since(a.ActiveAt) >= ar.For { if a.State == notifier.StatePending && ts.Sub(a.ActiveAt) >= ar.For {
a.State = notifier.StateFiring a.State = notifier.StateFiring
a.Start = ts a.Start = ts
alertsFired.Inc() alertsFired.Inc()
@ -382,7 +383,7 @@ func hash(labels map[string]string) uint64 {
return hash.Sum64() return hash.Sum64()
} }
func (ar *AlertingRule) newAlert(m datasource.Metric, ls *labelSet, start time.Time, qFn notifier.QueryFn) (*notifier.Alert, error) { func (ar *AlertingRule) newAlert(m datasource.Metric, ls *labelSet, start time.Time, qFn templates.QueryFn) (*notifier.Alert, error) {
var err error var err error
if ls == nil { if ls == nil {
ls, err = ar.toLabels(m, qFn) ls, err = ar.toLabels(m, qFn)

View file

@ -10,18 +10,19 @@ import (
"gopkg.in/yaml.v2" "gopkg.in/yaml.v2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils" "github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
) )
func TestMain(m *testing.M) { func TestMain(m *testing.M) {
u, _ := url.Parse("https://victoriametrics.com/path") if err := templates.Load([]string{"testdata/templates/*good.tmpl"}, true); err != nil {
notifier.InitTemplateFunc(u) os.Exit(1)
}
os.Exit(m.Run()) os.Exit(m.Run())
} }
func TestParseGood(t *testing.T) { func TestParseGood(t *testing.T) {
if _, err := Parse([]string{"testdata/*good.rules", "testdata/dir/*good.*"}, true, true); err != nil { if _, err := Parse([]string{"testdata/rules/*good.rules", "testdata/dir/*good.*"}, true, true); err != nil {
t.Errorf("error parsing files %s", err) t.Errorf("error parsing files %s", err)
} }
} }
@ -32,7 +33,7 @@ func TestParseBad(t *testing.T) {
expErr string expErr string
}{ }{
{ {
[]string{"testdata/rules0-bad.rules"}, []string{"testdata/rules/rules0-bad.rules"},
"unexpected token", "unexpected token",
}, },
{ {
@ -56,7 +57,7 @@ func TestParseBad(t *testing.T) {
"either `record` or `alert` must be set", "either `record` or `alert` must be set",
}, },
{ {
[]string{"testdata/rules1-bad.rules"}, []string{"testdata/rules/rules1-bad.rules"},
"bad graphite expr", "bad graphite expr",
}, },
} }

View file

@ -2,8 +2,6 @@ groups:
- name: TestGroup - name: TestGroup
interval: 2s interval: 2s
concurrency: 2 concurrency: 2
extra_filter_labels: # deprecated param, use `params` instead
job: victoriametrics
params: params:
denyPartialResponse: ["true"] denyPartialResponse: ["true"]
extra_label: ["env=dev"] extra_label: ["env=dev"]
@ -50,3 +48,11 @@ groups:
sum(code:requests:rate5m{code="200"}) sum(code:requests:rate5m{code="200"})
/ /
sum(code:requests:rate5m) sum(code:requests:rate5m)
- record: code:requests:slo
labels:
recording: true
expr: 0.95
- record: time:current
labels:
recording: true
expr: time()

View file

@ -0,0 +1,3 @@
{{ define "template0" }}
Visit {{ externalURL }}
{{ end }}

View file

@ -0,0 +1,3 @@
{{ define "template1" }}
{{ 1048576 | humanize1024 }}
{{ end }}

View file

@ -0,0 +1,3 @@
{{ define "template2" }}
{{ 1048576 | humanize1024 }}
{{ end }}

View file

@ -0,0 +1,3 @@
{{ define "template3" }}
{{ printf "%s to %s!" "welcome" "hell" | toUpper }}
{{ end }}

View file

@ -0,0 +1,3 @@
{{ define "template3" }}
{{ 1230912039102391023.0 | humanizeDuration }}
{{ end }}

View file

@ -12,7 +12,7 @@ import (
var ( var (
addr = flag.String("datasource.url", "", "VictoriaMetrics or vmselect url. Required parameter. "+ addr = flag.String("datasource.url", "", "VictoriaMetrics or vmselect url. Required parameter. "+
"E.g. http://127.0.0.1:8428") "E.g. http://127.0.0.1:8428 . See also -remoteRead.disablePathAppend")
appendTypePrefix = flag.Bool("datasource.appendTypePrefix", false, "Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL.") appendTypePrefix = flag.Bool("datasource.appendTypePrefix", false, "Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL.")
basicAuthUsername = flag.String("datasource.basicAuth.username", "", "Optional basic auth username for -datasource.url") basicAuthUsername = flag.String("datasource.basicAuth.username", "", "Optional basic auth username for -datasource.url")

View file

@ -24,20 +24,18 @@ type VMStorage struct {
dataSourceType Type dataSourceType Type
evaluationInterval time.Duration evaluationInterval time.Duration
extraParams url.Values extraParams url.Values
disablePathAppend bool
} }
// Clone makes clone of VMStorage, shares http client. // Clone makes clone of VMStorage, shares http client.
func (s *VMStorage) Clone() *VMStorage { func (s *VMStorage) Clone() *VMStorage {
return &VMStorage{ return &VMStorage{
c: s.c, c: s.c,
authCfg: s.authCfg, authCfg: s.authCfg,
datasourceURL: s.datasourceURL, datasourceURL: s.datasourceURL,
lookBack: s.lookBack, lookBack: s.lookBack,
queryStep: s.queryStep, queryStep: s.queryStep,
appendTypePrefix: s.appendTypePrefix, appendTypePrefix: s.appendTypePrefix,
dataSourceType: s.dataSourceType, dataSourceType: s.dataSourceType,
disablePathAppend: s.disablePathAppend,
} }
} }
@ -57,16 +55,15 @@ func (s *VMStorage) BuildWithParams(params QuerierParams) Querier {
} }
// NewVMStorage is a constructor for VMStorage // NewVMStorage is a constructor for VMStorage
func NewVMStorage(baseURL string, authCfg *promauth.Config, lookBack time.Duration, queryStep time.Duration, appendTypePrefix bool, c *http.Client, disablePathAppend bool) *VMStorage { func NewVMStorage(baseURL string, authCfg *promauth.Config, lookBack time.Duration, queryStep time.Duration, appendTypePrefix bool, c *http.Client) *VMStorage {
return &VMStorage{ return &VMStorage{
c: c, c: c,
authCfg: authCfg, authCfg: authCfg,
datasourceURL: strings.TrimSuffix(baseURL, "/"), datasourceURL: strings.TrimSuffix(baseURL, "/"),
appendTypePrefix: appendTypePrefix, appendTypePrefix: appendTypePrefix,
lookBack: lookBack, lookBack: lookBack,
queryStep: queryStep, queryStep: queryStep,
dataSourceType: NewPrometheusType(), dataSourceType: NewPrometheusType(),
disablePathAppend: disablePathAppend,
} }
} }

View file

@ -2,12 +2,18 @@ package datasource
import ( import (
"encoding/json" "encoding/json"
"flag"
"fmt" "fmt"
"net/http" "net/http"
"strconv" "strconv"
"time" "time"
) )
var (
disablePathAppend = flag.Bool("remoteRead.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/query' path "+
"to the configured -datasource.url and -remoteRead.url")
)
type promResponse struct { type promResponse struct {
Status string `json:"status"` Status string `json:"status"`
ErrorType string `json:"errorType"` ErrorType string `json:"errorType"`
@ -25,13 +31,6 @@ type promInstant struct {
} `json:"result"` } `json:"result"`
} }
type promRange struct {
Result []struct {
Labels map[string]string `json:"metric"`
TVs [][2]interface{} `json:"values"`
} `json:"result"`
}
func (r promInstant) metrics() ([]Metric, error) { func (r promInstant) metrics() ([]Metric, error) {
var result []Metric var result []Metric
for i, res := range r.Result { for i, res := range r.Result {
@ -50,6 +49,13 @@ func (r promInstant) metrics() ([]Metric, error) {
return result, nil return result, nil
} }
type promRange struct {
Result []struct {
Labels map[string]string `json:"metric"`
TVs [][2]interface{} `json:"values"`
} `json:"result"`
}
func (r promRange) metrics() ([]Metric, error) { func (r promRange) metrics() ([]Metric, error) {
var result []Metric var result []Metric
for i, res := range r.Result { for i, res := range r.Result {
@ -74,9 +80,22 @@ func (r promRange) metrics() ([]Metric, error) {
return result, nil return result, nil
} }
type promScalar [2]interface{}
func (r promScalar) metrics() ([]Metric, error) {
var m Metric
f, err := strconv.ParseFloat(r[1].(string), 64)
if err != nil {
return nil, fmt.Errorf("metric %v, unable to parse float64 from %s: %w", r, r[1], err)
}
m.Values = append(m.Values, f)
m.Timestamps = append(m.Timestamps, int64(r[0].(float64)))
return []Metric{m}, nil
}
const ( const (
statusSuccess, statusError = "success", "error" statusSuccess, statusError = "success", "error"
rtVector, rtMatrix = "vector", "matrix" rtVector, rtMatrix, rScalar = "vector", "matrix", "scalar"
) )
func parsePrometheusResponse(req *http.Request, resp *http.Response) ([]Metric, error) { func parsePrometheusResponse(req *http.Request, resp *http.Response) ([]Metric, error) {
@ -103,23 +122,23 @@ func parsePrometheusResponse(req *http.Request, resp *http.Response) ([]Metric,
return nil, err return nil, err
} }
return pr.metrics() return pr.metrics()
case rScalar:
var ps promScalar
if err := json.Unmarshal(r.Data.Result, &ps); err != nil {
return nil, err
}
return ps.metrics()
default: default:
return nil, fmt.Errorf("unknown result type %q", r.Data.ResultType) return nil, fmt.Errorf("unknown result type %q", r.Data.ResultType)
} }
} }
const (
prometheusInstantPath = "/api/v1/query"
prometheusRangePath = "/api/v1/query_range"
prometheusPrefix = "/prometheus"
)
func (s *VMStorage) setPrometheusInstantReqParams(r *http.Request, query string, timestamp time.Time) { func (s *VMStorage) setPrometheusInstantReqParams(r *http.Request, query string, timestamp time.Time) {
if s.appendTypePrefix { if s.appendTypePrefix {
r.URL.Path += prometheusPrefix r.URL.Path += "/prometheus"
} }
if !s.disablePathAppend { if !*disablePathAppend {
r.URL.Path += prometheusInstantPath r.URL.Path += "/api/v1/query"
} }
q := r.URL.Query() q := r.URL.Query()
if s.lookBack > 0 { if s.lookBack > 0 {
@ -136,10 +155,10 @@ func (s *VMStorage) setPrometheusInstantReqParams(r *http.Request, query string,
func (s *VMStorage) setPrometheusRangeReqParams(r *http.Request, query string, start, end time.Time) { func (s *VMStorage) setPrometheusRangeReqParams(r *http.Request, query string, start, end time.Time) {
if s.appendTypePrefix { if s.appendTypePrefix {
r.URL.Path += prometheusPrefix r.URL.Path += "/prometheus"
} }
if !s.disablePathAppend { if !*disablePathAppend {
r.URL.Path += prometheusRangePath r.URL.Path += "/api/v1/query_range"
} }
q := r.URL.Query() q := r.URL.Query()
q.Add("start", fmt.Sprintf("%d", start.Unix())) q.Add("start", fmt.Sprintf("%d", start.Unix()))

View file

@ -37,7 +37,7 @@ func TestVMInstantQuery(t *testing.T) {
mux.HandleFunc("/render", func(w http.ResponseWriter, request *http.Request) { mux.HandleFunc("/render", func(w http.ResponseWriter, request *http.Request) {
c++ c++
switch c { switch c {
case 7: case 8:
w.Write([]byte(`[{"target":"constantLine(10)","tags":{"name":"constantLine(10)"},"datapoints":[[10,1611758343],[10,1611758373],[10,1611758403]]}]`)) w.Write([]byte(`[{"target":"constantLine(10)","tags":{"name":"constantLine(10)"},"datapoints":[[10,1611758343],[10,1611758373],[10,1611758403]]}]`))
} }
}) })
@ -75,6 +75,8 @@ func TestVMInstantQuery(t *testing.T) {
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix"}}`)) w.Write([]byte(`{"status":"success","data":{"resultType":"matrix"}}`))
case 6: case 6:
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests"},"value":[1583786140,"2000"]}]}}`)) w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests"},"value":[1583786140,"2000"]}]}}`))
case 7:
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]}}`))
} }
}) })
@ -85,31 +87,26 @@ func TestVMInstantQuery(t *testing.T) {
if err != nil { if err != nil {
t.Fatalf("unexpected: %s", err) t.Fatalf("unexpected: %s", err)
} }
s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client(), false) s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client())
p := NewPrometheusType() p := NewPrometheusType()
pq := s.BuildWithParams(QuerierParams{DataSourceType: &p, EvaluationInterval: 15 * time.Second}) pq := s.BuildWithParams(QuerierParams{DataSourceType: &p, EvaluationInterval: 15 * time.Second})
ts := time.Now() ts := time.Now()
if _, err := pq.Query(ctx, query, ts); err == nil { expErr := func(err string) {
t.Fatalf("expected connection error got nil") if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected %q got nil", err)
}
} }
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected invalid response status error got nil") expErr("connection error") // 0
} expErr("invalid response status error") // 1
if _, err := pq.Query(ctx, query, ts); err == nil { expErr("response body error") // 2
t.Fatalf("expected response body error got nil") expErr("error status") // 3
} expErr("unknown status") // 4
if _, err := pq.Query(ctx, query, ts); err == nil { expErr("non-vector resultType error") // 5
t.Fatalf("expected error status got nil")
} m, err := pq.Query(ctx, query, ts) // 6 - vector
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected unknown status got nil")
}
if _, err := pq.Query(ctx, query, ts); err == nil {
t.Fatalf("expected non-vector resultType error got nil")
}
m, err := pq.Query(ctx, query, ts)
if err != nil { if err != nil {
t.Fatalf("unexpected %s", err) t.Fatalf("unexpected %s", err)
} }
@ -132,10 +129,27 @@ func TestVMInstantQuery(t *testing.T) {
t.Fatalf("unexpected metric %+v want %+v", m, expected) t.Fatalf("unexpected metric %+v want %+v", m, expected)
} }
m, err = pq.Query(ctx, query, ts) // 7 - scalar
if err != nil {
t.Fatalf("unexpected %s", err)
}
if len(m) != 1 {
t.Fatalf("expected 1 metrics got %d in %+v", len(m), m)
}
expected = []Metric{
{
Timestamps: []int64{1583786142},
Values: []float64{1},
},
}
if !reflect.DeepEqual(m, expected) {
t.Fatalf("unexpected metric %+v want %+v", m, expected)
}
g := NewGraphiteType() g := NewGraphiteType()
gq := s.BuildWithParams(QuerierParams{DataSourceType: &g}) gq := s.BuildWithParams(QuerierParams{DataSourceType: &g})
m, err = gq.Query(ctx, queryRender, ts) m, err = gq.Query(ctx, queryRender, ts) // 8 - graphite
if err != nil { if err != nil {
t.Fatalf("unexpected %s", err) t.Fatalf("unexpected %s", err)
} }
@ -196,7 +210,7 @@ func TestVMRangeQuery(t *testing.T) {
if err != nil { if err != nil {
t.Fatalf("unexpected: %s", err) t.Fatalf("unexpected: %s", err)
} }
s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client(), false) s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client())
p := NewPrometheusType() p := NewPrometheusType()
pq := s.BuildWithParams(QuerierParams{DataSourceType: &p, EvaluationInterval: 15 * time.Second}) pq := s.BuildWithParams(QuerierParams{DataSourceType: &p, EvaluationInterval: 15 * time.Second})
@ -252,18 +266,7 @@ func TestRequestParams(t *testing.T) {
dataSourceType: NewPrometheusType(), dataSourceType: NewPrometheusType(),
}, },
func(t *testing.T, r *http.Request) { func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusInstantPath, r.URL.Path) checkEqualString(t, "/api/v1/query", r.URL.Path)
},
},
{
"prometheus path with disablePathAppend",
false,
&VMStorage{
dataSourceType: NewPrometheusType(),
disablePathAppend: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, "", r.URL.Path)
}, },
}, },
{ {
@ -274,19 +277,7 @@ func TestRequestParams(t *testing.T) {
appendTypePrefix: true, appendTypePrefix: true,
}, },
func(t *testing.T, r *http.Request) { func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusPrefix+prometheusInstantPath, r.URL.Path) checkEqualString(t, "/prometheus/api/v1/query", r.URL.Path)
},
},
{
"prometheus prefix with disablePathAppend",
false,
&VMStorage{
dataSourceType: NewPrometheusType(),
appendTypePrefix: true,
disablePathAppend: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusPrefix, r.URL.Path)
}, },
}, },
{ {
@ -296,18 +287,7 @@ func TestRequestParams(t *testing.T) {
dataSourceType: NewPrometheusType(), dataSourceType: NewPrometheusType(),
}, },
func(t *testing.T, r *http.Request) { func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusRangePath, r.URL.Path) checkEqualString(t, "/api/v1/query_range", r.URL.Path)
},
},
{
"prometheus range path with disablePathAppend",
true,
&VMStorage{
dataSourceType: NewPrometheusType(),
disablePathAppend: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, "", r.URL.Path)
}, },
}, },
{ {
@ -318,19 +298,7 @@ func TestRequestParams(t *testing.T) {
appendTypePrefix: true, appendTypePrefix: true,
}, },
func(t *testing.T, r *http.Request) { func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusPrefix+prometheusRangePath, r.URL.Path) checkEqualString(t, "/prometheus/api/v1/query_range", r.URL.Path)
},
},
{
"prometheus range prefix with disablePathAppend",
true,
&VMStorage{
dataSourceType: NewPrometheusType(),
appendTypePrefix: true,
disablePathAppend: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusPrefix, r.URL.Path)
}, },
}, },
{ {

View file

@ -237,8 +237,6 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
notifiers: nts, notifiers: nts,
previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label)} previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label)}
evalTS := time.Now()
// Spread group rules evaluation over time in order to reduce load on VictoriaMetrics. // Spread group rules evaluation over time in order to reduce load on VictoriaMetrics.
if !skipRandSleepOnGroupStart { if !skipRandSleepOnGroupStart {
randSleep := uint64(float64(g.Interval) * (float64(g.ID()) / (1 << 64))) randSleep := uint64(float64(g.Interval) * (float64(g.ID()) / (1 << 64)))
@ -259,6 +257,8 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
} }
} }
evalTS := time.Now()
logger.Infof("group %q started; interval=%v; concurrency=%d", g.Name, g.Interval, g.Concurrency) logger.Infof("group %q started; interval=%v; concurrency=%d", g.Name, g.Interval, g.Concurrency)
eval := func(ts time.Time) { eval := func(ts time.Time) {
@ -303,6 +303,10 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
g.mu.Unlock() g.mu.Unlock()
continue continue
} }
// ensure that staleness is tracked or existing rules only
e.purgeStaleSeries(g.Rules)
if g.Interval != ng.Interval { if g.Interval != ng.Interval {
g.Interval = ng.Interval g.Interval = ng.Interval
t.Stop() t.Stop()
@ -457,6 +461,30 @@ func (e *executor) getStaleSeries(rule Rule, tss []prompbmarshal.TimeSeries, tim
return staleS return staleS
} }
// purgeStaleSeries deletes references in tracked
// previouslySentSeriesToRW list to Rules which aren't present
// in the given activeRules list. The method is used when the list
// of loaded rules has changed and executor has to remove
// references to non-existing rules.
func (e *executor) purgeStaleSeries(activeRules []Rule) {
newPreviouslySentSeriesToRW := make(map[uint64]map[string][]prompbmarshal.Label)
e.previouslySentSeriesToRWMu.Lock()
for _, rule := range activeRules {
id := rule.ID()
prev, ok := e.previouslySentSeriesToRW[id]
if ok {
// keep previous series for staleness detection
newPreviouslySentSeriesToRW[id] = prev
}
}
e.previouslySentSeriesToRW = nil
e.previouslySentSeriesToRW = newPreviouslySentSeriesToRW
e.previouslySentSeriesToRWMu.Unlock()
}
func labelsToString(labels []prompbmarshal.Label) string { func labelsToString(labels []prompbmarshal.Label) string {
var b strings.Builder var b strings.Builder
b.WriteRune('{') b.WriteRune('{')

View file

@ -157,7 +157,7 @@ func TestUpdateWith(t *testing.T) {
func TestGroupStart(t *testing.T) { func TestGroupStart(t *testing.T) {
// TODO: make parsing from string instead of file // TODO: make parsing from string instead of file
groups, err := config.Parse([]string{"config/testdata/rules1-good.rules"}, true, true) groups, err := config.Parse([]string{"config/testdata/rules/rules1-good.rules"}, true, true)
if err != nil { if err != nil {
t.Fatalf("failed to parse rules: %s", err) t.Fatalf("failed to parse rules: %s", err)
} }
@ -355,3 +355,61 @@ func TestGetStaleSeries(t *testing.T) {
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "bar")}, [][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "bar")},
nil) nil)
} }
func TestPurgeStaleSeries(t *testing.T) {
ts := time.Now()
labels := toPromLabels(t, "__name__", "job:foo", "job", "foo")
tss := []prompbmarshal.TimeSeries{newTimeSeriesPB([]float64{1}, []int64{ts.Unix()}, labels)}
f := func(curRules, newRules, expStaleRules []Rule) {
t.Helper()
e := &executor{
previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label),
}
// seed executor with series for
// current rules
for _, rule := range curRules {
e.getStaleSeries(rule, tss, ts)
}
e.purgeStaleSeries(newRules)
if len(e.previouslySentSeriesToRW) != len(expStaleRules) {
t.Fatalf("expected to get %d stale series, got %d",
len(expStaleRules), len(e.previouslySentSeriesToRW))
}
for _, exp := range expStaleRules {
if _, ok := e.previouslySentSeriesToRW[exp.ID()]; !ok {
t.Fatalf("expected to have rule %d; got nil instead", exp.ID())
}
}
}
f(nil, nil, nil)
f(
nil,
[]Rule{&AlertingRule{RuleID: 1}},
nil,
)
f(
[]Rule{&AlertingRule{RuleID: 1}},
nil,
nil,
)
f(
[]Rule{&AlertingRule{RuleID: 1}},
[]Rule{&AlertingRule{RuleID: 2}},
nil,
)
f(
[]Rule{&AlertingRule{RuleID: 1}, &AlertingRule{RuleID: 2}},
[]Rule{&AlertingRule{RuleID: 2}},
[]Rule{&AlertingRule{RuleID: 2}},
)
f(
[]Rule{&AlertingRule{RuleID: 1}, &AlertingRule{RuleID: 2}},
[]Rule{&AlertingRule{RuleID: 1}, &AlertingRule{RuleID: 2}},
[]Rule{&AlertingRule{RuleID: 1}, &AlertingRule{RuleID: 2}},
)
}

View file

@ -15,6 +15,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remoteread" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remoteread"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo" "github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag" "github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime" "github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
@ -34,6 +35,13 @@ Examples:
absolute path to all .yaml files in root. absolute path to all .yaml files in root.
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.`) Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.`)
ruleTemplatesPath = flagutil.NewArray("rule.templates", `Path or glob pattern to location with go template definitions
for rules annotations templating. Flag can be specified multiple times.
Examples:
-rule.templates="/path/to/file". Path to a single file with go templates
-rule.templates="dir/*.tpl" -rule.templates="/*.tpl". Relative path to all .tpl files in "dir" folder,
absolute path to all .tpl files in root.`)
rulesCheckInterval = flag.Duration("rule.configCheckInterval", 0, "Interval for checking for changes in '-rule' files. "+ rulesCheckInterval = flag.Duration("rule.configCheckInterval", 0, "Interval for checking for changes in '-rule' files. "+
"By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. DEPRECATED - see '-configCheckInterval' instead") "By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. DEPRECATED - see '-configCheckInterval' instead")
@ -73,10 +81,12 @@ func main() {
envflag.Parse() envflag.Parse()
buildinfo.Init() buildinfo.Init()
logger.Init() logger.Init()
err := templates.Load(*ruleTemplatesPath, true)
if err != nil {
logger.Fatalf("failed to parse %q: %s", *ruleTemplatesPath, err)
}
if *dryRun { if *dryRun {
u, _ := url.Parse("https://victoriametrics.com/")
notifier.InitTemplateFunc(u)
groups, err := config.Parse(*rulePath, true, true) groups, err := config.Parse(*rulePath, true, true)
if err != nil { if err != nil {
logger.Fatalf("failed to parse %q: %s", *rulePath, err) logger.Fatalf("failed to parse %q: %s", *rulePath, err)
@ -91,7 +101,7 @@ func main() {
if err != nil { if err != nil {
logger.Fatalf("failed to init `external.url`: %s", err) logger.Fatalf("failed to init `external.url`: %s", err)
} }
notifier.InitTemplateFunc(eu)
alertURLGeneratorFn, err = getAlertURLGenerator(eu, *externalAlertSource, *validateTemplates) alertURLGeneratorFn, err = getAlertURLGenerator(eu, *externalAlertSource, *validateTemplates)
if err != nil { if err != nil {
logger.Fatalf("failed to init `external.alert.source`: %s", err) logger.Fatalf("failed to init `external.alert.source`: %s", err)
@ -105,7 +115,6 @@ func main() {
if rw == nil { if rw == nil {
logger.Fatalf("remoteWrite.url can't be empty in replay mode") logger.Fatalf("remoteWrite.url can't be empty in replay mode")
} }
notifier.InitTemplateFunc(eu)
groupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions) groupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions)
if err != nil { if err != nil {
logger.Fatalf("cannot parse configuration file: %s", err) logger.Fatalf("cannot parse configuration file: %s", err)
@ -127,7 +136,6 @@ func main() {
if err != nil { if err != nil {
logger.Fatalf("failed to init: %s", err) logger.Fatalf("failed to init: %s", err)
} }
logger.Infof("reading rules configuration file from %q", strings.Join(*rulePath, ";")) logger.Infof("reading rules configuration file from %q", strings.Join(*rulePath, ";"))
groupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions) groupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions)
if err != nil { if err != nil {
@ -170,7 +178,7 @@ func newManager(ctx context.Context) (*manager, error) {
return nil, fmt.Errorf("failed to init datasource: %w", err) return nil, fmt.Errorf("failed to init datasource: %w", err)
} }
labels := make(map[string]string, 0) labels := make(map[string]string)
for _, s := range *externalLabels { for _, s := range *externalLabels {
if len(s) == 0 { if len(s) == 0 {
continue continue
@ -281,7 +289,11 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
case <-ctx.Done(): case <-ctx.Done():
return return
case <-sighupCh: case <-sighupCh:
logger.Infof("SIGHUP received. Going to reload rules %q ...", *rulePath) tmplMsg := ""
if len(*ruleTemplatesPath) > 0 {
tmplMsg = fmt.Sprintf("and templates %q ", *ruleTemplatesPath)
}
logger.Infof("SIGHUP received. Going to reload rules %q %s...", *rulePath, tmplMsg)
configReloads.Inc() configReloads.Inc()
case <-configCheckCh: case <-configCheckCh:
} }
@ -291,6 +303,13 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
logger.Errorf("failed to reload notifier config: %s", err) logger.Errorf("failed to reload notifier config: %s", err)
continue continue
} }
err := templates.Load(*ruleTemplatesPath, false)
if err != nil {
configReloadErrors.Inc()
configSuccess.Set(0)
logger.Errorf("failed to load new templates: %s", err)
continue
}
newGroupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions) newGroupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions)
if err != nil { if err != nil {
configReloadErrors.Inc() configReloadErrors.Inc()
@ -299,6 +318,7 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
continue continue
} }
if configsEqual(newGroupsCfg, groupsCfg) { if configsEqual(newGroupsCfg, groupsCfg) {
templates.Reload()
// set success to 1 since previous reload // set success to 1 since previous reload
// could have been unsuccessful // could have been unsuccessful
configSuccess.Set(1) configSuccess.Set(1)
@ -311,6 +331,7 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
logger.Errorf("error while reloading rules: %s", err) logger.Errorf("error while reloading rules: %s", err)
continue continue
} }
templates.Reload()
groupsCfg = newGroupsCfg groupsCfg = newGroupsCfg
configSuccess.Set(1) configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp()) configTimestamp.Set(fasttime.UnixTimestamp())

View file

@ -3,7 +3,6 @@ package main
import ( import (
"context" "context"
"math/rand" "math/rand"
"net/url"
"os" "os"
"strings" "strings"
"sync" "sync"
@ -14,11 +13,13 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
) )
func TestMain(m *testing.M) { func TestMain(m *testing.M) {
u, _ := url.Parse("https://victoriametrics.com/path") if err := templates.Load([]string{"testdata/templates/*good.tmpl"}, true); err != nil {
notifier.InitTemplateFunc(u) os.Exit(1)
}
os.Exit(m.Run()) os.Exit(m.Run())
} }
@ -47,9 +48,9 @@ func TestManagerUpdateConcurrent(t *testing.T) {
"config/testdata/dir/rules0-bad.rules", "config/testdata/dir/rules0-bad.rules",
"config/testdata/dir/rules1-good.rules", "config/testdata/dir/rules1-good.rules",
"config/testdata/dir/rules1-bad.rules", "config/testdata/dir/rules1-bad.rules",
"config/testdata/rules0-good.rules", "config/testdata/rules/rules0-good.rules",
"config/testdata/rules1-good.rules", "config/testdata/rules/rules1-good.rules",
"config/testdata/rules2-good.rules", "config/testdata/rules/rules2-good.rules",
} }
evalInterval := *evaluationInterval evalInterval := *evaluationInterval
defer func() { *evaluationInterval = evalInterval }() defer func() { *evaluationInterval = evalInterval }()
@ -125,7 +126,7 @@ func TestManagerUpdate(t *testing.T) {
}{ }{
{ {
name: "update good rules", name: "update good rules",
initPath: "config/testdata/rules0-good.rules", initPath: "config/testdata/rules/rules0-good.rules",
updatePath: "config/testdata/dir/rules1-good.rules", updatePath: "config/testdata/dir/rules1-good.rules",
want: []*Group{ want: []*Group{
{ {
@ -150,18 +151,18 @@ func TestManagerUpdate(t *testing.T) {
}, },
{ {
name: "update good rules from 1 to 2 groups", name: "update good rules from 1 to 2 groups",
initPath: "config/testdata/dir/rules1-good.rules", initPath: "config/testdata/dir/rules/rules1-good.rules",
updatePath: "config/testdata/rules0-good.rules", updatePath: "config/testdata/rules/rules0-good.rules",
want: []*Group{ want: []*Group{
{ {
File: "config/testdata/rules0-good.rules", File: "config/testdata/rules/rules0-good.rules",
Name: "groupGorSingleAlert", Name: "groupGorSingleAlert",
Type: datasource.NewPrometheusType(), Type: datasource.NewPrometheusType(),
Rules: []Rule{VMRows}, Rules: []Rule{VMRows},
Interval: defaultEvalInterval, Interval: defaultEvalInterval,
}, },
{ {
File: "config/testdata/rules0-good.rules", File: "config/testdata/rules/rules0-good.rules",
Interval: defaultEvalInterval, Interval: defaultEvalInterval,
Type: datasource.NewPrometheusType(), Type: datasource.NewPrometheusType(),
Name: "TestGroup", Rules: []Rule{ Name: "TestGroup", Rules: []Rule{
@ -172,18 +173,18 @@ func TestManagerUpdate(t *testing.T) {
}, },
{ {
name: "update with one bad rule file", name: "update with one bad rule file",
initPath: "config/testdata/rules0-good.rules", initPath: "config/testdata/rules/rules0-good.rules",
updatePath: "config/testdata/dir/rules2-bad.rules", updatePath: "config/testdata/dir/rules2-bad.rules",
want: []*Group{ want: []*Group{
{ {
File: "config/testdata/rules0-good.rules", File: "config/testdata/rules/rules0-good.rules",
Name: "groupGorSingleAlert", Name: "groupGorSingleAlert",
Type: datasource.NewPrometheusType(), Type: datasource.NewPrometheusType(),
Interval: defaultEvalInterval, Interval: defaultEvalInterval,
Rules: []Rule{VMRows}, Rules: []Rule{VMRows},
}, },
{ {
File: "config/testdata/rules0-good.rules", File: "config/testdata/rules/rules0-good.rules",
Interval: defaultEvalInterval, Interval: defaultEvalInterval,
Name: "TestGroup", Name: "TestGroup",
Type: datasource.NewPrometheusType(), Type: datasource.NewPrometheusType(),
@ -196,17 +197,17 @@ func TestManagerUpdate(t *testing.T) {
{ {
name: "update empty dir rules from 0 to 2 groups", name: "update empty dir rules from 0 to 2 groups",
initPath: "config/testdata/empty/*", initPath: "config/testdata/empty/*",
updatePath: "config/testdata/rules0-good.rules", updatePath: "config/testdata/rules/rules0-good.rules",
want: []*Group{ want: []*Group{
{ {
File: "config/testdata/rules0-good.rules", File: "config/testdata/rules/rules0-good.rules",
Name: "groupGorSingleAlert", Name: "groupGorSingleAlert",
Type: datasource.NewPrometheusType(), Type: datasource.NewPrometheusType(),
Interval: defaultEvalInterval, Interval: defaultEvalInterval,
Rules: []Rule{VMRows}, Rules: []Rule{VMRows},
}, },
{ {
File: "config/testdata/rules0-good.rules", File: "config/testdata/rules/rules0-good.rules",
Interval: defaultEvalInterval, Interval: defaultEvalInterval,
Type: datasource.NewPrometheusType(), Type: datasource.NewPrometheusType(),
Name: "TestGroup", Rules: []Rule{ Name: "TestGroup", Rules: []Rule{

View file

@ -5,9 +5,10 @@ import (
"fmt" "fmt"
"io" "io"
"strings" "strings"
"text/template" textTpl "text/template"
"time" "time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal" "github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel" "github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
@ -90,26 +91,38 @@ var tplHeaders = []string{
// map of annotations. // map of annotations.
// Every alert could have a different datasource, so function // Every alert could have a different datasource, so function
// requires a queryFunction as an argument. // requires a queryFunction as an argument.
func (a *Alert) ExecTemplate(q QueryFn, labels, annotations map[string]string) (map[string]string, error) { func (a *Alert) ExecTemplate(q templates.QueryFn, labels, annotations map[string]string) (map[string]string, error) {
tplData := AlertTplData{Value: a.Value, Labels: labels, Expr: a.Expr} tplData := AlertTplData{Value: a.Value, Labels: labels, Expr: a.Expr}
return templateAnnotations(annotations, tplData, funcsWithQuery(q), true) tmpl, err := templates.GetWithFuncs(templates.FuncsWithQuery(q))
if err != nil {
return nil, fmt.Errorf("error getting a template: %w", err)
}
return templateAnnotations(annotations, tplData, tmpl, true)
} }
// ExecTemplate executes the given template for given annotations map. // ExecTemplate executes the given template for given annotations map.
func ExecTemplate(q QueryFn, annotations map[string]string, tpl AlertTplData) (map[string]string, error) { func ExecTemplate(q templates.QueryFn, annotations map[string]string, tplData AlertTplData) (map[string]string, error) {
return templateAnnotations(annotations, tpl, funcsWithQuery(q), true) tmpl, err := templates.GetWithFuncs(templates.FuncsWithQuery(q))
if err != nil {
return nil, fmt.Errorf("error cloning template: %w", err)
}
return templateAnnotations(annotations, tplData, tmpl, true)
} }
// ValidateTemplates validate annotations for possible template error, uses empty data for template population // ValidateTemplates validate annotations for possible template error, uses empty data for template population
func ValidateTemplates(annotations map[string]string) error { func ValidateTemplates(annotations map[string]string) error {
_, err := templateAnnotations(annotations, AlertTplData{ tmpl, err := templates.Get()
if err != nil {
return err
}
_, err = templateAnnotations(annotations, AlertTplData{
Labels: map[string]string{}, Labels: map[string]string{},
Value: 0, Value: 0,
}, tmplFunc, false) }, tmpl, false)
return err return err
} }
func templateAnnotations(annotations map[string]string, data AlertTplData, funcs template.FuncMap, execute bool) (map[string]string, error) { func templateAnnotations(annotations map[string]string, data AlertTplData, tmpl *textTpl.Template, execute bool) (map[string]string, error) {
var builder strings.Builder var builder strings.Builder
var buf bytes.Buffer var buf bytes.Buffer
eg := new(utils.ErrGroup) eg := new(utils.ErrGroup)
@ -122,7 +135,7 @@ func templateAnnotations(annotations map[string]string, data AlertTplData, funcs
builder.Grow(len(header) + len(text)) builder.Grow(len(header) + len(text))
builder.WriteString(header) builder.WriteString(header)
builder.WriteString(text) builder.WriteString(text)
if err := templateAnnotation(&buf, builder.String(), tData, funcs, execute); err != nil { if err := templateAnnotation(&buf, builder.String(), tData, tmpl, execute); err != nil {
r[key] = text r[key] = text
eg.Add(fmt.Errorf("key %q, template %q: %w", key, text, err)) eg.Add(fmt.Errorf("key %q, template %q: %w", key, text, err))
continue continue
@ -138,11 +151,17 @@ type tplData struct {
ExternalURL string ExternalURL string
} }
func templateAnnotation(dst io.Writer, text string, data tplData, funcs template.FuncMap, execute bool) error { func templateAnnotation(dst io.Writer, text string, data tplData, tmpl *textTpl.Template, execute bool) error {
t := template.New("").Funcs(funcs).Option("missingkey=zero") tpl, err := tmpl.Clone()
tpl, err := t.Parse(text)
if err != nil { if err != nil {
return fmt.Errorf("error parsing annotation: %w", err) return fmt.Errorf("error cloning template before parse annotation: %w", err)
}
tpl, err = tpl.Parse(text)
if err != nil {
return fmt.Errorf("error parsing annotation template: %w", err)
}
if !execute {
return nil
} }
if !execute { if !execute {
return nil return nil

View file

@ -11,7 +11,7 @@ import (
) )
func TestAlert_ExecTemplate(t *testing.T) { func TestAlert_ExecTemplate(t *testing.T) {
extLabels := make(map[string]string, 0) extLabels := make(map[string]string)
const ( const (
extCluster = "prod" extCluster = "prod"
extDC = "east" extDC = "east"

View file

@ -3,9 +3,11 @@ package notifier
import ( import (
"flag" "flag"
"fmt" "fmt"
"net/url"
"strings" "strings"
"time" "time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil" "github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth" "github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal" "github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
@ -83,6 +85,12 @@ func Init(gen AlertURLGenerator, extLabels map[string]string, extURL string) (fu
externalURL = extURL externalURL = extURL
externalLabels = extLabels externalLabels = extLabels
eu, err := url.Parse(externalURL)
if err != nil {
return nil, fmt.Errorf("failed to parse external URL: %s", err)
}
templates.UpdateWithFuncs(templates.FuncsWithExternalURL(eu))
if *configPath == "" && len(*addrs) == 0 { if *configPath == "" && len(*addrs) == 0 {
return nil, nil return nil, nil
@ -102,7 +110,6 @@ func Init(gen AlertURLGenerator, extLabels map[string]string, extURL string) (fu
return staticNotifiersFn, nil return staticNotifiersFn, nil
} }
var err error
cw, err = newWatcher(*configPath, gen) cw, err = newWatcher(*configPath, gen)
if err != nil { if err != nil {
return nil, fmt.Errorf("failed to init config watcher: %s", err) return nil, fmt.Errorf("failed to init config watcher: %s", err)

View file

@ -1,13 +1,14 @@
package notifier package notifier
import ( import (
"net/url" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
"os" "os"
"testing" "testing"
) )
func TestMain(m *testing.M) { func TestMain(m *testing.M) {
u, _ := url.Parse("https://victoriametrics.com/path") if err := templates.Load([]string{"testdata/templates/*good.tmpl"}, true); err != nil {
InitTemplateFunc(u) os.Exit(1)
}
os.Exit(m.Run()) os.Exit(m.Run())
} }

View file

@ -34,8 +34,6 @@ var (
oauth2ClientSecretFile = flag.String("remoteRead.oauth2.clientSecretFile", "", "Optional OAuth2 clientSecretFile to use for -remoteRead.url.") oauth2ClientSecretFile = flag.String("remoteRead.oauth2.clientSecretFile", "", "Optional OAuth2 clientSecretFile to use for -remoteRead.url.")
oauth2TokenURL = flag.String("remoteRead.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -remoteRead.url. ") oauth2TokenURL = flag.String("remoteRead.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -remoteRead.url. ")
oauth2Scopes = flag.String("remoteRead.oauth2.scopes", "", "Optional OAuth2 scopes to use for -remoteRead.url. Scopes must be delimited by ';'.") oauth2Scopes = flag.String("remoteRead.oauth2.scopes", "", "Optional OAuth2 scopes to use for -remoteRead.url. Scopes must be delimited by ';'.")
disablePathAppend = flag.Bool("remoteRead.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/query' path to the configured -remoteRead.url.")
) )
// Init creates a Querier from provided flag values. // Init creates a Querier from provided flag values.
@ -57,5 +55,5 @@ func Init() (datasource.QuerierBuilder, error) {
return nil, fmt.Errorf("failed to configure auth: %w", err) return nil, fmt.Errorf("failed to configure auth: %w", err)
} }
c := &http.Client{Transport: tr} c := &http.Client{Transport: tr}
return datasource.NewVMStorage(*addr, authCfg, 0, 0, false, c, *disablePathAppend), nil return datasource.NewVMStorage(*addr, authCfg, 0, 0, false, c), nil
} }

View file

@ -39,8 +39,6 @@ var (
oauth2ClientSecretFile = flag.String("remoteWrite.oauth2.clientSecretFile", "", "Optional OAuth2 clientSecretFile to use for -remoteWrite.url.") oauth2ClientSecretFile = flag.String("remoteWrite.oauth2.clientSecretFile", "", "Optional OAuth2 clientSecretFile to use for -remoteWrite.url.")
oauth2TokenURL = flag.String("remoteWrite.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -notifier.url.") oauth2TokenURL = flag.String("remoteWrite.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -notifier.url.")
oauth2Scopes = flag.String("remoteWrite.oauth2.scopes", "", "Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'.") oauth2Scopes = flag.String("remoteWrite.oauth2.scopes", "", "Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'.")
disablePathAppend = flag.Bool("remoteWrite.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url.")
) )
// Init creates Client object from given flags. // Init creates Client object from given flags.

View file

@ -3,6 +3,7 @@ package remotewrite
import ( import (
"bytes" "bytes"
"context" "context"
"flag"
"fmt" "fmt"
"io/ioutil" "io/ioutil"
"net/http" "net/http"
@ -19,17 +20,20 @@ import (
"github.com/VictoriaMetrics/metrics" "github.com/VictoriaMetrics/metrics"
) )
var (
disablePathAppend = flag.Bool("remoteWrite.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url.")
)
// Client is an asynchronous HTTP client for writing // Client is an asynchronous HTTP client for writing
// timeseries via remote write protocol. // timeseries via remote write protocol.
type Client struct { type Client struct {
addr string addr string
c *http.Client c *http.Client
authCfg *promauth.Config authCfg *promauth.Config
input chan prompbmarshal.TimeSeries input chan prompbmarshal.TimeSeries
flushInterval time.Duration flushInterval time.Duration
maxBatchSize int maxBatchSize int
maxQueueSize int maxQueueSize int
disablePathAppend bool
wg sync.WaitGroup wg sync.WaitGroup
doneCh chan struct{} doneCh chan struct{}
@ -70,8 +74,6 @@ const (
defaultWriteTimeout = 30 * time.Second defaultWriteTimeout = 30 * time.Second
) )
const writePath = "/api/v1/write"
// NewClient returns asynchronous client for // NewClient returns asynchronous client for
// writing timeseries via remotewrite protocol. // writing timeseries via remotewrite protocol.
func NewClient(ctx context.Context, cfg Config) (*Client, error) { func NewClient(ctx context.Context, cfg Config) (*Client, error) {
@ -102,14 +104,13 @@ func NewClient(ctx context.Context, cfg Config) (*Client, error) {
Timeout: cfg.WriteTimeout, Timeout: cfg.WriteTimeout,
Transport: cfg.Transport, Transport: cfg.Transport,
}, },
addr: strings.TrimSuffix(cfg.Addr, "/"), addr: strings.TrimSuffix(cfg.Addr, "/"),
authCfg: cfg.AuthCfg, authCfg: cfg.AuthCfg,
flushInterval: cfg.FlushInterval, flushInterval: cfg.FlushInterval,
maxBatchSize: cfg.MaxBatchSize, maxBatchSize: cfg.MaxBatchSize,
maxQueueSize: cfg.MaxQueueSize, maxQueueSize: cfg.MaxQueueSize,
doneCh: make(chan struct{}), doneCh: make(chan struct{}),
input: make(chan prompbmarshal.TimeSeries, cfg.MaxQueueSize), input: make(chan prompbmarshal.TimeSeries, cfg.MaxQueueSize),
disablePathAppend: cfg.DisablePathAppend,
} }
for i := 0; i < cc; i++ { for i := 0; i < cc; i++ {
@ -240,8 +241,8 @@ func (c *Client) send(ctx context.Context, data []byte) error {
req.Header.Set("Authorization", auth) req.Header.Set("Authorization", auth)
} }
} }
if !c.disablePathAppend { if !*disablePathAppend {
req.URL.Path = path.Join(req.URL.Path, writePath) req.URL.Path = path.Join(req.URL.Path, "/api/v1/write")
} }
resp, err := c.c.Do(req.WithContext(ctx)) resp, err := c.c.Do(req.WithContext(ctx))
if err != nil { if err != nil {

View file

@ -7,13 +7,12 @@ import (
"strings" "strings"
"time" "time"
"github.com/cheggaaa/pb/v3"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger" "github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal" "github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/dmitryk-dk/pb/v3"
) )
var ( var (

View file

@ -11,26 +11,118 @@
// See the License for the specific language governing permissions and // See the License for the specific language governing permissions and
// limitations under the License. // limitations under the License.
package notifier package templates
import ( import (
"errors" "errors"
"fmt" "fmt"
htmlTpl "html/template"
"io/ioutil"
"math" "math"
"net" "net"
"net/url" "net/url"
"path/filepath"
"regexp" "regexp"
"sort" "sort"
"strconv"
"strings" "strings"
"sync"
"time" "time"
htmlTpl "html/template"
textTpl "text/template" textTpl "text/template"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils" "github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
) )
// go template execution fails when it's tree is empty
const defaultTemplate = `{{- define "default.template" -}}{{- end -}}`
var tplMu sync.RWMutex
type textTemplate struct {
current *textTpl.Template
replacement *textTpl.Template
}
var masterTmpl textTemplate
func newTemplate() *textTpl.Template {
tmpl := textTpl.New("").Option("missingkey=zero").Funcs(templateFuncs())
return textTpl.Must(tmpl.Parse(defaultTemplate))
}
// Load func loads templates from multiple globs specified in pathPatterns and either
// sets them directly to current template if it's undefined or with overwrite=true
// or sets replacement templates and adds templates with new names to a current
func Load(pathPatterns []string, overwrite bool) error {
var err error
tmpl := newTemplate()
for _, tp := range pathPatterns {
p, err := filepath.Glob(tp)
if err != nil {
return fmt.Errorf("failed to retrieve a template glob %q: %w", tp, err)
}
if len(p) > 0 {
tmpl, err = tmpl.ParseGlob(tp)
if err != nil {
return fmt.Errorf("failed to parse template glob %q: %w", tp, err)
}
}
}
if len(tmpl.Templates()) > 0 {
err := tmpl.Execute(ioutil.Discard, nil)
if err != nil {
return fmt.Errorf("failed to execute template: %w", err)
}
}
tplMu.Lock()
defer tplMu.Unlock()
if masterTmpl.current == nil || overwrite {
masterTmpl.replacement = nil
masterTmpl.current = newTemplate()
} else {
masterTmpl.replacement = newTemplate()
if err = copyTemplates(tmpl, masterTmpl.replacement, overwrite); err != nil {
return err
}
}
return copyTemplates(tmpl, masterTmpl.current, overwrite)
}
func copyTemplates(from *textTpl.Template, to *textTpl.Template, overwrite bool) error {
if from == nil {
return nil
}
if to == nil {
to = newTemplate()
}
tmpl, err := from.Clone()
if err != nil {
return err
}
for _, t := range tmpl.Templates() {
if to.Lookup(t.Name()) == nil || overwrite {
to, err = to.AddParseTree(t.Name(), t.Tree)
if err != nil {
return fmt.Errorf("failed to add template %q: %w", t.Name(), err)
}
}
}
return nil
}
// Reload func replaces current template with a replacement template
// which was set by Load with override=false
func Reload() {
tplMu.Lock()
defer tplMu.Unlock()
if masterTmpl.replacement != nil {
masterTmpl.current = masterTmpl.replacement
masterTmpl.replacement = nil
}
}
// metric is private copy of datasource.Metric, // metric is private copy of datasource.Metric,
// it is used for templating annotations, // it is used for templating annotations,
// Labels as map simplifies templates evaluation. // Labels as map simplifies templates evaluation.
@ -60,12 +152,62 @@ func datasourceMetricsToTemplateMetrics(ms []datasource.Metric) []metric {
// for templating functions. // for templating functions.
type QueryFn func(query string) ([]datasource.Metric, error) type QueryFn func(query string) ([]datasource.Metric, error)
var tmplFunc textTpl.FuncMap // UpdateWithFuncs updates existing or sets a new function map for a template
func UpdateWithFuncs(funcs textTpl.FuncMap) {
tplMu.Lock()
defer tplMu.Unlock()
masterTmpl.current = masterTmpl.current.Funcs(funcs)
}
// InitTemplateFunc initiates template helper functions // GetWithFuncs returns a copy of current template with additional FuncMap
func InitTemplateFunc(externalURL *url.URL) { // provided with funcs argument
func GetWithFuncs(funcs textTpl.FuncMap) (*textTpl.Template, error) {
tplMu.RLock()
defer tplMu.RUnlock()
tmpl, err := masterTmpl.current.Clone()
if err != nil {
return nil, err
}
return tmpl.Funcs(funcs), nil
}
// Get returns a copy of a template
func Get() (*textTpl.Template, error) {
tplMu.RLock()
defer tplMu.RUnlock()
return masterTmpl.current.Clone()
}
// FuncsWithQuery returns a function map that depends on metric data
func FuncsWithQuery(query QueryFn) textTpl.FuncMap {
return textTpl.FuncMap{
"query": func(q string) ([]metric, error) {
result, err := query(q)
if err != nil {
return nil, err
}
return datasourceMetricsToTemplateMetrics(result), nil
},
}
}
// FuncsWithExternalURL returns a function map that depends on externalURL value
func FuncsWithExternalURL(externalURL *url.URL) textTpl.FuncMap {
return textTpl.FuncMap{
"externalURL": func() string {
return externalURL.String()
},
"pathPrefix": func() string {
return externalURL.Path
},
}
}
// templateFuncs initiates template helper functions
func templateFuncs() textTpl.FuncMap {
// See https://prometheus.io/docs/prometheus/latest/configuration/template_reference/ // See https://prometheus.io/docs/prometheus/latest/configuration/template_reference/
tmplFunc = textTpl.FuncMap{ return textTpl.FuncMap{
/* Strings */ /* Strings */
// reReplaceAll ReplaceAllString returns a copy of src, replacing matches of the Regexp with // reReplaceAll ReplaceAllString returns a copy of src, replacing matches of the Regexp with
@ -117,9 +259,13 @@ func InitTemplateFunc(externalURL *url.URL) {
// humanize converts given number to a human readable format // humanize converts given number to a human readable format
// by adding metric prefixes https://en.wikipedia.org/wiki/Metric_prefix // by adding metric prefixes https://en.wikipedia.org/wiki/Metric_prefix
"humanize": func(v float64) string { "humanize": func(i interface{}) (string, error) {
v, err := toFloat64(i)
if err != nil {
return "", err
}
if v == 0 || math.IsNaN(v) || math.IsInf(v, 0) { if v == 0 || math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v) return fmt.Sprintf("%.4g", v), nil
} }
if math.Abs(v) >= 1 { if math.Abs(v) >= 1 {
prefix := "" prefix := ""
@ -130,7 +276,7 @@ func InitTemplateFunc(externalURL *url.URL) {
prefix = p prefix = p
v /= 1000 v /= 1000
} }
return fmt.Sprintf("%.4g%s", v, prefix) return fmt.Sprintf("%.4g%s", v, prefix), nil
} }
prefix := "" prefix := ""
for _, p := range []string{"m", "u", "n", "p", "f", "a", "z", "y"} { for _, p := range []string{"m", "u", "n", "p", "f", "a", "z", "y"} {
@ -140,13 +286,17 @@ func InitTemplateFunc(externalURL *url.URL) {
prefix = p prefix = p
v *= 1000 v *= 1000
} }
return fmt.Sprintf("%.4g%s", v, prefix) return fmt.Sprintf("%.4g%s", v, prefix), nil
}, },
// humanize1024 converts given number to a human readable format with 1024 as base // humanize1024 converts given number to a human readable format with 1024 as base
"humanize1024": func(v float64) string { "humanize1024": func(i interface{}) (string, error) {
v, err := toFloat64(i)
if err != nil {
return "", err
}
if math.Abs(v) <= 1 || math.IsNaN(v) || math.IsInf(v, 0) { if math.Abs(v) <= 1 || math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v) return fmt.Sprintf("%.4g", v), nil
} }
prefix := "" prefix := ""
for _, p := range []string{"ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi", "Yi"} { for _, p := range []string{"ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi", "Yi"} {
@ -156,16 +306,20 @@ func InitTemplateFunc(externalURL *url.URL) {
prefix = p prefix = p
v /= 1024 v /= 1024
} }
return fmt.Sprintf("%.4g%s", v, prefix) return fmt.Sprintf("%.4g%s", v, prefix), nil
}, },
// humanizeDuration converts given seconds to a human readable duration // humanizeDuration converts given seconds to a human readable duration
"humanizeDuration": func(v float64) string { "humanizeDuration": func(i interface{}) (string, error) {
v, err := toFloat64(i)
if err != nil {
return "", err
}
if math.IsNaN(v) || math.IsInf(v, 0) { if math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v) return fmt.Sprintf("%.4g", v), nil
} }
if v == 0 { if v == 0 {
return fmt.Sprintf("%.4gs", v) return fmt.Sprintf("%.4gs", v), nil
} }
if math.Abs(v) >= 1 { if math.Abs(v) >= 1 {
sign := "" sign := ""
@ -179,16 +333,16 @@ func InitTemplateFunc(externalURL *url.URL) {
days := int64(v) / 60 / 60 / 24 days := int64(v) / 60 / 60 / 24
// For days to minutes, we display seconds as an integer. // For days to minutes, we display seconds as an integer.
if days != 0 { if days != 0 {
return fmt.Sprintf("%s%dd %dh %dm %ds", sign, days, hours, minutes, seconds) return fmt.Sprintf("%s%dd %dh %dm %ds", sign, days, hours, minutes, seconds), nil
} }
if hours != 0 { if hours != 0 {
return fmt.Sprintf("%s%dh %dm %ds", sign, hours, minutes, seconds) return fmt.Sprintf("%s%dh %dm %ds", sign, hours, minutes, seconds), nil
} }
if minutes != 0 { if minutes != 0 {
return fmt.Sprintf("%s%dm %ds", sign, minutes, seconds) return fmt.Sprintf("%s%dm %ds", sign, minutes, seconds), nil
} }
// For seconds, we display 4 significant digits. // For seconds, we display 4 significant digits.
return fmt.Sprintf("%s%.4gs", sign, v) return fmt.Sprintf("%s%.4gs", sign, v), nil
} }
prefix := "" prefix := ""
for _, p := range []string{"m", "u", "n", "p", "f", "a", "z", "y"} { for _, p := range []string{"m", "u", "n", "p", "f", "a", "z", "y"} {
@ -198,33 +352,51 @@ func InitTemplateFunc(externalURL *url.URL) {
prefix = p prefix = p
v *= 1000 v *= 1000
} }
return fmt.Sprintf("%.4g%ss", v, prefix) return fmt.Sprintf("%.4g%ss", v, prefix), nil
}, },
// humanizePercentage converts given ratio value to a fraction of 100 // humanizePercentage converts given ratio value to a fraction of 100
"humanizePercentage": func(v float64) string { "humanizePercentage": func(i interface{}) (string, error) {
return fmt.Sprintf("%.4g%%", v*100) v, err := toFloat64(i)
if err != nil {
return "", err
}
return fmt.Sprintf("%.4g%%", v*100), nil
}, },
// humanizeTimestamp converts given timestamp to a human readable time equivalent // humanizeTimestamp converts given timestamp to a human readable time equivalent
"humanizeTimestamp": func(v float64) string { "humanizeTimestamp": func(i interface{}) (string, error) {
v, err := toFloat64(i)
if err != nil {
return "", err
}
if math.IsNaN(v) || math.IsInf(v, 0) { if math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v) return fmt.Sprintf("%.4g", v), nil
} }
t := TimeFromUnixNano(int64(v * 1e9)).Time().UTC() t := TimeFromUnixNano(int64(v * 1e9)).Time().UTC()
return fmt.Sprint(t) return fmt.Sprint(t), nil
}, },
/* URLs */ /* URLs */
// externalURL returns value of `external.url` flag // externalURL returns value of `external.url` flag
"externalURL": func() string { "externalURL": func() string {
return externalURL.String() // externalURL function supposed to be substituted at FuncsWithExteralURL().
// it is present here only for validation purposes, when there is no
// provided datasource.
//
// return non-empty slice to pass validation with chained functions in template
return ""
}, },
// pathPrefix returns a Path segment from the URL value in `external.url` flag // pathPrefix returns a Path segment from the URL value in `external.url` flag
"pathPrefix": func() string { "pathPrefix": func() string {
return externalURL.Path // pathPrefix function supposed to be substituted at FuncsWithExteralURL().
// it is present here only for validation purposes, when there is no
// provided datasource.
//
// return non-empty slice to pass validation with chained functions in template
return ""
}, },
// pathEscape escapes the string so it can be safely placed inside a URL path segment, // pathEscape escapes the string so it can be safely placed inside a URL path segment,
@ -259,7 +431,7 @@ func InitTemplateFunc(externalURL *url.URL) {
// execute "/api/v1/query?query=foo" request and will return // execute "/api/v1/query?query=foo" request and will return
// the first value in response. // the first value in response.
"query": func(q string) ([]metric, error) { "query": func(q string) ([]metric, error) {
// query function supposed to be substituted at funcsWithQuery(). // query function supposed to be substituted at FuncsWithQuery().
// it is present here only for validation purposes, when there is no // it is present here only for validation purposes, when there is no
// provided datasource. // provided datasource.
// //
@ -316,21 +488,6 @@ func InitTemplateFunc(externalURL *url.URL) {
} }
} }
func funcsWithQuery(query QueryFn) textTpl.FuncMap {
fm := make(textTpl.FuncMap)
for k, fn := range tmplFunc {
fm[k] = fn
}
fm["query"] = func(q string) ([]metric, error) {
result, err := query(q)
if err != nil {
return nil, err
}
return datasourceMetricsToTemplateMetrics(result), nil
}
return fm
}
// Time is the number of milliseconds since the epoch // Time is the number of milliseconds since the epoch
// (1970-01-01 00:00 UTC) excluding leap seconds. // (1970-01-01 00:00 UTC) excluding leap seconds.
type Time int64 type Time int64
@ -355,3 +512,28 @@ const second = int64(time.Second / minimumTick)
func (t Time) Time() time.Time { func (t Time) Time() time.Time {
return time.Unix(int64(t)/second, (int64(t)%second)*nanosPerTick) return time.Unix(int64(t)/second, (int64(t)%second)*nanosPerTick)
} }
func toFloat64(v interface{}) (float64, error) {
switch i := v.(type) {
case float64:
return i, nil
case float32:
return float64(i), nil
case int64:
return float64(i), nil
case int32:
return float64(i), nil
case int:
return float64(i), nil
case uint64:
return float64(i), nil
case uint32:
return float64(i), nil
case uint:
return float64(i), nil
case string:
return strconv.ParseFloat(i, 64)
default:
return 0, fmt.Errorf("unexpected value type %v", i)
}
}

View file

@ -0,0 +1,275 @@
package templates
import (
"strings"
"testing"
textTpl "text/template"
)
func mkTemplate(current, replacement interface{}) textTemplate {
tmpl := textTemplate{}
if current != nil {
switch val := current.(type) {
case string:
tmpl.current = textTpl.Must(newTemplate().Parse(val))
}
}
if replacement != nil {
switch val := replacement.(type) {
case string:
tmpl.replacement = textTpl.Must(newTemplate().Parse(val))
}
}
return tmpl
}
func equalTemplates(tmpls ...*textTpl.Template) bool {
var cmp *textTpl.Template
for i, tmpl := range tmpls {
if i == 0 {
cmp = tmpl
} else {
if cmp == nil || tmpl == nil {
if cmp != tmpl {
return false
}
continue
}
if len(tmpl.Templates()) != len(cmp.Templates()) {
return false
}
for _, t := range tmpl.Templates() {
tp := cmp.Lookup(t.Name())
if tp == nil {
return false
}
if tp.Root.String() != t.Root.String() {
return false
}
}
}
}
return true
}
func TestTemplates_Load(t *testing.T) {
testCases := []struct {
name string
initialTemplate textTemplate
pathPatterns []string
overwrite bool
expectedTemplate textTemplate
expErr string
}{
{
"non existing path undefined template override",
mkTemplate(nil, nil),
[]string{
"templates/non-existing/good-*.tpl",
"templates/absent/good-*.tpl",
},
true,
mkTemplate(``, nil),
"",
},
{
"non existing path defined template override",
mkTemplate(`
{{- define "test.1" -}}
{{- printf "value" -}}
{{- end -}}
`, nil),
[]string{
"templates/non-existing/good-*.tpl",
"templates/absent/good-*.tpl",
},
true,
mkTemplate(``, nil),
"",
},
{
"existing path undefined template override",
mkTemplate(nil, nil),
[]string{
"templates/other/nested/good0-*.tpl",
"templates/test/good0-*.tpl",
},
false,
mkTemplate(`
{{- define "good0-test.tpl" -}}{{- end -}}
{{- define "test.0" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
{{- define "test.1" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
{{- define "test.2" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
{{- define "test.3" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
`, nil),
"",
},
{
"existing path defined template override",
mkTemplate(`
{{- define "test.1" -}}
{{ printf "Hello %s!" "world" }}
{{- end -}}
`, nil),
[]string{
"templates/other/nested/good0-*.tpl",
"templates/test/good0-*.tpl",
},
false,
mkTemplate(`
{{- define "good0-test.tpl" -}}{{- end -}}
{{- define "test.0" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
{{- define "test.1" -}}
{{ printf "Hello %s!" "world" }}
{{- end -}}
{{- define "test.2" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
{{- define "test.3" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
`, `
{{- define "good0-test.tpl" -}}{{- end -}}
{{- define "test.0" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
{{- define "test.1" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
{{- define "test.2" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
{{- define "test.3" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
`),
"",
},
{
"load template with syntax error",
mkTemplate(`
{{- define "test.1" -}}
{{ printf "Hello %s!" "world" }}
{{- end -}}
`, nil),
[]string{
"templates/other/nested/bad0-*.tpl",
"templates/test/good0-*.tpl",
},
false,
mkTemplate(`
{{- define "test.1" -}}
{{ printf "Hello %s!" "world" }}
{{- end -}}
`, nil),
"failed to parse template glob",
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
masterTmpl = tc.initialTemplate
err := Load(tc.pathPatterns, tc.overwrite)
if tc.expErr == "" && err != nil {
t.Error("happened error that wasn't expected: %w", err)
}
if tc.expErr != "" && err == nil {
t.Error("%+w", err)
t.Error("expected error that didn't happend")
}
if err != nil && !strings.Contains(err.Error(), tc.expErr) {
t.Error("%+w", err)
t.Error("expected string doesn't exist in error message")
}
if !equalTemplates(masterTmpl.replacement, tc.expectedTemplate.replacement) {
t.Fatalf("replacement template is not as expected")
}
if !equalTemplates(masterTmpl.current, tc.expectedTemplate.current) {
t.Fatalf("current template is not as expected")
}
})
}
}
func TestTemplates_Reload(t *testing.T) {
testCases := []struct {
name string
initialTemplate textTemplate
expectedTemplate textTemplate
}{
{
"empty current and replacement templates",
mkTemplate(nil, nil),
mkTemplate(nil, nil),
},
{
"empty current template only",
mkTemplate(`
{{- define "test.1" -}}
{{- printf "value" -}}
{{- end -}}
`, nil),
mkTemplate(`
{{- define "test.1" -}}
{{- printf "value" -}}
{{- end -}}
`, nil),
},
{
"empty replacement template only",
mkTemplate(nil, `
{{- define "test.1" -}}
{{- printf "value" -}}
{{- end -}}
`),
mkTemplate(`
{{- define "test.1" -}}
{{- printf "value" -}}
{{- end -}}
`, nil),
},
{
"defined both templates",
mkTemplate(`
{{- define "test.0" -}}
{{- printf "value" -}}
{{- end -}}
{{- define "test.1" -}}
{{- printf "before" -}}
{{- end -}}
`, `
{{- define "test.1" -}}
{{- printf "after" -}}
{{- end -}}
`),
mkTemplate(`
{{- define "test.1" -}}
{{- printf "after" -}}
{{- end -}}
`, nil),
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
masterTmpl = tc.initialTemplate
Reload()
if !equalTemplates(masterTmpl.replacement, tc.expectedTemplate.replacement) {
t.Fatalf("replacement template is not as expected")
}
if !equalTemplates(masterTmpl.current, tc.expectedTemplate.current) {
t.Fatalf("current template is not as expected")
}
})
}
}

View file

@ -0,0 +1,3 @@
{{- define "test.1" -}}
{{ printf "Hello %s!" externalURL" }}
{{- end -}}

View file

@ -0,0 +1,9 @@
{{- define "test.1" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
{{- define "test.0" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
{{- define "test.3" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}

View file

@ -0,0 +1,9 @@
{{- define "test.2" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
{{- define "test.0" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}
{{- define "test.3" -}}
{{ printf "Hello %s!" externalURL }}
{{- end -}}

View file

@ -69,7 +69,7 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
case "/alerts": case "/alerts":
WriteListAlerts(w, pathPrefix, rh.groupAlerts()) WriteListAlerts(w, pathPrefix, rh.groupAlerts())
return true return true
case "/groups": case "/groups", "/rules":
WriteListGroups(w, rh.groups()) WriteListGroups(w, rh.groups())
return true return true
case "/notifiers": case "/notifiers":

View file

@ -139,3 +139,122 @@ info app/vmbackupmanager/retention.go:106 daily backups to delete [daily/2
The result on the GCS bucket. We see only 3 daily backups: The result on the GCS bucket. We see only 3 daily backups:
![daily](vmbackupmanager_rp_daily_2.png) ![daily](vmbackupmanager_rp_daily_2.png)
## Configuration
### Flags
Pass `-help` to `vmbackupmanager` in order to see the full list of supported
command-line flags with their descriptions.
The shortlist of configuration flags is the following:
```
vmbackupmanager performs regular backups according to the provided configs.
-concurrency int
The number of concurrent workers. Higher concurrency may reduce backup duration (default 10)
-configFilePath string
Path to file with S3 configs. Configs are loaded from default location if not set.
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-configProfile string
Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used
-credsFilePath string
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
-customS3Endpoint string
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
-disableDaily
Disable daily run. Default false
-disableHourly
Disable hourly run. Default false
-disableMonthly
Disable monthly run. Default false
-disableWeekly
Disable weekly run. Default false
-dst string
The root folder of Victoria Metrics backups. Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-eula
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
-httpListenAddr string
Address to listen for http connections (default ":8300")
-keepLastDaily int
Keep last N daily backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
-keepLastHourly int
Keep last N hourly backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
-keepLastMonthly int
Keep last N monthly backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
-keepLastWeekly int
Keep last N weekly backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
-loggerOutput string
Output for the logs. Supported values: stderr, stdout (default "stderr")
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-maxBytesPerSecond int
The maximum upload speed. There is no limit if it is set to 0
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
-pprofAuthKey string
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
-runOnStart
Upload backups immediately after start of the service. Otherwise the backup starts on new hour
-s3ForcePathStyle
Prefixing endpoint with bucket name when set false, true by default. (default true)
-snapshot.createURL string
VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup.Example: http://victoriametrics:8428/snapshot/create
-snapshot.deleteURL string
VictoriaMetrics delete snapshot url. Optional. Will be generated from snapshot.createURL if not provided. All created snaphosts will be automatically deleted.Example: http://victoriametrics:8428/snapshot/delete
-storageDataPath string
Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage (default "victoria-metrics-data")
-tls
Whether to enable TLS for incoming HTTP requests at -httpListenAddr (aka https). -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
-tlsCipherSuites array
Optional list of TLS cipher suites for incoming requests over HTTPS if -tls is set. See the list of supported cipher suites at https://pkg.go.dev/crypto/tls#pkg-constants
Supports an array of values separated by comma or specified via multiple flags.
-tlsKeyFile string
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
-version
Show VictoriaMetrics version
```

View file

@ -16,7 +16,7 @@ To see the full list of supported modes
run the following command: run the following command:
```bash ```bash
./vmctl --help $ ./vmctl --help
NAME: NAME:
vmctl - VictoriaMetrics command-line tool vmctl - VictoriaMetrics command-line tool
@ -35,7 +35,7 @@ Each mode has its own unique set of flags specific (e.g. prefixed with `influx`
to the data source and common list of flags for destination (prefixed with `vm` for VictoriaMetrics): to the data source and common list of flags for destination (prefixed with `vm` for VictoriaMetrics):
``` ```
./vmctl influx --help $ ./vmctl influx --help
OPTIONS: OPTIONS:
--influx-addr value InfluxDB server addr (default: "http://localhost:8086") --influx-addr value InfluxDB server addr (default: "http://localhost:8086")
--influx-user value InfluxDB user [$INFLUX_USERNAME] --influx-user value InfluxDB user [$INFLUX_USERNAME]
@ -55,7 +55,7 @@ them below in corresponding sections.
For the destination flags see the full description by running the following command: For the destination flags see the full description by running the following command:
``` ```
./vmctl influx --help | grep vm- $ ./vmctl influx --help | grep vm-
``` ```
Some flags like [--vm-extra-label](#adding-extra-labels) or [--vm-significant-figures](#significant-figures) Some flags like [--vm-extra-label](#adding-extra-labels) or [--vm-significant-figures](#significant-figures)
@ -77,11 +77,11 @@ forget to specify the `--vm-account-id` flag. See more details for cluster versi
See `./vmctl opentsdb --help` for details and full list of flags. See `./vmctl opentsdb --help` for details and full list of flags.
*OpenTSDB migration is not possible without a functioning [meta](http://opentsdb.net/docs/build/html/user_guide/metadata.html) table to search for metrics/series.* **Important:** OpenTSDB migration is not possible without a functioning [meta](http://opentsdb.net/docs/build/html/user_guide/metadata.html) table to search for metrics/series. Check in OpenTSDB config that appropriate options are [activated]( https://github.com/OpenTSDB/opentsdb/issues/681#issuecomment-177359563) and HBase meta tables are present. W/o them migration won't work.
OpenTSDB migration works like so: OpenTSDB migration works like so:
1. Find metrics based on selected filters (or the default filter set ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']) 1. Find metrics based on selected filters (or the default filter set `['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']`)
- e.g. `curl -Ss "http://opentsdb:4242/api/suggest?type=metrics&q=sys"` - e.g. `curl -Ss "http://opentsdb:4242/api/suggest?type=metrics&q=sys"`
@ -89,9 +89,11 @@ OpenTSDB migration works like so:
- e.g. `curl -Ss "http://opentsdb:4242/api/search/lookup?m=system.load5&limit=1000000"` - e.g. `curl -Ss "http://opentsdb:4242/api/search/lookup?m=system.load5&limit=1000000"`
Here `results` return field should not be empty. Otherwise it means that meta tables are absent and needs to be turned on previously.
3. Download data for each series in chunks defined in the CLI switches 3. Download data for each series in chunks defined in the CLI switches
- e.g. `-retention=sum-1m-avg:1h:90d` == - e.g. `-retention=sum-1m-avg:1h:90d` means
- `curl -Ss "http://opentsdb:4242/api/query?start=1h-ago&end=now&m=sum:1m-avg-none:system.load5\{host=host1\}"` - `curl -Ss "http://opentsdb:4242/api/query?start=1h-ago&end=now&m=sum:1m-avg-none:system.load5\{host=host1\}"`
- `curl -Ss "http://opentsdb:4242/api/query?start=2h-ago&end=1h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"` - `curl -Ss "http://opentsdb:4242/api/query?start=2h-ago&end=1h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
- `curl -Ss "http://opentsdb:4242/api/query?start=3h-ago&end=2h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"` - `curl -Ss "http://opentsdb:4242/api/query?start=3h-ago&end=2h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
@ -101,7 +103,7 @@ OpenTSDB migration works like so:
This means that we must stream data from OpenTSDB to VictoriaMetrics in chunks. This is where concurrency for OpenTSDB comes in. We can query multiple chunks at once, but we shouldn't perform too many chunks at a time to avoid overloading the OpenTSDB cluster. This means that we must stream data from OpenTSDB to VictoriaMetrics in chunks. This is where concurrency for OpenTSDB comes in. We can query multiple chunks at once, but we shouldn't perform too many chunks at a time to avoid overloading the OpenTSDB cluster.
``` ```
$ bin/vmctl opentsdb --otsdb-addr http://opentsdb:4242/ --otsdb-retentions sum-1m-avg:1h:1d --otsdb-filters system --otsdb-normalize --vm-addr http://victoria/ $ ./vmctl opentsdb --otsdb-addr http://opentsdb:4242/ --otsdb-retentions sum-1m-avg:1h:1d --otsdb-filters system --otsdb-normalize --vm-addr http://victoria:8428/
OpenTSDB import mode OpenTSDB import mode
2021/04/09 11:52:50 Will collect data starting at TS 1617990770 2021/04/09 11:52:50 Will collect data starting at TS 1617990770
2021/04/09 11:52:50 Loading all metrics from OpenTSDB for filters: [system] 2021/04/09 11:52:50 Loading all metrics from OpenTSDB for filters: [system]
@ -109,6 +111,14 @@ Found 9 metrics to import. Continue? [Y/n]
2021/04/09 11:52:51 Starting work on system.load1 2021/04/09 11:52:51 Starting work on system.load1
23 / 402200 [>____________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________] 0.01% 2 p/s 23 / 402200 [>____________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________] 0.01% 2 p/s
``` ```
Where `:8428` is Prometheus port of VictoriaMetrics.
For clustered VictoriaMetrics setup `--vm-account-id` flag needs to be added, for example:
```
$ ./vmctl opentsdb --otsdb-addr http://opentsdb:4242/ --otsdb-retentions sum-1m-avg:1h:1d --otsdb-filters system --otsdb-normalize --vm-addr http://victoria:8480/ --vm-account-id 0
```
This time `:8480` port is vminsert/Prometheus input port.
### Retention strings ### Retention strings

View file

@ -3,7 +3,7 @@
// altogether. // altogether.
package barpool package barpool
import "github.com/cheggaaa/pb/v3" import "github.com/dmitryk-dk/pb/v3"
var pool = pb.NewPool() var pool = pb.NewPool()

View file

@ -202,6 +202,7 @@ const (
influxFilterTimeEnd = "influx-filter-time-end" influxFilterTimeEnd = "influx-filter-time-end"
influxMeasurementFieldSeparator = "influx-measurement-field-separator" influxMeasurementFieldSeparator = "influx-measurement-field-separator"
influxSkipDatabaseLabel = "influx-skip-database-label" influxSkipDatabaseLabel = "influx-skip-database-label"
influxPrometheusMode = "influx-prometheus-mode"
) )
var ( var (
@ -264,6 +265,11 @@ var (
Usage: "Wether to skip adding the label 'db' to timeseries.", Usage: "Wether to skip adding the label 'db' to timeseries.",
Value: false, Value: false,
}, },
&cli.BoolFlag{
Name: influxPrometheusMode,
Usage: "Wether to restore the original timeseries name previously written from Prometheus to InfluxDB v1 via remote_write.",
Value: false,
},
} }
) )

View file

@ -17,9 +17,10 @@ type influxProcessor struct {
cc int cc int
separator string separator string
skipDbLabel bool skipDbLabel bool
promMode bool
} }
func newInfluxProcessor(ic *influx.Client, im *vm.Importer, cc int, separator string, skipDbLabel bool) *influxProcessor { func newInfluxProcessor(ic *influx.Client, im *vm.Importer, cc int, separator string, skipDbLabel bool, promMode bool) *influxProcessor {
if cc < 1 { if cc < 1 {
cc = 1 cc = 1
} }
@ -29,6 +30,7 @@ func newInfluxProcessor(ic *influx.Client, im *vm.Importer, cc int, separator st
cc: cc, cc: cc,
separator: separator, separator: separator,
skipDbLabel: skipDbLabel, skipDbLabel: skipDbLabel,
promMode: promMode,
} }
} }
@ -101,6 +103,8 @@ func (ip *influxProcessor) run(silent, verbose bool) error {
} }
const dbLabel = "db" const dbLabel = "db"
const nameLabel = "__name__"
const valueField = "value"
func (ip *influxProcessor) do(s *influx.Series) error { func (ip *influxProcessor) do(s *influx.Series) error {
cr, err := ip.ic.FetchDataPoints(s) cr, err := ip.ic.FetchDataPoints(s)
@ -122,6 +126,8 @@ func (ip *influxProcessor) do(s *influx.Series) error {
for i, lp := range s.LabelPairs { for i, lp := range s.LabelPairs {
if lp.Name == dbLabel { if lp.Name == dbLabel {
containsDBLabel = true containsDBLabel = true
} else if lp.Name == nameLabel && s.Field == valueField && ip.promMode {
name = lp.Value
} }
labels[i] = vm.LabelPair{ labels[i] = vm.LabelPair{
Name: lp.Name, Name: lp.Name,

View file

@ -105,7 +105,8 @@ func main() {
importer, importer,
c.Int(influxConcurrency), c.Int(influxConcurrency),
c.String(influxMeasurementFieldSeparator), c.String(influxMeasurementFieldSeparator),
c.Bool(influxSkipDatabaseLabel)) c.Bool(influxSkipDatabaseLabel),
c.Bool(influxPrometheusMode))
return processor.run(c.Bool(globalSilent), c.Bool(globalVerbose)) return processor.run(c.Bool(globalSilent), c.Bool(globalVerbose))
}, },
}, },

View file

@ -8,7 +8,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/opentsdb" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/opentsdb"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/vm" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/vm"
"github.com/cheggaaa/pb/v3" "github.com/dmitryk-dk/pb/v3"
) )
type otsdbProcessor struct { type otsdbProcessor struct {
@ -67,7 +67,7 @@ func (op *otsdbProcessor) run(silent, verbose bool) error {
queryRanges += len(rt.QueryRanges) queryRanges += len(rt.QueryRanges)
} }
for _, metric := range metrics { for _, metric := range metrics {
log.Println(fmt.Sprintf("Starting work on %s", metric)) log.Printf("Starting work on %s", metric)
serieslist, err := op.oc.FindSeries(metric) serieslist, err := op.oc.FindSeries(metric)
if err != nil { if err != nil {
return fmt.Errorf("couldn't retrieve series list for %s : %s", metric, err) return fmt.Errorf("couldn't retrieve series list for %s : %s", metric, err)

View file

@ -196,7 +196,7 @@ func (c Client) GetData(series Meta, rt RetentionMeta, start int64, end int64, m
3. bad format of response body 3. bad format of response body
*/ */
if resp.StatusCode != 200 { if resp.StatusCode != 200 {
log.Println(fmt.Sprintf("bad response code from OpenTSDB query %v for %q...skipping", resp.StatusCode, q)) log.Printf("bad response code from OpenTSDB query %v for %q...skipping", resp.StatusCode, q)
return Metric{}, nil return Metric{}, nil
} }
defer func() { _ = resp.Body.Close() }() defer func() { _ = resp.Body.Close() }()
@ -208,7 +208,7 @@ func (c Client) GetData(series Meta, rt RetentionMeta, start int64, end int64, m
var output []OtsdbMetric var output []OtsdbMetric
err = json.Unmarshal(body, &output) err = json.Unmarshal(body, &output)
if err != nil { if err != nil {
log.Println(fmt.Sprintf("couldn't marshall response body from OpenTSDB query (%s)...skipping", body)) log.Printf("couldn't marshall response body from OpenTSDB query (%s)...skipping", body)
return Metric{}, nil return Metric{}, nil
} }
/* /*
@ -309,7 +309,7 @@ func NewClient(cfg Config) (*Client, error) {
*/ */
offsetPrint = offsetPrint - offsetSecs offsetPrint = offsetPrint - offsetSecs
} }
log.Println(fmt.Sprintf("Will collect data starting at TS %v", offsetPrint)) log.Printf("Will collect data starting at TS %v", offsetPrint)
for _, r := range cfg.Retentions { for _, r := range cfg.Retentions {
ret, err := convertRetention(r, offsetSecs, cfg.MsecsTime) ret, err := convertRetention(r, offsetSecs, cfg.MsecsTime)
if err != nil { if err != nil {

View file

@ -13,11 +13,10 @@ import (
"sync" "sync"
"time" "time"
"github.com/cheggaaa/pb/v3"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/barpool" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/barpool"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/limiter" "github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/limiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal" "github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/dmitryk-dk/pb/v3"
) )
// Config contains list of params to configure // Config contains list of params to configure

View file

@ -36,7 +36,7 @@ import (
var ( var (
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty") graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. "+ influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. "+
"This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write") "This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+ opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+
"Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. "+ "Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. "+

View file

@ -2069,7 +2069,7 @@ func TestExecSuccess(t *testing.T) {
t.Parallel() t.Parallel()
q := `with ( q := `with (
x = ( x = (
label_set(time(), "foo", "123.456", "__name__", "aaa"), label_set(time() > 1500, "foo", "123.456", "__name__", "aaa"),
label_set(-time(), "foo", "bar", "__name__", "bbb"), label_set(-time(), "foo", "bar", "__name__", "bbb"),
label_set(-time(), "__name__", "bxs"), label_set(-time(), "__name__", "bxs"),
label_set(-time(), "foo", "45", "bar", "xs"), label_set(-time(), "foo", "45", "bar", "xs"),
@ -2093,7 +2093,7 @@ func TestExecSuccess(t *testing.T) {
} }
r2 := netstorage.Result{ r2 := netstorage.Result{
MetricName: metricNameExpected, MetricName: metricNameExpected,
Values: []float64{1123.456, 1323.456, 1523.456, 1723.456, 1923.456, 2123.456}, Values: []float64{nan, nan, nan, 1723.456, 1923.456, 2123.456},
Timestamps: timestampsExpected, Timestamps: timestampsExpected,
} }
r2.MetricName.Tags = []storage.Tag{ r2.MetricName.Tags = []storage.Tag{

View file

@ -1715,8 +1715,10 @@ func transformLabelValue(tfa *transformFuncArg) ([]*timeseries, error) {
v = nan v = nan
} }
values := ts.Values values := ts.Values
for i := range values { for i, vOrig := range values {
values[i] = v if !math.IsNaN(vOrig) {
values[i] = v
}
} }
} }
// Do not remove timeseries with only NaN values, so `default` could be applied to them: // Do not remove timeseries with only NaN values, so `default` could be applied to them:

View file

@ -1,14 +1,12 @@
{ {
"files": { "files": {
"main.css": "./static/css/main.d8362c27.css", "main.css": "./static/css/main.d8362c27.css",
"main.js": "./static/js/main.f64c8675.js", "main.js": "./static/js/main.a54e3212.js",
"static/js/362.1f16598a.chunk.js": "./static/js/362.1f16598a.chunk.js",
"static/js/27.939f971b.chunk.js": "./static/js/27.939f971b.chunk.js", "static/js/27.939f971b.chunk.js": "./static/js/27.939f971b.chunk.js",
"static/media/README.md": "./static/media/README.40ebc3a1f4adae949154.md",
"index.html": "./index.html" "index.html": "./index.html"
}, },
"entrypoints": [ "entrypoints": [
"static/css/main.d8362c27.css", "static/css/main.d8362c27.css",
"static/js/main.f64c8675.js" "static/js/main.a54e3212.js"
] ]
} }

View file

@ -1,3 +1,8 @@
### Setup
1. Create `.json` config file in a folder `dashboards`
2. Import your config file into the `dashboards/index.js`
3. Add imported variable into the array `window.__VMUI_PREDEFINED_DASHBOARDS__`
### Configuration options ### Configuration options
<br/> <br/>

View file

@ -0,0 +1,5 @@
import perJob from "./perJobUsage.json" assert { type: "json" };
window.__VMUI_PREDEFINED_DASHBOARDS__ = [
perJob
];

View file

@ -17,12 +17,12 @@
"title": "Per-job disk read", "title": "Per-job disk read",
"width": 6, "width": 6,
"expr": ["sum(rate(process_io_storage_read_bytes_total)) by (job)"] "expr": ["sum(rate(process_io_storage_read_bytes_total)) by (job)"]
},{ },
{
"title": "Per-job disk write", "title": "Per-job disk write",
"width": 6, "width": 6,
"expr": ["sum(rate(process_io_storage_written_bytes_total)) by (job)"] "expr": ["sum(rate(process_io_storage_written_bytes_total)) by (job)"]
} }
] ]
} }
] ]

View file

@ -1 +1 @@
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1"/><meta name="theme-color" content="#000000"/><meta name="description" content="VM-UI is a metric explorer for Victoria Metrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700&display=swap"/><script defer="defer" src="./static/js/main.f64c8675.js"></script><link href="./static/css/main.d8362c27.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html> <!doctype html><html lang="en"><head><meta charset="utf-8"/><link href="./favicon.ico" rel="icon"/><meta content="width=device-width,initial-scale=1" name="viewport"/><meta content="#000000" name="theme-color"/><meta content="VM-UI is a metric explorer for Victoria Metrics" name="description"/><link href="./apple-touch-icon.png" rel="apple-touch-icon"/><link href="./favicon-32x32.png" rel="icon" sizes="32x32" type="image/png"><link href="./manifest.json" rel="manifest"/><title>VM UI</title><link href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700&display=swap" rel="stylesheet"/><script src="./dashboards/index.js" type="module"></script><script defer="defer" src="./static/js/main.a54e3212.js"></script><link href="./static/css/main.d8362c27.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>

View file

@ -1 +0,0 @@
"use strict";(self.webpackChunkvmui=self.webpackChunkvmui||[]).push([[362],{8362:function(e,a,s){e.exports=s.p+"static/media/README.40ebc3a1f4adae949154.md"}}]);

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load diff

View file

@ -37,7 +37,6 @@
"start": "react-app-rewired start", "start": "react-app-rewired start",
"build": "GENERATE_SOURCEMAP=false react-app-rewired build", "build": "GENERATE_SOURCEMAP=false react-app-rewired build",
"test": "react-app-rewired test", "test": "react-app-rewired test",
"eject": "react-scripts eject",
"lint": "eslint src --ext tsx,ts", "lint": "eslint src --ext tsx,ts",
"lint:fix": "eslint src --ext tsx,ts --fix" "lint:fix": "eslint src --ext tsx,ts --fix"
}, },
@ -66,5 +65,10 @@
"customize-cra": "^1.0.0", "customize-cra": "^1.0.0",
"eslint-plugin-react": "^7.29.4", "eslint-plugin-react": "^7.29.4",
"react-app-rewired": "^2.2.1" "react-app-rewired": "^2.2.1"
},
"overrides": {
"react-app-rewired": {
"nth-check": "^2.0.1"
}
} }
} }

View file

@ -1,3 +1,8 @@
### Setup
1. Create `.json` config file in a folder `dashboards`
2. Import your config file into the `dashboards/index.js`
3. Add imported variable into the array `window.__VMUI_PREDEFINED_DASHBOARDS__`
### Configuration options ### Configuration options
<br/> <br/>

View file

@ -0,0 +1,5 @@
import perJob from "./perJobUsage.json" assert { type: "json" };
window.__VMUI_PREDEFINED_DASHBOARDS__ = [
perJob
];

View file

@ -0,0 +1,29 @@
{
"title": "per-job resource usage",
"rows": [
{
"panels": [
{
"title": "Per-job CPU usage",
"width": 6,
"expr": ["sum(rate(process_cpu_seconds_total)) by (job)"]
},
{
"title": "Per-job RSS usage",
"width": 6,
"expr": ["sum(process_resident_memory_bytes) by (job)"]
},
{
"title": "Per-job disk read",
"width": 6,
"expr": ["sum(rate(process_io_storage_read_bytes_total)) by (job)"]
},
{
"title": "Per-job disk write",
"width": 6,
"expr": ["sum(rate(process_io_storage_written_bytes_total)) by (job)"]
}
]
}
]
}

View file

@ -27,6 +27,7 @@
--> -->
<title>VM UI</title> <title>VM UI</title>
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700&display=swap" /> <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700&display=swap" />
<script src="%PUBLIC_URL%/dashboards/index.js" type="module"></script>
</head> </head>
<body> <body>
<noscript>You need to enable JavaScript to run this app.</noscript> <noscript>You need to enable JavaScript to run this app.</noscript>

View file

@ -19,29 +19,29 @@ import DashboardsLayout from "./components/PredefinedPanels/DashboardsLayout";
const App: FC = () => { const App: FC = () => {
return <> return <>
<CssBaseline /> {/* CSS Baseline: kind of normalize.css made by materialUI team - can be scoped */} <HashRouter>
<LocalizationProvider dateAdapter={DayjsUtils}> {/* Allows datepicker to work with DayJS */} <CssBaseline /> {/* CSS Baseline: kind of normalize.css made by materialUI team - can be scoped */}
<StyledEngineProvider injectFirst> <LocalizationProvider dateAdapter={DayjsUtils}> {/* Allows datepicker to work with DayJS */}
<ThemeProvider theme={THEME}> {/* Material UI theme customization */} <StyledEngineProvider injectFirst>
<StateProvider> {/* Serialized into query string, common app settings */} <ThemeProvider theme={THEME}> {/* Material UI theme customization */}
<AuthStateProvider> {/* Auth related info - optionally persisted to Local Storage */} <StateProvider> {/* Serialized into query string, common app settings */}
<GraphStateProvider> {/* Graph settings */} <AuthStateProvider> {/* Auth related info - optionally persisted to Local Storage */}
<SnackbarProvider> {/* Display various snackbars */} <GraphStateProvider> {/* Graph settings */}
<HashRouter> <SnackbarProvider> {/* Display various snackbars */}
<Routes> <Routes>
<Route path={"/"} element={<HomeLayout/>}> <Route path={"/"} element={<HomeLayout/>}>
<Route path={router.home} element={<CustomPanel/>}/> <Route path={router.home} element={<CustomPanel/>}/>
<Route path={router.dashboards} element={<DashboardsLayout/>}/> <Route path={router.dashboards} element={<DashboardsLayout/>}/>
</Route> </Route>
</Routes> </Routes>
</HashRouter> </SnackbarProvider>
</SnackbarProvider> </GraphStateProvider>
</GraphStateProvider> </AuthStateProvider>
</AuthStateProvider> </StateProvider>
</StateProvider> </ThemeProvider>
</ThemeProvider> </StyledEngineProvider>
</StyledEngineProvider> </LocalizationProvider>
</LocalizationProvider> </HashRouter>
</>; </>;
}; };

View file

@ -9,10 +9,10 @@ import {SyntheticEvent} from "react";
export type DisplayType = "table" | "chart" | "code"; export type DisplayType = "table" | "chart" | "code";
const tabs = [ export const displayTypeTabs = [
{value: "chart", icon: <ShowChartIcon/>, label: "Graph"}, {value: "chart", icon: <ShowChartIcon/>, label: "Graph", prometheusCode: 0},
{value: "code", icon: <CodeIcon/>, label: "JSON"}, {value: "code", icon: <CodeIcon/>, label: "JSON"},
{value: "table", icon: <TableChartIcon/>, label: "Table"} {value: "table", icon: <TableChartIcon/>, label: "Table", prometheusCode: 1}
]; ];
export const DisplayTypeSwitch: FC = () => { export const DisplayTypeSwitch: FC = () => {
@ -29,7 +29,7 @@ export const DisplayTypeSwitch: FC = () => {
onChange={handleChange} onChange={handleChange}
sx={{minHeight: "0", marginBottom: "-1px"}} sx={{minHeight: "0", marginBottom: "-1px"}}
> >
{tabs.map(t => {displayTypeTabs.map(t =>
<Tab key={t.value} <Tab key={t.value}
icon={t.icon} icon={t.icon}
iconPosition="start" iconPosition="start"

View file

@ -20,7 +20,7 @@ const DashboardLayout: FC = () => {
}, [dashboards, tab]); }, [dashboards, tab]);
useEffect(() => { useEffect(() => {
getDashboardSettings().then(d => d.length && setDashboards(d)); setDashboards(getDashboardSettings());
}, []); }, []);
return <> return <>

View file

@ -1,14 +1,6 @@
import {DashboardSettings} from "../../types"; import {DashboardSettings} from "../../types";
const importModule = async (filename: string) => { export default (): DashboardSettings[] => {
const module = await import(`../../dashboards/${filename}`); return window.__VMUI_PREDEFINED_DASHBOARDS__ || [];
module.default.filename = filename;
return module.default as DashboardSettings;
};
export default async () => {
const context = require.context("../../dashboards", true, /\.json$/);
const filenames = context.keys().map(r => r.replace("./", ""));
return await Promise.all(filenames.map(async f => importModule(f)));
}; };

View file

@ -8,6 +8,8 @@ import {getAppModeEnable, getAppModeParams} from "../utils/app-mode";
import throttle from "lodash.throttle"; import throttle from "lodash.throttle";
import {DisplayType} from "../components/CustomPanel/Configurator/DisplayTypeSwitch"; import {DisplayType} from "../components/CustomPanel/Configurator/DisplayTypeSwitch";
import {CustomStep} from "../state/graph/reducer"; import {CustomStep} from "../state/graph/reducer";
import usePrevious from "./usePrevious";
import {arrayEquals} from "../utils/array";
interface FetchQueryParams { interface FetchQueryParams {
predefinedQuery?: string[] predefinedQuery?: string[]
@ -48,7 +50,6 @@ export const useFetchQuery = ({predefinedQuery, visible, display, customStep}: F
const controller = new AbortController(); const controller = new AbortController();
setFetchQueue([...fetchQueue, controller]); setFetchQueue([...fetchQueue, controller]);
setIsLoading(true); setIsLoading(true);
try { try {
const responses = await Promise.all(fetchUrl.map(url => fetch(url, {signal: controller.signal}))); const responses = await Promise.all(fetchUrl.map(url => fetch(url, {signal: controller.signal})));
const tempData = []; const tempData = [];
@ -114,12 +115,14 @@ export const useFetchQuery = ({predefinedQuery, visible, display, customStep}: F
}, },
[serverUrl, period, displayType, customStep]); [serverUrl, period, displayType, customStep]);
const prevFetchUrl = usePrevious(fetchUrl);
useEffect(() => { useEffect(() => {
fetchOptions(); fetchOptions();
}, [serverUrl]); }, [serverUrl]);
useEffect(() => { useEffect(() => {
if (!visible) return; if (!visible || (fetchUrl && prevFetchUrl && arrayEquals(fetchUrl, prevFetchUrl))) return;
throttledFetchData(fetchUrl, fetchQueue, (display || displayType)); throttledFetchData(fetchUrl, fetchQueue, (display || displayType));
}, [fetchUrl, visible]); }, [fetchUrl, visible]);

View file

@ -0,0 +1,10 @@
import { useRef, useEffect } from "react";
export default (value: any) => {
const ref = useRef();
useEffect(() => {
ref.current = value;
}, [value]);
return ref.current;
};

View file

@ -2,6 +2,7 @@ import React, {createContext, FC, useContext, useEffect, useMemo, useReducer} fr
import {Action, AppState, initialState, reducer} from "./reducer"; import {Action, AppState, initialState, reducer} from "./reducer";
import {getQueryStringValue, setQueryStringValue} from "../../utils/query-string"; import {getQueryStringValue, setQueryStringValue} from "../../utils/query-string";
import {Dispatch} from "react"; import {Dispatch} from "react";
import {useLocation} from "react-router-dom";
type StateContextType = { state: AppState, dispatch: Dispatch<Action> }; type StateContextType = { state: AppState, dispatch: Dispatch<Action> };
@ -17,12 +18,13 @@ export const initialPrepopulatedState = Object.entries(initialState)
}), {}) as AppState; }), {}) as AppState;
export const StateProvider: FC = ({children}) => { export const StateProvider: FC = ({children}) => {
const location = useLocation();
const [state, dispatch] = useReducer(reducer, initialPrepopulatedState); const [state, dispatch] = useReducer(reducer, initialPrepopulatedState);
useEffect(() => { useEffect(() => {
setQueryStringValue(state as unknown as Record<string, unknown>); setQueryStringValue(state as unknown as Record<string, unknown>);
}, [state]); }, [state, location]);
const contextValue = useMemo(() => { const contextValue = useMemo(() => {
return { state, dispatch }; return { state, dispatch };

View file

@ -1,5 +1,5 @@
/* eslint max-lines: 0 */ /* eslint max-lines: 0 */
import {DisplayType} from "../../components/CustomPanel/Configurator/DisplayTypeSwitch"; import {DisplayType, displayTypeTabs} from "../../components/CustomPanel/Configurator/DisplayTypeSwitch";
import {TimeParams, TimePeriod} from "../../types"; import {TimeParams, TimePeriod} from "../../types";
import { import {
dateFromSeconds, dateFromSeconds,
@ -62,10 +62,12 @@ const {duration, endInput, relativeTimeId} = getRelativeTime({
defaultEndInput: new Date(formatDateToLocal(getQueryStringValue("g0.end_input", getDateNowUTC()) as Date)), defaultEndInput: new Date(formatDateToLocal(getQueryStringValue("g0.end_input", getDateNowUTC()) as Date)),
}); });
const query = getQueryArray(); const query = getQueryArray();
const queryTab = getQueryStringValue("g0.tab", 0);
const displayType = displayTypeTabs.find(t => t.prometheusCode === queryTab || t.value === queryTab);
export const initialState: AppState = { export const initialState: AppState = {
serverUrl: getDefaultServer(), serverUrl: getDefaultServer(),
displayType: getQueryStringValue("g0.tab", "chart") as DisplayType || "chart", displayType: (displayType?.value || "chart") as DisplayType,
query: query, // demo_memory_usage_bytes query: query, // demo_memory_usage_bytes
queryHistory: query.map(q => ({index: 0, values: [q]})), queryHistory: query.map(q => ({index: 0, values: [q]})),
time: { time: {

View file

@ -1,5 +1,11 @@
import {MetricBase} from "../api/types"; import {MetricBase} from "../api/types";
declare global {
interface Window {
__VMUI_PREDEFINED_DASHBOARDS__: DashboardSettings[];
}
}
export interface TimeParams { export interface TimeParams {
start: number; // timestamp in seconds start: number; // timestamp in seconds
end: number; // timestamp in seconds end: number; // timestamp in seconds

View file

@ -0,0 +1,4 @@
export const arrayEquals = (a: (string|number)[], b: (string|number)[]) => {
return a.length === b.length && a.every((val, index) => val === b[index]);
};

View file

@ -1,12 +1,18 @@
import qs from "qs"; import qs from "qs";
import get from "lodash.get"; import get from "lodash.get";
import router from "../router";
const stateToUrlParams = { const graphStateToUrlParams = {
"time.duration": "range_input", "time.duration": "range_input",
"time.period.date": "end_input", "time.period.date": "end_input",
"time.period.step": "step_input", "time.period.step": "step_input",
"time.relativeTime": "relative_time", "time.relativeTime": "relative_time",
"displayType": "tab" "displayType": "tab",
};
const stateToUrlParams = {
[router.home]: graphStateToUrlParams,
[router.dashboards]: graphStateToUrlParams,
}; };
// TODO need function for detect types. // TODO need function for detect types.
@ -32,14 +38,23 @@ const stateToUrlParams = {
export const setQueryStringWithoutPageReload = (qsValue: string): void => { export const setQueryStringWithoutPageReload = (qsValue: string): void => {
const w = window; const w = window;
if (w) { if (w) {
const newurl = `${w.location.protocol}//${w.location.host}${w.location.pathname}?${qsValue}${w.location.hash}`; const qs = qsValue ? `?${qsValue}` : "";
const newurl = `${w.location.protocol}//${w.location.host}${w.location.pathname}${qs}${w.location.hash}`;
w.history.pushState({ path: newurl }, "", newurl); w.history.pushState({ path: newurl }, "", newurl);
} }
}; };
export const setQueryStringValue = (newValue: Record<string, unknown>): void => { export const setQueryStringValue = (newValue: Record<string, unknown>): void => {
const queryMap = new Map(Object.entries(stateToUrlParams)); const route = window.location.hash.replace("#", "");
const query = get(newValue, "query", "") as string[]; const params = stateToUrlParams[route] || {};
const queryMap = new Map(Object.entries(params));
const isGraphRoute = route === router.home || route === router.dashboards;
const newQsValue = isGraphRoute ? getGraphQsValue(newValue, queryMap) : getQsValue(newValue, queryMap);
setQueryStringWithoutPageReload(newQsValue.join("&"));
};
const getGraphQsValue = (newValue: Record<string, unknown>, queryMap: Map<string, string>): string[] => {
const query = get(newValue, "query", []) as string[];
const newQsValue: string[] = []; const newQsValue: string[] = [];
query.forEach((q, i) => { query.forEach((q, i) => {
queryMap.forEach((queryKey, stateKey) => { queryMap.forEach((queryKey, stateKey) => {
@ -52,7 +67,20 @@ export const setQueryStringValue = (newValue: Record<string, unknown>): void =>
newQsValue.push(`g${i}.expr=${encodeURIComponent(q)}`); newQsValue.push(`g${i}.expr=${encodeURIComponent(q)}`);
}); });
setQueryStringWithoutPageReload(newQsValue.join("&")); return newQsValue;
};
const getQsValue = (newValue: Record<string, unknown>, queryMap: Map<string, string>): string[] => {
const newQsValue: string[] = [];
queryMap.forEach((queryKey, stateKey) => {
const value = get(newValue, stateKey, "") as string;
if (value) {
const valueEncoded = encodeURIComponent(value);
newQsValue.push(`${queryKey}=${valueEncoded}`);
}
});
return newQsValue;
}; };
export const getQueryStringValue = ( export const getQueryStringValue = (

View file

@ -34,7 +34,7 @@ app-via-docker: package-builder
--env GO111MODULE=on \ --env GO111MODULE=on \
$(DOCKER_OPTS) \ $(DOCKER_OPTS) \
$(BUILDER_IMAGE) \ $(BUILDER_IMAGE) \
go build $(RACE) -mod=vendor -trimpath \ go build $(RACE) -mod=vendor -trimpath -buildvcs=false \
-ldflags "-extldflags '-static' $(GO_BUILDINFO)" \ -ldflags "-extldflags '-static' $(GO_BUILDINFO)" \
-tags 'netgo osusergo nethttpomithttp2 musl' \ -tags 'netgo osusergo nethttpomithttp2 musl' \
-o bin/$(APP_NAME)$(APP_SUFFIX)-prod $(PKG_PREFIX)/app/$(APP_NAME) -o bin/$(APP_NAME)$(APP_SUFFIX)-prod $(PKG_PREFIX)/app/$(APP_NAME)
@ -50,7 +50,7 @@ app-via-docker-windows: package-builder
--env GO111MODULE=on \ --env GO111MODULE=on \
$(DOCKER_OPTS) \ $(DOCKER_OPTS) \
$(BUILDER_IMAGE) \ $(BUILDER_IMAGE) \
go build $(RACE) -mod=vendor -trimpath \ go build $(RACE) -mod=vendor -trimpath -buildvcs=false \
-ldflags "-s -w -extldflags '-static' $(GO_BUILDINFO)" \ -ldflags "-s -w -extldflags '-static' $(GO_BUILDINFO)" \
-tags 'netgo osusergo nethttpomithttp2' \ -tags 'netgo osusergo nethttpomithttp2' \
-o bin/$(APP_NAME)-windows$(APP_SUFFIX)-prod.exe $(PKG_PREFIX)/app/$(APP_NAME) -o bin/$(APP_NAME)-windows$(APP_SUFFIX)-prod.exe $(PKG_PREFIX)/app/$(APP_NAME)

View file

@ -15,6 +15,19 @@ The following tip changes can be tested by building VictoriaMetrics components f
## tip ## tip
* FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): support [reusable templates](https://prometheus.io/docs/prometheus/latest/configuration/template_examples/#defining-reusable-templates) for rules annotations. The path to the template files can be specified via `-rule.templates` flag. See more about this feature [here](https://docs.victoriametrics.com/vmalert.html#reusable-templates). Thanks to @AndrewChubatiuk for [the pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2532). See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2510).
* FEATURE: [vmctl](https://docs.victoriametrics.com/vmctl.html): add `influx-prometheus-mode` command-line flag, which allows to restore the original time series written from Prometheus into InfluxDB during data migration from InfluxDB to VictoriaMetrics. See [this feature request](https://github.com/VictoriaMetrics/vmctl/issues/8). Thanks to @mback2k for [the pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2545).
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add ability to specify AWS service name when issuing requests to AWS api. See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2605). Thanks to @transacid for [the pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2604).
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert.html): support `scalar` result type in response. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2607).
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert.html): support strings in `humanize.*` template function in the same way as Prometheus does. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2569).
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert.html): proxy `/rules` requests to vmalert from Grafana's alerting UI. This removes errors in Grafana's UI for Grafana versions older than `8.5.*`. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2583)
* BUGFIX: [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html): do not return values from [label_value()](https://docs.victoriametrics.com/MetricsQL.html#label_value) function if the original time series has no values at the selected timestamps.
* BUGFIX: [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html): limit the number of concurrently established connections from vmselect to vmstorage. This should prevent from potentially high spikes in the number of established connections after temporary slowdown in connection handshake procedure between vmselect and vmstorage because of spikes in workload. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2552).
* BUGFIX: [vmctl](https://docs.victoriametrics.com/vmctl.html): fix build for Solaris / SmartOS. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1322#issuecomment-1120276146).
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert.html): do not add `/api/v1/query` suffix to `-datasource.url` if `-remoteRead.disablePathAppend` command-line flag is set. Previously this flag was applied only to `-remoteRead.url`, which could confuse users.
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert.html): prevent from possible resource leak on config update, which could lead to the slowdown of `vmalert` over time. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2577).
## [v1.77.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.77.1) ## [v1.77.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.77.1)
Released at 07-05-2022 Released at 07-05-2022
@ -44,7 +57,7 @@ Released at 05-05-2022
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add ability to attach node-level labels and annotations to discovered Kubernetes pod targets in the same way as Prometheus 2.35 does. See [this feature request](https://github.com/prometheus/prometheus/issues/9510) and [this pull request](https://github.com/prometheus/prometheus/pull/10080). * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add ability to attach node-level labels and annotations to discovered Kubernetes pod targets in the same way as Prometheus 2.35 does. See [this feature request](https://github.com/prometheus/prometheus/issues/9510) and [this pull request](https://github.com/prometheus/prometheus/pull/10080).
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add support for `tls_config` and `proxy_url` options at `oauth2` section in the same way as Prometheus does. See [oauth2 docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#oauth2). * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add support for `tls_config` and `proxy_url` options at `oauth2` section in the same way as Prometheus does. See [oauth2 docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#oauth2).
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add support for `min_version` option at `tls_config` section in the same way as Prometheus does. See [tls_config docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config). * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add support for `min_version` option at `tls_config` section in the same way as Prometheus does. See [tls_config docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config).
* FEATURE: [vmagent](): expose `vmagent_remotewrite_rate_limit` metric at `http://vmagent:8429/metrics`, which can be used for alerting rules such as `rate(vmagent_remotewrite_conn_bytes_written_total) / vmagent_remotewrite_rate_limit > 0.8` when `-remoteWrite.rateLimit` command-line flag is set. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2521). * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): expose `vmagent_remotewrite_rate_limit` metric at `http://vmagent:8429/metrics`, which can be used for alerting rules such as `rate(vmagent_remotewrite_conn_bytes_written_total) / vmagent_remotewrite_rate_limit > 0.8` when `-remoteWrite.rateLimit` command-line flag is set. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2521).
* FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): add support for DNS-based discovery for notifiers in the same way as Prometheus does (aka `dns_sd_configs`). See [these docs](https://docs.victoriametrics.com/vmalert.html#notifier-configuration-file) and [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2460). * FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): add support for DNS-based discovery for notifiers in the same way as Prometheus does (aka `dns_sd_configs`). See [these docs](https://docs.victoriametrics.com/vmalert.html#notifier-configuration-file) and [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2460).
* FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): add `-replay.disableProgressBar` command-line flag, which allows disabling progressbar in [rules' backfilling mode](https://docs.victoriametrics.com/vmalert.html#rules-backfilling). See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1761). * FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): add `-replay.disableProgressBar` command-line flag, which allows disabling progressbar in [rules' backfilling mode](https://docs.victoriametrics.com/vmalert.html#rules-backfilling). See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1761).
* FEATURE: allow specifying TLS cipher suites for incoming https requests via `-tlsCipherSuites` command-line flag. See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2404). * FEATURE: allow specifying TLS cipher suites for incoming https requests via `-tlsCipherSuites` command-line flag. See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2404).

View file

@ -326,12 +326,30 @@ Some capacity planning tips for VictoriaMetrics cluster:
- The [replication](#replication-and-data-safety) increases the amounts of needed resources for the cluster by up to `N` times where `N` is replication factor. This is because `vminsert` stores `N` copies of every ingested sample on distinct `vmstorage` nodes. These copies are de-duplicated by `vmselect` during querying. The most cost-efficient and performant solution for data durability is to rely on replicated durable persistent disks such as [Google Compute persistent disks](https://cloud.google.com/compute/docs/disks#pdspecs) instead of using the [replication at VictoriaMetrics level](#replication-and-data-safety). - The [replication](#replication-and-data-safety) increases the amounts of needed resources for the cluster by up to `N` times where `N` is replication factor. This is because `vminsert` stores `N` copies of every ingested sample on distinct `vmstorage` nodes. These copies are de-duplicated by `vmselect` during querying. The most cost-efficient and performant solution for data durability is to rely on replicated durable persistent disks such as [Google Compute persistent disks](https://cloud.google.com/compute/docs/disks#pdspecs) instead of using the [replication at VictoriaMetrics level](#replication-and-data-safety).
- It is recommended to run a cluster with big number of small `vmstorage` nodes instead of a cluster with small number of big `vmstorage` nodes. This increases chances that the cluster remains available and stable when some of `vmstorage` nodes are temporarily unavailable during maintenance events such as upgrades, configuration changes or migrations. For example, when a cluster contains 10 `vmstorage` nodes and a single node becomes temporarily unavailable, then the workload on the remaining 9 nodes increases by `1/9=11%`. When a cluster contains 3 `vmstorage` nodes and a single node becomes temporarily unavailable, then the workload on the remaining 2 nodes increases by `1/2=50%`. The remaining `vmstorage` nodes may have no enough free capacity for handling the increased workload. In this case the cluster may become overloaded, which may result to decreased availability and stability. - It is recommended to run a cluster with big number of small `vmstorage` nodes instead of a cluster with small number of big `vmstorage` nodes. This increases chances that the cluster remains available and stable when some of `vmstorage` nodes are temporarily unavailable during maintenance events such as upgrades, configuration changes or migrations. For example, when a cluster contains 10 `vmstorage` nodes and a single node becomes temporarily unavailable, then the workload on the remaining 9 nodes increases by `1/9=11%`. When a cluster contains 3 `vmstorage` nodes and a single node becomes temporarily unavailable, then the workload on the remaining 2 nodes increases by `1/2=50%`. The remaining `vmstorage` nodes may have no enough free capacity for handling the increased workload. In this case the cluster may become overloaded, which may result to decreased availability and stability.
- Cluster capacity for [active time series](https://docs.victoriametrics.com/FAQ.html#what-is-an-active-time-series) can be increased by increasing RAM and CPU resources per each `vmstorage` node or by by adding new `vmstorage` nodes. - Cluster capacity for [active time series](https://docs.victoriametrics.com/FAQ.html#what-is-an-active-time-series) can be increased by increasing RAM and CPU resources per each `vmstorage` node or by adding new `vmstorage` nodes.
- Query latency can be reduced by increasing CPU resources per each `vmselect` node, since each incoming query is processed by a single `vmselect` node. Performance for heavy queries scales with the number of available CPU cores at `vmselect` node, since `vmselect` processes time series referred by the query on all the available CPU cores. - Query latency can be reduced by increasing CPU resources per each `vmselect` node, since each incoming query is processed by a single `vmselect` node. Performance for heavy queries scales with the number of available CPU cores at `vmselect` node, since `vmselect` processes time series referred by the query on all the available CPU cores.
- If the cluster needs to process incoming queries at a high rate, then its capacity can be increased by adding more `vmselect` nodes, so incoming queries could be spread among bigger number of `vmselect` nodes. - If the cluster needs to process incoming queries at a high rate, then its capacity can be increased by adding more `vmselect` nodes, so incoming queries could be spread among bigger number of `vmselect` nodes.
- By default `vminsert` compresses the data it sends to `vmstorage` in order to reduce network bandwidth usage. The compression takes additional CPU resources at `vminsert`. If `vminsert` nodes have limited CPU, then the compression can be disabled by passing `-rpc.disableCompression` command-line flag at `vminsert` nodes. - By default `vminsert` compresses the data it sends to `vmstorage` in order to reduce network bandwidth usage. The compression takes additional CPU resources at `vminsert`. If `vminsert` nodes have limited CPU, then the compression can be disabled by passing `-rpc.disableCompression` command-line flag at `vminsert` nodes.
- By default `vmstorage` compresses the data it sends to `vmselect` during queries in order to reduce network bandwidth usage. The compression takes additional CPU resources at `vmstorage`. If `vmstorage` nodes have limited CPU, then the compression can be disabled by passing `-rpc.disableCompression` command-line flag at `vmstorage` nodes. - By default `vmstorage` compresses the data it sends to `vmselect` during queries in order to reduce network bandwidth usage. The compression takes additional CPU resources at `vmstorage`. If `vmstorage` nodes have limited CPU, then the compression can be disabled by passing `-rpc.disableCompression` command-line flag at `vmstorage` nodes.
See also [resource usage limits docs](#resource-usage-limits).
## Resource usage limits
By default cluster components of VictoriaMetrics are tuned for an optimal resource usage under typical workloads. Some workloads may need fine-grained resource usage limits. In these cases the following command-line flags may be useful:
- `-memory.allowedPercent` and `-search.allowedBytes` limit the amounts of memory, which may be used for various internal caches at all the cluster components of VictoriaMetrics - `vminsert`, `vmselect` and `vmstorage`. Note that VictoriaMetrics components may use more memory, since these flags don't limit additional memory, which may be needed on a per-query basis.
- `-search.maxUniqueTimeseries` at `vmselect` component limits the number of unique time series a single query can find and process. `vmselect` passes the limit to `vmstorage` component, which keeps in memory some metainformation about the time series located by each query and spends some CPU time for processing the found time series. This means that the maximum memory usage and CPU usage a single query can use at `vmstorage` is proportional to `-search.maxUniqueTimeseries`.
- `-search.maxQueryDuration` at `vmselect` limits the duration of a single query. If the query takes longer than the given duration, then it is canceled. This allows saving CPU and RAM at `vmselect` and `vmstorage` when executing unexpected heavy queries.
- `-search.maxConcurrentRequests` at `vmselect` limits the number of concurrent requests a single `vmselect` node can process. Bigger number of concurrent requests usually means bigger memory usage at both `vmselect` and `vmstorage`. For example, if a single query needs 100 MiB of additional memory during its execution, then 100 concurrent queries may need `100 * 100 MiB = 10 GiB` of additional memory. So it is better to limit the number of concurrent queries, while suspending additional incoming queries if the concurrency limit is reached. `vmselect` provides `-search.maxQueueDuration` command-line flag for limiting the max wait time for suspended queries.
- `-search.maxSamplesPerSeries` at `vmselect` limits the number of raw samples the query can process per each time series. `vmselect` sequentially processes raw samples per each found time series during the query. It unpacks raw samples on the selected time range per each time series into memory and then applies the given [rollup function](https://docs.victoriametrics.com/MetricsQL.html#rollup-functions). The `-search.maxSamplesPerSeries` command-line flag allows limiting memory usage at `vmselect` in the case when the query is executed on a time range, which contains hundreds of millions of raw samples per each located time series.
- `-search.maxSamplesPerQuery` at `vmselect` limits the number of raw samples a single query can process. This allows limiting CPU usage at `vmselect` for heavy queries.
- `-search.maxSeries` at `vmselect` limits the number of time series, which may be returned from [/api/v1/series](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers). This endpoint is used mostly by Grafana for auto-completion of metric names, label names and label values. Queries to this endpoint may take big amounts of CPU time and memory at `vmstorage` and `vmselect` when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxSeries` to quite low value in order limit CPU and memory usage.
- `-search.maxTagKeys` at `vmselect` limits the number of items, which may be returned from [/api/v1/labels](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names). This endpoint is used mostly by Grafana for auto-completion of label names. Queries to this endpoint may take big amounts of CPU time and memory at `vmstorage` and `vmselect` when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagKeys` to quite low value in order to limit CPU and memory usage.
- `-search.maxTagValues` at `vmselect` limits the number of items, which may be returned from [/api/v1/label/.../values](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values). This endpoint is used mostly by Grafana for auto-completion of label values. Queries to this endpoint may take big amounts of CPU time and memory at `vmstorage` and `vmselect` when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagValues` to quite low value in order to limit CPU and memory usage.
See also [capacity planning docs](#capacity-planning).
## High availability ## High availability
The database is considered highly available if it continues accepting new data and processing incoming queries when some of its components are temporarily unavailable. The database is considered highly available if it continues accepting new data and processing incoming queries when some of its components are temporarily unavailable.
@ -545,7 +563,7 @@ Below is the output for `/path/to/vminsert -help`:
-influxDBLabel string -influxDBLabel string
Default label for the DB name sent over '?db={db_name}' query parameter (default "db") Default label for the DB name sent over '?db={db_name}' query parameter (default "db")
-influxListenAddr string -influxListenAddr string
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write
-influxMeasurementFieldSeparator string -influxMeasurementFieldSeparator string
Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_")
-influxSkipMeasurement -influxSkipMeasurement

View file

@ -250,8 +250,8 @@ All the VictoriaMetrics components provide command-line flags to control the siz
Memory usage for VictoriaMetrics components can be tuned according to the following docs: Memory usage for VictoriaMetrics components can be tuned according to the following docs:
* [Capacity planning for single-node VictoriaMetrics](https://docs.victoriametrics.com/#capacity-planning) * [Resource usage limits for single-node VictoriaMetrics](https://docs.victoriametrics.com/#resource-usage-limits)
* [Capacity planning for cluster VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#capacity-planning) * [Resource usage limits for cluster VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#resource-usage-limits)
* [Troubleshooting for vmagent](https://docs.victoriametrics.com/vmagent.html#troubleshooting) * [Troubleshooting for vmagent](https://docs.victoriametrics.com/vmagent.html#troubleshooting)
* [Troubleshooting for single-node VictoriaMetrics](https://docs.victoriametrics.com/#troubleshooting) * [Troubleshooting for single-node VictoriaMetrics](https://docs.victoriametrics.com/#troubleshooting)

11
docs/Makefile Normal file
View file

@ -0,0 +1,11 @@
docs-install:
gem install jekyll bundler
bundle install --gemfile=Gemfile
# run local server for documentation website
# at http://127.0.0.1:4000/
# On first use, please run `make docs-install`
docs-up:
JEKYLL_GITHUB_TOKEN=blank PAGES_API_URL=http://0.0.0.0 bundle exec \
--gemfile=Gemfile \
jekyll server --livereload

View file

@ -60,6 +60,7 @@ to prevent limits exhaustion.
Here is an alert example for high churn rate by the tenant: Here is an alert example for high churn rate by the tenant:
{% raw %}
```yaml ```yaml
- alert: TooHighChurnRate - alert: TooHighChurnRate
@ -79,3 +80,4 @@ Here is an alert example for high churn rate by the tenant:
High Churn Rate is tightly connected with database performance and may High Churn Rate is tightly connected with database performance and may
result in unexpected OOM's or slow queries." result in unexpected OOM's or slow queries."
``` ```
{% endraw %}

View file

@ -4,7 +4,14 @@ sort: 13
# Quick start # Quick start
## Installation ## How to install
VictoriaMetrics is distributed in two forms:
* [Single-server-VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html) - all-in-one
binary, which is very easy to use and maintain.
Single-server-VictoriaMetrics perfectly scales vertically and easily handles millions of metrics/s;
* [VictoriaMetrics Cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) - set of components
for building horizontally scalable clusters.
Single-server-VictoriaMetrics VictoriaMetrics is available as: Single-server-VictoriaMetrics VictoriaMetrics is available as:
@ -13,65 +20,153 @@ Single-server-VictoriaMetrics VictoriaMetrics is available as:
* [Snap packages](https://snapcraft.io/victoriametrics) * [Snap packages](https://snapcraft.io/victoriametrics)
* [Helm Charts](https://github.com/VictoriaMetrics/helm-charts#list-of-charts) * [Helm Charts](https://github.com/VictoriaMetrics/helm-charts#list-of-charts)
* [Binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) * [Binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
* [Source code](https://github.com/VictoriaMetrics/VictoriaMetrics). See [How to build from sources](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-build-from-sources) * [Source code](https://github.com/VictoriaMetrics/VictoriaMetrics).
See [How to build from sources](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-build-from-sources)
* [VictoriaMetrics on Linode](https://www.linode.com/marketplace/apps/victoriametrics/victoriametrics/) * [VictoriaMetrics on Linode](https://www.linode.com/marketplace/apps/victoriametrics/victoriametrics/)
* [VictoriaMetrics on DigitalOcean](https://marketplace.digitalocean.com/apps/victoriametrics-single) * [VictoriaMetrics on DigitalOcean](https://marketplace.digitalocean.com/apps/victoriametrics-single)
Just download VictoriaMetrics and follow [these instructions](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-start-victoriametrics). Just download VictoriaMetrics and follow
Then read [Prometheus setup](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#prometheus-setup) and [Grafana setup](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#grafana-setup) docs. [these instructions](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-start-victoriametrics).
Then read [Prometheus setup](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#prometheus-setup)
and [Grafana setup](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#grafana-setup) docs.
### Starting VM-Signle via Docker:
The following commands download the latest available [Docker image of VictoriaMetrics](https://hub.docker.com/r/victoriametrics/victoria-metrics) and start it at port 8428, while storing the ingested data at `victoria-metrics-data` subdirectory under the current directory: ### Starting VM-Single via Docker
The following commands download the latest available
[Docker image of VictoriaMetrics](https://hub.docker.com/r/victoriametrics/victoria-metrics)
and start it at port 8428, while storing the ingested data at `victoria-metrics-data` subdirectory
under the current directory:
<div class="with-copy" markdown="1">
```bash ```bash
docker pull victoriametrics/victoria-metrics:latest docker pull victoriametrics/victoria-metrics:latest
docker run -it --rm -v `pwd`/victoria-metrics-data:/victoria-metrics-data -p 8428:8428 victoriametrics/victoria-metrics:latest docker run -it --rm -v `pwd`/victoria-metrics-data:/victoria-metrics-data -p 8428:8428 victoriametrics/victoria-metrics:latest
``` ```
Open `http://localhost:8428` in web browser and read [these docs](https://docs.victoriametrics.com/#operation). </div>
There are also the following versions of VictoriaMetrics available: Open <a href="http://localhost:8428">http://localhost:8428</a> in web browser
and read [these docs](https://docs.victoriametrics.com/#operation).
* [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) - horizontally scalable VictoriaMetrics, which scales to multiple nodes. There is also [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html)
- horizontally scalable installation, which scales to multiple nodes.
### Starting VM-Cluster via Docker: ### Starting VM-Cluster via Docker
The following commands clone the latest available [VictoriaMetrics cluster repository](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster) and start the docker container via 'docker-compose'. Further customization is possible by editing the [docker-compose.yaml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/docker-compose.yml) file. The following commands clone the latest available
[VictoriaMetrics cluster repository](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster)
and start the docker container via 'docker-compose'. Further customization is possible by editing
the [docker-compose.yaml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/docker-compose.yml)
file.
<div class="with-copy" markdown="1">
```bash ```bash
git clone https://github.com/VictoriaMetrics/VictoriaMetrics --branch cluster && cd VictoriaMetrics/deployment/docker && docker-compose up git clone https://github.com/VictoriaMetrics/VictoriaMetrics --branch cluster &&
cd VictoriaMetrics/deployment/docker &&
docker-compose up
``` ```
</div>
* [Cluster setup](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#cluster-setup) * [Cluster setup](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#cluster-setup)
## Writing data ## Write data
Data can be written to VictoriaMetrics in the following ways: There are two main models in monitoring for data collection:
[push](https://docs.victoriametrics.com/keyConcepts.html#push-model)
and [pull](https://docs.victoriametrics.com/keyConcepts.html#pull-model).
Both are used in modern monitoring and both are supported by VictoriaMetrics.
* [DataDog agent](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-datadog-agent) See more details on [writing data here](https://docs.victoriametrics.com/keyConcepts.html#write-data).
* [InfluxDB-compatible agents such as Telegraf](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
* [Graphite-compatible agents such as StatsD](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-graphite-compatible-agents-such-as-statsd)
* [OpenTSDB-compatible agents](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-opentsdb-compatible-agents)
* [Prometheus remote_write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
* [In JSON line format](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-data-in-json-line-format)
* [Imported in CSV format](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-csv-data)
* [Imported in Prometheus exposition format](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-data-in-prometheus-exposition-format)
* `/api/v1/import` for importing data obtained from [/api/v1/export](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-export-data-in-json-line-format).
See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-data-in-json-line-format) for details.
## Reading data
VictoriaMetrics various APIs for reading the data. [This document briefly describes these APIs](https://docs.victoriametrics.com/url-examples.html). ## Query data
### Grafana setup: VictoriaMetrics provides an
[HTTP API](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#prometheus-querying-api-usage)
for serving read queries. The API is used in various integrations such as
[Grafana](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#grafana-setup).
The same API is also used by
[VMUI](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#vmui) - graphical User Interface
for querying and visualizing metrics.
Create [Prometheus datasource](http://docs.grafana.org/features/datasources/prometheus/) in Grafana with the following url: [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) - is he query language for executing read queries
in VictoriaMetrics. MetricsQL is a [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics)
-like query language with a powerful set of functions and features for working specifically with time series data.
```url See more details on [querying data here](https://docs.victoriametrics.com/keyConcepts.html#query-data)
http://<victoriametrics-addr>:8428
```
Substitute `<victoriametrics-addr>` with the hostname or IP address of VictoriaMetrics.
Then build graphs and dashboards for the created datasource using [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/) or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html). ## Alerting
It is not possible to physically trace all changes on graphs all the time, that is why alerting exists.
In [vmalert](https://docs.victoriametrics.com/vmalert.html) it is possible to create a set of conditions
based on PromQL and MetricsQL queries that will send a notification when such conditions are met.
## Data migration
Migrating data from other TSDBs to VictoriaMetrics is as simple as importing data via any of
[supported formats](https://docs.victoriametrics.com/keyConcepts.html#push-model).
The migration might get easier when using [vmctl](https://docs.victoriametrics.com/vmctl.html) - VictoriaMetrics
command line tool. It supports the following databases for migration to VictoriaMetrics:
* [Prometheus using snapshot API](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-prometheus);
* [Thanos](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-thanos);
* [InfluxDB](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-influxdb-1x);
* [OpenTSDB](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-opentsdb);
* [Migrate data between VictoriaMetrics single and cluster versions](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-victoriametrics).
## Productionisation
When going to production with VictoriaMetrics we recommend following the recommendations.
### Monitoring
Each VictoriaMetrics component emits its own metrics with various details regarding performance
and health state. Docs for the components also contain a `Monitoring` section with an explanation
of what and how should be monitored. For example,
[Single-server-VictoriaMetrics Monitoring](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#monitoring).
VictoriaMetric team prepared a list of [Grafana dashboards](https://grafana.com/orgs/victoriametrics/dashboards)
for the main components. Each dashboard contains a lot of useful information and tips. It is recommended
to have these dashboards installed and up to date.
The list of alerts for [single](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml)
and [cluster](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/alerts.yml)
versions would also help to identify and notify about issues with the system.
The rule of the thumb is to have a separate installation of VictoriaMetrics or any other monitoring system
to monitor the production installation of VictoriaMetrics. This would make monitoring independent and
will help identify problems with the main monitoring installation.
### Capacity planning
See capacity planning sections in [docs](https://docs.victoriametrics.com) for
[Single-server-VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#capacity-planning).
and [VictoriaMetrics Cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#capacity-planning).
Capacity planning isn't possible without [monitoring](#monitoring), so consider configuring it first.
Understanding resource usage and performance of VictoriaMetrics also requires knowing the tech terms
[active series](https://docs.victoriametrics.com/FAQ.html#what-is-an-active-time-series),
[churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate),
[cardinality](https://docs.victoriametrics.com/FAQ.html#what-is-high-cardinality),
[slow inserts](https://docs.victoriametrics.com/FAQ.html#what-is-a-slow-insert).
All of them are present in [Grafana dashboards](https://grafana.com/orgs/victoriametrics/dashboards).
### Data safety
It is recommended to read [Replication and data safety](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#replication-and-data-safety),
[Why replication doesnt save from disaster?](https://valyala.medium.com/speeding-up-backups-for-big-time-series-databases-533c1a927883)
and [backups](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#backups).
### Configuring limits
To avoid excessive resource usage or performance degradation limits must be in place:
* [Resource usage limits](https://docs.victoriametrics.com/FAQ.html#how-to-set-a-memory-limit-for-victoriametrics-components);
* [Cardinality limiter](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cardinality-limiter).

View file

@ -1055,6 +1055,25 @@ It is recommended leaving the following amounts of spare resources:
* 50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload. * 50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload.
* At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag. See also `-storage.minFreeDiskSpaceBytes` command-line flag description [here](#list-of-command-line-flags). * At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag. See also `-storage.minFreeDiskSpaceBytes` command-line flag description [here](#list-of-command-line-flags).
See also [resource usage limits docs](#resource-usage-limits).
## Resource usage limits
By default VictoriaMetrics is tuned for an optimal resource usage under typical workloads. Some workloads may need fine-grained resource usage limits. In these cases the following command-line flags may be useful:
- `-memory.allowedPercent` and `-search.allowedBytes` limit the amounts of memory, which may be used for various internal caches at VictoriaMetrics. Note that VictoriaMetrics may use more memory, since these flags don't limit additional memory, which may be needed on a per-query basis.
- `-search.maxUniqueTimeseries` limits the number of unique time series a single query can find and process. VictoriaMetrics keeps in memory some metainformation about the time series located by each query and spends some CPU time for processing the found time series. This means that the maximum memory usage and CPU usage a single query can use is proportional to `-search.maxUniqueTimeseries`.
- `-search.maxQueryDuration` limits the duration of a single query. If the query takes longer than the given duration, then it is canceled. This allows saving CPU and RAM when executing unexpected heavy queries.
- `-search.maxConcurrentRequests` limits the number of concurrent requests VictoriaMetrics can process. Bigger number of concurrent requests usually means bigger memory usage. For example, if a single query needs 100 MiB of additional memory during its execution, then 100 concurrent queries may need `100 * 100 MiB = 10 GiB` of additional memory. So it is better to limit the number of concurrent queries, while suspending additional incoming queries if the concurrency limit is reached. VictoriaMetrics provides `-search.maxQueueDuration` command-line flag for limiting the max wait time for suspended queries.
- `-search.maxSamplesPerSeries` limits the number of raw samples the query can process per each time series. VictoriaMetrics sequentially processes raw samples per each found time series during the query. It unpacks raw samples on the selected time range per each time series into memory and then applies the given [rollup function](https://docs.victoriametrics.com/MetricsQL.html#rollup-functions). The `-search.maxSamplesPerSeries` command-line flag allows limiting memory usage in the case when the query is executed on a time range, which contains hundreds of millions of raw samples per each located time series.
- `-search.maxSamplesPerQuery` limits the number of raw samples a single query can process. This allows limiting CPU usage for heavy queries.
- `-search.maxSeries` limits the number of time series, which may be returned from [/api/v1/series](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers). This endpoint is used mostly by Grafana for auto-completion of metric names, label names and label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxSeries` to quite low value in order limit CPU and memory usage.
- `-search.maxTagKeys` limits the number of items, which may be returned from [/api/v1/labels](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names). This endpoint is used mostly by Grafana for auto-completion of label names. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagKeys` to quite low value in order to limit CPU and memory usage.
- `-search.maxTagValues` limits the number of items, which may be returned from [/api/v1/label/.../values](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values). This endpoint is used mostly by Grafana for auto-completion of label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagValues` to quite low value in order to limit CPU and memory usage.
See also [capacity planning docs](#capacity-planning).
## High availability ## High availability
* Install multiple VictoriaMetrics instances in distinct datacenters (availability zones). * Install multiple VictoriaMetrics instances in distinct datacenters (availability zones).
@ -1682,7 +1701,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-influxDBLabel string -influxDBLabel string
Default label for the DB name sent over '?db={db_name}' query parameter (default "db") Default label for the DB name sent over '?db={db_name}' query parameter (default "db")
-influxListenAddr string -influxListenAddr string
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write
-influxMeasurementFieldSeparator string -influxMeasurementFieldSeparator string
Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_")
-influxSkipMeasurement -influxSkipMeasurement
@ -1745,7 +1764,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-promscrape.cluster.membersCount int -promscrape.cluster.membersCount int
The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets
-promscrape.cluster.replicationFactor int -promscrape.cluster.replicationFactor int
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1) The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 1, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
-promscrape.config string -promscrape.config string
Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details
-promscrape.config.dryRun -promscrape.config.dryRun

View file

@ -52,7 +52,7 @@ The helm chart repository [https://github.com/VictoriaMetrics/helm-charts/](http
3. Update `vmauth` chart version in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-auth/values.yaml) and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-auth/Chart.yaml) 3. Update `vmauth` chart version in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-auth/values.yaml) and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-auth/Chart.yaml)
4. Update `cluster` chart versions in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-cluster/values.yaml), bump version for `vmselect`, `vminsert` and `vmstorage` and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-cluster/Chart.yaml) 4. Update `cluster` chart versions in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-cluster/values.yaml), bump version for `vmselect`, `vminsert` and `vmstorage` and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-cluster/Chart.yaml)
5. Update `k8s-stack` chart versions in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-k8s-stack/values.yaml), bump version for `vmselect`, `vminsert`, `vmstorage`, `vmsingle`, `vmalert`, `vmagent` and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-k8s-stack/Chart.yaml) 5. Update `k8s-stack` chart versions in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-k8s-stack/values.yaml), bump version for `vmselect`, `vminsert`, `vmstorage`, `vmsingle`, `vmalert`, `vmagent` and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-k8s-stack/Chart.yaml)
6. Update `signle` chart version in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-single/values.yaml) and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-single/Chart.yaml) 6. Update `single-node` chart version in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-single/values.yaml) and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-single/Chart.yaml)
8. Run `make gen-doc` 8. Run `make gen-doc`
9. Run `make package` that creates or updates zip file with the packed chart 9. Run `make package` that creates or updates zip file with the packed chart
10. Run `make merge`. It creates or updates metadata for charts in index.yaml 10. Run `make merge`. It creates or updates metadata for charts in index.yaml

View file

@ -1059,6 +1059,25 @@ It is recommended leaving the following amounts of spare resources:
* 50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload. * 50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload.
* At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag. See also `-storage.minFreeDiskSpaceBytes` command-line flag description [here](#list-of-command-line-flags). * At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag. See also `-storage.minFreeDiskSpaceBytes` command-line flag description [here](#list-of-command-line-flags).
See also [resource usage limits docs](#resource-usage-limits).
## Resource usage limits
By default VictoriaMetrics is tuned for an optimal resource usage under typical workloads. Some workloads may need fine-grained resource usage limits. In these cases the following command-line flags may be useful:
- `-memory.allowedPercent` and `-search.allowedBytes` limit the amounts of memory, which may be used for various internal caches at VictoriaMetrics. Note that VictoriaMetrics may use more memory, since these flags don't limit additional memory, which may be needed on a per-query basis.
- `-search.maxUniqueTimeseries` limits the number of unique time series a single query can find and process. VictoriaMetrics keeps in memory some metainformation about the time series located by each query and spends some CPU time for processing the found time series. This means that the maximum memory usage and CPU usage a single query can use is proportional to `-search.maxUniqueTimeseries`.
- `-search.maxQueryDuration` limits the duration of a single query. If the query takes longer than the given duration, then it is canceled. This allows saving CPU and RAM when executing unexpected heavy queries.
- `-search.maxConcurrentRequests` limits the number of concurrent requests VictoriaMetrics can process. Bigger number of concurrent requests usually means bigger memory usage. For example, if a single query needs 100 MiB of additional memory during its execution, then 100 concurrent queries may need `100 * 100 MiB = 10 GiB` of additional memory. So it is better to limit the number of concurrent queries, while suspending additional incoming queries if the concurrency limit is reached. VictoriaMetrics provides `-search.maxQueueDuration` command-line flag for limiting the max wait time for suspended queries.
- `-search.maxSamplesPerSeries` limits the number of raw samples the query can process per each time series. VictoriaMetrics sequentially processes raw samples per each found time series during the query. It unpacks raw samples on the selected time range per each time series into memory and then applies the given [rollup function](https://docs.victoriametrics.com/MetricsQL.html#rollup-functions). The `-search.maxSamplesPerSeries` command-line flag allows limiting memory usage in the case when the query is executed on a time range, which contains hundreds of millions of raw samples per each located time series.
- `-search.maxSamplesPerQuery` limits the number of raw samples a single query can process. This allows limiting CPU usage for heavy queries.
- `-search.maxSeries` limits the number of time series, which may be returned from [/api/v1/series](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers). This endpoint is used mostly by Grafana for auto-completion of metric names, label names and label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxSeries` to quite low value in order limit CPU and memory usage.
- `-search.maxTagKeys` limits the number of items, which may be returned from [/api/v1/labels](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names). This endpoint is used mostly by Grafana for auto-completion of label names. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagKeys` to quite low value in order to limit CPU and memory usage.
- `-search.maxTagValues` limits the number of items, which may be returned from [/api/v1/label/.../values](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values). This endpoint is used mostly by Grafana for auto-completion of label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagValues` to quite low value in order to limit CPU and memory usage.
See also [capacity planning docs](#capacity-planning).
## High availability ## High availability
* Install multiple VictoriaMetrics instances in distinct datacenters (availability zones). * Install multiple VictoriaMetrics instances in distinct datacenters (availability zones).
@ -1686,7 +1705,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-influxDBLabel string -influxDBLabel string
Default label for the DB name sent over '?db={db_name}' query parameter (default "db") Default label for the DB name sent over '?db={db_name}' query parameter (default "db")
-influxListenAddr string -influxListenAddr string
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write
-influxMeasurementFieldSeparator string -influxMeasurementFieldSeparator string
Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_")
-influxSkipMeasurement -influxSkipMeasurement
@ -1749,7 +1768,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
-promscrape.cluster.membersCount int -promscrape.cluster.membersCount int
The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets
-promscrape.cluster.replicationFactor int -promscrape.cluster.replicationFactor int
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1) The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 1, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
-promscrape.config string -promscrape.config string
Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details
-promscrape.config.dryRun -promscrape.config.dryRun

5
docs/_includes/img.html Normal file
View file

@ -0,0 +1,5 @@
<p style="text-align: center">
<a href="{{ include.href }}" target="_blank">
<img src="{{ include.href }}">
</a>
</p>

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 124 KiB

View file

@ -0,0 +1,270 @@
# Migrate from InfluxDB to VictoriaMetrics
InfluxDB is a well-known time series database built for
[IoT](https://en.wikipedia.org/wiki/Internet_of_things) monitoring, Application Performance Monitoring (APM) and
analytics. It has its query language, unique data model, and rich tooling for collecting and processing metrics.
Nowadays, the volume of time series data grows constantly, as well as requirements for durable time series storage. And
sometimes old known solutions just can't keep up with the new expectations.
VictoriaMetrics is a high-performance opensource time series database specifically designed to deal with huge volumes of
monitoring data while remaining cost-efficient at the same time. Many companies are choosing to migrate from InfluxDB to
VictoriaMetrics specifically for performance and scalability reasons. Along them see case studies provided by
[ARNES](https://docs.victoriametrics.com/CaseStudies.html#arnes)
and [Brandwatch](https://docs.victoriametrics.com/CaseStudies.html#brandwatch).
This guide will cover the differences between two solutions, most commonly asked questions, and approaches for migrating
from InfluxDB to VictoriaMetrics.
## Data model differences
While readers are likely familiar
with [InfluxDB key concepts](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/), the data model of
VictoriaMetrics is something [new to explore](https://docs.victoriametrics.com/keyConcepts.html#data-model). Let's start
with similarities and differences:
* both solutions are **schemaless**, which means there is no need to define metrics or their tags in advance;
* multi-dimensional data support is implemented
via [tags](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#tags)
in InfluxDB and via [labels](https://docs.victoriametrics.com/keyConcepts.html#structure-of-a-metric) in
VictoriaMetrics. However, labels in VictoriaMetrics are always `strings`, while InfluxDB supports multiple data types;
* timestamps are stored with nanosecond resolution in InfluxDB, while in VictoriaMetrics it is **milliseconds**;
* in VictoriaMetrics metric's value is always `float64`, while InfluxDB supports multiple data types.
* there are
no [measurements](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#measurement)
or [fields](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#field-key) in
VictoriaMetrics, metric name contains it all. If measurement contains more than 1 field, then for VictoriaMetrics
it will be multiple metrics;
* there are no [buckets](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#bucket)
or [organizations](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#organization), all
data in VictoriaMetrics is stored in a global namespace or within
a [tenant](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy).
Let's consider the
following [sample data](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#sample-data)
borrowed from InfluxDB docs as an example:
| _measurement | _field | location | scientist | _value | _time |
|--------------|--------|----------|-------------|--------|----------------------|
| census | bees | klamath | anderson | 23 | 2019-08-18T00:00:00Z |
| census | ants | portland | mullen | 30 | 2019-08-18T00:00:00Z |
| census | bees | klamath | anderson | 28 | 2019-08-18T00:06:00Z |
| census | ants | portland | mullen | 32 | 2019-08-18T00:06:00Z |
In VictoriaMetrics data model this sample will have the following form:
| metric name | labels | value | time |
|-------------|:---------------------------------------------|-------|----------------------|
| census_bees | {location="klamath", scientist="anderson"} | 23 | 2019-08-18T00:00:00Z |
| census_ants | {location="portland", scientist="mullen"} | 30 | 2019-08-18T00:00:00Z |
| census_bees | {location="klamath", scientist="anderson"} | 28 | 2019-08-18T00:06:00Z |
| census_ants | {location="portland", scientist="mullen"} | 32 | 2019-08-18T00:06:00Z |
Actually, metric name for VictoriaMetrics is also a label with static name `__name__`, and example above can be
converted to `{__name__="census_bees", location="klamath", scientist="anderson"}`. All labels are indexed by
VictoriaMetrics, so lookups by names or labels have the same query speed.
## Write data
VictoriaMetrics
supports [InfluxDB line protocol](https://docs.victoriametrics.com/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
for data ingestion. For example, to write a measurement to VictoriaMetrics we need to send an HTTP POST request with
payload in a line protocol format:
```bash
curl -d 'census,location=klamath,scientist=anderson bees=23 1566079200000' -X POST 'http://<victoriametric-addr>:8428/write'
```
_hint: timestamp in the example might be out of configured retention for VictoriaMetrics. Consider increasing the
retention period or changing the timestamp, if that is the case._
Please note, an arbitrary number of lines delimited by `\n` (aka newline char) can be sent in a single request.
To get the written data back let's export all series matching the `location="klamath"` filter:
```bash
curl -G 'http://<victoriametric-addr>:8428/api/v1/export' -d 'match={location="klamath"}'
```
The expected response is the following:
```json
{
"metric": {
"__name__": "census_bees",
"location": "klamath",
"scientist": "anderson"
},
"values": [
23
],
"timestamps": [
1566079200000
]
}
```
Please note, VictoriaMetrics performed additional
[data mapping](https://docs.victoriametrics.com/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
to the data ingested via InfluxDB line protocol.
Support of InfluxDB line protocol also means VictoriaMetrics is compatible with
[Telegraf](https://github.com/influxdata/telegraf). To configure Telegraf, simply
add `http://<victoriametric-addr>:8428` URL to Telegraf configs:
```
[[outputs.influxdb]]
urls = ["http://<victoriametrics-addr>:8428"]
```
In addition to InfluxDB line protocol, VictoriaMetrics supports many other ways for
[metrics collection](https://docs.victoriametrics.com/keyConcepts.html#write-data).
## Query data
VictoriaMetrics does not have a com\mand-line interface (CLI). Instead, it provides
an [HTTP API](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#prometheus-querying-api-usage)
for serving read queries. This API is used in various integrations such as
[Grafana](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#grafana-setup). The same API is also used
by [VMUI](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#vmui) - a graphical User Interface for
querying and visualizing metrics:
{% include img.html href="migrate-from-influx-vmui.png" %}
See more about [how to query data in VictoriaMetrics](https://docs.victoriametrics.com/keyConcepts.html#query-data).
### Basic concepts
Let's take a closer look at querying specific with the following data sample:
```sql
foo
,instance=localhost bar=1.00 1652169600000000000
foo,instance=localhost bar=2.00 1652169660000000000
foo,instance=localhost bar=3.00 1652169720000000000
foo,instance=localhost bar=5.00 1652169840000000000
foo,instance=localhost bar=5.50 1652169960000000000
foo,instance=localhost bar=5.50 1652170020000000000
foo,instance=localhost bar=4.00 1652170080000000000
foo,instance=localhost bar=3.50 1652170260000000000
foo,instance=localhost bar=3.25 1652170320000000000
foo,instance=localhost bar=3.00 1652170380000000000
foo,instance=localhost bar=2.00 1652170440000000000
foo,instance=localhost bar=1.00 1652170500000000000
foo,instance=localhost bar=4.00 1652170560000000000
```
The data sample consists data points for a measurement `foo`
and a field `bar` with additional tag `instance=localhost`. If we would like plot this data as a time series in Grafana
it might have the following look:
{% include img.html href="migrate-from-influx-data-sample-in-influx.png" %}
The query used for this panel is written in
[InfluxQL](https://docs.influxdata.com/influxdb/v1.8/query_language/):
```sql
SELECT last ("bar")
FROM "foo"
WHERE ("instance" = 'localhost')
AND $timeFilter
GROUP BY time (1m)
```
Having this, let's import the same data sample in VictoriaMetrics and plot it in Grafana as well. To understand how the
InfluxQL query might be translated to MetricsQL let's break it into components first:
* `SELECT last("bar") FROM "foo"` - all requests
to [instant](https://docs.victoriametrics.com/keyConcepts.html#instant-query)
or [range](https://docs.victoriametrics.com/keyConcepts.html#range-query) VictoriaMetrics APIs are reads, so no need
to specify the `SELECT` statement. There are no `measurements` or `fields` in VictoriaMetrics, so the whole expression
can be replaced with `foo_bar` in MetricsQL;
* `WHERE ("instance" = 'localhost')`- [filtering by labels](https://docs.victoriametrics.com/keyConcepts.html#filtering)
in MetricsQL requires specifying the filter in curly braces next to the metric name. So in MetricsQL filter expression
will be translated to `{instance="localhost"}`;
* `WHERE $timeFilter` - filtering by time is done via request params sent along with query, so in MetricsQL no need to
specify this filter;
* `GROUP BY time(1m)` - grouping by time is done by default
in [range](https://docs.victoriametrics.com/keyConcepts.html#range-query) API according to specified `step` param.
This param is also a part of params sent along with request. See how to perform additional
[aggregations and grouping via MetricsQL](https://docs.victoriametrics.com/keyConcepts.html#aggregation-and-grouping-functions)
.
In result, executing the `foo_bar{instance="localhost"}` MetricsQL expression with `step=1m` for the same set of data in
Grafana will have the following form:
{% include img.html href="migrate-from-influx-data-sample-in-vm.png" %}
It is noticeable that visualizations from both databases are a bit different - VictoriaMetrics shows some extra points
filling the gaps in the graph. This behavior is described in more
detail [here](https://docs.victoriametrics.com/keyConcepts.html#range-query). In InfluxDB, we can achieve a similar
behavior by adding `fill(previous)` to the query.
### Advanced usage
The good thing is that knowing the basics and some aggregation functions is often enough for using MetricsQL or PromQL.
Let's consider one of the most popular Grafana
dashboards [Node Exporter Full](https://grafana.com/grafana/dashboards/1860). It has almost 15 million downloads and
about 230 PromQL queries in it! But a closer look at those queries shows the following:
* ~120 queries are just selecting a metric with label filters,
e.g. `node_textfile_scrape_error{instance="$node",job="$job"}`;
* ~80 queries are using [rate](https://docs.victoriametrics.com/MetricsQL.html#rate) function for selected metric,
e.g. `rate(node_netstat_Tcp_InSegs{instance=\"$node\",job=\"$job\"})`
* and the rest
are [aggregation functions](https://docs.victoriametrics.com/keyConcepts.html#aggregation-and-grouping-functions)
like [sum](https://docs.victoriametrics.com/MetricsQL.html#sum)
or [count](https://docs.victoriametrics.com/MetricsQL.html#count).
To get a better understanding of how MetricsQL works, see the following resources:
* [MetricsQL concepts](https://docs.victoriametrics.com/keyConcepts.html#metricsql);
* [MetricsQL functions](https://docs.victoriametrics.com/MetricsQL.html);
* [PromQL tutorial for beginners](https://valyala.medium.com/promql-tutorial-for-beginners-9ab455142085).
## How to migrate current data from InfluxDB to VictoriaMetrics
Migrating data from other TSDBs to VictoriaMetrics is as simple as importing data via any of
[supported formats](https://docs.victoriametrics.com/keyConcepts.html#push-model).
But migration from InfluxDB might get easier when using [vmctl](https://docs.victoriametrics.com/vmctl.html) -
VictoriaMetrics command-line tool. See more about
migrating [from InfluxDB v1.x versions](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-influxdb-1x).
Migrating data from InfluxDB v2.x is not supported yet. But there is
useful [3rd party solution]((https://docs.victoriametrics.com/vmctl.html#migrating-data-from-influxdb-2x)) for this.
Please note, that data migration is a backfilling process. So, please
consider [backfilling tips](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#backfilling).
## Frequently asked questions
* How does VictoriaMetrics compare to InfluxDB?
* _[Answer](https://docs.victoriametrics.com/FAQ.html#how-does-victoriametrics-compare-to-influxdb)_
* Why don't VictoriaMetrics support Remote Read API, so I don't need to learn MetricsQL?
* _[Answer](https://docs.victoriametrics.com/FAQ.html#why-doesnt-victoriametrics-support-the-prometheus-remote-read-api)_
* The PromQL and MetricsQL are often mentioned together - why is that?
* _MetricsQL - query language inspired by PromQL. MetricsQL is backward-compatible with PromQL, so Grafana
dashboards backed by Prometheus datasource should work the same after switching from Prometheus to
VictoriaMetrics. Both languages mostly share the same concepts with slight differences._
* Query returns more data points than expected - why?
* _VictoriaMetrics may return non-existing data points if `step` param is lower than the actual data resolution. See
more about this [here](https://docs.victoriametrics.com/keyConcepts.html#range-query)._
* How do I get the `real` last data point, not `ephemeral`?
* _[last_over_time](https://docs.victoriametrics.com/MetricsQL.html#last_over_time) function can be used for
limiting the lookbehind window for calculated data. For example, `last_over_time(metric[10s])` would return
calculated samples only if the real samples are located closer than 10 seconds to the calculated timestamps
according to
`start`, `end` and `step` query args passed
to [range query](https://docs.victoriametrics.com/keyConcepts.html#range-query)._
* How do I get raw data points with MetricsQL?
* _For getting raw data points specify the interval at which you want them in square brackets and send
as [instant query](https://docs.victoriametrics.com/keyConcepts.html#instant-query). For
example, `GET api/v1/query?query="my_metric[5m]"&time=<time>` will return raw samples for `my_metric` in interval
from `<time>` to `<time>-5m`._
* Can you have multiple aggregators in a MetricsQL query, e.g. `SELECT MAX(field), MIN(field) ...`?
* _Yes, try the following query `( alias(max(field), "max"), alias(min(field), "min") )`._
* How to translate Influx `percentile` function to MetricsQL?
* _[Answer](https://stackoverflow.com/questions/66431990/translate-influx-percentile-function-to-promqlb)_
* How to translate Influx `stddev` function to MetricsQL?
* _[Answer](https://stackoverflow.com/questions/66433143/translate-influx-stddev-to-promql)_

Some files were not shown because too many files have changed in this diff Show more