mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2025-01-10 15:14:09 +00:00
Merge branch 'public-single-node' into pmm-6401-read-prometheus-data-files
This commit is contained in:
commit
5a60387eea
149 changed files with 10528 additions and 1108 deletions
2
Makefile
2
Makefile
|
@ -283,7 +283,7 @@ golangci-lint: install-golangci-lint
|
|||
golangci-lint run --exclude '(SA4003|SA1019|SA5011):' -D errcheck -D structcheck --timeout 2m
|
||||
|
||||
install-golangci-lint:
|
||||
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.45.1
|
||||
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.46.1
|
||||
|
||||
install-wwhrd:
|
||||
which wwhrd || GO111MODULE=off go get github.com/frapposelli/wwhrd
|
||||
|
|
23
README.md
23
README.md
|
@ -1055,6 +1055,25 @@ It is recommended leaving the following amounts of spare resources:
|
|||
* 50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload.
|
||||
* At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag. See also `-storage.minFreeDiskSpaceBytes` command-line flag description [here](#list-of-command-line-flags).
|
||||
|
||||
See also [resource usage limits docs](#resource-usage-limits).
|
||||
|
||||
## Resource usage limits
|
||||
|
||||
By default VictoriaMetrics is tuned for an optimal resource usage under typical workloads. Some workloads may need fine-grained resource usage limits. In these cases the following command-line flags may be useful:
|
||||
|
||||
- `-memory.allowedPercent` and `-search.allowedBytes` limit the amounts of memory, which may be used for various internal caches at VictoriaMetrics. Note that VictoriaMetrics may use more memory, since these flags don't limit additional memory, which may be needed on a per-query basis.
|
||||
- `-search.maxUniqueTimeseries` limits the number of unique time series a single query can find and process. VictoriaMetrics keeps in memory some metainformation about the time series located by each query and spends some CPU time for processing the found time series. This means that the maximum memory usage and CPU usage a single query can use is proportional to `-search.maxUniqueTimeseries`.
|
||||
- `-search.maxQueryDuration` limits the duration of a single query. If the query takes longer than the given duration, then it is canceled. This allows saving CPU and RAM when executing unexpected heavy queries.
|
||||
- `-search.maxConcurrentRequests` limits the number of concurrent requests VictoriaMetrics can process. Bigger number of concurrent requests usually means bigger memory usage. For example, if a single query needs 100 MiB of additional memory during its execution, then 100 concurrent queries may need `100 * 100 MiB = 10 GiB` of additional memory. So it is better to limit the number of concurrent queries, while suspending additional incoming queries if the concurrency limit is reached. VictoriaMetrics provides `-search.maxQueueDuration` command-line flag for limiting the max wait time for suspended queries.
|
||||
- `-search.maxSamplesPerSeries` limits the number of raw samples the query can process per each time series. VictoriaMetrics sequentially processes raw samples per each found time series during the query. It unpacks raw samples on the selected time range per each time series into memory and then applies the given [rollup function](https://docs.victoriametrics.com/MetricsQL.html#rollup-functions). The `-search.maxSamplesPerSeries` command-line flag allows limiting memory usage in the case when the query is executed on a time range, which contains hundreds of millions of raw samples per each located time series.
|
||||
- `-search.maxSamplesPerQuery` limits the number of raw samples a single query can process. This allows limiting CPU usage for heavy queries.
|
||||
- `-search.maxSeries` limits the number of time series, which may be returned from [/api/v1/series](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers). This endpoint is used mostly by Grafana for auto-completion of metric names, label names and label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxSeries` to quite low value in order limit CPU and memory usage.
|
||||
- `-search.maxTagKeys` limits the number of items, which may be returned from [/api/v1/labels](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names). This endpoint is used mostly by Grafana for auto-completion of label names. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagKeys` to quite low value in order to limit CPU and memory usage.
|
||||
- `-search.maxTagValues` limits the number of items, which may be returned from [/api/v1/label/.../values](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values). This endpoint is used mostly by Grafana for auto-completion of label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagValues` to quite low value in order to limit CPU and memory usage.
|
||||
|
||||
See also [capacity planning docs](#capacity-planning).
|
||||
|
||||
|
||||
## High availability
|
||||
|
||||
* Install multiple VictoriaMetrics instances in distinct datacenters (availability zones).
|
||||
|
@ -1682,7 +1701,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
|
|||
-influxDBLabel string
|
||||
Default label for the DB name sent over '?db={db_name}' query parameter (default "db")
|
||||
-influxListenAddr string
|
||||
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write
|
||||
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write
|
||||
-influxMeasurementFieldSeparator string
|
||||
Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_")
|
||||
-influxSkipMeasurement
|
||||
|
@ -1745,7 +1764,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
|
|||
-promscrape.cluster.membersCount int
|
||||
The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets
|
||||
-promscrape.cluster.replicationFactor int
|
||||
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
|
||||
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 1, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
|
||||
-promscrape.config string
|
||||
Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details
|
||||
-promscrape.config.dryRun
|
||||
|
|
|
@ -765,7 +765,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
|
|||
-influxDBLabel string
|
||||
Default label for the DB name sent over '?db={db_name}' query parameter (default "db")
|
||||
-influxListenAddr string
|
||||
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<vmagent>:8429/write
|
||||
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<vmagent>:8429/write
|
||||
-influxMeasurementFieldSeparator string
|
||||
Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_")
|
||||
-influxSkipMeasurement
|
||||
|
@ -846,7 +846,7 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
|
|||
-promscrape.cluster.membersCount int
|
||||
The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets
|
||||
-promscrape.cluster.replicationFactor int
|
||||
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
|
||||
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 1, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
|
||||
-promscrape.config string
|
||||
Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details
|
||||
-promscrape.config.dryRun
|
||||
|
|
|
@ -43,7 +43,7 @@ var (
|
|||
httpListenAddr = flag.String("httpListenAddr", ":8429", "TCP address to listen for http connections. "+
|
||||
"Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. "+
|
||||
"Note that /targets and /metrics pages aren't available if -httpListenAddr=''")
|
||||
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. "+
|
||||
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. "+
|
||||
"This flag isn't needed when ingesting data over HTTP - just send it to http://<vmagent>:8429/write")
|
||||
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
|
||||
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+
|
||||
|
|
|
@ -70,6 +70,8 @@ var (
|
|||
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
|
||||
awsAccessKey = flagutil.NewArray("remoteWrite.aws.accessKey", "Optional AWS AccessKey to use for -remoteWrite.url if -remoteWrite.aws.useSigv4 is set. "+
|
||||
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
|
||||
awsService = flagutil.NewArray("remoteWrite.aws.serice", "Optional AWS Service to use for -remoteWrite.url if -remoteWrite.aws.useSigv4 is set. "+
|
||||
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url. Defaults to \"aps\".")
|
||||
awsSecretKey = flagutil.NewArray("remoteWrite.aws.secretKey", "Optional AWS SecretKey to use for -remoteWrite.url if -remoteWrite.aws.useSigv4 is set. "+
|
||||
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
|
||||
)
|
||||
|
@ -232,7 +234,8 @@ func getAWSAPIConfig(argIdx int) (*awsapi.Config, error) {
|
|||
roleARN := awsRoleARN.GetOptionalArg(argIdx)
|
||||
accessKey := awsAccessKey.GetOptionalArg(argIdx)
|
||||
secretKey := awsSecretKey.GetOptionalArg(argIdx)
|
||||
cfg, err := awsapi.NewConfig(region, roleARN, accessKey, secretKey)
|
||||
service := awsService.GetOptionalArg(argIdx)
|
||||
cfg, err := awsapi.NewConfig(region, roleARN, accessKey, secretKey, service)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
@ -307,7 +310,7 @@ again:
|
|||
req.Header.Set("Authorization", ah)
|
||||
}
|
||||
if c.awsCfg != nil {
|
||||
if err := c.awsCfg.SignRequest(req, "aps", sigv4Hash); err != nil {
|
||||
if err := c.awsCfg.SignRequest(req, sigv4Hash); err != nil {
|
||||
// there is no need in retry, request will be rejected by client.Do and retried by code below
|
||||
logger.Warnf("cannot sign remoteWrite request with AWS sigv4: %s", err)
|
||||
}
|
||||
|
|
|
@ -62,13 +62,14 @@ publish-vmalert:
|
|||
|
||||
test-vmalert:
|
||||
go test -v -race -cover ./app/vmalert -loggerLevel=ERROR
|
||||
go test -v -race -cover ./app/vmalert/templates
|
||||
go test -v -race -cover ./app/vmalert/datasource
|
||||
go test -v -race -cover ./app/vmalert/notifier
|
||||
go test -v -race -cover ./app/vmalert/config
|
||||
go test -v -race -cover ./app/vmalert/remotewrite
|
||||
|
||||
run-vmalert: vmalert
|
||||
./bin/vmalert -rule=app/vmalert/config/testdata/rules2-good.rules \
|
||||
./bin/vmalert -rule=app/vmalert/config/testdata/rules/rules2-good.rules \
|
||||
-datasource.url=http://localhost:8428 \
|
||||
-notifier.url=http://localhost:9093 \
|
||||
-notifier.url=http://127.0.0.1:9093 \
|
||||
|
@ -77,7 +78,7 @@ run-vmalert: vmalert
|
|||
-external.label=cluster=east-1 \
|
||||
-external.label=replica=a \
|
||||
-evaluationInterval=3s \
|
||||
-rule.configCheckInterval=10s
|
||||
-configCheckInterval=10s
|
||||
|
||||
run-vmalert-sd: vmalert
|
||||
./bin/vmalert -rule=app/vmalert/config/testdata/rules2-good.rules \
|
||||
|
|
|
@ -13,23 +13,24 @@ implementation and aims to be compatible with its syntax.
|
|||
|
||||
* Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB;
|
||||
* VictoriaMetrics [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html)
|
||||
support and expressions validation;
|
||||
support and expressions validation;
|
||||
* Prometheus [alerting rules definition format](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#defining-alerting-rules)
|
||||
support;
|
||||
support;
|
||||
* Integration with [Alertmanager](https://github.com/prometheus/alertmanager) starting from [Alertmanager v0.16.0-aplha](https://github.com/prometheus/alertmanager/releases/tag/v0.16.0-alpha.0);
|
||||
* Keeps the alerts [state on restarts](#alerts-state-on-restarts);
|
||||
* Graphite datasource can be used for alerting and recording rules. See [these docs](#graphite);
|
||||
* Recording and Alerting rules backfilling (aka `replay`). See [these docs](#rules-backfilling);
|
||||
* Lightweight without extra dependencies.
|
||||
* Supports [reusable templates](#reusable-templates) for annotations.
|
||||
|
||||
## Limitations
|
||||
|
||||
* `vmalert` execute queries against remote datasource which has reliability risks because of the network.
|
||||
It is recommended to configure alerts thresholds and rules expressions with the understanding that network
|
||||
requests may fail;
|
||||
It is recommended to configure alerts thresholds and rules expressions with the understanding that network
|
||||
requests may fail;
|
||||
* by default, rules execution is sequential within one group, but persistence of execution results to remote
|
||||
storage is asynchronous. Hence, user shouldn't rely on chaining of recording rules when result of previous
|
||||
recording rule is reused in the next one;
|
||||
storage is asynchronous. Hence, user shouldn't rely on chaining of recording rules when result of previous
|
||||
recording rule is reused in the next one;
|
||||
|
||||
## QuickStart
|
||||
|
||||
|
@ -48,8 +49,8 @@ To start using `vmalert` you will need the following things:
|
|||
* list of rules - PromQL/MetricsQL expressions to execute;
|
||||
* datasource address - reachable MetricsQL endpoint to run queries against;
|
||||
* notifier address [optional] - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing,
|
||||
aggregating alerts, and sending notifications. Please note, notifier address also supports Consul and DNS Service Discovery via
|
||||
[config file](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go).
|
||||
aggregating alerts, and sending notifications. Please note, notifier address also supports Consul and DNS Service Discovery via
|
||||
[config file](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go).
|
||||
* remote write address [optional] - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
|
||||
compatible storage to persist rules and alerts state info;
|
||||
* remote read address [optional] - MetricsQL compatible datasource to restore alerts state from.
|
||||
|
@ -146,12 +147,12 @@ expression and then act according to the Rule type.
|
|||
There are two types of Rules:
|
||||
|
||||
* [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) -
|
||||
Alerting rules allow defining alert conditions via `expr` field and to send notifications to
|
||||
[Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty.
|
||||
Alerting rules allow defining alert conditions via `expr` field and to send notifications to
|
||||
[Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty.
|
||||
* [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) -
|
||||
Recording rules allow defining `expr` which result will be then backfilled to configured
|
||||
`-remoteWrite.url`. Recording rules are used to precompute frequently needed or computationally
|
||||
expensive expressions and save their result as a new set of time series.
|
||||
Recording rules allow defining `expr` which result will be then backfilled to configured
|
||||
`-remoteWrite.url`. Recording rules are used to precompute frequently needed or computationally
|
||||
expensive expressions and save their result as a new set of time series.
|
||||
|
||||
`vmalert` forbids defining duplicates - rules with the same combination of name, expression, and labels
|
||||
within one group.
|
||||
|
@ -184,10 +185,52 @@ annotations:
|
|||
[ <labelname>: <tmpl_string> ]
|
||||
```
|
||||
|
||||
It is allowed to use [Go templating](https://golang.org/pkg/text/template/) in annotations
|
||||
to format data, iterate over it or execute expressions.
|
||||
It is allowed to use [Go templating](https://golang.org/pkg/text/template/) in annotations to format data, iterate over it or execute expressions.
|
||||
Additionally, `vmalert` provides some extra templating functions
|
||||
listed [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/template_func.go).
|
||||
listed [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/template_func.go) and [reusable templates](#reusable-templates).
|
||||
|
||||
#### Reusable templates
|
||||
|
||||
Like in Alertmanager you can define [reusable templates](https://prometheus.io/docs/prometheus/latest/configuration/template_examples/#defining-reusable-templates)
|
||||
to share same templates across annotations. Just define the templates in a file and
|
||||
set the path via `-rule.templates` flag.
|
||||
|
||||
For example, template `grafana.filter` can be defined as following:
|
||||
|
||||
{% raw %}
|
||||
```
|
||||
{{ define "grafana.filter" -}}
|
||||
{{- $labels := .arg0 -}}
|
||||
{{- range $name, $label := . -}}
|
||||
{{- if (ne $name "arg0") -}}
|
||||
{{- ( or (index $labels $label) "All" ) | printf "&var-%s=%s" $label -}}
|
||||
{{- end -}}
|
||||
{{- end -}}
|
||||
{{- end -}}
|
||||
```
|
||||
{% endraw %}
|
||||
|
||||
And then used in annotations:
|
||||
|
||||
{% raw %}
|
||||
```yaml
|
||||
groups:
|
||||
- name: AlertGroupName
|
||||
rules:
|
||||
- alert: AlertName
|
||||
expr: any_metric > 100
|
||||
for: 30s
|
||||
labels:
|
||||
alertname: 'Any metric is too high'
|
||||
severity: 'warning'
|
||||
annotations:
|
||||
dashboard: '{{ $externalURL }}/d/dashboard?orgId=1{{ template "grafana.filter" (args .CommonLabels "account_id" "any_label") }}'
|
||||
```
|
||||
{% endraw %}
|
||||
|
||||
The `-rule.templates` flag supports wildcards so multiple files with templates can be loaded.
|
||||
The content of `-rule.templates` can be also [hot reloaded](#hot-config-reload).
|
||||
|
||||
|
||||
#### Recording rules
|
||||
|
||||
|
@ -215,11 +258,11 @@ For recording rules to work `-remoteWrite.url` must be specified.
|
|||
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
|
||||
|
||||
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state
|
||||
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
|
||||
These are regular time series and maybe queried from VM just as any other time series.
|
||||
The state is stored to the configured address on every rule evaluation.
|
||||
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
|
||||
These are regular time series and maybe queried from VM just as any other time series.
|
||||
The state is stored to the configured address on every rule evaluation.
|
||||
* `-remoteRead.url` - URL to VictoriaMetrics (Single) or vmselect (Cluster). `vmalert` will try to restore alerts state
|
||||
from configured address by querying time series with name `ALERTS_FOR_STATE`.
|
||||
from configured address by querying time series with name `ALERTS_FOR_STATE`.
|
||||
|
||||
Both flags are required for proper state restoration. Restore process may fail if time series are missing
|
||||
in configured `-remoteRead.url`, weren't updated in the last `1h` (controlled by `-remoteRead.lookback`)
|
||||
|
@ -275,7 +318,7 @@ for different scenarios.
|
|||
Please note, not all flags in examples are required:
|
||||
|
||||
* `-remoteWrite.url` and `-remoteRead.url` are optional and are needed only if
|
||||
you have recording rules or want to store [alerts state](#alerts-state-on-restarts) on `vmalert` restarts;
|
||||
you have recording rules or want to store [alerts state](#alerts-state-on-restarts) on `vmalert` restarts;
|
||||
* `-notifier.url` is optional and is needed only if you have alerting rules.
|
||||
|
||||
#### Single-node VictoriaMetrics
|
||||
|
@ -341,6 +384,7 @@ Alertmanagers.
|
|||
|
||||
To avoid recording rules results and alerts state duplication in VictoriaMetrics server
|
||||
don't forget to configure [deduplication](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#deduplication).
|
||||
The recommended value for `-dedup.minScrapeInterval` must be greater or equal to vmalert's `evaluation_interval`.
|
||||
|
||||
Alertmanager will automatically deduplicate alerts with identical labels, so ensure that
|
||||
all `vmalert`s are having the same config.
|
||||
|
@ -384,7 +428,7 @@ See also [downsampling docs](https://docs.victoriametrics.com/#downsampling).
|
|||
* `http://<vmalert-addr>/api/v1/rules` - list of all loaded groups and rules;
|
||||
* `http://<vmalert-addr>/api/v1/alerts` - list of all active alerts;
|
||||
* `http://<vmalert-addr>/api/v1/<groupID>/<alertID>/status"` - get alert status by ID.
|
||||
Used as alert source in AlertManager.
|
||||
Used as alert source in AlertManager.
|
||||
* `http://<vmalert-addr>/metrics` - application metrics.
|
||||
* `http://<vmalert-addr>/-/reload` - hot configuration reload.
|
||||
|
||||
|
@ -473,17 +517,17 @@ Execute the query against storage which was used for `-remoteWrite.url` during t
|
|||
There are following non-required `replay` flags:
|
||||
|
||||
* `-replay.maxDatapointsPerQuery` - the max number of data points expected to receive in one request.
|
||||
In two words, it affects the max time range for every `/query_range` request. The higher the value,
|
||||
the fewer requests will be issued during `replay`.
|
||||
In two words, it affects the max time range for every `/query_range` request. The higher the value,
|
||||
the fewer requests will be issued during `replay`.
|
||||
* `-replay.ruleRetryAttempts` - when datasource fails to respond vmalert will make this number of retries
|
||||
per rule before giving up.
|
||||
per rule before giving up.
|
||||
* `-replay.rulesDelay` - delay between sequential rules execution. Important in cases if there are chaining
|
||||
(rules which depend on each other) rules. It is expected, that remote storage will be able to persist
|
||||
previously accepted data during the delay, so data will be available for the subsequent queries.
|
||||
Keep it equal or bigger than `-remoteWrite.flushInterval`.
|
||||
(rules which depend on each other) rules. It is expected, that remote storage will be able to persist
|
||||
previously accepted data during the delay, so data will be available for the subsequent queries.
|
||||
Keep it equal or bigger than `-remoteWrite.flushInterval`.
|
||||
* `replay.disableProgressBar` - whether to disable progress bar which shows progress work.
|
||||
Progress bar may generate a lot of log records, which is not formatted as standard VictoriaMetrics logger.
|
||||
It could break logs parsing by external system and generate additional load on it.
|
||||
Progress bar may generate a lot of log records, which is not formatted as standard VictoriaMetrics logger.
|
||||
It could break logs parsing by external system and generate additional load on it.
|
||||
|
||||
See full description for these flags in `./vmalert --help`.
|
||||
|
||||
|
@ -792,6 +836,11 @@ The shortlist of configuration flags is the following:
|
|||
absolute path to all .yaml files in root.
|
||||
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.
|
||||
Supports an array of values separated by comma or specified via multiple flags.
|
||||
-rule.templates
|
||||
Path or glob pattern to location with go template definitions for rules annotations templating. Flag can be specified multiple times.
|
||||
Examples:
|
||||
-rule.templates="/path/to/file". Path to a single file with go templates
|
||||
-rule.templates="dir/*.tpl" -rule.templates="/*.tpl". Relative path to all .tpl files in "dir" folder, absolute path to all .tpl files in root.
|
||||
-rule.configCheckInterval duration
|
||||
Interval for checking for changes in '-rule' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. DEPRECATED - see '-configCheckInterval' instead
|
||||
-rule.maxResolveDuration duration
|
||||
|
@ -822,7 +871,7 @@ The shortlist of configuration flags is the following:
|
|||
* send SIGHUP signal to `vmalert` process;
|
||||
* send GET request to `/-/reload` endpoint;
|
||||
* configure `-configCheckInterval` flag for periodic reload
|
||||
on config change.
|
||||
on config change.
|
||||
|
||||
### URL params
|
||||
|
||||
|
|
|
@ -12,6 +12,7 @@ import (
|
|||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
|
@ -152,7 +153,7 @@ type labelSet struct {
|
|||
|
||||
// toLabels converts labels from given Metric
|
||||
// to labelSet which contains original and processed labels.
|
||||
func (ar *AlertingRule) toLabels(m datasource.Metric, qFn notifier.QueryFn) (*labelSet, error) {
|
||||
func (ar *AlertingRule) toLabels(m datasource.Metric, qFn templates.QueryFn) (*labelSet, error) {
|
||||
ls := &labelSet{
|
||||
origin: make(map[string]string, len(m.Labels)),
|
||||
processed: make(map[string]string),
|
||||
|
@ -323,7 +324,7 @@ func (ar *AlertingRule) Exec(ctx context.Context, ts time.Time) ([]prompbmarshal
|
|||
}
|
||||
continue
|
||||
}
|
||||
if a.State == notifier.StatePending && time.Since(a.ActiveAt) >= ar.For {
|
||||
if a.State == notifier.StatePending && ts.Sub(a.ActiveAt) >= ar.For {
|
||||
a.State = notifier.StateFiring
|
||||
a.Start = ts
|
||||
alertsFired.Inc()
|
||||
|
@ -382,7 +383,7 @@ func hash(labels map[string]string) uint64 {
|
|||
return hash.Sum64()
|
||||
}
|
||||
|
||||
func (ar *AlertingRule) newAlert(m datasource.Metric, ls *labelSet, start time.Time, qFn notifier.QueryFn) (*notifier.Alert, error) {
|
||||
func (ar *AlertingRule) newAlert(m datasource.Metric, ls *labelSet, start time.Time, qFn templates.QueryFn) (*notifier.Alert, error) {
|
||||
var err error
|
||||
if ls == nil {
|
||||
ls, err = ar.toLabels(m, qFn)
|
||||
|
|
|
@ -10,18 +10,19 @@ import (
|
|||
"gopkg.in/yaml.v2"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
|
||||
)
|
||||
|
||||
func TestMain(m *testing.M) {
|
||||
u, _ := url.Parse("https://victoriametrics.com/path")
|
||||
notifier.InitTemplateFunc(u)
|
||||
if err := templates.Load([]string{"testdata/templates/*good.tmpl"}, true); err != nil {
|
||||
os.Exit(1)
|
||||
}
|
||||
os.Exit(m.Run())
|
||||
}
|
||||
|
||||
func TestParseGood(t *testing.T) {
|
||||
if _, err := Parse([]string{"testdata/*good.rules", "testdata/dir/*good.*"}, true, true); err != nil {
|
||||
if _, err := Parse([]string{"testdata/rules/*good.rules", "testdata/dir/*good.*"}, true, true); err != nil {
|
||||
t.Errorf("error parsing files %s", err)
|
||||
}
|
||||
}
|
||||
|
@ -32,7 +33,7 @@ func TestParseBad(t *testing.T) {
|
|||
expErr string
|
||||
}{
|
||||
{
|
||||
[]string{"testdata/rules0-bad.rules"},
|
||||
[]string{"testdata/rules/rules0-bad.rules"},
|
||||
"unexpected token",
|
||||
},
|
||||
{
|
||||
|
@ -56,7 +57,7 @@ func TestParseBad(t *testing.T) {
|
|||
"either `record` or `alert` must be set",
|
||||
},
|
||||
{
|
||||
[]string{"testdata/rules1-bad.rules"},
|
||||
[]string{"testdata/rules/rules1-bad.rules"},
|
||||
"bad graphite expr",
|
||||
},
|
||||
}
|
||||
|
|
|
@ -2,8 +2,6 @@ groups:
|
|||
- name: TestGroup
|
||||
interval: 2s
|
||||
concurrency: 2
|
||||
extra_filter_labels: # deprecated param, use `params` instead
|
||||
job: victoriametrics
|
||||
params:
|
||||
denyPartialResponse: ["true"]
|
||||
extra_label: ["env=dev"]
|
||||
|
@ -49,4 +47,12 @@ groups:
|
|||
expr: |2
|
||||
sum(code:requests:rate5m{code="200"})
|
||||
/
|
||||
sum(code:requests:rate5m)
|
||||
sum(code:requests:rate5m)
|
||||
- record: code:requests:slo
|
||||
labels:
|
||||
recording: true
|
||||
expr: 0.95
|
||||
- record: time:current
|
||||
labels:
|
||||
recording: true
|
||||
expr: time()
|
3
app/vmalert/config/testdata/templates/templates0-good.tmpl
vendored
Normal file
3
app/vmalert/config/testdata/templates/templates0-good.tmpl
vendored
Normal file
|
@ -0,0 +1,3 @@
|
|||
{{ define "template0" }}
|
||||
Visit {{ externalURL }}
|
||||
{{ end }}
|
3
app/vmalert/config/testdata/templates/templates1-good.tmpl
vendored
Normal file
3
app/vmalert/config/testdata/templates/templates1-good.tmpl
vendored
Normal file
|
@ -0,0 +1,3 @@
|
|||
{{ define "template1" }}
|
||||
{{ 1048576 | humanize1024 }}
|
||||
{{ end }}
|
3
app/vmalert/config/testdata/templates/templates2-good.tmpl
vendored
Normal file
3
app/vmalert/config/testdata/templates/templates2-good.tmpl
vendored
Normal file
|
@ -0,0 +1,3 @@
|
|||
{{ define "template2" }}
|
||||
{{ 1048576 | humanize1024 }}
|
||||
{{ end }}
|
3
app/vmalert/config/testdata/templates/templates3-good.tmpl
vendored
Normal file
3
app/vmalert/config/testdata/templates/templates3-good.tmpl
vendored
Normal file
|
@ -0,0 +1,3 @@
|
|||
{{ define "template3" }}
|
||||
{{ printf "%s to %s!" "welcome" "hell" | toUpper }}
|
||||
{{ end }}
|
3
app/vmalert/config/testdata/templates/templates4-good-tmpl
vendored
Normal file
3
app/vmalert/config/testdata/templates/templates4-good-tmpl
vendored
Normal file
|
@ -0,0 +1,3 @@
|
|||
{{ define "template3" }}
|
||||
{{ 1230912039102391023.0 | humanizeDuration }}
|
||||
{{ end }}
|
|
@ -12,7 +12,7 @@ import (
|
|||
|
||||
var (
|
||||
addr = flag.String("datasource.url", "", "VictoriaMetrics or vmselect url. Required parameter. "+
|
||||
"E.g. http://127.0.0.1:8428")
|
||||
"E.g. http://127.0.0.1:8428 . See also -remoteRead.disablePathAppend")
|
||||
appendTypePrefix = flag.Bool("datasource.appendTypePrefix", false, "Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL.")
|
||||
|
||||
basicAuthUsername = flag.String("datasource.basicAuth.username", "", "Optional basic auth username for -datasource.url")
|
||||
|
|
|
@ -24,20 +24,18 @@ type VMStorage struct {
|
|||
dataSourceType Type
|
||||
evaluationInterval time.Duration
|
||||
extraParams url.Values
|
||||
disablePathAppend bool
|
||||
}
|
||||
|
||||
// Clone makes clone of VMStorage, shares http client.
|
||||
func (s *VMStorage) Clone() *VMStorage {
|
||||
return &VMStorage{
|
||||
c: s.c,
|
||||
authCfg: s.authCfg,
|
||||
datasourceURL: s.datasourceURL,
|
||||
lookBack: s.lookBack,
|
||||
queryStep: s.queryStep,
|
||||
appendTypePrefix: s.appendTypePrefix,
|
||||
dataSourceType: s.dataSourceType,
|
||||
disablePathAppend: s.disablePathAppend,
|
||||
c: s.c,
|
||||
authCfg: s.authCfg,
|
||||
datasourceURL: s.datasourceURL,
|
||||
lookBack: s.lookBack,
|
||||
queryStep: s.queryStep,
|
||||
appendTypePrefix: s.appendTypePrefix,
|
||||
dataSourceType: s.dataSourceType,
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -57,16 +55,15 @@ func (s *VMStorage) BuildWithParams(params QuerierParams) Querier {
|
|||
}
|
||||
|
||||
// NewVMStorage is a constructor for VMStorage
|
||||
func NewVMStorage(baseURL string, authCfg *promauth.Config, lookBack time.Duration, queryStep time.Duration, appendTypePrefix bool, c *http.Client, disablePathAppend bool) *VMStorage {
|
||||
func NewVMStorage(baseURL string, authCfg *promauth.Config, lookBack time.Duration, queryStep time.Duration, appendTypePrefix bool, c *http.Client) *VMStorage {
|
||||
return &VMStorage{
|
||||
c: c,
|
||||
authCfg: authCfg,
|
||||
datasourceURL: strings.TrimSuffix(baseURL, "/"),
|
||||
appendTypePrefix: appendTypePrefix,
|
||||
lookBack: lookBack,
|
||||
queryStep: queryStep,
|
||||
dataSourceType: NewPrometheusType(),
|
||||
disablePathAppend: disablePathAppend,
|
||||
c: c,
|
||||
authCfg: authCfg,
|
||||
datasourceURL: strings.TrimSuffix(baseURL, "/"),
|
||||
appendTypePrefix: appendTypePrefix,
|
||||
lookBack: lookBack,
|
||||
queryStep: queryStep,
|
||||
dataSourceType: NewPrometheusType(),
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
@ -2,12 +2,18 @@ package datasource
|
|||
|
||||
import (
|
||||
"encoding/json"
|
||||
"flag"
|
||||
"fmt"
|
||||
"net/http"
|
||||
"strconv"
|
||||
"time"
|
||||
)
|
||||
|
||||
var (
|
||||
disablePathAppend = flag.Bool("remoteRead.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/query' path "+
|
||||
"to the configured -datasource.url and -remoteRead.url")
|
||||
)
|
||||
|
||||
type promResponse struct {
|
||||
Status string `json:"status"`
|
||||
ErrorType string `json:"errorType"`
|
||||
|
@ -25,13 +31,6 @@ type promInstant struct {
|
|||
} `json:"result"`
|
||||
}
|
||||
|
||||
type promRange struct {
|
||||
Result []struct {
|
||||
Labels map[string]string `json:"metric"`
|
||||
TVs [][2]interface{} `json:"values"`
|
||||
} `json:"result"`
|
||||
}
|
||||
|
||||
func (r promInstant) metrics() ([]Metric, error) {
|
||||
var result []Metric
|
||||
for i, res := range r.Result {
|
||||
|
@ -50,6 +49,13 @@ func (r promInstant) metrics() ([]Metric, error) {
|
|||
return result, nil
|
||||
}
|
||||
|
||||
type promRange struct {
|
||||
Result []struct {
|
||||
Labels map[string]string `json:"metric"`
|
||||
TVs [][2]interface{} `json:"values"`
|
||||
} `json:"result"`
|
||||
}
|
||||
|
||||
func (r promRange) metrics() ([]Metric, error) {
|
||||
var result []Metric
|
||||
for i, res := range r.Result {
|
||||
|
@ -74,9 +80,22 @@ func (r promRange) metrics() ([]Metric, error) {
|
|||
return result, nil
|
||||
}
|
||||
|
||||
type promScalar [2]interface{}
|
||||
|
||||
func (r promScalar) metrics() ([]Metric, error) {
|
||||
var m Metric
|
||||
f, err := strconv.ParseFloat(r[1].(string), 64)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("metric %v, unable to parse float64 from %s: %w", r, r[1], err)
|
||||
}
|
||||
m.Values = append(m.Values, f)
|
||||
m.Timestamps = append(m.Timestamps, int64(r[0].(float64)))
|
||||
return []Metric{m}, nil
|
||||
}
|
||||
|
||||
const (
|
||||
statusSuccess, statusError = "success", "error"
|
||||
rtVector, rtMatrix = "vector", "matrix"
|
||||
statusSuccess, statusError = "success", "error"
|
||||
rtVector, rtMatrix, rScalar = "vector", "matrix", "scalar"
|
||||
)
|
||||
|
||||
func parsePrometheusResponse(req *http.Request, resp *http.Response) ([]Metric, error) {
|
||||
|
@ -103,23 +122,23 @@ func parsePrometheusResponse(req *http.Request, resp *http.Response) ([]Metric,
|
|||
return nil, err
|
||||
}
|
||||
return pr.metrics()
|
||||
case rScalar:
|
||||
var ps promScalar
|
||||
if err := json.Unmarshal(r.Data.Result, &ps); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return ps.metrics()
|
||||
default:
|
||||
return nil, fmt.Errorf("unknown result type %q", r.Data.ResultType)
|
||||
}
|
||||
}
|
||||
|
||||
const (
|
||||
prometheusInstantPath = "/api/v1/query"
|
||||
prometheusRangePath = "/api/v1/query_range"
|
||||
prometheusPrefix = "/prometheus"
|
||||
)
|
||||
|
||||
func (s *VMStorage) setPrometheusInstantReqParams(r *http.Request, query string, timestamp time.Time) {
|
||||
if s.appendTypePrefix {
|
||||
r.URL.Path += prometheusPrefix
|
||||
r.URL.Path += "/prometheus"
|
||||
}
|
||||
if !s.disablePathAppend {
|
||||
r.URL.Path += prometheusInstantPath
|
||||
if !*disablePathAppend {
|
||||
r.URL.Path += "/api/v1/query"
|
||||
}
|
||||
q := r.URL.Query()
|
||||
if s.lookBack > 0 {
|
||||
|
@ -136,10 +155,10 @@ func (s *VMStorage) setPrometheusInstantReqParams(r *http.Request, query string,
|
|||
|
||||
func (s *VMStorage) setPrometheusRangeReqParams(r *http.Request, query string, start, end time.Time) {
|
||||
if s.appendTypePrefix {
|
||||
r.URL.Path += prometheusPrefix
|
||||
r.URL.Path += "/prometheus"
|
||||
}
|
||||
if !s.disablePathAppend {
|
||||
r.URL.Path += prometheusRangePath
|
||||
if !*disablePathAppend {
|
||||
r.URL.Path += "/api/v1/query_range"
|
||||
}
|
||||
q := r.URL.Query()
|
||||
q.Add("start", fmt.Sprintf("%d", start.Unix()))
|
||||
|
|
|
@ -37,7 +37,7 @@ func TestVMInstantQuery(t *testing.T) {
|
|||
mux.HandleFunc("/render", func(w http.ResponseWriter, request *http.Request) {
|
||||
c++
|
||||
switch c {
|
||||
case 7:
|
||||
case 8:
|
||||
w.Write([]byte(`[{"target":"constantLine(10)","tags":{"name":"constantLine(10)"},"datapoints":[[10,1611758343],[10,1611758373],[10,1611758403]]}]`))
|
||||
}
|
||||
})
|
||||
|
@ -75,6 +75,8 @@ func TestVMInstantQuery(t *testing.T) {
|
|||
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix"}}`))
|
||||
case 6:
|
||||
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests"},"value":[1583786140,"2000"]}]}}`))
|
||||
case 7:
|
||||
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]}}`))
|
||||
}
|
||||
})
|
||||
|
||||
|
@ -85,31 +87,26 @@ func TestVMInstantQuery(t *testing.T) {
|
|||
if err != nil {
|
||||
t.Fatalf("unexpected: %s", err)
|
||||
}
|
||||
s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client(), false)
|
||||
s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client())
|
||||
|
||||
p := NewPrometheusType()
|
||||
pq := s.BuildWithParams(QuerierParams{DataSourceType: &p, EvaluationInterval: 15 * time.Second})
|
||||
ts := time.Now()
|
||||
|
||||
if _, err := pq.Query(ctx, query, ts); err == nil {
|
||||
t.Fatalf("expected connection error got nil")
|
||||
expErr := func(err string) {
|
||||
if _, err := pq.Query(ctx, query, ts); err == nil {
|
||||
t.Fatalf("expected %q got nil", err)
|
||||
}
|
||||
}
|
||||
if _, err := pq.Query(ctx, query, ts); err == nil {
|
||||
t.Fatalf("expected invalid response status error got nil")
|
||||
}
|
||||
if _, err := pq.Query(ctx, query, ts); err == nil {
|
||||
t.Fatalf("expected response body error got nil")
|
||||
}
|
||||
if _, err := pq.Query(ctx, query, ts); err == nil {
|
||||
t.Fatalf("expected error status got nil")
|
||||
}
|
||||
if _, err := pq.Query(ctx, query, ts); err == nil {
|
||||
t.Fatalf("expected unknown status got nil")
|
||||
}
|
||||
if _, err := pq.Query(ctx, query, ts); err == nil {
|
||||
t.Fatalf("expected non-vector resultType error got nil")
|
||||
}
|
||||
m, err := pq.Query(ctx, query, ts)
|
||||
|
||||
expErr("connection error") // 0
|
||||
expErr("invalid response status error") // 1
|
||||
expErr("response body error") // 2
|
||||
expErr("error status") // 3
|
||||
expErr("unknown status") // 4
|
||||
expErr("non-vector resultType error") // 5
|
||||
|
||||
m, err := pq.Query(ctx, query, ts) // 6 - vector
|
||||
if err != nil {
|
||||
t.Fatalf("unexpected %s", err)
|
||||
}
|
||||
|
@ -132,10 +129,27 @@ func TestVMInstantQuery(t *testing.T) {
|
|||
t.Fatalf("unexpected metric %+v want %+v", m, expected)
|
||||
}
|
||||
|
||||
m, err = pq.Query(ctx, query, ts) // 7 - scalar
|
||||
if err != nil {
|
||||
t.Fatalf("unexpected %s", err)
|
||||
}
|
||||
if len(m) != 1 {
|
||||
t.Fatalf("expected 1 metrics got %d in %+v", len(m), m)
|
||||
}
|
||||
expected = []Metric{
|
||||
{
|
||||
Timestamps: []int64{1583786142},
|
||||
Values: []float64{1},
|
||||
},
|
||||
}
|
||||
if !reflect.DeepEqual(m, expected) {
|
||||
t.Fatalf("unexpected metric %+v want %+v", m, expected)
|
||||
}
|
||||
|
||||
g := NewGraphiteType()
|
||||
gq := s.BuildWithParams(QuerierParams{DataSourceType: &g})
|
||||
|
||||
m, err = gq.Query(ctx, queryRender, ts)
|
||||
m, err = gq.Query(ctx, queryRender, ts) // 8 - graphite
|
||||
if err != nil {
|
||||
t.Fatalf("unexpected %s", err)
|
||||
}
|
||||
|
@ -196,7 +210,7 @@ func TestVMRangeQuery(t *testing.T) {
|
|||
if err != nil {
|
||||
t.Fatalf("unexpected: %s", err)
|
||||
}
|
||||
s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client(), false)
|
||||
s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client())
|
||||
|
||||
p := NewPrometheusType()
|
||||
pq := s.BuildWithParams(QuerierParams{DataSourceType: &p, EvaluationInterval: 15 * time.Second})
|
||||
|
@ -252,18 +266,7 @@ func TestRequestParams(t *testing.T) {
|
|||
dataSourceType: NewPrometheusType(),
|
||||
},
|
||||
func(t *testing.T, r *http.Request) {
|
||||
checkEqualString(t, prometheusInstantPath, r.URL.Path)
|
||||
},
|
||||
},
|
||||
{
|
||||
"prometheus path with disablePathAppend",
|
||||
false,
|
||||
&VMStorage{
|
||||
dataSourceType: NewPrometheusType(),
|
||||
disablePathAppend: true,
|
||||
},
|
||||
func(t *testing.T, r *http.Request) {
|
||||
checkEqualString(t, "", r.URL.Path)
|
||||
checkEqualString(t, "/api/v1/query", r.URL.Path)
|
||||
},
|
||||
},
|
||||
{
|
||||
|
@ -274,19 +277,7 @@ func TestRequestParams(t *testing.T) {
|
|||
appendTypePrefix: true,
|
||||
},
|
||||
func(t *testing.T, r *http.Request) {
|
||||
checkEqualString(t, prometheusPrefix+prometheusInstantPath, r.URL.Path)
|
||||
},
|
||||
},
|
||||
{
|
||||
"prometheus prefix with disablePathAppend",
|
||||
false,
|
||||
&VMStorage{
|
||||
dataSourceType: NewPrometheusType(),
|
||||
appendTypePrefix: true,
|
||||
disablePathAppend: true,
|
||||
},
|
||||
func(t *testing.T, r *http.Request) {
|
||||
checkEqualString(t, prometheusPrefix, r.URL.Path)
|
||||
checkEqualString(t, "/prometheus/api/v1/query", r.URL.Path)
|
||||
},
|
||||
},
|
||||
{
|
||||
|
@ -296,18 +287,7 @@ func TestRequestParams(t *testing.T) {
|
|||
dataSourceType: NewPrometheusType(),
|
||||
},
|
||||
func(t *testing.T, r *http.Request) {
|
||||
checkEqualString(t, prometheusRangePath, r.URL.Path)
|
||||
},
|
||||
},
|
||||
{
|
||||
"prometheus range path with disablePathAppend",
|
||||
true,
|
||||
&VMStorage{
|
||||
dataSourceType: NewPrometheusType(),
|
||||
disablePathAppend: true,
|
||||
},
|
||||
func(t *testing.T, r *http.Request) {
|
||||
checkEqualString(t, "", r.URL.Path)
|
||||
checkEqualString(t, "/api/v1/query_range", r.URL.Path)
|
||||
},
|
||||
},
|
||||
{
|
||||
|
@ -318,19 +298,7 @@ func TestRequestParams(t *testing.T) {
|
|||
appendTypePrefix: true,
|
||||
},
|
||||
func(t *testing.T, r *http.Request) {
|
||||
checkEqualString(t, prometheusPrefix+prometheusRangePath, r.URL.Path)
|
||||
},
|
||||
},
|
||||
{
|
||||
"prometheus range prefix with disablePathAppend",
|
||||
true,
|
||||
&VMStorage{
|
||||
dataSourceType: NewPrometheusType(),
|
||||
appendTypePrefix: true,
|
||||
disablePathAppend: true,
|
||||
},
|
||||
func(t *testing.T, r *http.Request) {
|
||||
checkEqualString(t, prometheusPrefix, r.URL.Path)
|
||||
checkEqualString(t, "/prometheus/api/v1/query_range", r.URL.Path)
|
||||
},
|
||||
},
|
||||
{
|
||||
|
|
|
@ -237,8 +237,6 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
|
|||
notifiers: nts,
|
||||
previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label)}
|
||||
|
||||
evalTS := time.Now()
|
||||
|
||||
// Spread group rules evaluation over time in order to reduce load on VictoriaMetrics.
|
||||
if !skipRandSleepOnGroupStart {
|
||||
randSleep := uint64(float64(g.Interval) * (float64(g.ID()) / (1 << 64)))
|
||||
|
@ -259,6 +257,8 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
|
|||
}
|
||||
}
|
||||
|
||||
evalTS := time.Now()
|
||||
|
||||
logger.Infof("group %q started; interval=%v; concurrency=%d", g.Name, g.Interval, g.Concurrency)
|
||||
|
||||
eval := func(ts time.Time) {
|
||||
|
@ -303,6 +303,10 @@ func (g *Group) start(ctx context.Context, nts func() []notifier.Notifier, rw *r
|
|||
g.mu.Unlock()
|
||||
continue
|
||||
}
|
||||
|
||||
// ensure that staleness is tracked or existing rules only
|
||||
e.purgeStaleSeries(g.Rules)
|
||||
|
||||
if g.Interval != ng.Interval {
|
||||
g.Interval = ng.Interval
|
||||
t.Stop()
|
||||
|
@ -457,6 +461,30 @@ func (e *executor) getStaleSeries(rule Rule, tss []prompbmarshal.TimeSeries, tim
|
|||
return staleS
|
||||
}
|
||||
|
||||
// purgeStaleSeries deletes references in tracked
|
||||
// previouslySentSeriesToRW list to Rules which aren't present
|
||||
// in the given activeRules list. The method is used when the list
|
||||
// of loaded rules has changed and executor has to remove
|
||||
// references to non-existing rules.
|
||||
func (e *executor) purgeStaleSeries(activeRules []Rule) {
|
||||
newPreviouslySentSeriesToRW := make(map[uint64]map[string][]prompbmarshal.Label)
|
||||
|
||||
e.previouslySentSeriesToRWMu.Lock()
|
||||
|
||||
for _, rule := range activeRules {
|
||||
id := rule.ID()
|
||||
prev, ok := e.previouslySentSeriesToRW[id]
|
||||
if ok {
|
||||
// keep previous series for staleness detection
|
||||
newPreviouslySentSeriesToRW[id] = prev
|
||||
}
|
||||
}
|
||||
e.previouslySentSeriesToRW = nil
|
||||
e.previouslySentSeriesToRW = newPreviouslySentSeriesToRW
|
||||
|
||||
e.previouslySentSeriesToRWMu.Unlock()
|
||||
}
|
||||
|
||||
func labelsToString(labels []prompbmarshal.Label) string {
|
||||
var b strings.Builder
|
||||
b.WriteRune('{')
|
||||
|
|
|
@ -157,7 +157,7 @@ func TestUpdateWith(t *testing.T) {
|
|||
|
||||
func TestGroupStart(t *testing.T) {
|
||||
// TODO: make parsing from string instead of file
|
||||
groups, err := config.Parse([]string{"config/testdata/rules1-good.rules"}, true, true)
|
||||
groups, err := config.Parse([]string{"config/testdata/rules/rules1-good.rules"}, true, true)
|
||||
if err != nil {
|
||||
t.Fatalf("failed to parse rules: %s", err)
|
||||
}
|
||||
|
@ -355,3 +355,61 @@ func TestGetStaleSeries(t *testing.T) {
|
|||
[][]prompbmarshal.Label{toPromLabels(t, "__name__", "job:foo", "job", "bar")},
|
||||
nil)
|
||||
}
|
||||
|
||||
func TestPurgeStaleSeries(t *testing.T) {
|
||||
ts := time.Now()
|
||||
labels := toPromLabels(t, "__name__", "job:foo", "job", "foo")
|
||||
tss := []prompbmarshal.TimeSeries{newTimeSeriesPB([]float64{1}, []int64{ts.Unix()}, labels)}
|
||||
|
||||
f := func(curRules, newRules, expStaleRules []Rule) {
|
||||
t.Helper()
|
||||
e := &executor{
|
||||
previouslySentSeriesToRW: make(map[uint64]map[string][]prompbmarshal.Label),
|
||||
}
|
||||
// seed executor with series for
|
||||
// current rules
|
||||
for _, rule := range curRules {
|
||||
e.getStaleSeries(rule, tss, ts)
|
||||
}
|
||||
|
||||
e.purgeStaleSeries(newRules)
|
||||
|
||||
if len(e.previouslySentSeriesToRW) != len(expStaleRules) {
|
||||
t.Fatalf("expected to get %d stale series, got %d",
|
||||
len(expStaleRules), len(e.previouslySentSeriesToRW))
|
||||
}
|
||||
|
||||
for _, exp := range expStaleRules {
|
||||
if _, ok := e.previouslySentSeriesToRW[exp.ID()]; !ok {
|
||||
t.Fatalf("expected to have rule %d; got nil instead", exp.ID())
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
f(nil, nil, nil)
|
||||
f(
|
||||
nil,
|
||||
[]Rule{&AlertingRule{RuleID: 1}},
|
||||
nil,
|
||||
)
|
||||
f(
|
||||
[]Rule{&AlertingRule{RuleID: 1}},
|
||||
nil,
|
||||
nil,
|
||||
)
|
||||
f(
|
||||
[]Rule{&AlertingRule{RuleID: 1}},
|
||||
[]Rule{&AlertingRule{RuleID: 2}},
|
||||
nil,
|
||||
)
|
||||
f(
|
||||
[]Rule{&AlertingRule{RuleID: 1}, &AlertingRule{RuleID: 2}},
|
||||
[]Rule{&AlertingRule{RuleID: 2}},
|
||||
[]Rule{&AlertingRule{RuleID: 2}},
|
||||
)
|
||||
f(
|
||||
[]Rule{&AlertingRule{RuleID: 1}, &AlertingRule{RuleID: 2}},
|
||||
[]Rule{&AlertingRule{RuleID: 1}, &AlertingRule{RuleID: 2}},
|
||||
[]Rule{&AlertingRule{RuleID: 1}, &AlertingRule{RuleID: 2}},
|
||||
)
|
||||
}
|
||||
|
|
|
@ -15,6 +15,7 @@ import (
|
|||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remoteread"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
|
||||
|
@ -34,6 +35,13 @@ Examples:
|
|||
absolute path to all .yaml files in root.
|
||||
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.`)
|
||||
|
||||
ruleTemplatesPath = flagutil.NewArray("rule.templates", `Path or glob pattern to location with go template definitions
|
||||
for rules annotations templating. Flag can be specified multiple times.
|
||||
Examples:
|
||||
-rule.templates="/path/to/file". Path to a single file with go templates
|
||||
-rule.templates="dir/*.tpl" -rule.templates="/*.tpl". Relative path to all .tpl files in "dir" folder,
|
||||
absolute path to all .tpl files in root.`)
|
||||
|
||||
rulesCheckInterval = flag.Duration("rule.configCheckInterval", 0, "Interval for checking for changes in '-rule' files. "+
|
||||
"By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. DEPRECATED - see '-configCheckInterval' instead")
|
||||
|
||||
|
@ -73,10 +81,12 @@ func main() {
|
|||
envflag.Parse()
|
||||
buildinfo.Init()
|
||||
logger.Init()
|
||||
err := templates.Load(*ruleTemplatesPath, true)
|
||||
if err != nil {
|
||||
logger.Fatalf("failed to parse %q: %s", *ruleTemplatesPath, err)
|
||||
}
|
||||
|
||||
if *dryRun {
|
||||
u, _ := url.Parse("https://victoriametrics.com/")
|
||||
notifier.InitTemplateFunc(u)
|
||||
groups, err := config.Parse(*rulePath, true, true)
|
||||
if err != nil {
|
||||
logger.Fatalf("failed to parse %q: %s", *rulePath, err)
|
||||
|
@ -91,7 +101,7 @@ func main() {
|
|||
if err != nil {
|
||||
logger.Fatalf("failed to init `external.url`: %s", err)
|
||||
}
|
||||
notifier.InitTemplateFunc(eu)
|
||||
|
||||
alertURLGeneratorFn, err = getAlertURLGenerator(eu, *externalAlertSource, *validateTemplates)
|
||||
if err != nil {
|
||||
logger.Fatalf("failed to init `external.alert.source`: %s", err)
|
||||
|
@ -105,7 +115,6 @@ func main() {
|
|||
if rw == nil {
|
||||
logger.Fatalf("remoteWrite.url can't be empty in replay mode")
|
||||
}
|
||||
notifier.InitTemplateFunc(eu)
|
||||
groupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions)
|
||||
if err != nil {
|
||||
logger.Fatalf("cannot parse configuration file: %s", err)
|
||||
|
@ -127,7 +136,6 @@ func main() {
|
|||
if err != nil {
|
||||
logger.Fatalf("failed to init: %s", err)
|
||||
}
|
||||
|
||||
logger.Infof("reading rules configuration file from %q", strings.Join(*rulePath, ";"))
|
||||
groupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions)
|
||||
if err != nil {
|
||||
|
@ -170,7 +178,7 @@ func newManager(ctx context.Context) (*manager, error) {
|
|||
return nil, fmt.Errorf("failed to init datasource: %w", err)
|
||||
}
|
||||
|
||||
labels := make(map[string]string, 0)
|
||||
labels := make(map[string]string)
|
||||
for _, s := range *externalLabels {
|
||||
if len(s) == 0 {
|
||||
continue
|
||||
|
@ -281,7 +289,11 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
|
|||
case <-ctx.Done():
|
||||
return
|
||||
case <-sighupCh:
|
||||
logger.Infof("SIGHUP received. Going to reload rules %q ...", *rulePath)
|
||||
tmplMsg := ""
|
||||
if len(*ruleTemplatesPath) > 0 {
|
||||
tmplMsg = fmt.Sprintf("and templates %q ", *ruleTemplatesPath)
|
||||
}
|
||||
logger.Infof("SIGHUP received. Going to reload rules %q %s...", *rulePath, tmplMsg)
|
||||
configReloads.Inc()
|
||||
case <-configCheckCh:
|
||||
}
|
||||
|
@ -291,6 +303,13 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
|
|||
logger.Errorf("failed to reload notifier config: %s", err)
|
||||
continue
|
||||
}
|
||||
err := templates.Load(*ruleTemplatesPath, false)
|
||||
if err != nil {
|
||||
configReloadErrors.Inc()
|
||||
configSuccess.Set(0)
|
||||
logger.Errorf("failed to load new templates: %s", err)
|
||||
continue
|
||||
}
|
||||
newGroupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions)
|
||||
if err != nil {
|
||||
configReloadErrors.Inc()
|
||||
|
@ -299,6 +318,7 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
|
|||
continue
|
||||
}
|
||||
if configsEqual(newGroupsCfg, groupsCfg) {
|
||||
templates.Reload()
|
||||
// set success to 1 since previous reload
|
||||
// could have been unsuccessful
|
||||
configSuccess.Set(1)
|
||||
|
@ -311,6 +331,7 @@ func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sig
|
|||
logger.Errorf("error while reloading rules: %s", err)
|
||||
continue
|
||||
}
|
||||
templates.Reload()
|
||||
groupsCfg = newGroupsCfg
|
||||
configSuccess.Set(1)
|
||||
configTimestamp.Set(fasttime.UnixTimestamp())
|
||||
|
|
|
@ -3,7 +3,6 @@ package main
|
|||
import (
|
||||
"context"
|
||||
"math/rand"
|
||||
"net/url"
|
||||
"os"
|
||||
"strings"
|
||||
"sync"
|
||||
|
@ -14,11 +13,13 @@ import (
|
|||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
|
||||
)
|
||||
|
||||
func TestMain(m *testing.M) {
|
||||
u, _ := url.Parse("https://victoriametrics.com/path")
|
||||
notifier.InitTemplateFunc(u)
|
||||
if err := templates.Load([]string{"testdata/templates/*good.tmpl"}, true); err != nil {
|
||||
os.Exit(1)
|
||||
}
|
||||
os.Exit(m.Run())
|
||||
}
|
||||
|
||||
|
@ -47,9 +48,9 @@ func TestManagerUpdateConcurrent(t *testing.T) {
|
|||
"config/testdata/dir/rules0-bad.rules",
|
||||
"config/testdata/dir/rules1-good.rules",
|
||||
"config/testdata/dir/rules1-bad.rules",
|
||||
"config/testdata/rules0-good.rules",
|
||||
"config/testdata/rules1-good.rules",
|
||||
"config/testdata/rules2-good.rules",
|
||||
"config/testdata/rules/rules0-good.rules",
|
||||
"config/testdata/rules/rules1-good.rules",
|
||||
"config/testdata/rules/rules2-good.rules",
|
||||
}
|
||||
evalInterval := *evaluationInterval
|
||||
defer func() { *evaluationInterval = evalInterval }()
|
||||
|
@ -125,7 +126,7 @@ func TestManagerUpdate(t *testing.T) {
|
|||
}{
|
||||
{
|
||||
name: "update good rules",
|
||||
initPath: "config/testdata/rules0-good.rules",
|
||||
initPath: "config/testdata/rules/rules0-good.rules",
|
||||
updatePath: "config/testdata/dir/rules1-good.rules",
|
||||
want: []*Group{
|
||||
{
|
||||
|
@ -150,18 +151,18 @@ func TestManagerUpdate(t *testing.T) {
|
|||
},
|
||||
{
|
||||
name: "update good rules from 1 to 2 groups",
|
||||
initPath: "config/testdata/dir/rules1-good.rules",
|
||||
updatePath: "config/testdata/rules0-good.rules",
|
||||
initPath: "config/testdata/dir/rules/rules1-good.rules",
|
||||
updatePath: "config/testdata/rules/rules0-good.rules",
|
||||
want: []*Group{
|
||||
{
|
||||
File: "config/testdata/rules0-good.rules",
|
||||
File: "config/testdata/rules/rules0-good.rules",
|
||||
Name: "groupGorSingleAlert",
|
||||
Type: datasource.NewPrometheusType(),
|
||||
Rules: []Rule{VMRows},
|
||||
Interval: defaultEvalInterval,
|
||||
},
|
||||
{
|
||||
File: "config/testdata/rules0-good.rules",
|
||||
File: "config/testdata/rules/rules0-good.rules",
|
||||
Interval: defaultEvalInterval,
|
||||
Type: datasource.NewPrometheusType(),
|
||||
Name: "TestGroup", Rules: []Rule{
|
||||
|
@ -172,18 +173,18 @@ func TestManagerUpdate(t *testing.T) {
|
|||
},
|
||||
{
|
||||
name: "update with one bad rule file",
|
||||
initPath: "config/testdata/rules0-good.rules",
|
||||
initPath: "config/testdata/rules/rules0-good.rules",
|
||||
updatePath: "config/testdata/dir/rules2-bad.rules",
|
||||
want: []*Group{
|
||||
{
|
||||
File: "config/testdata/rules0-good.rules",
|
||||
File: "config/testdata/rules/rules0-good.rules",
|
||||
Name: "groupGorSingleAlert",
|
||||
Type: datasource.NewPrometheusType(),
|
||||
Interval: defaultEvalInterval,
|
||||
Rules: []Rule{VMRows},
|
||||
},
|
||||
{
|
||||
File: "config/testdata/rules0-good.rules",
|
||||
File: "config/testdata/rules/rules0-good.rules",
|
||||
Interval: defaultEvalInterval,
|
||||
Name: "TestGroup",
|
||||
Type: datasource.NewPrometheusType(),
|
||||
|
@ -196,17 +197,17 @@ func TestManagerUpdate(t *testing.T) {
|
|||
{
|
||||
name: "update empty dir rules from 0 to 2 groups",
|
||||
initPath: "config/testdata/empty/*",
|
||||
updatePath: "config/testdata/rules0-good.rules",
|
||||
updatePath: "config/testdata/rules/rules0-good.rules",
|
||||
want: []*Group{
|
||||
{
|
||||
File: "config/testdata/rules0-good.rules",
|
||||
File: "config/testdata/rules/rules0-good.rules",
|
||||
Name: "groupGorSingleAlert",
|
||||
Type: datasource.NewPrometheusType(),
|
||||
Interval: defaultEvalInterval,
|
||||
Rules: []Rule{VMRows},
|
||||
},
|
||||
{
|
||||
File: "config/testdata/rules0-good.rules",
|
||||
File: "config/testdata/rules/rules0-good.rules",
|
||||
Interval: defaultEvalInterval,
|
||||
Type: datasource.NewPrometheusType(),
|
||||
Name: "TestGroup", Rules: []Rule{
|
||||
|
|
|
@ -5,9 +5,10 @@ import (
|
|||
"fmt"
|
||||
"io"
|
||||
"strings"
|
||||
"text/template"
|
||||
textTpl "text/template"
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
|
||||
|
@ -90,26 +91,38 @@ var tplHeaders = []string{
|
|||
// map of annotations.
|
||||
// Every alert could have a different datasource, so function
|
||||
// requires a queryFunction as an argument.
|
||||
func (a *Alert) ExecTemplate(q QueryFn, labels, annotations map[string]string) (map[string]string, error) {
|
||||
func (a *Alert) ExecTemplate(q templates.QueryFn, labels, annotations map[string]string) (map[string]string, error) {
|
||||
tplData := AlertTplData{Value: a.Value, Labels: labels, Expr: a.Expr}
|
||||
return templateAnnotations(annotations, tplData, funcsWithQuery(q), true)
|
||||
tmpl, err := templates.GetWithFuncs(templates.FuncsWithQuery(q))
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error getting a template: %w", err)
|
||||
}
|
||||
return templateAnnotations(annotations, tplData, tmpl, true)
|
||||
}
|
||||
|
||||
// ExecTemplate executes the given template for given annotations map.
|
||||
func ExecTemplate(q QueryFn, annotations map[string]string, tpl AlertTplData) (map[string]string, error) {
|
||||
return templateAnnotations(annotations, tpl, funcsWithQuery(q), true)
|
||||
func ExecTemplate(q templates.QueryFn, annotations map[string]string, tplData AlertTplData) (map[string]string, error) {
|
||||
tmpl, err := templates.GetWithFuncs(templates.FuncsWithQuery(q))
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error cloning template: %w", err)
|
||||
}
|
||||
return templateAnnotations(annotations, tplData, tmpl, true)
|
||||
}
|
||||
|
||||
// ValidateTemplates validate annotations for possible template error, uses empty data for template population
|
||||
func ValidateTemplates(annotations map[string]string) error {
|
||||
_, err := templateAnnotations(annotations, AlertTplData{
|
||||
tmpl, err := templates.Get()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
_, err = templateAnnotations(annotations, AlertTplData{
|
||||
Labels: map[string]string{},
|
||||
Value: 0,
|
||||
}, tmplFunc, false)
|
||||
}, tmpl, false)
|
||||
return err
|
||||
}
|
||||
|
||||
func templateAnnotations(annotations map[string]string, data AlertTplData, funcs template.FuncMap, execute bool) (map[string]string, error) {
|
||||
func templateAnnotations(annotations map[string]string, data AlertTplData, tmpl *textTpl.Template, execute bool) (map[string]string, error) {
|
||||
var builder strings.Builder
|
||||
var buf bytes.Buffer
|
||||
eg := new(utils.ErrGroup)
|
||||
|
@ -122,7 +135,7 @@ func templateAnnotations(annotations map[string]string, data AlertTplData, funcs
|
|||
builder.Grow(len(header) + len(text))
|
||||
builder.WriteString(header)
|
||||
builder.WriteString(text)
|
||||
if err := templateAnnotation(&buf, builder.String(), tData, funcs, execute); err != nil {
|
||||
if err := templateAnnotation(&buf, builder.String(), tData, tmpl, execute); err != nil {
|
||||
r[key] = text
|
||||
eg.Add(fmt.Errorf("key %q, template %q: %w", key, text, err))
|
||||
continue
|
||||
|
@ -138,11 +151,17 @@ type tplData struct {
|
|||
ExternalURL string
|
||||
}
|
||||
|
||||
func templateAnnotation(dst io.Writer, text string, data tplData, funcs template.FuncMap, execute bool) error {
|
||||
t := template.New("").Funcs(funcs).Option("missingkey=zero")
|
||||
tpl, err := t.Parse(text)
|
||||
func templateAnnotation(dst io.Writer, text string, data tplData, tmpl *textTpl.Template, execute bool) error {
|
||||
tpl, err := tmpl.Clone()
|
||||
if err != nil {
|
||||
return fmt.Errorf("error parsing annotation: %w", err)
|
||||
return fmt.Errorf("error cloning template before parse annotation: %w", err)
|
||||
}
|
||||
tpl, err = tpl.Parse(text)
|
||||
if err != nil {
|
||||
return fmt.Errorf("error parsing annotation template: %w", err)
|
||||
}
|
||||
if !execute {
|
||||
return nil
|
||||
}
|
||||
if !execute {
|
||||
return nil
|
||||
|
|
|
@ -11,7 +11,7 @@ import (
|
|||
)
|
||||
|
||||
func TestAlert_ExecTemplate(t *testing.T) {
|
||||
extLabels := make(map[string]string, 0)
|
||||
extLabels := make(map[string]string)
|
||||
const (
|
||||
extCluster = "prod"
|
||||
extDC = "east"
|
||||
|
|
|
@ -3,9 +3,11 @@ package notifier
|
|||
import (
|
||||
"flag"
|
||||
"fmt"
|
||||
"net/url"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
|
@ -83,6 +85,12 @@ func Init(gen AlertURLGenerator, extLabels map[string]string, extURL string) (fu
|
|||
|
||||
externalURL = extURL
|
||||
externalLabels = extLabels
|
||||
eu, err := url.Parse(externalURL)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to parse external URL: %s", err)
|
||||
}
|
||||
|
||||
templates.UpdateWithFuncs(templates.FuncsWithExternalURL(eu))
|
||||
|
||||
if *configPath == "" && len(*addrs) == 0 {
|
||||
return nil, nil
|
||||
|
@ -102,7 +110,6 @@ func Init(gen AlertURLGenerator, extLabels map[string]string, extURL string) (fu
|
|||
return staticNotifiersFn, nil
|
||||
}
|
||||
|
||||
var err error
|
||||
cw, err = newWatcher(*configPath, gen)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to init config watcher: %s", err)
|
||||
|
|
|
@ -1,13 +1,14 @@
|
|||
package notifier
|
||||
|
||||
import (
|
||||
"net/url"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/templates"
|
||||
"os"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func TestMain(m *testing.M) {
|
||||
u, _ := url.Parse("https://victoriametrics.com/path")
|
||||
InitTemplateFunc(u)
|
||||
if err := templates.Load([]string{"testdata/templates/*good.tmpl"}, true); err != nil {
|
||||
os.Exit(1)
|
||||
}
|
||||
os.Exit(m.Run())
|
||||
}
|
||||
|
|
|
@ -34,8 +34,6 @@ var (
|
|||
oauth2ClientSecretFile = flag.String("remoteRead.oauth2.clientSecretFile", "", "Optional OAuth2 clientSecretFile to use for -remoteRead.url.")
|
||||
oauth2TokenURL = flag.String("remoteRead.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -remoteRead.url. ")
|
||||
oauth2Scopes = flag.String("remoteRead.oauth2.scopes", "", "Optional OAuth2 scopes to use for -remoteRead.url. Scopes must be delimited by ';'.")
|
||||
|
||||
disablePathAppend = flag.Bool("remoteRead.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/query' path to the configured -remoteRead.url.")
|
||||
)
|
||||
|
||||
// Init creates a Querier from provided flag values.
|
||||
|
@ -57,5 +55,5 @@ func Init() (datasource.QuerierBuilder, error) {
|
|||
return nil, fmt.Errorf("failed to configure auth: %w", err)
|
||||
}
|
||||
c := &http.Client{Transport: tr}
|
||||
return datasource.NewVMStorage(*addr, authCfg, 0, 0, false, c, *disablePathAppend), nil
|
||||
return datasource.NewVMStorage(*addr, authCfg, 0, 0, false, c), nil
|
||||
}
|
||||
|
|
|
@ -39,8 +39,6 @@ var (
|
|||
oauth2ClientSecretFile = flag.String("remoteWrite.oauth2.clientSecretFile", "", "Optional OAuth2 clientSecretFile to use for -remoteWrite.url.")
|
||||
oauth2TokenURL = flag.String("remoteWrite.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -notifier.url.")
|
||||
oauth2Scopes = flag.String("remoteWrite.oauth2.scopes", "", "Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'.")
|
||||
|
||||
disablePathAppend = flag.Bool("remoteWrite.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url.")
|
||||
)
|
||||
|
||||
// Init creates Client object from given flags.
|
||||
|
|
|
@ -3,6 +3,7 @@ package remotewrite
|
|||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"flag"
|
||||
"fmt"
|
||||
"io/ioutil"
|
||||
"net/http"
|
||||
|
@ -19,17 +20,20 @@ import (
|
|||
"github.com/VictoriaMetrics/metrics"
|
||||
)
|
||||
|
||||
var (
|
||||
disablePathAppend = flag.Bool("remoteWrite.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url.")
|
||||
)
|
||||
|
||||
// Client is an asynchronous HTTP client for writing
|
||||
// timeseries via remote write protocol.
|
||||
type Client struct {
|
||||
addr string
|
||||
c *http.Client
|
||||
authCfg *promauth.Config
|
||||
input chan prompbmarshal.TimeSeries
|
||||
flushInterval time.Duration
|
||||
maxBatchSize int
|
||||
maxQueueSize int
|
||||
disablePathAppend bool
|
||||
addr string
|
||||
c *http.Client
|
||||
authCfg *promauth.Config
|
||||
input chan prompbmarshal.TimeSeries
|
||||
flushInterval time.Duration
|
||||
maxBatchSize int
|
||||
maxQueueSize int
|
||||
|
||||
wg sync.WaitGroup
|
||||
doneCh chan struct{}
|
||||
|
@ -70,8 +74,6 @@ const (
|
|||
defaultWriteTimeout = 30 * time.Second
|
||||
)
|
||||
|
||||
const writePath = "/api/v1/write"
|
||||
|
||||
// NewClient returns asynchronous client for
|
||||
// writing timeseries via remotewrite protocol.
|
||||
func NewClient(ctx context.Context, cfg Config) (*Client, error) {
|
||||
|
@ -102,14 +104,13 @@ func NewClient(ctx context.Context, cfg Config) (*Client, error) {
|
|||
Timeout: cfg.WriteTimeout,
|
||||
Transport: cfg.Transport,
|
||||
},
|
||||
addr: strings.TrimSuffix(cfg.Addr, "/"),
|
||||
authCfg: cfg.AuthCfg,
|
||||
flushInterval: cfg.FlushInterval,
|
||||
maxBatchSize: cfg.MaxBatchSize,
|
||||
maxQueueSize: cfg.MaxQueueSize,
|
||||
doneCh: make(chan struct{}),
|
||||
input: make(chan prompbmarshal.TimeSeries, cfg.MaxQueueSize),
|
||||
disablePathAppend: cfg.DisablePathAppend,
|
||||
addr: strings.TrimSuffix(cfg.Addr, "/"),
|
||||
authCfg: cfg.AuthCfg,
|
||||
flushInterval: cfg.FlushInterval,
|
||||
maxBatchSize: cfg.MaxBatchSize,
|
||||
maxQueueSize: cfg.MaxQueueSize,
|
||||
doneCh: make(chan struct{}),
|
||||
input: make(chan prompbmarshal.TimeSeries, cfg.MaxQueueSize),
|
||||
}
|
||||
|
||||
for i := 0; i < cc; i++ {
|
||||
|
@ -240,8 +241,8 @@ func (c *Client) send(ctx context.Context, data []byte) error {
|
|||
req.Header.Set("Authorization", auth)
|
||||
}
|
||||
}
|
||||
if !c.disablePathAppend {
|
||||
req.URL.Path = path.Join(req.URL.Path, writePath)
|
||||
if !*disablePathAppend {
|
||||
req.URL.Path = path.Join(req.URL.Path, "/api/v1/write")
|
||||
}
|
||||
resp, err := c.c.Do(req.WithContext(ctx))
|
||||
if err != nil {
|
||||
|
|
|
@ -7,13 +7,12 @@ import (
|
|||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/cheggaaa/pb/v3"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
|
||||
"github.com/dmitryk-dk/pb/v3"
|
||||
)
|
||||
|
||||
var (
|
||||
|
|
|
@ -11,26 +11,118 @@
|
|||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package notifier
|
||||
package templates
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
htmlTpl "html/template"
|
||||
"io/ioutil"
|
||||
"math"
|
||||
"net"
|
||||
"net/url"
|
||||
"path/filepath"
|
||||
"regexp"
|
||||
"sort"
|
||||
"strconv"
|
||||
"strings"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
htmlTpl "html/template"
|
||||
textTpl "text/template"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
|
||||
)
|
||||
|
||||
// go template execution fails when it's tree is empty
|
||||
const defaultTemplate = `{{- define "default.template" -}}{{- end -}}`
|
||||
|
||||
var tplMu sync.RWMutex
|
||||
|
||||
type textTemplate struct {
|
||||
current *textTpl.Template
|
||||
replacement *textTpl.Template
|
||||
}
|
||||
|
||||
var masterTmpl textTemplate
|
||||
|
||||
func newTemplate() *textTpl.Template {
|
||||
tmpl := textTpl.New("").Option("missingkey=zero").Funcs(templateFuncs())
|
||||
return textTpl.Must(tmpl.Parse(defaultTemplate))
|
||||
}
|
||||
|
||||
// Load func loads templates from multiple globs specified in pathPatterns and either
|
||||
// sets them directly to current template if it's undefined or with overwrite=true
|
||||
// or sets replacement templates and adds templates with new names to a current
|
||||
func Load(pathPatterns []string, overwrite bool) error {
|
||||
var err error
|
||||
tmpl := newTemplate()
|
||||
for _, tp := range pathPatterns {
|
||||
p, err := filepath.Glob(tp)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to retrieve a template glob %q: %w", tp, err)
|
||||
}
|
||||
if len(p) > 0 {
|
||||
tmpl, err = tmpl.ParseGlob(tp)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to parse template glob %q: %w", tp, err)
|
||||
}
|
||||
}
|
||||
}
|
||||
if len(tmpl.Templates()) > 0 {
|
||||
err := tmpl.Execute(ioutil.Discard, nil)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to execute template: %w", err)
|
||||
}
|
||||
}
|
||||
tplMu.Lock()
|
||||
defer tplMu.Unlock()
|
||||
if masterTmpl.current == nil || overwrite {
|
||||
masterTmpl.replacement = nil
|
||||
masterTmpl.current = newTemplate()
|
||||
} else {
|
||||
masterTmpl.replacement = newTemplate()
|
||||
if err = copyTemplates(tmpl, masterTmpl.replacement, overwrite); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return copyTemplates(tmpl, masterTmpl.current, overwrite)
|
||||
}
|
||||
|
||||
func copyTemplates(from *textTpl.Template, to *textTpl.Template, overwrite bool) error {
|
||||
if from == nil {
|
||||
return nil
|
||||
}
|
||||
if to == nil {
|
||||
to = newTemplate()
|
||||
}
|
||||
tmpl, err := from.Clone()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
for _, t := range tmpl.Templates() {
|
||||
if to.Lookup(t.Name()) == nil || overwrite {
|
||||
to, err = to.AddParseTree(t.Name(), t.Tree)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to add template %q: %w", t.Name(), err)
|
||||
}
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// Reload func replaces current template with a replacement template
|
||||
// which was set by Load with override=false
|
||||
func Reload() {
|
||||
tplMu.Lock()
|
||||
defer tplMu.Unlock()
|
||||
if masterTmpl.replacement != nil {
|
||||
masterTmpl.current = masterTmpl.replacement
|
||||
masterTmpl.replacement = nil
|
||||
}
|
||||
}
|
||||
|
||||
// metric is private copy of datasource.Metric,
|
||||
// it is used for templating annotations,
|
||||
// Labels as map simplifies templates evaluation.
|
||||
|
@ -60,12 +152,62 @@ func datasourceMetricsToTemplateMetrics(ms []datasource.Metric) []metric {
|
|||
// for templating functions.
|
||||
type QueryFn func(query string) ([]datasource.Metric, error)
|
||||
|
||||
var tmplFunc textTpl.FuncMap
|
||||
// UpdateWithFuncs updates existing or sets a new function map for a template
|
||||
func UpdateWithFuncs(funcs textTpl.FuncMap) {
|
||||
tplMu.Lock()
|
||||
defer tplMu.Unlock()
|
||||
masterTmpl.current = masterTmpl.current.Funcs(funcs)
|
||||
}
|
||||
|
||||
// InitTemplateFunc initiates template helper functions
|
||||
func InitTemplateFunc(externalURL *url.URL) {
|
||||
// GetWithFuncs returns a copy of current template with additional FuncMap
|
||||
// provided with funcs argument
|
||||
func GetWithFuncs(funcs textTpl.FuncMap) (*textTpl.Template, error) {
|
||||
tplMu.RLock()
|
||||
defer tplMu.RUnlock()
|
||||
tmpl, err := masterTmpl.current.Clone()
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return tmpl.Funcs(funcs), nil
|
||||
}
|
||||
|
||||
// Get returns a copy of a template
|
||||
func Get() (*textTpl.Template, error) {
|
||||
tplMu.RLock()
|
||||
defer tplMu.RUnlock()
|
||||
return masterTmpl.current.Clone()
|
||||
}
|
||||
|
||||
// FuncsWithQuery returns a function map that depends on metric data
|
||||
func FuncsWithQuery(query QueryFn) textTpl.FuncMap {
|
||||
return textTpl.FuncMap{
|
||||
"query": func(q string) ([]metric, error) {
|
||||
result, err := query(q)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return datasourceMetricsToTemplateMetrics(result), nil
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
// FuncsWithExternalURL returns a function map that depends on externalURL value
|
||||
func FuncsWithExternalURL(externalURL *url.URL) textTpl.FuncMap {
|
||||
return textTpl.FuncMap{
|
||||
"externalURL": func() string {
|
||||
return externalURL.String()
|
||||
},
|
||||
|
||||
"pathPrefix": func() string {
|
||||
return externalURL.Path
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
// templateFuncs initiates template helper functions
|
||||
func templateFuncs() textTpl.FuncMap {
|
||||
// See https://prometheus.io/docs/prometheus/latest/configuration/template_reference/
|
||||
tmplFunc = textTpl.FuncMap{
|
||||
return textTpl.FuncMap{
|
||||
/* Strings */
|
||||
|
||||
// reReplaceAll ReplaceAllString returns a copy of src, replacing matches of the Regexp with
|
||||
|
@ -117,9 +259,13 @@ func InitTemplateFunc(externalURL *url.URL) {
|
|||
|
||||
// humanize converts given number to a human readable format
|
||||
// by adding metric prefixes https://en.wikipedia.org/wiki/Metric_prefix
|
||||
"humanize": func(v float64) string {
|
||||
"humanize": func(i interface{}) (string, error) {
|
||||
v, err := toFloat64(i)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
if v == 0 || math.IsNaN(v) || math.IsInf(v, 0) {
|
||||
return fmt.Sprintf("%.4g", v)
|
||||
return fmt.Sprintf("%.4g", v), nil
|
||||
}
|
||||
if math.Abs(v) >= 1 {
|
||||
prefix := ""
|
||||
|
@ -130,7 +276,7 @@ func InitTemplateFunc(externalURL *url.URL) {
|
|||
prefix = p
|
||||
v /= 1000
|
||||
}
|
||||
return fmt.Sprintf("%.4g%s", v, prefix)
|
||||
return fmt.Sprintf("%.4g%s", v, prefix), nil
|
||||
}
|
||||
prefix := ""
|
||||
for _, p := range []string{"m", "u", "n", "p", "f", "a", "z", "y"} {
|
||||
|
@ -140,13 +286,17 @@ func InitTemplateFunc(externalURL *url.URL) {
|
|||
prefix = p
|
||||
v *= 1000
|
||||
}
|
||||
return fmt.Sprintf("%.4g%s", v, prefix)
|
||||
return fmt.Sprintf("%.4g%s", v, prefix), nil
|
||||
},
|
||||
|
||||
// humanize1024 converts given number to a human readable format with 1024 as base
|
||||
"humanize1024": func(v float64) string {
|
||||
"humanize1024": func(i interface{}) (string, error) {
|
||||
v, err := toFloat64(i)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
if math.Abs(v) <= 1 || math.IsNaN(v) || math.IsInf(v, 0) {
|
||||
return fmt.Sprintf("%.4g", v)
|
||||
return fmt.Sprintf("%.4g", v), nil
|
||||
}
|
||||
prefix := ""
|
||||
for _, p := range []string{"ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi", "Yi"} {
|
||||
|
@ -156,16 +306,20 @@ func InitTemplateFunc(externalURL *url.URL) {
|
|||
prefix = p
|
||||
v /= 1024
|
||||
}
|
||||
return fmt.Sprintf("%.4g%s", v, prefix)
|
||||
return fmt.Sprintf("%.4g%s", v, prefix), nil
|
||||
},
|
||||
|
||||
// humanizeDuration converts given seconds to a human readable duration
|
||||
"humanizeDuration": func(v float64) string {
|
||||
"humanizeDuration": func(i interface{}) (string, error) {
|
||||
v, err := toFloat64(i)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
if math.IsNaN(v) || math.IsInf(v, 0) {
|
||||
return fmt.Sprintf("%.4g", v)
|
||||
return fmt.Sprintf("%.4g", v), nil
|
||||
}
|
||||
if v == 0 {
|
||||
return fmt.Sprintf("%.4gs", v)
|
||||
return fmt.Sprintf("%.4gs", v), nil
|
||||
}
|
||||
if math.Abs(v) >= 1 {
|
||||
sign := ""
|
||||
|
@ -179,16 +333,16 @@ func InitTemplateFunc(externalURL *url.URL) {
|
|||
days := int64(v) / 60 / 60 / 24
|
||||
// For days to minutes, we display seconds as an integer.
|
||||
if days != 0 {
|
||||
return fmt.Sprintf("%s%dd %dh %dm %ds", sign, days, hours, minutes, seconds)
|
||||
return fmt.Sprintf("%s%dd %dh %dm %ds", sign, days, hours, minutes, seconds), nil
|
||||
}
|
||||
if hours != 0 {
|
||||
return fmt.Sprintf("%s%dh %dm %ds", sign, hours, minutes, seconds)
|
||||
return fmt.Sprintf("%s%dh %dm %ds", sign, hours, minutes, seconds), nil
|
||||
}
|
||||
if minutes != 0 {
|
||||
return fmt.Sprintf("%s%dm %ds", sign, minutes, seconds)
|
||||
return fmt.Sprintf("%s%dm %ds", sign, minutes, seconds), nil
|
||||
}
|
||||
// For seconds, we display 4 significant digits.
|
||||
return fmt.Sprintf("%s%.4gs", sign, v)
|
||||
return fmt.Sprintf("%s%.4gs", sign, v), nil
|
||||
}
|
||||
prefix := ""
|
||||
for _, p := range []string{"m", "u", "n", "p", "f", "a", "z", "y"} {
|
||||
|
@ -198,33 +352,51 @@ func InitTemplateFunc(externalURL *url.URL) {
|
|||
prefix = p
|
||||
v *= 1000
|
||||
}
|
||||
return fmt.Sprintf("%.4g%ss", v, prefix)
|
||||
return fmt.Sprintf("%.4g%ss", v, prefix), nil
|
||||
},
|
||||
|
||||
// humanizePercentage converts given ratio value to a fraction of 100
|
||||
"humanizePercentage": func(v float64) string {
|
||||
return fmt.Sprintf("%.4g%%", v*100)
|
||||
"humanizePercentage": func(i interface{}) (string, error) {
|
||||
v, err := toFloat64(i)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
return fmt.Sprintf("%.4g%%", v*100), nil
|
||||
},
|
||||
|
||||
// humanizeTimestamp converts given timestamp to a human readable time equivalent
|
||||
"humanizeTimestamp": func(v float64) string {
|
||||
"humanizeTimestamp": func(i interface{}) (string, error) {
|
||||
v, err := toFloat64(i)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
if math.IsNaN(v) || math.IsInf(v, 0) {
|
||||
return fmt.Sprintf("%.4g", v)
|
||||
return fmt.Sprintf("%.4g", v), nil
|
||||
}
|
||||
t := TimeFromUnixNano(int64(v * 1e9)).Time().UTC()
|
||||
return fmt.Sprint(t)
|
||||
return fmt.Sprint(t), nil
|
||||
},
|
||||
|
||||
/* URLs */
|
||||
|
||||
// externalURL returns value of `external.url` flag
|
||||
"externalURL": func() string {
|
||||
return externalURL.String()
|
||||
// externalURL function supposed to be substituted at FuncsWithExteralURL().
|
||||
// it is present here only for validation purposes, when there is no
|
||||
// provided datasource.
|
||||
//
|
||||
// return non-empty slice to pass validation with chained functions in template
|
||||
return ""
|
||||
},
|
||||
|
||||
// pathPrefix returns a Path segment from the URL value in `external.url` flag
|
||||
"pathPrefix": func() string {
|
||||
return externalURL.Path
|
||||
// pathPrefix function supposed to be substituted at FuncsWithExteralURL().
|
||||
// it is present here only for validation purposes, when there is no
|
||||
// provided datasource.
|
||||
//
|
||||
// return non-empty slice to pass validation with chained functions in template
|
||||
return ""
|
||||
},
|
||||
|
||||
// pathEscape escapes the string so it can be safely placed inside a URL path segment,
|
||||
|
@ -259,7 +431,7 @@ func InitTemplateFunc(externalURL *url.URL) {
|
|||
// execute "/api/v1/query?query=foo" request and will return
|
||||
// the first value in response.
|
||||
"query": func(q string) ([]metric, error) {
|
||||
// query function supposed to be substituted at funcsWithQuery().
|
||||
// query function supposed to be substituted at FuncsWithQuery().
|
||||
// it is present here only for validation purposes, when there is no
|
||||
// provided datasource.
|
||||
//
|
||||
|
@ -316,21 +488,6 @@ func InitTemplateFunc(externalURL *url.URL) {
|
|||
}
|
||||
}
|
||||
|
||||
func funcsWithQuery(query QueryFn) textTpl.FuncMap {
|
||||
fm := make(textTpl.FuncMap)
|
||||
for k, fn := range tmplFunc {
|
||||
fm[k] = fn
|
||||
}
|
||||
fm["query"] = func(q string) ([]metric, error) {
|
||||
result, err := query(q)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return datasourceMetricsToTemplateMetrics(result), nil
|
||||
}
|
||||
return fm
|
||||
}
|
||||
|
||||
// Time is the number of milliseconds since the epoch
|
||||
// (1970-01-01 00:00 UTC) excluding leap seconds.
|
||||
type Time int64
|
||||
|
@ -355,3 +512,28 @@ const second = int64(time.Second / minimumTick)
|
|||
func (t Time) Time() time.Time {
|
||||
return time.Unix(int64(t)/second, (int64(t)%second)*nanosPerTick)
|
||||
}
|
||||
|
||||
func toFloat64(v interface{}) (float64, error) {
|
||||
switch i := v.(type) {
|
||||
case float64:
|
||||
return i, nil
|
||||
case float32:
|
||||
return float64(i), nil
|
||||
case int64:
|
||||
return float64(i), nil
|
||||
case int32:
|
||||
return float64(i), nil
|
||||
case int:
|
||||
return float64(i), nil
|
||||
case uint64:
|
||||
return float64(i), nil
|
||||
case uint32:
|
||||
return float64(i), nil
|
||||
case uint:
|
||||
return float64(i), nil
|
||||
case string:
|
||||
return strconv.ParseFloat(i, 64)
|
||||
default:
|
||||
return 0, fmt.Errorf("unexpected value type %v", i)
|
||||
}
|
||||
}
|
275
app/vmalert/templates/template_test.go
Normal file
275
app/vmalert/templates/template_test.go
Normal file
|
@ -0,0 +1,275 @@
|
|||
package templates
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"testing"
|
||||
textTpl "text/template"
|
||||
)
|
||||
|
||||
func mkTemplate(current, replacement interface{}) textTemplate {
|
||||
tmpl := textTemplate{}
|
||||
if current != nil {
|
||||
switch val := current.(type) {
|
||||
case string:
|
||||
tmpl.current = textTpl.Must(newTemplate().Parse(val))
|
||||
}
|
||||
}
|
||||
if replacement != nil {
|
||||
switch val := replacement.(type) {
|
||||
case string:
|
||||
tmpl.replacement = textTpl.Must(newTemplate().Parse(val))
|
||||
}
|
||||
}
|
||||
return tmpl
|
||||
}
|
||||
|
||||
func equalTemplates(tmpls ...*textTpl.Template) bool {
|
||||
var cmp *textTpl.Template
|
||||
for i, tmpl := range tmpls {
|
||||
if i == 0 {
|
||||
cmp = tmpl
|
||||
} else {
|
||||
if cmp == nil || tmpl == nil {
|
||||
if cmp != tmpl {
|
||||
return false
|
||||
}
|
||||
continue
|
||||
}
|
||||
if len(tmpl.Templates()) != len(cmp.Templates()) {
|
||||
return false
|
||||
}
|
||||
for _, t := range tmpl.Templates() {
|
||||
tp := cmp.Lookup(t.Name())
|
||||
if tp == nil {
|
||||
return false
|
||||
}
|
||||
if tp.Root.String() != t.Root.String() {
|
||||
return false
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
func TestTemplates_Load(t *testing.T) {
|
||||
testCases := []struct {
|
||||
name string
|
||||
initialTemplate textTemplate
|
||||
pathPatterns []string
|
||||
overwrite bool
|
||||
expectedTemplate textTemplate
|
||||
expErr string
|
||||
}{
|
||||
{
|
||||
"non existing path undefined template override",
|
||||
mkTemplate(nil, nil),
|
||||
[]string{
|
||||
"templates/non-existing/good-*.tpl",
|
||||
"templates/absent/good-*.tpl",
|
||||
},
|
||||
true,
|
||||
mkTemplate(``, nil),
|
||||
"",
|
||||
},
|
||||
{
|
||||
"non existing path defined template override",
|
||||
mkTemplate(`
|
||||
{{- define "test.1" -}}
|
||||
{{- printf "value" -}}
|
||||
{{- end -}}
|
||||
`, nil),
|
||||
[]string{
|
||||
"templates/non-existing/good-*.tpl",
|
||||
"templates/absent/good-*.tpl",
|
||||
},
|
||||
true,
|
||||
mkTemplate(``, nil),
|
||||
"",
|
||||
},
|
||||
{
|
||||
"existing path undefined template override",
|
||||
mkTemplate(nil, nil),
|
||||
[]string{
|
||||
"templates/other/nested/good0-*.tpl",
|
||||
"templates/test/good0-*.tpl",
|
||||
},
|
||||
false,
|
||||
mkTemplate(`
|
||||
{{- define "good0-test.tpl" -}}{{- end -}}
|
||||
{{- define "test.0" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
{{- define "test.1" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
{{- define "test.2" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
{{- define "test.3" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
`, nil),
|
||||
"",
|
||||
},
|
||||
{
|
||||
"existing path defined template override",
|
||||
mkTemplate(`
|
||||
{{- define "test.1" -}}
|
||||
{{ printf "Hello %s!" "world" }}
|
||||
{{- end -}}
|
||||
`, nil),
|
||||
[]string{
|
||||
"templates/other/nested/good0-*.tpl",
|
||||
"templates/test/good0-*.tpl",
|
||||
},
|
||||
false,
|
||||
mkTemplate(`
|
||||
{{- define "good0-test.tpl" -}}{{- end -}}
|
||||
{{- define "test.0" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
{{- define "test.1" -}}
|
||||
{{ printf "Hello %s!" "world" }}
|
||||
{{- end -}}
|
||||
{{- define "test.2" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
{{- define "test.3" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
`, `
|
||||
{{- define "good0-test.tpl" -}}{{- end -}}
|
||||
{{- define "test.0" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
{{- define "test.1" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
{{- define "test.2" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
{{- define "test.3" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
`),
|
||||
"",
|
||||
},
|
||||
{
|
||||
"load template with syntax error",
|
||||
mkTemplate(`
|
||||
{{- define "test.1" -}}
|
||||
{{ printf "Hello %s!" "world" }}
|
||||
{{- end -}}
|
||||
`, nil),
|
||||
[]string{
|
||||
"templates/other/nested/bad0-*.tpl",
|
||||
"templates/test/good0-*.tpl",
|
||||
},
|
||||
false,
|
||||
mkTemplate(`
|
||||
{{- define "test.1" -}}
|
||||
{{ printf "Hello %s!" "world" }}
|
||||
{{- end -}}
|
||||
`, nil),
|
||||
"failed to parse template glob",
|
||||
},
|
||||
}
|
||||
|
||||
for _, tc := range testCases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
masterTmpl = tc.initialTemplate
|
||||
err := Load(tc.pathPatterns, tc.overwrite)
|
||||
if tc.expErr == "" && err != nil {
|
||||
t.Error("happened error that wasn't expected: %w", err)
|
||||
}
|
||||
if tc.expErr != "" && err == nil {
|
||||
t.Error("%+w", err)
|
||||
t.Error("expected error that didn't happend")
|
||||
}
|
||||
if err != nil && !strings.Contains(err.Error(), tc.expErr) {
|
||||
t.Error("%+w", err)
|
||||
t.Error("expected string doesn't exist in error message")
|
||||
}
|
||||
if !equalTemplates(masterTmpl.replacement, tc.expectedTemplate.replacement) {
|
||||
t.Fatalf("replacement template is not as expected")
|
||||
}
|
||||
if !equalTemplates(masterTmpl.current, tc.expectedTemplate.current) {
|
||||
t.Fatalf("current template is not as expected")
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
func TestTemplates_Reload(t *testing.T) {
|
||||
testCases := []struct {
|
||||
name string
|
||||
initialTemplate textTemplate
|
||||
expectedTemplate textTemplate
|
||||
}{
|
||||
{
|
||||
"empty current and replacement templates",
|
||||
mkTemplate(nil, nil),
|
||||
mkTemplate(nil, nil),
|
||||
},
|
||||
{
|
||||
"empty current template only",
|
||||
mkTemplate(`
|
||||
{{- define "test.1" -}}
|
||||
{{- printf "value" -}}
|
||||
{{- end -}}
|
||||
`, nil),
|
||||
mkTemplate(`
|
||||
{{- define "test.1" -}}
|
||||
{{- printf "value" -}}
|
||||
{{- end -}}
|
||||
`, nil),
|
||||
},
|
||||
{
|
||||
"empty replacement template only",
|
||||
mkTemplate(nil, `
|
||||
{{- define "test.1" -}}
|
||||
{{- printf "value" -}}
|
||||
{{- end -}}
|
||||
`),
|
||||
mkTemplate(`
|
||||
{{- define "test.1" -}}
|
||||
{{- printf "value" -}}
|
||||
{{- end -}}
|
||||
`, nil),
|
||||
},
|
||||
{
|
||||
"defined both templates",
|
||||
mkTemplate(`
|
||||
{{- define "test.0" -}}
|
||||
{{- printf "value" -}}
|
||||
{{- end -}}
|
||||
{{- define "test.1" -}}
|
||||
{{- printf "before" -}}
|
||||
{{- end -}}
|
||||
`, `
|
||||
{{- define "test.1" -}}
|
||||
{{- printf "after" -}}
|
||||
{{- end -}}
|
||||
`),
|
||||
mkTemplate(`
|
||||
{{- define "test.1" -}}
|
||||
{{- printf "after" -}}
|
||||
{{- end -}}
|
||||
`, nil),
|
||||
},
|
||||
}
|
||||
|
||||
for _, tc := range testCases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
masterTmpl = tc.initialTemplate
|
||||
Reload()
|
||||
if !equalTemplates(masterTmpl.replacement, tc.expectedTemplate.replacement) {
|
||||
t.Fatalf("replacement template is not as expected")
|
||||
}
|
||||
if !equalTemplates(masterTmpl.current, tc.expectedTemplate.current) {
|
||||
t.Fatalf("current template is not as expected")
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
|
@ -0,0 +1,3 @@
|
|||
{{- define "test.1" -}}
|
||||
{{ printf "Hello %s!" externalURL" }}
|
||||
{{- end -}}
|
|
@ -0,0 +1,9 @@
|
|||
{{- define "test.1" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
{{- define "test.0" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
{{- define "test.3" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
9
app/vmalert/templates/templates/test/good0-test.tpl
Normal file
9
app/vmalert/templates/templates/test/good0-test.tpl
Normal file
|
@ -0,0 +1,9 @@
|
|||
{{- define "test.2" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
{{- define "test.0" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
||||
{{- define "test.3" -}}
|
||||
{{ printf "Hello %s!" externalURL }}
|
||||
{{- end -}}
|
|
@ -69,7 +69,7 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
|
|||
case "/alerts":
|
||||
WriteListAlerts(w, pathPrefix, rh.groupAlerts())
|
||||
return true
|
||||
case "/groups":
|
||||
case "/groups", "/rules":
|
||||
WriteListGroups(w, rh.groups())
|
||||
return true
|
||||
case "/notifiers":
|
||||
|
|
|
@ -139,3 +139,122 @@ info app/vmbackupmanager/retention.go:106 daily backups to delete [daily/2
|
|||
The result on the GCS bucket. We see only 3 daily backups:
|
||||
|
||||
![daily](vmbackupmanager_rp_daily_2.png)
|
||||
|
||||
|
||||
## Configuration
|
||||
|
||||
### Flags
|
||||
|
||||
Pass `-help` to `vmbackupmanager` in order to see the full list of supported
|
||||
command-line flags with their descriptions.
|
||||
|
||||
The shortlist of configuration flags is the following:
|
||||
|
||||
```
|
||||
vmbackupmanager performs regular backups according to the provided configs.
|
||||
|
||||
-concurrency int
|
||||
The number of concurrent workers. Higher concurrency may reduce backup duration (default 10)
|
||||
-configFilePath string
|
||||
Path to file with S3 configs. Configs are loaded from default location if not set.
|
||||
See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
|
||||
-configProfile string
|
||||
Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used
|
||||
-credsFilePath string
|
||||
Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set.
|
||||
See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html
|
||||
-customS3Endpoint string
|
||||
Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set
|
||||
-disableDaily
|
||||
Disable daily run. Default false
|
||||
-disableHourly
|
||||
Disable hourly run. Default false
|
||||
-disableMonthly
|
||||
Disable monthly run. Default false
|
||||
-disableWeekly
|
||||
Disable weekly run. Default false
|
||||
-dst string
|
||||
The root folder of Victoria Metrics backups. Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir
|
||||
-enableTCP6
|
||||
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
|
||||
-envflag.enable
|
||||
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
|
||||
-envflag.prefix string
|
||||
Prefix for environment variables if -envflag.enable is set
|
||||
-eula
|
||||
By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf
|
||||
-fs.disableMmap
|
||||
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
|
||||
-http.connTimeout duration
|
||||
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
|
||||
-http.disableResponseCompression
|
||||
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
|
||||
-http.idleConnTimeout duration
|
||||
Timeout for incoming idle http connections (default 1m0s)
|
||||
-http.maxGracefulShutdownDuration duration
|
||||
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
|
||||
-http.pathPrefix string
|
||||
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
|
||||
-http.shutdownDelay duration
|
||||
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
|
||||
-httpAuth.password string
|
||||
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
|
||||
-httpAuth.username string
|
||||
Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password
|
||||
-httpListenAddr string
|
||||
Address to listen for http connections (default ":8300")
|
||||
-keepLastDaily int
|
||||
Keep last N daily backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
|
||||
-keepLastHourly int
|
||||
Keep last N hourly backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
|
||||
-keepLastMonthly int
|
||||
Keep last N monthly backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
|
||||
-keepLastWeekly int
|
||||
Keep last N weekly backups. If 0 is specified next retention cycle removes all backups for given time period. (default -1)
|
||||
-loggerDisableTimestamps
|
||||
Whether to disable writing timestamps in logs
|
||||
-loggerErrorsPerSecondLimit int
|
||||
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
|
||||
-loggerFormat string
|
||||
Format for logs. Possible values: default, json (default "default")
|
||||
-loggerLevel string
|
||||
Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO")
|
||||
-loggerOutput string
|
||||
Output for the logs. Supported values: stderr, stdout (default "stderr")
|
||||
-loggerTimezone string
|
||||
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
|
||||
-loggerWarnsPerSecondLimit int
|
||||
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
|
||||
-maxBytesPerSecond int
|
||||
The maximum upload speed. There is no limit if it is set to 0
|
||||
-memory.allowedBytes size
|
||||
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
|
||||
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
|
||||
-memory.allowedPercent float
|
||||
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
|
||||
-metricsAuthKey string
|
||||
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
|
||||
-pprofAuthKey string
|
||||
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
|
||||
-runOnStart
|
||||
Upload backups immediately after start of the service. Otherwise the backup starts on new hour
|
||||
-s3ForcePathStyle
|
||||
Prefixing endpoint with bucket name when set false, true by default. (default true)
|
||||
-snapshot.createURL string
|
||||
VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup.Example: http://victoriametrics:8428/snapshot/create
|
||||
-snapshot.deleteURL string
|
||||
VictoriaMetrics delete snapshot url. Optional. Will be generated from snapshot.createURL if not provided. All created snaphosts will be automatically deleted.Example: http://victoriametrics:8428/snapshot/delete
|
||||
-storageDataPath string
|
||||
Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage (default "victoria-metrics-data")
|
||||
-tls
|
||||
Whether to enable TLS for incoming HTTP requests at -httpListenAddr (aka https). -tlsCertFile and -tlsKeyFile must be set if -tls is set
|
||||
-tlsCertFile string
|
||||
Path to file with TLS certificate if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated
|
||||
-tlsCipherSuites array
|
||||
Optional list of TLS cipher suites for incoming requests over HTTPS if -tls is set. See the list of supported cipher suites at https://pkg.go.dev/crypto/tls#pkg-constants
|
||||
Supports an array of values separated by comma or specified via multiple flags.
|
||||
-tlsKeyFile string
|
||||
Path to file with TLS key if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated
|
||||
-version
|
||||
Show VictoriaMetrics version
|
||||
```
|
|
@ -16,7 +16,7 @@ To see the full list of supported modes
|
|||
run the following command:
|
||||
|
||||
```bash
|
||||
./vmctl --help
|
||||
$ ./vmctl --help
|
||||
NAME:
|
||||
vmctl - VictoriaMetrics command-line tool
|
||||
|
||||
|
@ -35,7 +35,7 @@ Each mode has its own unique set of flags specific (e.g. prefixed with `influx`
|
|||
to the data source and common list of flags for destination (prefixed with `vm` for VictoriaMetrics):
|
||||
|
||||
```
|
||||
./vmctl influx --help
|
||||
$ ./vmctl influx --help
|
||||
OPTIONS:
|
||||
--influx-addr value InfluxDB server addr (default: "http://localhost:8086")
|
||||
--influx-user value InfluxDB user [$INFLUX_USERNAME]
|
||||
|
@ -55,7 +55,7 @@ them below in corresponding sections.
|
|||
For the destination flags see the full description by running the following command:
|
||||
|
||||
```
|
||||
./vmctl influx --help | grep vm-
|
||||
$ ./vmctl influx --help | grep vm-
|
||||
```
|
||||
|
||||
Some flags like [--vm-extra-label](#adding-extra-labels) or [--vm-significant-figures](#significant-figures)
|
||||
|
@ -77,11 +77,11 @@ forget to specify the `--vm-account-id` flag. See more details for cluster versi
|
|||
|
||||
See `./vmctl opentsdb --help` for details and full list of flags.
|
||||
|
||||
*OpenTSDB migration is not possible without a functioning [meta](http://opentsdb.net/docs/build/html/user_guide/metadata.html) table to search for metrics/series.*
|
||||
**Important:** OpenTSDB migration is not possible without a functioning [meta](http://opentsdb.net/docs/build/html/user_guide/metadata.html) table to search for metrics/series. Check in OpenTSDB config that appropriate options are [activated]( https://github.com/OpenTSDB/opentsdb/issues/681#issuecomment-177359563) and HBase meta tables are present. W/o them migration won't work.
|
||||
|
||||
OpenTSDB migration works like so:
|
||||
|
||||
1. Find metrics based on selected filters (or the default filter set ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z'])
|
||||
1. Find metrics based on selected filters (or the default filter set `['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']`)
|
||||
|
||||
- e.g. `curl -Ss "http://opentsdb:4242/api/suggest?type=metrics&q=sys"`
|
||||
|
||||
|
@ -89,9 +89,11 @@ OpenTSDB migration works like so:
|
|||
|
||||
- e.g. `curl -Ss "http://opentsdb:4242/api/search/lookup?m=system.load5&limit=1000000"`
|
||||
|
||||
Here `results` return field should not be empty. Otherwise it means that meta tables are absent and needs to be turned on previously.
|
||||
|
||||
3. Download data for each series in chunks defined in the CLI switches
|
||||
|
||||
- e.g. `-retention=sum-1m-avg:1h:90d` ==
|
||||
- e.g. `-retention=sum-1m-avg:1h:90d` means
|
||||
- `curl -Ss "http://opentsdb:4242/api/query?start=1h-ago&end=now&m=sum:1m-avg-none:system.load5\{host=host1\}"`
|
||||
- `curl -Ss "http://opentsdb:4242/api/query?start=2h-ago&end=1h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
|
||||
- `curl -Ss "http://opentsdb:4242/api/query?start=3h-ago&end=2h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"`
|
||||
|
@ -101,7 +103,7 @@ OpenTSDB migration works like so:
|
|||
This means that we must stream data from OpenTSDB to VictoriaMetrics in chunks. This is where concurrency for OpenTSDB comes in. We can query multiple chunks at once, but we shouldn't perform too many chunks at a time to avoid overloading the OpenTSDB cluster.
|
||||
|
||||
```
|
||||
$ bin/vmctl opentsdb --otsdb-addr http://opentsdb:4242/ --otsdb-retentions sum-1m-avg:1h:1d --otsdb-filters system --otsdb-normalize --vm-addr http://victoria/
|
||||
$ ./vmctl opentsdb --otsdb-addr http://opentsdb:4242/ --otsdb-retentions sum-1m-avg:1h:1d --otsdb-filters system --otsdb-normalize --vm-addr http://victoria:8428/
|
||||
OpenTSDB import mode
|
||||
2021/04/09 11:52:50 Will collect data starting at TS 1617990770
|
||||
2021/04/09 11:52:50 Loading all metrics from OpenTSDB for filters: [system]
|
||||
|
@ -109,6 +111,14 @@ Found 9 metrics to import. Continue? [Y/n]
|
|||
2021/04/09 11:52:51 Starting work on system.load1
|
||||
23 / 402200 [>____________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________] 0.01% 2 p/s
|
||||
```
|
||||
Where `:8428` is Prometheus port of VictoriaMetrics.
|
||||
|
||||
For clustered VictoriaMetrics setup `--vm-account-id` flag needs to be added, for example:
|
||||
|
||||
```
|
||||
$ ./vmctl opentsdb --otsdb-addr http://opentsdb:4242/ --otsdb-retentions sum-1m-avg:1h:1d --otsdb-filters system --otsdb-normalize --vm-addr http://victoria:8480/ --vm-account-id 0
|
||||
```
|
||||
This time `:8480` port is vminsert/Prometheus input port.
|
||||
|
||||
### Retention strings
|
||||
|
||||
|
|
|
@ -3,7 +3,7 @@
|
|||
// altogether.
|
||||
package barpool
|
||||
|
||||
import "github.com/cheggaaa/pb/v3"
|
||||
import "github.com/dmitryk-dk/pb/v3"
|
||||
|
||||
var pool = pb.NewPool()
|
||||
|
||||
|
|
|
@ -202,6 +202,7 @@ const (
|
|||
influxFilterTimeEnd = "influx-filter-time-end"
|
||||
influxMeasurementFieldSeparator = "influx-measurement-field-separator"
|
||||
influxSkipDatabaseLabel = "influx-skip-database-label"
|
||||
influxPrometheusMode = "influx-prometheus-mode"
|
||||
)
|
||||
|
||||
var (
|
||||
|
@ -264,6 +265,11 @@ var (
|
|||
Usage: "Wether to skip adding the label 'db' to timeseries.",
|
||||
Value: false,
|
||||
},
|
||||
&cli.BoolFlag{
|
||||
Name: influxPrometheusMode,
|
||||
Usage: "Wether to restore the original timeseries name previously written from Prometheus to InfluxDB v1 via remote_write.",
|
||||
Value: false,
|
||||
},
|
||||
}
|
||||
)
|
||||
|
||||
|
|
|
@ -17,9 +17,10 @@ type influxProcessor struct {
|
|||
cc int
|
||||
separator string
|
||||
skipDbLabel bool
|
||||
promMode bool
|
||||
}
|
||||
|
||||
func newInfluxProcessor(ic *influx.Client, im *vm.Importer, cc int, separator string, skipDbLabel bool) *influxProcessor {
|
||||
func newInfluxProcessor(ic *influx.Client, im *vm.Importer, cc int, separator string, skipDbLabel bool, promMode bool) *influxProcessor {
|
||||
if cc < 1 {
|
||||
cc = 1
|
||||
}
|
||||
|
@ -29,6 +30,7 @@ func newInfluxProcessor(ic *influx.Client, im *vm.Importer, cc int, separator st
|
|||
cc: cc,
|
||||
separator: separator,
|
||||
skipDbLabel: skipDbLabel,
|
||||
promMode: promMode,
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -101,6 +103,8 @@ func (ip *influxProcessor) run(silent, verbose bool) error {
|
|||
}
|
||||
|
||||
const dbLabel = "db"
|
||||
const nameLabel = "__name__"
|
||||
const valueField = "value"
|
||||
|
||||
func (ip *influxProcessor) do(s *influx.Series) error {
|
||||
cr, err := ip.ic.FetchDataPoints(s)
|
||||
|
@ -122,6 +126,8 @@ func (ip *influxProcessor) do(s *influx.Series) error {
|
|||
for i, lp := range s.LabelPairs {
|
||||
if lp.Name == dbLabel {
|
||||
containsDBLabel = true
|
||||
} else if lp.Name == nameLabel && s.Field == valueField && ip.promMode {
|
||||
name = lp.Value
|
||||
}
|
||||
labels[i] = vm.LabelPair{
|
||||
Name: lp.Name,
|
||||
|
|
|
@ -105,7 +105,8 @@ func main() {
|
|||
importer,
|
||||
c.Int(influxConcurrency),
|
||||
c.String(influxMeasurementFieldSeparator),
|
||||
c.Bool(influxSkipDatabaseLabel))
|
||||
c.Bool(influxSkipDatabaseLabel),
|
||||
c.Bool(influxPrometheusMode))
|
||||
return processor.run(c.Bool(globalSilent), c.Bool(globalVerbose))
|
||||
},
|
||||
},
|
||||
|
|
|
@ -8,7 +8,7 @@ import (
|
|||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/opentsdb"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/vm"
|
||||
"github.com/cheggaaa/pb/v3"
|
||||
"github.com/dmitryk-dk/pb/v3"
|
||||
)
|
||||
|
||||
type otsdbProcessor struct {
|
||||
|
@ -67,7 +67,7 @@ func (op *otsdbProcessor) run(silent, verbose bool) error {
|
|||
queryRanges += len(rt.QueryRanges)
|
||||
}
|
||||
for _, metric := range metrics {
|
||||
log.Println(fmt.Sprintf("Starting work on %s", metric))
|
||||
log.Printf("Starting work on %s", metric)
|
||||
serieslist, err := op.oc.FindSeries(metric)
|
||||
if err != nil {
|
||||
return fmt.Errorf("couldn't retrieve series list for %s : %s", metric, err)
|
||||
|
|
|
@ -196,7 +196,7 @@ func (c Client) GetData(series Meta, rt RetentionMeta, start int64, end int64, m
|
|||
3. bad format of response body
|
||||
*/
|
||||
if resp.StatusCode != 200 {
|
||||
log.Println(fmt.Sprintf("bad response code from OpenTSDB query %v for %q...skipping", resp.StatusCode, q))
|
||||
log.Printf("bad response code from OpenTSDB query %v for %q...skipping", resp.StatusCode, q)
|
||||
return Metric{}, nil
|
||||
}
|
||||
defer func() { _ = resp.Body.Close() }()
|
||||
|
@ -208,7 +208,7 @@ func (c Client) GetData(series Meta, rt RetentionMeta, start int64, end int64, m
|
|||
var output []OtsdbMetric
|
||||
err = json.Unmarshal(body, &output)
|
||||
if err != nil {
|
||||
log.Println(fmt.Sprintf("couldn't marshall response body from OpenTSDB query (%s)...skipping", body))
|
||||
log.Printf("couldn't marshall response body from OpenTSDB query (%s)...skipping", body)
|
||||
return Metric{}, nil
|
||||
}
|
||||
/*
|
||||
|
@ -309,7 +309,7 @@ func NewClient(cfg Config) (*Client, error) {
|
|||
*/
|
||||
offsetPrint = offsetPrint - offsetSecs
|
||||
}
|
||||
log.Println(fmt.Sprintf("Will collect data starting at TS %v", offsetPrint))
|
||||
log.Printf("Will collect data starting at TS %v", offsetPrint)
|
||||
for _, r := range cfg.Retentions {
|
||||
ret, err := convertRetention(r, offsetSecs, cfg.MsecsTime)
|
||||
if err != nil {
|
||||
|
|
|
@ -13,11 +13,10 @@ import (
|
|||
"sync"
|
||||
"time"
|
||||
|
||||
"github.com/cheggaaa/pb/v3"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/barpool"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/limiter"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
|
||||
"github.com/dmitryk-dk/pb/v3"
|
||||
)
|
||||
|
||||
// Config contains list of params to configure
|
||||
|
|
|
@ -36,7 +36,7 @@ import (
|
|||
|
||||
var (
|
||||
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
|
||||
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. "+
|
||||
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. "+
|
||||
"This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write")
|
||||
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+
|
||||
"Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. "+
|
||||
|
|
|
@ -2069,7 +2069,7 @@ func TestExecSuccess(t *testing.T) {
|
|||
t.Parallel()
|
||||
q := `with (
|
||||
x = (
|
||||
label_set(time(), "foo", "123.456", "__name__", "aaa"),
|
||||
label_set(time() > 1500, "foo", "123.456", "__name__", "aaa"),
|
||||
label_set(-time(), "foo", "bar", "__name__", "bbb"),
|
||||
label_set(-time(), "__name__", "bxs"),
|
||||
label_set(-time(), "foo", "45", "bar", "xs"),
|
||||
|
@ -2093,7 +2093,7 @@ func TestExecSuccess(t *testing.T) {
|
|||
}
|
||||
r2 := netstorage.Result{
|
||||
MetricName: metricNameExpected,
|
||||
Values: []float64{1123.456, 1323.456, 1523.456, 1723.456, 1923.456, 2123.456},
|
||||
Values: []float64{nan, nan, nan, 1723.456, 1923.456, 2123.456},
|
||||
Timestamps: timestampsExpected,
|
||||
}
|
||||
r2.MetricName.Tags = []storage.Tag{
|
||||
|
|
|
@ -1715,8 +1715,10 @@ func transformLabelValue(tfa *transformFuncArg) ([]*timeseries, error) {
|
|||
v = nan
|
||||
}
|
||||
values := ts.Values
|
||||
for i := range values {
|
||||
values[i] = v
|
||||
for i, vOrig := range values {
|
||||
if !math.IsNaN(vOrig) {
|
||||
values[i] = v
|
||||
}
|
||||
}
|
||||
}
|
||||
// Do not remove timeseries with only NaN values, so `default` could be applied to them:
|
||||
|
|
|
@ -1,14 +1,12 @@
|
|||
{
|
||||
"files": {
|
||||
"main.css": "./static/css/main.d8362c27.css",
|
||||
"main.js": "./static/js/main.f64c8675.js",
|
||||
"static/js/362.1f16598a.chunk.js": "./static/js/362.1f16598a.chunk.js",
|
||||
"main.js": "./static/js/main.a54e3212.js",
|
||||
"static/js/27.939f971b.chunk.js": "./static/js/27.939f971b.chunk.js",
|
||||
"static/media/README.md": "./static/media/README.40ebc3a1f4adae949154.md",
|
||||
"index.html": "./index.html"
|
||||
},
|
||||
"entrypoints": [
|
||||
"static/css/main.d8362c27.css",
|
||||
"static/js/main.f64c8675.js"
|
||||
"static/js/main.a54e3212.js"
|
||||
]
|
||||
}
|
|
@ -1,3 +1,8 @@
|
|||
### Setup
|
||||
1. Create `.json` config file in a folder `dashboards`
|
||||
2. Import your config file into the `dashboards/index.js`
|
||||
3. Add imported variable into the array `window.__VMUI_PREDEFINED_DASHBOARDS__`
|
||||
|
||||
### Configuration options
|
||||
|
||||
<br/>
|
5
app/vmselect/vmui/dashboards/index.js
Normal file
5
app/vmselect/vmui/dashboards/index.js
Normal file
|
@ -0,0 +1,5 @@
|
|||
import perJob from "./perJobUsage.json" assert { type: "json" };
|
||||
|
||||
window.__VMUI_PREDEFINED_DASHBOARDS__ = [
|
||||
perJob
|
||||
];
|
|
@ -17,12 +17,12 @@
|
|||
"title": "Per-job disk read",
|
||||
"width": 6,
|
||||
"expr": ["sum(rate(process_io_storage_read_bytes_total)) by (job)"]
|
||||
},{
|
||||
},
|
||||
{
|
||||
"title": "Per-job disk write",
|
||||
"width": 6,
|
||||
"expr": ["sum(rate(process_io_storage_written_bytes_total)) by (job)"]
|
||||
}
|
||||
|
||||
]
|
||||
}
|
||||
]
|
|
@ -1 +1 @@
|
|||
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1"/><meta name="theme-color" content="#000000"/><meta name="description" content="VM-UI is a metric explorer for Victoria Metrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700&display=swap"/><script defer="defer" src="./static/js/main.f64c8675.js"></script><link href="./static/css/main.d8362c27.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>
|
||||
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link href="./favicon.ico" rel="icon"/><meta content="width=device-width,initial-scale=1" name="viewport"/><meta content="#000000" name="theme-color"/><meta content="VM-UI is a metric explorer for Victoria Metrics" name="description"/><link href="./apple-touch-icon.png" rel="apple-touch-icon"/><link href="./favicon-32x32.png" rel="icon" sizes="32x32" type="image/png"><link href="./manifest.json" rel="manifest"/><title>VM UI</title><link href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700&display=swap" rel="stylesheet"/><script src="./dashboards/index.js" type="module"></script><script defer="defer" src="./static/js/main.a54e3212.js"></script><link href="./static/css/main.d8362c27.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>
|
||||
|
|
|
@ -1 +0,0 @@
|
|||
"use strict";(self.webpackChunkvmui=self.webpackChunkvmui||[]).push([[362],{8362:function(e,a,s){e.exports=s.p+"static/media/README.40ebc3a1f4adae949154.md"}}]);
|
2
app/vmselect/vmui/static/js/main.a54e3212.js
Normal file
2
app/vmselect/vmui/static/js/main.a54e3212.js
Normal file
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
669
app/vmui/packages/vmui/package-lock.json
generated
669
app/vmui/packages/vmui/package-lock.json
generated
File diff suppressed because it is too large
Load diff
|
@ -37,7 +37,6 @@
|
|||
"start": "react-app-rewired start",
|
||||
"build": "GENERATE_SOURCEMAP=false react-app-rewired build",
|
||||
"test": "react-app-rewired test",
|
||||
"eject": "react-scripts eject",
|
||||
"lint": "eslint src --ext tsx,ts",
|
||||
"lint:fix": "eslint src --ext tsx,ts --fix"
|
||||
},
|
||||
|
@ -66,5 +65,10 @@
|
|||
"customize-cra": "^1.0.0",
|
||||
"eslint-plugin-react": "^7.29.4",
|
||||
"react-app-rewired": "^2.2.1"
|
||||
},
|
||||
"overrides": {
|
||||
"react-app-rewired": {
|
||||
"nth-check": "^2.0.1"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
@ -1,3 +1,8 @@
|
|||
### Setup
|
||||
1. Create `.json` config file in a folder `dashboards`
|
||||
2. Import your config file into the `dashboards/index.js`
|
||||
3. Add imported variable into the array `window.__VMUI_PREDEFINED_DASHBOARDS__`
|
||||
|
||||
### Configuration options
|
||||
|
||||
<br/>
|
5
app/vmui/packages/vmui/public/dashboards/index.js
Normal file
5
app/vmui/packages/vmui/public/dashboards/index.js
Normal file
|
@ -0,0 +1,5 @@
|
|||
import perJob from "./perJobUsage.json" assert { type: "json" };
|
||||
|
||||
window.__VMUI_PREDEFINED_DASHBOARDS__ = [
|
||||
perJob
|
||||
];
|
29
app/vmui/packages/vmui/public/dashboards/perJobUsage.json
Normal file
29
app/vmui/packages/vmui/public/dashboards/perJobUsage.json
Normal file
|
@ -0,0 +1,29 @@
|
|||
{
|
||||
"title": "per-job resource usage",
|
||||
"rows": [
|
||||
{
|
||||
"panels": [
|
||||
{
|
||||
"title": "Per-job CPU usage",
|
||||
"width": 6,
|
||||
"expr": ["sum(rate(process_cpu_seconds_total)) by (job)"]
|
||||
},
|
||||
{
|
||||
"title": "Per-job RSS usage",
|
||||
"width": 6,
|
||||
"expr": ["sum(process_resident_memory_bytes) by (job)"]
|
||||
},
|
||||
{
|
||||
"title": "Per-job disk read",
|
||||
"width": 6,
|
||||
"expr": ["sum(rate(process_io_storage_read_bytes_total)) by (job)"]
|
||||
},
|
||||
{
|
||||
"title": "Per-job disk write",
|
||||
"width": 6,
|
||||
"expr": ["sum(rate(process_io_storage_written_bytes_total)) by (job)"]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
|
@ -27,6 +27,7 @@
|
|||
-->
|
||||
<title>VM UI</title>
|
||||
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500,700&display=swap" />
|
||||
<script src="%PUBLIC_URL%/dashboards/index.js" type="module"></script>
|
||||
</head>
|
||||
<body>
|
||||
<noscript>You need to enable JavaScript to run this app.</noscript>
|
||||
|
|
|
@ -19,29 +19,29 @@ import DashboardsLayout from "./components/PredefinedPanels/DashboardsLayout";
|
|||
const App: FC = () => {
|
||||
|
||||
return <>
|
||||
<CssBaseline /> {/* CSS Baseline: kind of normalize.css made by materialUI team - can be scoped */}
|
||||
<LocalizationProvider dateAdapter={DayjsUtils}> {/* Allows datepicker to work with DayJS */}
|
||||
<StyledEngineProvider injectFirst>
|
||||
<ThemeProvider theme={THEME}> {/* Material UI theme customization */}
|
||||
<StateProvider> {/* Serialized into query string, common app settings */}
|
||||
<AuthStateProvider> {/* Auth related info - optionally persisted to Local Storage */}
|
||||
<GraphStateProvider> {/* Graph settings */}
|
||||
<SnackbarProvider> {/* Display various snackbars */}
|
||||
<HashRouter>
|
||||
<HashRouter>
|
||||
<CssBaseline /> {/* CSS Baseline: kind of normalize.css made by materialUI team - can be scoped */}
|
||||
<LocalizationProvider dateAdapter={DayjsUtils}> {/* Allows datepicker to work with DayJS */}
|
||||
<StyledEngineProvider injectFirst>
|
||||
<ThemeProvider theme={THEME}> {/* Material UI theme customization */}
|
||||
<StateProvider> {/* Serialized into query string, common app settings */}
|
||||
<AuthStateProvider> {/* Auth related info - optionally persisted to Local Storage */}
|
||||
<GraphStateProvider> {/* Graph settings */}
|
||||
<SnackbarProvider> {/* Display various snackbars */}
|
||||
<Routes>
|
||||
<Route path={"/"} element={<HomeLayout/>}>
|
||||
<Route path={router.home} element={<CustomPanel/>}/>
|
||||
<Route path={router.dashboards} element={<DashboardsLayout/>}/>
|
||||
</Route>
|
||||
</Routes>
|
||||
</HashRouter>
|
||||
</SnackbarProvider>
|
||||
</GraphStateProvider>
|
||||
</AuthStateProvider>
|
||||
</StateProvider>
|
||||
</ThemeProvider>
|
||||
</StyledEngineProvider>
|
||||
</LocalizationProvider>
|
||||
</SnackbarProvider>
|
||||
</GraphStateProvider>
|
||||
</AuthStateProvider>
|
||||
</StateProvider>
|
||||
</ThemeProvider>
|
||||
</StyledEngineProvider>
|
||||
</LocalizationProvider>
|
||||
</HashRouter>
|
||||
</>;
|
||||
};
|
||||
|
||||
|
|
|
@ -9,10 +9,10 @@ import {SyntheticEvent} from "react";
|
|||
|
||||
export type DisplayType = "table" | "chart" | "code";
|
||||
|
||||
const tabs = [
|
||||
{value: "chart", icon: <ShowChartIcon/>, label: "Graph"},
|
||||
export const displayTypeTabs = [
|
||||
{value: "chart", icon: <ShowChartIcon/>, label: "Graph", prometheusCode: 0},
|
||||
{value: "code", icon: <CodeIcon/>, label: "JSON"},
|
||||
{value: "table", icon: <TableChartIcon/>, label: "Table"}
|
||||
{value: "table", icon: <TableChartIcon/>, label: "Table", prometheusCode: 1}
|
||||
];
|
||||
|
||||
export const DisplayTypeSwitch: FC = () => {
|
||||
|
@ -29,7 +29,7 @@ export const DisplayTypeSwitch: FC = () => {
|
|||
onChange={handleChange}
|
||||
sx={{minHeight: "0", marginBottom: "-1px"}}
|
||||
>
|
||||
{tabs.map(t =>
|
||||
{displayTypeTabs.map(t =>
|
||||
<Tab key={t.value}
|
||||
icon={t.icon}
|
||||
iconPosition="start"
|
||||
|
@ -37,4 +37,4 @@ export const DisplayTypeSwitch: FC = () => {
|
|||
sx={{minHeight: "41px"}}
|
||||
/>)}
|
||||
</Tabs>;
|
||||
};
|
||||
};
|
||||
|
|
|
@ -20,7 +20,7 @@ const DashboardLayout: FC = () => {
|
|||
}, [dashboards, tab]);
|
||||
|
||||
useEffect(() => {
|
||||
getDashboardSettings().then(d => d.length && setDashboards(d));
|
||||
setDashboards(getDashboardSettings());
|
||||
}, []);
|
||||
|
||||
return <>
|
||||
|
|
|
@ -1,14 +1,6 @@
|
|||
import {DashboardSettings} from "../../types";
|
||||
|
||||
const importModule = async (filename: string) => {
|
||||
const module = await import(`../../dashboards/${filename}`);
|
||||
module.default.filename = filename;
|
||||
return module.default as DashboardSettings;
|
||||
};
|
||||
|
||||
export default async () => {
|
||||
const context = require.context("../../dashboards", true, /\.json$/);
|
||||
const filenames = context.keys().map(r => r.replace("./", ""));
|
||||
return await Promise.all(filenames.map(async f => importModule(f)));
|
||||
export default (): DashboardSettings[] => {
|
||||
return window.__VMUI_PREDEFINED_DASHBOARDS__ || [];
|
||||
};
|
||||
|
||||
|
|
|
@ -8,6 +8,8 @@ import {getAppModeEnable, getAppModeParams} from "../utils/app-mode";
|
|||
import throttle from "lodash.throttle";
|
||||
import {DisplayType} from "../components/CustomPanel/Configurator/DisplayTypeSwitch";
|
||||
import {CustomStep} from "../state/graph/reducer";
|
||||
import usePrevious from "./usePrevious";
|
||||
import {arrayEquals} from "../utils/array";
|
||||
|
||||
interface FetchQueryParams {
|
||||
predefinedQuery?: string[]
|
||||
|
@ -48,7 +50,6 @@ export const useFetchQuery = ({predefinedQuery, visible, display, customStep}: F
|
|||
const controller = new AbortController();
|
||||
setFetchQueue([...fetchQueue, controller]);
|
||||
setIsLoading(true);
|
||||
|
||||
try {
|
||||
const responses = await Promise.all(fetchUrl.map(url => fetch(url, {signal: controller.signal})));
|
||||
const tempData = [];
|
||||
|
@ -114,12 +115,14 @@ export const useFetchQuery = ({predefinedQuery, visible, display, customStep}: F
|
|||
},
|
||||
[serverUrl, period, displayType, customStep]);
|
||||
|
||||
const prevFetchUrl = usePrevious(fetchUrl);
|
||||
|
||||
useEffect(() => {
|
||||
fetchOptions();
|
||||
}, [serverUrl]);
|
||||
|
||||
useEffect(() => {
|
||||
if (!visible) return;
|
||||
if (!visible || (fetchUrl && prevFetchUrl && arrayEquals(fetchUrl, prevFetchUrl))) return;
|
||||
throttledFetchData(fetchUrl, fetchQueue, (display || displayType));
|
||||
}, [fetchUrl, visible]);
|
||||
|
||||
|
|
10
app/vmui/packages/vmui/src/hooks/usePrevious.ts
Normal file
10
app/vmui/packages/vmui/src/hooks/usePrevious.ts
Normal file
|
@ -0,0 +1,10 @@
|
|||
import { useRef, useEffect } from "react";
|
||||
|
||||
export default (value: any) => {
|
||||
const ref = useRef();
|
||||
useEffect(() => {
|
||||
ref.current = value;
|
||||
}, [value]);
|
||||
|
||||
return ref.current;
|
||||
};
|
|
@ -2,6 +2,7 @@ import React, {createContext, FC, useContext, useEffect, useMemo, useReducer} fr
|
|||
import {Action, AppState, initialState, reducer} from "./reducer";
|
||||
import {getQueryStringValue, setQueryStringValue} from "../../utils/query-string";
|
||||
import {Dispatch} from "react";
|
||||
import {useLocation} from "react-router-dom";
|
||||
|
||||
type StateContextType = { state: AppState, dispatch: Dispatch<Action> };
|
||||
|
||||
|
@ -17,12 +18,13 @@ export const initialPrepopulatedState = Object.entries(initialState)
|
|||
}), {}) as AppState;
|
||||
|
||||
export const StateProvider: FC = ({children}) => {
|
||||
const location = useLocation();
|
||||
|
||||
const [state, dispatch] = useReducer(reducer, initialPrepopulatedState);
|
||||
|
||||
useEffect(() => {
|
||||
setQueryStringValue(state as unknown as Record<string, unknown>);
|
||||
}, [state]);
|
||||
}, [state, location]);
|
||||
|
||||
const contextValue = useMemo(() => {
|
||||
return { state, dispatch };
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
/* eslint max-lines: 0 */
|
||||
import {DisplayType} from "../../components/CustomPanel/Configurator/DisplayTypeSwitch";
|
||||
import {DisplayType, displayTypeTabs} from "../../components/CustomPanel/Configurator/DisplayTypeSwitch";
|
||||
import {TimeParams, TimePeriod} from "../../types";
|
||||
import {
|
||||
dateFromSeconds,
|
||||
|
@ -62,10 +62,12 @@ const {duration, endInput, relativeTimeId} = getRelativeTime({
|
|||
defaultEndInput: new Date(formatDateToLocal(getQueryStringValue("g0.end_input", getDateNowUTC()) as Date)),
|
||||
});
|
||||
const query = getQueryArray();
|
||||
const queryTab = getQueryStringValue("g0.tab", 0);
|
||||
const displayType = displayTypeTabs.find(t => t.prometheusCode === queryTab || t.value === queryTab);
|
||||
|
||||
export const initialState: AppState = {
|
||||
serverUrl: getDefaultServer(),
|
||||
displayType: getQueryStringValue("g0.tab", "chart") as DisplayType || "chart",
|
||||
displayType: (displayType?.value || "chart") as DisplayType,
|
||||
query: query, // demo_memory_usage_bytes
|
||||
queryHistory: query.map(q => ({index: 0, values: [q]})),
|
||||
time: {
|
||||
|
|
|
@ -1,5 +1,11 @@
|
|||
import {MetricBase} from "../api/types";
|
||||
|
||||
declare global {
|
||||
interface Window {
|
||||
__VMUI_PREDEFINED_DASHBOARDS__: DashboardSettings[];
|
||||
}
|
||||
}
|
||||
|
||||
export interface TimeParams {
|
||||
start: number; // timestamp in seconds
|
||||
end: number; // timestamp in seconds
|
||||
|
|
4
app/vmui/packages/vmui/src/utils/array.ts
Normal file
4
app/vmui/packages/vmui/src/utils/array.ts
Normal file
|
@ -0,0 +1,4 @@
|
|||
export const arrayEquals = (a: (string|number)[], b: (string|number)[]) => {
|
||||
return a.length === b.length && a.every((val, index) => val === b[index]);
|
||||
};
|
||||
|
|
@ -1,12 +1,18 @@
|
|||
import qs from "qs";
|
||||
import get from "lodash.get";
|
||||
import router from "../router";
|
||||
|
||||
const stateToUrlParams = {
|
||||
const graphStateToUrlParams = {
|
||||
"time.duration": "range_input",
|
||||
"time.period.date": "end_input",
|
||||
"time.period.step": "step_input",
|
||||
"time.relativeTime": "relative_time",
|
||||
"displayType": "tab"
|
||||
"displayType": "tab",
|
||||
};
|
||||
|
||||
const stateToUrlParams = {
|
||||
[router.home]: graphStateToUrlParams,
|
||||
[router.dashboards]: graphStateToUrlParams,
|
||||
};
|
||||
|
||||
// TODO need function for detect types.
|
||||
|
@ -32,14 +38,23 @@ const stateToUrlParams = {
|
|||
export const setQueryStringWithoutPageReload = (qsValue: string): void => {
|
||||
const w = window;
|
||||
if (w) {
|
||||
const newurl = `${w.location.protocol}//${w.location.host}${w.location.pathname}?${qsValue}${w.location.hash}`;
|
||||
const qs = qsValue ? `?${qsValue}` : "";
|
||||
const newurl = `${w.location.protocol}//${w.location.host}${w.location.pathname}${qs}${w.location.hash}`;
|
||||
w.history.pushState({ path: newurl }, "", newurl);
|
||||
}
|
||||
};
|
||||
|
||||
export const setQueryStringValue = (newValue: Record<string, unknown>): void => {
|
||||
const queryMap = new Map(Object.entries(stateToUrlParams));
|
||||
const query = get(newValue, "query", "") as string[];
|
||||
const route = window.location.hash.replace("#", "");
|
||||
const params = stateToUrlParams[route] || {};
|
||||
const queryMap = new Map(Object.entries(params));
|
||||
const isGraphRoute = route === router.home || route === router.dashboards;
|
||||
const newQsValue = isGraphRoute ? getGraphQsValue(newValue, queryMap) : getQsValue(newValue, queryMap);
|
||||
setQueryStringWithoutPageReload(newQsValue.join("&"));
|
||||
};
|
||||
|
||||
const getGraphQsValue = (newValue: Record<string, unknown>, queryMap: Map<string, string>): string[] => {
|
||||
const query = get(newValue, "query", []) as string[];
|
||||
const newQsValue: string[] = [];
|
||||
query.forEach((q, i) => {
|
||||
queryMap.forEach((queryKey, stateKey) => {
|
||||
|
@ -52,7 +67,20 @@ export const setQueryStringValue = (newValue: Record<string, unknown>): void =>
|
|||
newQsValue.push(`g${i}.expr=${encodeURIComponent(q)}`);
|
||||
});
|
||||
|
||||
setQueryStringWithoutPageReload(newQsValue.join("&"));
|
||||
return newQsValue;
|
||||
};
|
||||
|
||||
const getQsValue = (newValue: Record<string, unknown>, queryMap: Map<string, string>): string[] => {
|
||||
const newQsValue: string[] = [];
|
||||
queryMap.forEach((queryKey, stateKey) => {
|
||||
const value = get(newValue, stateKey, "") as string;
|
||||
if (value) {
|
||||
const valueEncoded = encodeURIComponent(value);
|
||||
newQsValue.push(`${queryKey}=${valueEncoded}`);
|
||||
}
|
||||
});
|
||||
|
||||
return newQsValue;
|
||||
};
|
||||
|
||||
export const getQueryStringValue = (
|
||||
|
|
|
@ -34,7 +34,7 @@ app-via-docker: package-builder
|
|||
--env GO111MODULE=on \
|
||||
$(DOCKER_OPTS) \
|
||||
$(BUILDER_IMAGE) \
|
||||
go build $(RACE) -mod=vendor -trimpath \
|
||||
go build $(RACE) -mod=vendor -trimpath -buildvcs=false \
|
||||
-ldflags "-extldflags '-static' $(GO_BUILDINFO)" \
|
||||
-tags 'netgo osusergo nethttpomithttp2 musl' \
|
||||
-o bin/$(APP_NAME)$(APP_SUFFIX)-prod $(PKG_PREFIX)/app/$(APP_NAME)
|
||||
|
@ -50,7 +50,7 @@ app-via-docker-windows: package-builder
|
|||
--env GO111MODULE=on \
|
||||
$(DOCKER_OPTS) \
|
||||
$(BUILDER_IMAGE) \
|
||||
go build $(RACE) -mod=vendor -trimpath \
|
||||
go build $(RACE) -mod=vendor -trimpath -buildvcs=false \
|
||||
-ldflags "-s -w -extldflags '-static' $(GO_BUILDINFO)" \
|
||||
-tags 'netgo osusergo nethttpomithttp2' \
|
||||
-o bin/$(APP_NAME)-windows$(APP_SUFFIX)-prod.exe $(PKG_PREFIX)/app/$(APP_NAME)
|
||||
|
|
|
@ -15,6 +15,19 @@ The following tip changes can be tested by building VictoriaMetrics components f
|
|||
|
||||
## tip
|
||||
|
||||
* FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): support [reusable templates](https://prometheus.io/docs/prometheus/latest/configuration/template_examples/#defining-reusable-templates) for rules annotations. The path to the template files can be specified via `-rule.templates` flag. See more about this feature [here](https://docs.victoriametrics.com/vmalert.html#reusable-templates). Thanks to @AndrewChubatiuk for [the pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2532). See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2510).
|
||||
* FEATURE: [vmctl](https://docs.victoriametrics.com/vmctl.html): add `influx-prometheus-mode` command-line flag, which allows to restore the original time series written from Prometheus into InfluxDB during data migration from InfluxDB to VictoriaMetrics. See [this feature request](https://github.com/VictoriaMetrics/vmctl/issues/8). Thanks to @mback2k for [the pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2545).
|
||||
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add ability to specify AWS service name when issuing requests to AWS api. See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2605). Thanks to @transacid for [the pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2604).
|
||||
|
||||
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert.html): support `scalar` result type in response. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2607).
|
||||
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert.html): support strings in `humanize.*` template function in the same way as Prometheus does. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2569).
|
||||
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert.html): proxy `/rules` requests to vmalert from Grafana's alerting UI. This removes errors in Grafana's UI for Grafana versions older than `8.5.*`. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2583)
|
||||
* BUGFIX: [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html): do not return values from [label_value()](https://docs.victoriametrics.com/MetricsQL.html#label_value) function if the original time series has no values at the selected timestamps.
|
||||
* BUGFIX: [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html): limit the number of concurrently established connections from vmselect to vmstorage. This should prevent from potentially high spikes in the number of established connections after temporary slowdown in connection handshake procedure between vmselect and vmstorage because of spikes in workload. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2552).
|
||||
* BUGFIX: [vmctl](https://docs.victoriametrics.com/vmctl.html): fix build for Solaris / SmartOS. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1322#issuecomment-1120276146).
|
||||
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert.html): do not add `/api/v1/query` suffix to `-datasource.url` if `-remoteRead.disablePathAppend` command-line flag is set. Previously this flag was applied only to `-remoteRead.url`, which could confuse users.
|
||||
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert.html): prevent from possible resource leak on config update, which could lead to the slowdown of `vmalert` over time. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2577).
|
||||
|
||||
## [v1.77.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.77.1)
|
||||
|
||||
Released at 07-05-2022
|
||||
|
@ -44,7 +57,7 @@ Released at 05-05-2022
|
|||
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add ability to attach node-level labels and annotations to discovered Kubernetes pod targets in the same way as Prometheus 2.35 does. See [this feature request](https://github.com/prometheus/prometheus/issues/9510) and [this pull request](https://github.com/prometheus/prometheus/pull/10080).
|
||||
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add support for `tls_config` and `proxy_url` options at `oauth2` section in the same way as Prometheus does. See [oauth2 docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#oauth2).
|
||||
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add support for `min_version` option at `tls_config` section in the same way as Prometheus does. See [tls_config docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config).
|
||||
* FEATURE: [vmagent](): expose `vmagent_remotewrite_rate_limit` metric at `http://vmagent:8429/metrics`, which can be used for alerting rules such as `rate(vmagent_remotewrite_conn_bytes_written_total) / vmagent_remotewrite_rate_limit > 0.8` when `-remoteWrite.rateLimit` command-line flag is set. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2521).
|
||||
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): expose `vmagent_remotewrite_rate_limit` metric at `http://vmagent:8429/metrics`, which can be used for alerting rules such as `rate(vmagent_remotewrite_conn_bytes_written_total) / vmagent_remotewrite_rate_limit > 0.8` when `-remoteWrite.rateLimit` command-line flag is set. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2521).
|
||||
* FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): add support for DNS-based discovery for notifiers in the same way as Prometheus does (aka `dns_sd_configs`). See [these docs](https://docs.victoriametrics.com/vmalert.html#notifier-configuration-file) and [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2460).
|
||||
* FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): add `-replay.disableProgressBar` command-line flag, which allows disabling progressbar in [rules' backfilling mode](https://docs.victoriametrics.com/vmalert.html#rules-backfilling). See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1761).
|
||||
* FEATURE: allow specifying TLS cipher suites for incoming https requests via `-tlsCipherSuites` command-line flag. See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2404).
|
||||
|
|
|
@ -326,12 +326,30 @@ Some capacity planning tips for VictoriaMetrics cluster:
|
|||
|
||||
- The [replication](#replication-and-data-safety) increases the amounts of needed resources for the cluster by up to `N` times where `N` is replication factor. This is because `vminsert` stores `N` copies of every ingested sample on distinct `vmstorage` nodes. These copies are de-duplicated by `vmselect` during querying. The most cost-efficient and performant solution for data durability is to rely on replicated durable persistent disks such as [Google Compute persistent disks](https://cloud.google.com/compute/docs/disks#pdspecs) instead of using the [replication at VictoriaMetrics level](#replication-and-data-safety).
|
||||
- It is recommended to run a cluster with big number of small `vmstorage` nodes instead of a cluster with small number of big `vmstorage` nodes. This increases chances that the cluster remains available and stable when some of `vmstorage` nodes are temporarily unavailable during maintenance events such as upgrades, configuration changes or migrations. For example, when a cluster contains 10 `vmstorage` nodes and a single node becomes temporarily unavailable, then the workload on the remaining 9 nodes increases by `1/9=11%`. When a cluster contains 3 `vmstorage` nodes and a single node becomes temporarily unavailable, then the workload on the remaining 2 nodes increases by `1/2=50%`. The remaining `vmstorage` nodes may have no enough free capacity for handling the increased workload. In this case the cluster may become overloaded, which may result to decreased availability and stability.
|
||||
- Cluster capacity for [active time series](https://docs.victoriametrics.com/FAQ.html#what-is-an-active-time-series) can be increased by increasing RAM and CPU resources per each `vmstorage` node or by by adding new `vmstorage` nodes.
|
||||
- Cluster capacity for [active time series](https://docs.victoriametrics.com/FAQ.html#what-is-an-active-time-series) can be increased by increasing RAM and CPU resources per each `vmstorage` node or by adding new `vmstorage` nodes.
|
||||
- Query latency can be reduced by increasing CPU resources per each `vmselect` node, since each incoming query is processed by a single `vmselect` node. Performance for heavy queries scales with the number of available CPU cores at `vmselect` node, since `vmselect` processes time series referred by the query on all the available CPU cores.
|
||||
- If the cluster needs to process incoming queries at a high rate, then its capacity can be increased by adding more `vmselect` nodes, so incoming queries could be spread among bigger number of `vmselect` nodes.
|
||||
- By default `vminsert` compresses the data it sends to `vmstorage` in order to reduce network bandwidth usage. The compression takes additional CPU resources at `vminsert`. If `vminsert` nodes have limited CPU, then the compression can be disabled by passing `-rpc.disableCompression` command-line flag at `vminsert` nodes.
|
||||
- By default `vmstorage` compresses the data it sends to `vmselect` during queries in order to reduce network bandwidth usage. The compression takes additional CPU resources at `vmstorage`. If `vmstorage` nodes have limited CPU, then the compression can be disabled by passing `-rpc.disableCompression` command-line flag at `vmstorage` nodes.
|
||||
|
||||
See also [resource usage limits docs](#resource-usage-limits).
|
||||
|
||||
## Resource usage limits
|
||||
|
||||
By default cluster components of VictoriaMetrics are tuned for an optimal resource usage under typical workloads. Some workloads may need fine-grained resource usage limits. In these cases the following command-line flags may be useful:
|
||||
|
||||
- `-memory.allowedPercent` and `-search.allowedBytes` limit the amounts of memory, which may be used for various internal caches at all the cluster components of VictoriaMetrics - `vminsert`, `vmselect` and `vmstorage`. Note that VictoriaMetrics components may use more memory, since these flags don't limit additional memory, which may be needed on a per-query basis.
|
||||
- `-search.maxUniqueTimeseries` at `vmselect` component limits the number of unique time series a single query can find and process. `vmselect` passes the limit to `vmstorage` component, which keeps in memory some metainformation about the time series located by each query and spends some CPU time for processing the found time series. This means that the maximum memory usage and CPU usage a single query can use at `vmstorage` is proportional to `-search.maxUniqueTimeseries`.
|
||||
- `-search.maxQueryDuration` at `vmselect` limits the duration of a single query. If the query takes longer than the given duration, then it is canceled. This allows saving CPU and RAM at `vmselect` and `vmstorage` when executing unexpected heavy queries.
|
||||
- `-search.maxConcurrentRequests` at `vmselect` limits the number of concurrent requests a single `vmselect` node can process. Bigger number of concurrent requests usually means bigger memory usage at both `vmselect` and `vmstorage`. For example, if a single query needs 100 MiB of additional memory during its execution, then 100 concurrent queries may need `100 * 100 MiB = 10 GiB` of additional memory. So it is better to limit the number of concurrent queries, while suspending additional incoming queries if the concurrency limit is reached. `vmselect` provides `-search.maxQueueDuration` command-line flag for limiting the max wait time for suspended queries.
|
||||
- `-search.maxSamplesPerSeries` at `vmselect` limits the number of raw samples the query can process per each time series. `vmselect` sequentially processes raw samples per each found time series during the query. It unpacks raw samples on the selected time range per each time series into memory and then applies the given [rollup function](https://docs.victoriametrics.com/MetricsQL.html#rollup-functions). The `-search.maxSamplesPerSeries` command-line flag allows limiting memory usage at `vmselect` in the case when the query is executed on a time range, which contains hundreds of millions of raw samples per each located time series.
|
||||
- `-search.maxSamplesPerQuery` at `vmselect` limits the number of raw samples a single query can process. This allows limiting CPU usage at `vmselect` for heavy queries.
|
||||
- `-search.maxSeries` at `vmselect` limits the number of time series, which may be returned from [/api/v1/series](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers). This endpoint is used mostly by Grafana for auto-completion of metric names, label names and label values. Queries to this endpoint may take big amounts of CPU time and memory at `vmstorage` and `vmselect` when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxSeries` to quite low value in order limit CPU and memory usage.
|
||||
- `-search.maxTagKeys` at `vmselect` limits the number of items, which may be returned from [/api/v1/labels](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names). This endpoint is used mostly by Grafana for auto-completion of label names. Queries to this endpoint may take big amounts of CPU time and memory at `vmstorage` and `vmselect` when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagKeys` to quite low value in order to limit CPU and memory usage.
|
||||
- `-search.maxTagValues` at `vmselect` limits the number of items, which may be returned from [/api/v1/label/.../values](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values). This endpoint is used mostly by Grafana for auto-completion of label values. Queries to this endpoint may take big amounts of CPU time and memory at `vmstorage` and `vmselect` when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagValues` to quite low value in order to limit CPU and memory usage.
|
||||
|
||||
See also [capacity planning docs](#capacity-planning).
|
||||
|
||||
## High availability
|
||||
|
||||
The database is considered highly available if it continues accepting new data and processing incoming queries when some of its components are temporarily unavailable.
|
||||
|
@ -545,7 +563,7 @@ Below is the output for `/path/to/vminsert -help`:
|
|||
-influxDBLabel string
|
||||
Default label for the DB name sent over '?db={db_name}' query parameter (default "db")
|
||||
-influxListenAddr string
|
||||
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write
|
||||
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write
|
||||
-influxMeasurementFieldSeparator string
|
||||
Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_")
|
||||
-influxSkipMeasurement
|
||||
|
|
|
@ -250,8 +250,8 @@ All the VictoriaMetrics components provide command-line flags to control the siz
|
|||
|
||||
Memory usage for VictoriaMetrics components can be tuned according to the following docs:
|
||||
|
||||
* [Capacity planning for single-node VictoriaMetrics](https://docs.victoriametrics.com/#capacity-planning)
|
||||
* [Capacity planning for cluster VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#capacity-planning)
|
||||
* [Resource usage limits for single-node VictoriaMetrics](https://docs.victoriametrics.com/#resource-usage-limits)
|
||||
* [Resource usage limits for cluster VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#resource-usage-limits)
|
||||
* [Troubleshooting for vmagent](https://docs.victoriametrics.com/vmagent.html#troubleshooting)
|
||||
* [Troubleshooting for single-node VictoriaMetrics](https://docs.victoriametrics.com/#troubleshooting)
|
||||
|
||||
|
|
11
docs/Makefile
Normal file
11
docs/Makefile
Normal file
|
@ -0,0 +1,11 @@
|
|||
docs-install:
|
||||
gem install jekyll bundler
|
||||
bundle install --gemfile=Gemfile
|
||||
|
||||
# run local server for documentation website
|
||||
# at http://127.0.0.1:4000/
|
||||
# On first use, please run `make docs-install`
|
||||
docs-up:
|
||||
JEKYLL_GITHUB_TOKEN=blank PAGES_API_URL=http://0.0.0.0 bundle exec \
|
||||
--gemfile=Gemfile \
|
||||
jekyll server --livereload
|
|
@ -60,6 +60,7 @@ to prevent limits exhaustion.
|
|||
|
||||
Here is an alert example for high churn rate by the tenant:
|
||||
|
||||
{% raw %}
|
||||
```yaml
|
||||
|
||||
- alert: TooHighChurnRate
|
||||
|
@ -79,3 +80,4 @@ Here is an alert example for high churn rate by the tenant:
|
|||
High Churn Rate is tightly connected with database performance and may
|
||||
result in unexpected OOM's or slow queries."
|
||||
```
|
||||
{% endraw %}
|
||||
|
|
|
@ -4,74 +4,169 @@ sort: 13
|
|||
|
||||
# Quick start
|
||||
|
||||
## Installation
|
||||
## How to install
|
||||
|
||||
VictoriaMetrics is distributed in two forms:
|
||||
* [Single-server-VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html) - all-in-one
|
||||
binary, which is very easy to use and maintain.
|
||||
Single-server-VictoriaMetrics perfectly scales vertically and easily handles millions of metrics/s;
|
||||
* [VictoriaMetrics Cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) - set of components
|
||||
for building horizontally scalable clusters.
|
||||
|
||||
Single-server-VictoriaMetrics VictoriaMetrics is available as:
|
||||
|
||||
* [Managed VictoriaMetrics at AWS](https://aws.amazon.com/marketplace/pp/prodview-4tbfq5icmbmyc)
|
||||
* [Docker images](https://hub.docker.com/r/victoriametrics/victoria-metrics/)
|
||||
* [Docker images](https://hub.docker.com/r/victoriametrics/victoria-metrics/)
|
||||
* [Snap packages](https://snapcraft.io/victoriametrics)
|
||||
* [Helm Charts](https://github.com/VictoriaMetrics/helm-charts#list-of-charts)
|
||||
* [Binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
|
||||
* [Source code](https://github.com/VictoriaMetrics/VictoriaMetrics). See [How to build from sources](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-build-from-sources)
|
||||
* [Source code](https://github.com/VictoriaMetrics/VictoriaMetrics).
|
||||
See [How to build from sources](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-build-from-sources)
|
||||
* [VictoriaMetrics on Linode](https://www.linode.com/marketplace/apps/victoriametrics/victoriametrics/)
|
||||
* [VictoriaMetrics on DigitalOcean](https://marketplace.digitalocean.com/apps/victoriametrics-single)
|
||||
|
||||
Just download VictoriaMetrics and follow [these instructions](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-start-victoriametrics).
|
||||
Then read [Prometheus setup](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#prometheus-setup) and [Grafana setup](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#grafana-setup) docs.
|
||||
Just download VictoriaMetrics and follow
|
||||
[these instructions](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-start-victoriametrics).
|
||||
Then read [Prometheus setup](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#prometheus-setup)
|
||||
and [Grafana setup](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#grafana-setup) docs.
|
||||
|
||||
### Starting VM-Signle via Docker:
|
||||
|
||||
The following commands download the latest available [Docker image of VictoriaMetrics](https://hub.docker.com/r/victoriametrics/victoria-metrics) and start it at port 8428, while storing the ingested data at `victoria-metrics-data` subdirectory under the current directory:
|
||||
### Starting VM-Single via Docker
|
||||
|
||||
The following commands download the latest available
|
||||
[Docker image of VictoriaMetrics](https://hub.docker.com/r/victoriametrics/victoria-metrics)
|
||||
and start it at port 8428, while storing the ingested data at `victoria-metrics-data` subdirectory
|
||||
under the current directory:
|
||||
|
||||
<div class="with-copy" markdown="1">
|
||||
|
||||
```bash
|
||||
docker pull victoriametrics/victoria-metrics:latest
|
||||
docker run -it --rm -v `pwd`/victoria-metrics-data:/victoria-metrics-data -p 8428:8428 victoriametrics/victoria-metrics:latest
|
||||
```
|
||||
|
||||
Open `http://localhost:8428` in web browser and read [these docs](https://docs.victoriametrics.com/#operation).
|
||||
</div>
|
||||
|
||||
There are also the following versions of VictoriaMetrics available:
|
||||
Open <a href="http://localhost:8428">http://localhost:8428</a> in web browser
|
||||
and read [these docs](https://docs.victoriametrics.com/#operation).
|
||||
|
||||
* [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) - horizontally scalable VictoriaMetrics, which scales to multiple nodes.
|
||||
There is also [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html)
|
||||
- horizontally scalable installation, which scales to multiple nodes.
|
||||
|
||||
### Starting VM-Cluster via Docker:
|
||||
### Starting VM-Cluster via Docker
|
||||
|
||||
The following commands clone the latest available [VictoriaMetrics cluster repository](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster) and start the docker container via 'docker-compose'. Further customization is possible by editing the [docker-compose.yaml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/docker-compose.yml) file.
|
||||
The following commands clone the latest available
|
||||
[VictoriaMetrics cluster repository](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster)
|
||||
and start the docker container via 'docker-compose'. Further customization is possible by editing
|
||||
the [docker-compose.yaml](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/docker-compose.yml)
|
||||
file.
|
||||
|
||||
<div class="with-copy" markdown="1">
|
||||
|
||||
```bash
|
||||
git clone https://github.com/VictoriaMetrics/VictoriaMetrics --branch cluster && cd VictoriaMetrics/deployment/docker && docker-compose up
|
||||
git clone https://github.com/VictoriaMetrics/VictoriaMetrics --branch cluster &&
|
||||
cd VictoriaMetrics/deployment/docker &&
|
||||
docker-compose up
|
||||
```
|
||||
|
||||
</div>
|
||||
|
||||
* [Cluster setup](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#cluster-setup)
|
||||
|
||||
## Writing data
|
||||
## Write data
|
||||
|
||||
Data can be written to VictoriaMetrics in the following ways:
|
||||
There are two main models in monitoring for data collection:
|
||||
[push](https://docs.victoriametrics.com/keyConcepts.html#push-model)
|
||||
and [pull](https://docs.victoriametrics.com/keyConcepts.html#pull-model).
|
||||
Both are used in modern monitoring and both are supported by VictoriaMetrics.
|
||||
|
||||
* [DataDog agent](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-datadog-agent)
|
||||
* [InfluxDB-compatible agents such as Telegraf](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
|
||||
* [Graphite-compatible agents such as StatsD](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-graphite-compatible-agents-such-as-statsd)
|
||||
* [OpenTSDB-compatible agents](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-opentsdb-compatible-agents)
|
||||
* [Prometheus remote_write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)
|
||||
* [In JSON line format](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-data-in-json-line-format)
|
||||
* [Imported in CSV format](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-csv-data)
|
||||
* [Imported in Prometheus exposition format](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-data-in-prometheus-exposition-format)
|
||||
* `/api/v1/import` for importing data obtained from [/api/v1/export](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-export-data-in-json-line-format).
|
||||
See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-data-in-json-line-format) for details.
|
||||
See more details on [writing data here](https://docs.victoriametrics.com/keyConcepts.html#write-data).
|
||||
|
||||
## Reading data
|
||||
|
||||
VictoriaMetrics various APIs for reading the data. [This document briefly describes these APIs](https://docs.victoriametrics.com/url-examples.html).
|
||||
## Query data
|
||||
|
||||
### Grafana setup:
|
||||
VictoriaMetrics provides an
|
||||
[HTTP API](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#prometheus-querying-api-usage)
|
||||
for serving read queries. The API is used in various integrations such as
|
||||
[Grafana](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#grafana-setup).
|
||||
The same API is also used by
|
||||
[VMUI](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#vmui) - graphical User Interface
|
||||
for querying and visualizing metrics.
|
||||
|
||||
Create [Prometheus datasource](http://docs.grafana.org/features/datasources/prometheus/) in Grafana with the following url:
|
||||
[MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) - is he query language for executing read queries
|
||||
in VictoriaMetrics. MetricsQL is a [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics)
|
||||
-like query language with a powerful set of functions and features for working specifically with time series data.
|
||||
|
||||
```url
|
||||
http://<victoriametrics-addr>:8428
|
||||
```
|
||||
See more details on [querying data here](https://docs.victoriametrics.com/keyConcepts.html#query-data)
|
||||
|
||||
Substitute `<victoriametrics-addr>` with the hostname or IP address of VictoriaMetrics.
|
||||
|
||||
Then build graphs and dashboards for the created datasource using [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/) or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html).
|
||||
## Alerting
|
||||
|
||||
It is not possible to physically trace all changes on graphs all the time, that is why alerting exists.
|
||||
In [vmalert](https://docs.victoriametrics.com/vmalert.html) it is possible to create a set of conditions
|
||||
based on PromQL and MetricsQL queries that will send a notification when such conditions are met.
|
||||
|
||||
## Data migration
|
||||
|
||||
Migrating data from other TSDBs to VictoriaMetrics is as simple as importing data via any of
|
||||
[supported formats](https://docs.victoriametrics.com/keyConcepts.html#push-model).
|
||||
|
||||
The migration might get easier when using [vmctl](https://docs.victoriametrics.com/vmctl.html) - VictoriaMetrics
|
||||
command line tool. It supports the following databases for migration to VictoriaMetrics:
|
||||
* [Prometheus using snapshot API](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-prometheus);
|
||||
* [Thanos](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-thanos);
|
||||
* [InfluxDB](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-influxdb-1x);
|
||||
* [OpenTSDB](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-opentsdb);
|
||||
* [Migrate data between VictoriaMetrics single and cluster versions](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-victoriametrics).
|
||||
|
||||
## Productionisation
|
||||
|
||||
When going to production with VictoriaMetrics we recommend following the recommendations.
|
||||
|
||||
### Monitoring
|
||||
|
||||
Each VictoriaMetrics component emits its own metrics with various details regarding performance
|
||||
and health state. Docs for the components also contain a `Monitoring` section with an explanation
|
||||
of what and how should be monitored. For example,
|
||||
[Single-server-VictoriaMetrics Monitoring](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#monitoring).
|
||||
|
||||
VictoriaMetric team prepared a list of [Grafana dashboards](https://grafana.com/orgs/victoriametrics/dashboards)
|
||||
for the main components. Each dashboard contains a lot of useful information and tips. It is recommended
|
||||
to have these dashboards installed and up to date.
|
||||
|
||||
The list of alerts for [single](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml)
|
||||
and [cluster](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/alerts.yml)
|
||||
versions would also help to identify and notify about issues with the system.
|
||||
|
||||
The rule of the thumb is to have a separate installation of VictoriaMetrics or any other monitoring system
|
||||
to monitor the production installation of VictoriaMetrics. This would make monitoring independent and
|
||||
will help identify problems with the main monitoring installation.
|
||||
|
||||
|
||||
### Capacity planning
|
||||
|
||||
See capacity planning sections in [docs](https://docs.victoriametrics.com) for
|
||||
[Single-server-VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#capacity-planning).
|
||||
and [VictoriaMetrics Cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#capacity-planning).
|
||||
|
||||
Capacity planning isn't possible without [monitoring](#monitoring), so consider configuring it first.
|
||||
Understanding resource usage and performance of VictoriaMetrics also requires knowing the tech terms
|
||||
[active series](https://docs.victoriametrics.com/FAQ.html#what-is-an-active-time-series),
|
||||
[churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate),
|
||||
[cardinality](https://docs.victoriametrics.com/FAQ.html#what-is-high-cardinality),
|
||||
[slow inserts](https://docs.victoriametrics.com/FAQ.html#what-is-a-slow-insert).
|
||||
All of them are present in [Grafana dashboards](https://grafana.com/orgs/victoriametrics/dashboards).
|
||||
|
||||
|
||||
### Data safety
|
||||
|
||||
It is recommended to read [Replication and data safety](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#replication-and-data-safety),
|
||||
[Why replication doesn’t save from disaster?](https://valyala.medium.com/speeding-up-backups-for-big-time-series-databases-533c1a927883)
|
||||
and [backups](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#backups).
|
||||
|
||||
|
||||
### Configuring limits
|
||||
|
||||
To avoid excessive resource usage or performance degradation limits must be in place:
|
||||
* [Resource usage limits](https://docs.victoriametrics.com/FAQ.html#how-to-set-a-memory-limit-for-victoriametrics-components);
|
||||
* [Cardinality limiter](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cardinality-limiter).
|
|
@ -1055,6 +1055,25 @@ It is recommended leaving the following amounts of spare resources:
|
|||
* 50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload.
|
||||
* At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag. See also `-storage.minFreeDiskSpaceBytes` command-line flag description [here](#list-of-command-line-flags).
|
||||
|
||||
See also [resource usage limits docs](#resource-usage-limits).
|
||||
|
||||
## Resource usage limits
|
||||
|
||||
By default VictoriaMetrics is tuned for an optimal resource usage under typical workloads. Some workloads may need fine-grained resource usage limits. In these cases the following command-line flags may be useful:
|
||||
|
||||
- `-memory.allowedPercent` and `-search.allowedBytes` limit the amounts of memory, which may be used for various internal caches at VictoriaMetrics. Note that VictoriaMetrics may use more memory, since these flags don't limit additional memory, which may be needed on a per-query basis.
|
||||
- `-search.maxUniqueTimeseries` limits the number of unique time series a single query can find and process. VictoriaMetrics keeps in memory some metainformation about the time series located by each query and spends some CPU time for processing the found time series. This means that the maximum memory usage and CPU usage a single query can use is proportional to `-search.maxUniqueTimeseries`.
|
||||
- `-search.maxQueryDuration` limits the duration of a single query. If the query takes longer than the given duration, then it is canceled. This allows saving CPU and RAM when executing unexpected heavy queries.
|
||||
- `-search.maxConcurrentRequests` limits the number of concurrent requests VictoriaMetrics can process. Bigger number of concurrent requests usually means bigger memory usage. For example, if a single query needs 100 MiB of additional memory during its execution, then 100 concurrent queries may need `100 * 100 MiB = 10 GiB` of additional memory. So it is better to limit the number of concurrent queries, while suspending additional incoming queries if the concurrency limit is reached. VictoriaMetrics provides `-search.maxQueueDuration` command-line flag for limiting the max wait time for suspended queries.
|
||||
- `-search.maxSamplesPerSeries` limits the number of raw samples the query can process per each time series. VictoriaMetrics sequentially processes raw samples per each found time series during the query. It unpacks raw samples on the selected time range per each time series into memory and then applies the given [rollup function](https://docs.victoriametrics.com/MetricsQL.html#rollup-functions). The `-search.maxSamplesPerSeries` command-line flag allows limiting memory usage in the case when the query is executed on a time range, which contains hundreds of millions of raw samples per each located time series.
|
||||
- `-search.maxSamplesPerQuery` limits the number of raw samples a single query can process. This allows limiting CPU usage for heavy queries.
|
||||
- `-search.maxSeries` limits the number of time series, which may be returned from [/api/v1/series](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers). This endpoint is used mostly by Grafana for auto-completion of metric names, label names and label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxSeries` to quite low value in order limit CPU and memory usage.
|
||||
- `-search.maxTagKeys` limits the number of items, which may be returned from [/api/v1/labels](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names). This endpoint is used mostly by Grafana for auto-completion of label names. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagKeys` to quite low value in order to limit CPU and memory usage.
|
||||
- `-search.maxTagValues` limits the number of items, which may be returned from [/api/v1/label/.../values](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values). This endpoint is used mostly by Grafana for auto-completion of label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagValues` to quite low value in order to limit CPU and memory usage.
|
||||
|
||||
See also [capacity planning docs](#capacity-planning).
|
||||
|
||||
|
||||
## High availability
|
||||
|
||||
* Install multiple VictoriaMetrics instances in distinct datacenters (availability zones).
|
||||
|
@ -1682,7 +1701,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
|
|||
-influxDBLabel string
|
||||
Default label for the DB name sent over '?db={db_name}' query parameter (default "db")
|
||||
-influxListenAddr string
|
||||
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write
|
||||
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write
|
||||
-influxMeasurementFieldSeparator string
|
||||
Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_")
|
||||
-influxSkipMeasurement
|
||||
|
@ -1745,7 +1764,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
|
|||
-promscrape.cluster.membersCount int
|
||||
The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets
|
||||
-promscrape.cluster.replicationFactor int
|
||||
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
|
||||
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 1, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
|
||||
-promscrape.config string
|
||||
Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details
|
||||
-promscrape.config.dryRun
|
||||
|
|
|
@ -52,7 +52,7 @@ The helm chart repository [https://github.com/VictoriaMetrics/helm-charts/](http
|
|||
3. Update `vmauth` chart version in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-auth/values.yaml) and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-auth/Chart.yaml)
|
||||
4. Update `cluster` chart versions in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-cluster/values.yaml), bump version for `vmselect`, `vminsert` and `vmstorage` and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-cluster/Chart.yaml)
|
||||
5. Update `k8s-stack` chart versions in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-k8s-stack/values.yaml), bump version for `vmselect`, `vminsert`, `vmstorage`, `vmsingle`, `vmalert`, `vmagent` and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-k8s-stack/Chart.yaml)
|
||||
6. Update `signle` chart version in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-single/values.yaml) and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-single/Chart.yaml)
|
||||
6. Update `single-node` chart version in [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-single/values.yaml) and [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-single/Chart.yaml)
|
||||
8. Run `make gen-doc`
|
||||
9. Run `make package` that creates or updates zip file with the packed chart
|
||||
10. Run `make merge`. It creates or updates metadata for charts in index.yaml
|
||||
|
|
|
@ -1059,6 +1059,25 @@ It is recommended leaving the following amounts of spare resources:
|
|||
* 50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload.
|
||||
* At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag. See also `-storage.minFreeDiskSpaceBytes` command-line flag description [here](#list-of-command-line-flags).
|
||||
|
||||
See also [resource usage limits docs](#resource-usage-limits).
|
||||
|
||||
## Resource usage limits
|
||||
|
||||
By default VictoriaMetrics is tuned for an optimal resource usage under typical workloads. Some workloads may need fine-grained resource usage limits. In these cases the following command-line flags may be useful:
|
||||
|
||||
- `-memory.allowedPercent` and `-search.allowedBytes` limit the amounts of memory, which may be used for various internal caches at VictoriaMetrics. Note that VictoriaMetrics may use more memory, since these flags don't limit additional memory, which may be needed on a per-query basis.
|
||||
- `-search.maxUniqueTimeseries` limits the number of unique time series a single query can find and process. VictoriaMetrics keeps in memory some metainformation about the time series located by each query and spends some CPU time for processing the found time series. This means that the maximum memory usage and CPU usage a single query can use is proportional to `-search.maxUniqueTimeseries`.
|
||||
- `-search.maxQueryDuration` limits the duration of a single query. If the query takes longer than the given duration, then it is canceled. This allows saving CPU and RAM when executing unexpected heavy queries.
|
||||
- `-search.maxConcurrentRequests` limits the number of concurrent requests VictoriaMetrics can process. Bigger number of concurrent requests usually means bigger memory usage. For example, if a single query needs 100 MiB of additional memory during its execution, then 100 concurrent queries may need `100 * 100 MiB = 10 GiB` of additional memory. So it is better to limit the number of concurrent queries, while suspending additional incoming queries if the concurrency limit is reached. VictoriaMetrics provides `-search.maxQueueDuration` command-line flag for limiting the max wait time for suspended queries.
|
||||
- `-search.maxSamplesPerSeries` limits the number of raw samples the query can process per each time series. VictoriaMetrics sequentially processes raw samples per each found time series during the query. It unpacks raw samples on the selected time range per each time series into memory and then applies the given [rollup function](https://docs.victoriametrics.com/MetricsQL.html#rollup-functions). The `-search.maxSamplesPerSeries` command-line flag allows limiting memory usage in the case when the query is executed on a time range, which contains hundreds of millions of raw samples per each located time series.
|
||||
- `-search.maxSamplesPerQuery` limits the number of raw samples a single query can process. This allows limiting CPU usage for heavy queries.
|
||||
- `-search.maxSeries` limits the number of time series, which may be returned from [/api/v1/series](https://prometheus.io/docs/prometheus/latest/querying/api/#finding-series-by-label-matchers). This endpoint is used mostly by Grafana for auto-completion of metric names, label names and label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxSeries` to quite low value in order limit CPU and memory usage.
|
||||
- `-search.maxTagKeys` limits the number of items, which may be returned from [/api/v1/labels](https://prometheus.io/docs/prometheus/latest/querying/api/#getting-label-names). This endpoint is used mostly by Grafana for auto-completion of label names. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagKeys` to quite low value in order to limit CPU and memory usage.
|
||||
- `-search.maxTagValues` limits the number of items, which may be returned from [/api/v1/label/.../values](https://prometheus.io/docs/prometheus/latest/querying/api/#querying-label-values). This endpoint is used mostly by Grafana for auto-completion of label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). In this case it might be useful to set the `-search.maxTagValues` to quite low value in order to limit CPU and memory usage.
|
||||
|
||||
See also [capacity planning docs](#capacity-planning).
|
||||
|
||||
|
||||
## High availability
|
||||
|
||||
* Install multiple VictoriaMetrics instances in distinct datacenters (availability zones).
|
||||
|
@ -1686,7 +1705,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
|
|||
-influxDBLabel string
|
||||
Default label for the DB name sent over '?db={db_name}' query parameter (default "db")
|
||||
-influxListenAddr string
|
||||
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write
|
||||
TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://<victoriametrics>:8428/write
|
||||
-influxMeasurementFieldSeparator string
|
||||
Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_")
|
||||
-influxSkipMeasurement
|
||||
|
@ -1749,7 +1768,7 @@ Pass `-help` to VictoriaMetrics in order to see the list of supported command-li
|
|||
-promscrape.cluster.membersCount int
|
||||
The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets
|
||||
-promscrape.cluster.replicationFactor int
|
||||
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
|
||||
The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 1, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1)
|
||||
-promscrape.config string
|
||||
Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details
|
||||
-promscrape.config.dryRun
|
||||
|
|
5
docs/_includes/img.html
Normal file
5
docs/_includes/img.html
Normal file
|
@ -0,0 +1,5 @@
|
|||
<p style="text-align: center">
|
||||
<a href="{{ include.href }}" target="_blank">
|
||||
<img src="{{ include.href }}">
|
||||
</a>
|
||||
</p>
|
BIN
docs/guides/migrate-from-influx-data-sample-in-influx.png
Normal file
BIN
docs/guides/migrate-from-influx-data-sample-in-influx.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 25 KiB |
BIN
docs/guides/migrate-from-influx-data-sample-in-vm.png
Normal file
BIN
docs/guides/migrate-from-influx-data-sample-in-vm.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 21 KiB |
BIN
docs/guides/migrate-from-influx-vmui.png
Normal file
BIN
docs/guides/migrate-from-influx-vmui.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 124 KiB |
270
docs/guides/migrate-from-influx.md
Normal file
270
docs/guides/migrate-from-influx.md
Normal file
|
@ -0,0 +1,270 @@
|
|||
# Migrate from InfluxDB to VictoriaMetrics
|
||||
|
||||
InfluxDB is a well-known time series database built for
|
||||
[IoT](https://en.wikipedia.org/wiki/Internet_of_things) monitoring, Application Performance Monitoring (APM) and
|
||||
analytics. It has its query language, unique data model, and rich tooling for collecting and processing metrics.
|
||||
|
||||
Nowadays, the volume of time series data grows constantly, as well as requirements for durable time series storage. And
|
||||
sometimes old known solutions just can't keep up with the new expectations.
|
||||
|
||||
VictoriaMetrics is a high-performance opensource time series database specifically designed to deal with huge volumes of
|
||||
monitoring data while remaining cost-efficient at the same time. Many companies are choosing to migrate from InfluxDB to
|
||||
VictoriaMetrics specifically for performance and scalability reasons. Along them see case studies provided by
|
||||
[ARNES](https://docs.victoriametrics.com/CaseStudies.html#arnes)
|
||||
and [Brandwatch](https://docs.victoriametrics.com/CaseStudies.html#brandwatch).
|
||||
|
||||
This guide will cover the differences between two solutions, most commonly asked questions, and approaches for migrating
|
||||
from InfluxDB to VictoriaMetrics.
|
||||
|
||||
## Data model differences
|
||||
|
||||
While readers are likely familiar
|
||||
with [InfluxDB key concepts](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/), the data model of
|
||||
VictoriaMetrics is something [new to explore](https://docs.victoriametrics.com/keyConcepts.html#data-model). Let's start
|
||||
with similarities and differences:
|
||||
|
||||
* both solutions are **schemaless**, which means there is no need to define metrics or their tags in advance;
|
||||
* multi-dimensional data support is implemented
|
||||
via [tags](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#tags)
|
||||
in InfluxDB and via [labels](https://docs.victoriametrics.com/keyConcepts.html#structure-of-a-metric) in
|
||||
VictoriaMetrics. However, labels in VictoriaMetrics are always `strings`, while InfluxDB supports multiple data types;
|
||||
* timestamps are stored with nanosecond resolution in InfluxDB, while in VictoriaMetrics it is **milliseconds**;
|
||||
* in VictoriaMetrics metric's value is always `float64`, while InfluxDB supports multiple data types.
|
||||
* there are
|
||||
no [measurements](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#measurement)
|
||||
or [fields](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#field-key) in
|
||||
VictoriaMetrics, metric name contains it all. If measurement contains more than 1 field, then for VictoriaMetrics
|
||||
it will be multiple metrics;
|
||||
* there are no [buckets](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#bucket)
|
||||
or [organizations](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#organization), all
|
||||
data in VictoriaMetrics is stored in a global namespace or within
|
||||
a [tenant](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy).
|
||||
|
||||
Let's consider the
|
||||
following [sample data](https://docs.influxdata.com/influxdb/v2.2/reference/key-concepts/data-elements/#sample-data)
|
||||
borrowed from InfluxDB docs as an example:
|
||||
|
||||
| _measurement | _field | location | scientist | _value | _time |
|
||||
|--------------|--------|----------|-------------|--------|----------------------|
|
||||
| census | bees | klamath | anderson | 23 | 2019-08-18T00:00:00Z |
|
||||
| census | ants | portland | mullen | 30 | 2019-08-18T00:00:00Z |
|
||||
| census | bees | klamath | anderson | 28 | 2019-08-18T00:06:00Z |
|
||||
| census | ants | portland | mullen | 32 | 2019-08-18T00:06:00Z |
|
||||
|
||||
In VictoriaMetrics data model this sample will have the following form:
|
||||
|
||||
| metric name | labels | value | time |
|
||||
|-------------|:---------------------------------------------|-------|----------------------|
|
||||
| census_bees | {location="klamath", scientist="anderson"} | 23 | 2019-08-18T00:00:00Z |
|
||||
| census_ants | {location="portland", scientist="mullen"} | 30 | 2019-08-18T00:00:00Z |
|
||||
| census_bees | {location="klamath", scientist="anderson"} | 28 | 2019-08-18T00:06:00Z |
|
||||
| census_ants | {location="portland", scientist="mullen"} | 32 | 2019-08-18T00:06:00Z |
|
||||
|
||||
Actually, metric name for VictoriaMetrics is also a label with static name `__name__`, and example above can be
|
||||
converted to `{__name__="census_bees", location="klamath", scientist="anderson"}`. All labels are indexed by
|
||||
VictoriaMetrics, so lookups by names or labels have the same query speed.
|
||||
|
||||
|
||||
## Write data
|
||||
|
||||
VictoriaMetrics
|
||||
supports [InfluxDB line protocol](https://docs.victoriametrics.com/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
|
||||
for data ingestion. For example, to write a measurement to VictoriaMetrics we need to send an HTTP POST request with
|
||||
payload in a line protocol format:
|
||||
|
||||
```bash
|
||||
curl -d 'census,location=klamath,scientist=anderson bees=23 1566079200000' -X POST 'http://<victoriametric-addr>:8428/write'
|
||||
```
|
||||
|
||||
_hint: timestamp in the example might be out of configured retention for VictoriaMetrics. Consider increasing the
|
||||
retention period or changing the timestamp, if that is the case._
|
||||
|
||||
Please note, an arbitrary number of lines delimited by `\n` (aka newline char) can be sent in a single request.
|
||||
|
||||
To get the written data back let's export all series matching the `location="klamath"` filter:
|
||||
|
||||
```bash
|
||||
curl -G 'http://<victoriametric-addr>:8428/api/v1/export' -d 'match={location="klamath"}'
|
||||
```
|
||||
|
||||
The expected response is the following:
|
||||
|
||||
```json
|
||||
{
|
||||
"metric": {
|
||||
"__name__": "census_bees",
|
||||
"location": "klamath",
|
||||
"scientist": "anderson"
|
||||
},
|
||||
"values": [
|
||||
23
|
||||
],
|
||||
"timestamps": [
|
||||
1566079200000
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Please note, VictoriaMetrics performed additional
|
||||
[data mapping](https://docs.victoriametrics.com/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
|
||||
to the data ingested via InfluxDB line protocol.
|
||||
|
||||
Support of InfluxDB line protocol also means VictoriaMetrics is compatible with
|
||||
[Telegraf](https://github.com/influxdata/telegraf). To configure Telegraf, simply
|
||||
add `http://<victoriametric-addr>:8428` URL to Telegraf configs:
|
||||
|
||||
```
|
||||
[[outputs.influxdb]]
|
||||
urls = ["http://<victoriametrics-addr>:8428"]
|
||||
```
|
||||
|
||||
In addition to InfluxDB line protocol, VictoriaMetrics supports many other ways for
|
||||
[metrics collection](https://docs.victoriametrics.com/keyConcepts.html#write-data).
|
||||
|
||||
## Query data
|
||||
|
||||
VictoriaMetrics does not have a com\mand-line interface (CLI). Instead, it provides
|
||||
an [HTTP API](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#prometheus-querying-api-usage)
|
||||
for serving read queries. This API is used in various integrations such as
|
||||
[Grafana](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#grafana-setup). The same API is also used
|
||||
by [VMUI](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#vmui) - a graphical User Interface for
|
||||
querying and visualizing metrics:
|
||||
|
||||
{% include img.html href="migrate-from-influx-vmui.png" %}
|
||||
|
||||
See more about [how to query data in VictoriaMetrics](https://docs.victoriametrics.com/keyConcepts.html#query-data).
|
||||
|
||||
### Basic concepts
|
||||
|
||||
Let's take a closer look at querying specific with the following data sample:
|
||||
|
||||
```sql
|
||||
foo
|
||||
,instance=localhost bar=1.00 1652169600000000000
|
||||
foo,instance=localhost bar=2.00 1652169660000000000
|
||||
foo,instance=localhost bar=3.00 1652169720000000000
|
||||
foo,instance=localhost bar=5.00 1652169840000000000
|
||||
foo,instance=localhost bar=5.50 1652169960000000000
|
||||
foo,instance=localhost bar=5.50 1652170020000000000
|
||||
foo,instance=localhost bar=4.00 1652170080000000000
|
||||
foo,instance=localhost bar=3.50 1652170260000000000
|
||||
foo,instance=localhost bar=3.25 1652170320000000000
|
||||
foo,instance=localhost bar=3.00 1652170380000000000
|
||||
foo,instance=localhost bar=2.00 1652170440000000000
|
||||
foo,instance=localhost bar=1.00 1652170500000000000
|
||||
foo,instance=localhost bar=4.00 1652170560000000000
|
||||
```
|
||||
|
||||
The data sample consists data points for a measurement `foo`
|
||||
and a field `bar` with additional tag `instance=localhost`. If we would like plot this data as a time series in Grafana
|
||||
it might have the following look:
|
||||
|
||||
{% include img.html href="migrate-from-influx-data-sample-in-influx.png" %}
|
||||
|
||||
The query used for this panel is written in
|
||||
[InfluxQL](https://docs.influxdata.com/influxdb/v1.8/query_language/):
|
||||
|
||||
```sql
|
||||
SELECT last ("bar")
|
||||
FROM "foo"
|
||||
WHERE ("instance" = 'localhost')
|
||||
AND $timeFilter
|
||||
GROUP BY time (1m)
|
||||
```
|
||||
|
||||
Having this, let's import the same data sample in VictoriaMetrics and plot it in Grafana as well. To understand how the
|
||||
InfluxQL query might be translated to MetricsQL let's break it into components first:
|
||||
|
||||
* `SELECT last("bar") FROM "foo"` - all requests
|
||||
to [instant](https://docs.victoriametrics.com/keyConcepts.html#instant-query)
|
||||
or [range](https://docs.victoriametrics.com/keyConcepts.html#range-query) VictoriaMetrics APIs are reads, so no need
|
||||
to specify the `SELECT` statement. There are no `measurements` or `fields` in VictoriaMetrics, so the whole expression
|
||||
can be replaced with `foo_bar` in MetricsQL;
|
||||
* `WHERE ("instance" = 'localhost')`- [filtering by labels](https://docs.victoriametrics.com/keyConcepts.html#filtering)
|
||||
in MetricsQL requires specifying the filter in curly braces next to the metric name. So in MetricsQL filter expression
|
||||
will be translated to `{instance="localhost"}`;
|
||||
* `WHERE $timeFilter` - filtering by time is done via request params sent along with query, so in MetricsQL no need to
|
||||
specify this filter;
|
||||
* `GROUP BY time(1m)` - grouping by time is done by default
|
||||
in [range](https://docs.victoriametrics.com/keyConcepts.html#range-query) API according to specified `step` param.
|
||||
This param is also a part of params sent along with request. See how to perform additional
|
||||
[aggregations and grouping via MetricsQL](https://docs.victoriametrics.com/keyConcepts.html#aggregation-and-grouping-functions)
|
||||
.
|
||||
|
||||
In result, executing the `foo_bar{instance="localhost"}` MetricsQL expression with `step=1m` for the same set of data in
|
||||
Grafana will have the following form:
|
||||
|
||||
{% include img.html href="migrate-from-influx-data-sample-in-vm.png" %}
|
||||
|
||||
It is noticeable that visualizations from both databases are a bit different - VictoriaMetrics shows some extra points
|
||||
filling the gaps in the graph. This behavior is described in more
|
||||
detail [here](https://docs.victoriametrics.com/keyConcepts.html#range-query). In InfluxDB, we can achieve a similar
|
||||
behavior by adding `fill(previous)` to the query.
|
||||
|
||||
### Advanced usage
|
||||
|
||||
The good thing is that knowing the basics and some aggregation functions is often enough for using MetricsQL or PromQL.
|
||||
Let's consider one of the most popular Grafana
|
||||
dashboards [Node Exporter Full](https://grafana.com/grafana/dashboards/1860). It has almost 15 million downloads and
|
||||
about 230 PromQL queries in it! But a closer look at those queries shows the following:
|
||||
|
||||
* ~120 queries are just selecting a metric with label filters,
|
||||
e.g. `node_textfile_scrape_error{instance="$node",job="$job"}`;
|
||||
* ~80 queries are using [rate](https://docs.victoriametrics.com/MetricsQL.html#rate) function for selected metric,
|
||||
e.g. `rate(node_netstat_Tcp_InSegs{instance=\"$node\",job=\"$job\"})`
|
||||
* and the rest
|
||||
are [aggregation functions](https://docs.victoriametrics.com/keyConcepts.html#aggregation-and-grouping-functions)
|
||||
like [sum](https://docs.victoriametrics.com/MetricsQL.html#sum)
|
||||
or [count](https://docs.victoriametrics.com/MetricsQL.html#count).
|
||||
|
||||
To get a better understanding of how MetricsQL works, see the following resources:
|
||||
|
||||
* [MetricsQL concepts](https://docs.victoriametrics.com/keyConcepts.html#metricsql);
|
||||
* [MetricsQL functions](https://docs.victoriametrics.com/MetricsQL.html);
|
||||
* [PromQL tutorial for beginners](https://valyala.medium.com/promql-tutorial-for-beginners-9ab455142085).
|
||||
|
||||
## How to migrate current data from InfluxDB to VictoriaMetrics
|
||||
|
||||
Migrating data from other TSDBs to VictoriaMetrics is as simple as importing data via any of
|
||||
[supported formats](https://docs.victoriametrics.com/keyConcepts.html#push-model).
|
||||
|
||||
But migration from InfluxDB might get easier when using [vmctl](https://docs.victoriametrics.com/vmctl.html) -
|
||||
VictoriaMetrics command-line tool. See more about
|
||||
migrating [from InfluxDB v1.x versions](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-influxdb-1x).
|
||||
Migrating data from InfluxDB v2.x is not supported yet. But there is
|
||||
useful [3rd party solution]((https://docs.victoriametrics.com/vmctl.html#migrating-data-from-influxdb-2x)) for this.
|
||||
|
||||
Please note, that data migration is a backfilling process. So, please
|
||||
consider [backfilling tips](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#backfilling).
|
||||
|
||||
## Frequently asked questions
|
||||
|
||||
* How does VictoriaMetrics compare to InfluxDB?
|
||||
* _[Answer](https://docs.victoriametrics.com/FAQ.html#how-does-victoriametrics-compare-to-influxdb)_
|
||||
* Why don't VictoriaMetrics support Remote Read API, so I don't need to learn MetricsQL?
|
||||
* _[Answer](https://docs.victoriametrics.com/FAQ.html#why-doesnt-victoriametrics-support-the-prometheus-remote-read-api)_
|
||||
* The PromQL and MetricsQL are often mentioned together - why is that?
|
||||
* _MetricsQL - query language inspired by PromQL. MetricsQL is backward-compatible with PromQL, so Grafana
|
||||
dashboards backed by Prometheus datasource should work the same after switching from Prometheus to
|
||||
VictoriaMetrics. Both languages mostly share the same concepts with slight differences._
|
||||
* Query returns more data points than expected - why?
|
||||
* _VictoriaMetrics may return non-existing data points if `step` param is lower than the actual data resolution. See
|
||||
more about this [here](https://docs.victoriametrics.com/keyConcepts.html#range-query)._
|
||||
* How do I get the `real` last data point, not `ephemeral`?
|
||||
* _[last_over_time](https://docs.victoriametrics.com/MetricsQL.html#last_over_time) function can be used for
|
||||
limiting the lookbehind window for calculated data. For example, `last_over_time(metric[10s])` would return
|
||||
calculated samples only if the real samples are located closer than 10 seconds to the calculated timestamps
|
||||
according to
|
||||
`start`, `end` and `step` query args passed
|
||||
to [range query](https://docs.victoriametrics.com/keyConcepts.html#range-query)._
|
||||
* How do I get raw data points with MetricsQL?
|
||||
* _For getting raw data points specify the interval at which you want them in square brackets and send
|
||||
as [instant query](https://docs.victoriametrics.com/keyConcepts.html#instant-query). For
|
||||
example, `GET api/v1/query?query="my_metric[5m]"&time=<time>` will return raw samples for `my_metric` in interval
|
||||
from `<time>` to `<time>-5m`._
|
||||
* Can you have multiple aggregators in a MetricsQL query, e.g. `SELECT MAX(field), MIN(field) ...`?
|
||||
* _Yes, try the following query `( alias(max(field), "max"), alias(min(field), "min") )`._
|
||||
* How to translate Influx `percentile` function to MetricsQL?
|
||||
* _[Answer](https://stackoverflow.com/questions/66431990/translate-influx-percentile-function-to-promqlb)_
|
||||
* How to translate Influx `stddev` function to MetricsQL?
|
||||
* _[Answer](https://stackoverflow.com/questions/66433143/translate-influx-stddev-to-promql)_
|
Some files were not shown because too many files have changed in this diff Show more
Loading…
Reference in a new issue