diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index 9ba36fcc0..33bd9f982 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -68,9 +68,9 @@ members of the project's leadership. ## Attribution This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, -available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html +available at [homepage]: https://www.contributor-covenant.org For answers to common questions about this code of conduct, see -https://www.contributor-covenant.org/faq + diff --git a/CODE_OF_CONDUCT_RU.md b/CODE_OF_CONDUCT_RU.md index 26a5ecf4c..312f2011d 100644 --- a/CODE_OF_CONDUCT_RU.md +++ b/CODE_OF_CONDUCT_RU.md @@ -79,7 +79,7 @@ **Последствия**: Предупреждение о последствиях в случае продолжающегося неуместного поведения. На определенное время не допускается взаимодействие с людьми, вовлеченными в инцидент, -включая незапрошенное взаимодействие +включая незапрошенное взаимодействие с теми, кто обеспечивает соблюдение Кодекса. Это включает в себя избегание взаимодействия в публичных пространствах, а так же во внешних каналах, таких как социальные сети. Нарушение этих правил влечет за собой временный или вечный бан. @@ -89,10 +89,10 @@ **Общественное влияние**: Серьёзное нарушение стандартов сообщества, включая продолжительное неуместное поведение. -**Последствия**: Временный запрет (бан) на любое взаимодействие +**Последствия**: Временный запрет (бан) на любое взаимодействие или публичное общение с сообществом на определенный период времени. На этот период не допускается публичное или личное взаимодействие с людьми, -вовлеченными в инцидент, включая незапрошенное взаимодействие +вовлеченными в инцидент, включая незапрошенное взаимодействие с теми, кто обеспечивает соблюдение Кодекса. Нарушение этих правил влечет за собой вечный бан. @@ -108,7 +108,7 @@ Данный Кодекс Поведения основан на [Кодекс Поведения участника][homepage], версии 2.0, доступной по адресу -https://www.contributor-covenant.org/version/2/0/code_of_conduct.html. +. Принципы Воздействия в Сообществе были вдохновлены [Mozilla's code of conduct enforcement ladder](https://github.com/mozilla/diversity). @@ -116,5 +116,5 @@ enforcement ladder](https://github.com/mozilla/diversity). [homepage]: https://www.contributor-covenant.org Ответы на общие вопросы о данном кодексе поведения ищите на странице FAQ: -https://www.contributor-covenant.org/faq. Переводы доступны по адресу -https://www.contributor-covenant.org/translations. +. Переводы доступны по адресу +. diff --git a/README.md b/README.md index 6d950b239..136070a37 100644 --- a/README.md +++ b/README.md @@ -21,7 +21,6 @@ Cluster version of VictoriaMetrics is available [here](https://docs.victoriametr [Contact us](mailto:info@victoriametrics.com) if you need enterprise support for VictoriaMetrics. See [features available in enterprise package](https://victoriametrics.com/products/enterprise/). Enterprise binaries can be downloaded and evaluated for free from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). - ## Prominent features VictoriaMetrics has the following prominent features: @@ -61,7 +60,6 @@ VictoriaMetrics has the following prominent features: See also [various Articles about VictoriaMetrics](https://docs.victoriametrics.com/Articles.html). - ## Case studies and talks Case studies: @@ -92,7 +90,6 @@ Case studies: See also [articles and slides about VictoriaMetrics from our users](https://docs.victoriametrics.com/Articles.html#third-party-articles-and-slides-about-victoriametrics) - ## Operation ## How to start VictoriaMetrics @@ -112,7 +109,6 @@ VictoriaMetrics accepts [Prometheus querying API requests](#prometheus-querying- It is recommended setting up [monitoring](#monitoring) for VictoriaMetrics. - ### Environment variables Each flag value can be set via environment variables according to these rules: @@ -122,10 +118,8 @@ Each flag value can be set via environment variables according to these rules: * For repeating flags an alternative syntax can be used by joining the different values into one using `,` char as separator (for example `-storageNode -storageNode ` will translate to `storageNode=,`). * Environment var prefix can be set via `-envflag.prefix` flag. For instance, if `-envflag.prefix=VM_`, then env vars must be prepended with `VM_`. - ### Configuration with snap package - Snap package for VictoriaMetrics is available [here](https://snapcraft.io/victoriametrics). Command-line flags for Snap package can be set with following command: @@ -137,7 +131,6 @@ snap restart victoriametrics Do not change value for `-storageDataPath` flag, because snap package has limited access to host filesystem. - Changing scrape configuration is possible with text editor: ```text @@ -146,7 +139,6 @@ vi $SNAP_DATA/var/snap/victoriametrics/current/etc/victoriametrics-scrape-config After changes were made, trigger config re-read with the command `curl 127.0.0.1:8248/-/reload`. - ## Prometheus setup Add the following lines to Prometheus config file (it is usually located at `/etc/prometheus/prometheus.yml`) in order to send data to VictoriaMetrics: @@ -200,7 +192,6 @@ It is recommended upgrading Prometheus to [v2.12.0](https://github.com/prometheu Take a look also at [vmagent](https://docs.victoriametrics.com/vmagent.html) and [vmalert](https://docs.victoriametrics.com/vmalert.html), which can be used as faster and less resource-hungry alternative to Prometheus. - ## Grafana setup Create [Prometheus datasource](http://docs.grafana.org/features/datasources/prometheus/) in Grafana with the following url: @@ -213,7 +204,6 @@ Substitute `` with the hostname or IP address of VictoriaM Then build graphs and dashboards for the created datasource using [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/) or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html). - ## How to upgrade VictoriaMetrics It is safe upgrading VictoriaMetrics to new versions unless [release notes](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) say otherwise. It is safe skipping multiple versions during the upgrade unless [release notes](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) say otherwise. It is recommended performing regular upgrades to the latest version, since it may contain important bug fixes, performance optimizations or new features. @@ -228,7 +218,6 @@ The following steps must be performed during the upgrade / downgrade procedure: Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details. The same applies also to [vmagent](https://docs.victoriametrics.com/vmagent.html). - ## How to apply new config to VictoriaMetrics VictoriaMetrics is configured via command-line flags, so it must be restarted when new command-line flags should be applied: @@ -239,7 +228,6 @@ VictoriaMetrics is configured via command-line flags, so it must be restarted wh Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details. The same applies alos to [vmagent](https://docs.victoriametrics.com/vmagent.html). - ## How to scrape Prometheus exporters such as [node-exporter](https://github.com/prometheus/node_exporter) VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping targets configured in `prometheus.yml` config file according to [the specification](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file). Just set `-promscrape.config` command-line flag to the path to `prometheus.yml` config - and VictoriaMetrics should start scraping the configured targets. Currently the following [scrape_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) types are supported: @@ -258,7 +246,6 @@ VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping t * [digitalocean_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config) * [http_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config) - File a [feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you need support for other `*_sd_config` types. The file pointed by `-promscrape.config` may contain `%{ENV_VAR}` placeholders, which are substituted by the corresponding `ENV_VAR` environment variable values. @@ -267,7 +254,6 @@ VictoriaMetrics also supports [importing data in Prometheus exposition format](# See also [vmagent](https://docs.victoriametrics.com/vmagent.html), which can be used as drop-in replacement for Prometheus. - ## How to send data from DataDog agent VictoriaMetrics accepts data from [DataDog agent](https://docs.datadoghq.com/agent/) or [DogStatsD]() via ["submit metrics" API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics) at `/datadog/api/v1/series` path. @@ -315,7 +301,6 @@ This command should return the following output if everything is OK: Extra labels may be added to all the written time series by passing `extra_label=name=value` query args. For example, `/datadog/api/v1/series?extra_label=foo=bar` would add `{foo="bar"}` label to all the ingested metrics. - ## How to send data from InfluxDB-compatible agents such as [Telegraf](https://www.influxdata.com/time-series-platform/telegraf/) Use `http://:8428` url instead of InfluxDB url in agents' configs. @@ -503,7 +488,6 @@ The `/api/v1/export` endpoint should return the following response: Extra labels may be added to all the imported time series by passing `extra_label=name=value` query args. For example, `/api/put?extra_label=foo=bar` would add `{foo="bar"}` label to all the ingested metrics. - ## Prometheus querying API usage VictoriaMetrics supports the following handlers from [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/): @@ -519,7 +503,6 @@ VictoriaMetrics supports the following handlers from [Prometheus querying API](h These handlers can be queried from Prometheus-compatible clients such as Grafana or curl. All the Prometheus querying API handlers can be prepended with `/prometheus` prefix. For example, both `/prometheus/api/v1/query` and `/api/v1/query` should work. - ### Prometheus querying API enhancements VictoriaMetrics accepts optional `extra_label==` query arg, which can be used for enforcing additional label filters for queries. For example, @@ -552,7 +535,6 @@ Additionally VictoriaMetrics provides the following handlers: For example, request to `/api/v1/status/top_queries?topN=5&maxLifetime=30s` would return up to 5 queries per list, which were executed during the last 30 seconds. VictoriaMetrics tracks the last `-search.queryStats.lastQueriesCount` queries with durations at least `-search.queryStats.minQueryDuration`. - ## Graphite API usage VictoriaMetrics supports the following Graphite APIs, which are needed for [Graphite datasource in Grafana](https://grafana.com/docs/grafana/latest/datasources/graphite/): @@ -569,7 +551,6 @@ VictoriaMetrics accepts optional query args: `extra_label==:8428/api/v1/export?match[]=`, @@ -815,7 +788,6 @@ Exported data can be imported via POST'ing it to [/api/v1/import](#how-to-import The [deduplication](#deduplication) is applied to the data exported via `/api/v1/export` by default. The deduplication isn't applied if `reduce_mem_usage=1` query arg is passed to the request. - ### How to export CSV data Send a request to `http://:8428/api/v1/export/csv?format=&match=`, @@ -841,7 +813,6 @@ The exported CSV data can be imported to VictoriaMetrics via [/api/v1/import/csv The [deduplication](#deduplication) is applied for the data exported in CSV by default. It is possible to export raw data without de-duplication by passing `reduce_mem_usage=1` query arg to `/api/v1/export/csv`. - ### How to export data in native format Send a request to `http://:8428/api/v1/export/native?match[]=`, @@ -866,7 +837,6 @@ can fail to be imported into VictoriaMetrics release Y. The [deduplication](#deduplication) isn't applied for the data exported in native format. It is expected that the de-duplication is performed during data import. - ## How to import time series data Time series data can be imported into VictoriaMetrics via any supported ingestion protocol: @@ -884,7 +854,6 @@ Time series data can be imported into VictoriaMetrics via any supported ingestio * `/api/v1/import/csv` for importing arbitrary CSV data. See [these docs](#how-to-import-csv-data) for details. * `/api/v1/import/prometheus` for importing data in Prometheus exposition format. See [these docs](#how-to-import-data-in-prometheus-exposition-format) for details. - ### How to import data in JSON line format Example for importing data obtained via [/api/v1/export](#how-to-export-data-in-json-line-format): @@ -914,7 +883,6 @@ Note that it could be required to flush response cache after importing historica VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage. This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines. The solution is to split too long JSON lines into smaller lines. It is OK if samples for a single time series are split among multiple JSON lines. - ### How to import data in native format The specification of VictoriaMetrics' native format may yet change and is not formally documented yet. So currently we do not recommend that external clients attempt to pack their own metrics in native format file. @@ -934,7 +902,6 @@ For example, `/api/v1/import/native?extra_label=foo=bar` would add `"foo":"bar"` Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail. - ### How to import CSV data Arbitrary CSV data can be imported via `/api/v1/import/csv`. The CSV data is imported according to the provided `format` query arg. @@ -975,6 +942,7 @@ curl -G 'http://localhost:8428/api/v1/export' -d 'match[]={ticker!=""}' ``` The following response should be returned: + ```bash {"metric":{"__name__":"bid","market":"NASDAQ","ticker":"MSFT"},"values":[1.67],"timestamps":[1583865146520]} {"metric":{"__name__":"bid","market":"NYSE","ticker":"GOOG"},"values":[4.56],"timestamps":[1583865146495]} @@ -987,7 +955,6 @@ For example, `/api/v1/import/csv?extra_label=foo=bar` would add `"foo":"bar"` la Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail. - ### How to import data in Prometheus exposition format VictoriaMetrics accepts data in [Prometheus exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format) @@ -1029,8 +996,6 @@ Note that it could be required to flush response cache after importing historica VictoriaMetrics also may scrape Prometheus targets - see [these docs](#how-to-scrape-prometheus-exporters-such-as-node-exporter). - - ## Relabeling VictoriaMetrics supports Prometheus-compatible relabeling for all the ingested metrics if `-relabelConfig` command-line flag points @@ -1039,6 +1004,7 @@ The `-relabelConfig` also can point to http or https url. For example, `-relabel See [this article with relabeling tips and tricks](https://valyala.medium.com/how-to-use-relabeling-in-prometheus-and-victoriametrics-8b90fc22c4b2). Example contents for `-relabelConfig` file: + ```yml # Add {cluster="dev"} label. - target_label: cluster @@ -1052,7 +1018,6 @@ Example contents for `-relabelConfig` file: See [these docs](https://docs.victoriametrics.com/vmagent.html#relabeling) for more details about relabeling in VictoriaMetrics. - ## Federation VictoriaMetrics exports [Prometheus-compatible federation data](https://prometheus.io/docs/prometheus/latest/federation/) @@ -1064,7 +1029,6 @@ on the interval `[now - max_lookback ... now]` is scraped for each time series. For instance, `/federate?match[]=up&max_lookback=1h` would return last points on the `[now - 1h ... now]` interval. This may be useful for time series federation with scrape intervals exceeding `5m`. - ## Capacity planning VictoriaMetrics uses lower amounts of CPU, RAM and storage space on production workloads compared to competing solutions (Prometheus, Thanos, Cortex, TimescaleDB, InfluxDB, QuestDB, M3DB) according to [our case studies](https://docs.victoriametrics.com/CaseStudies.html). @@ -1087,7 +1051,6 @@ It is recommended leaving the following amounts of spare resources: * 50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload. * At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag. See also `-storage.minFreeDiskSpaceBytes` command-line flag description [here](#list-of-command-line-flags). - ## High availability * Install multiple VictoriaMetrics instances in distinct datacenters (availability zones). @@ -1128,7 +1091,6 @@ to write data to `victoriametrics-addr-1`, while each `r2` should write data to Another option is to write data simultaneously from Prometheus HA pair to a pair of VictoriaMetrics instances with the enabled de-duplication. See [this section](#deduplication) for details. - ## Deduplication VictoriaMetrics de-duplicates data points if `-dedup.minScrapeInterval` command-line flag is set to positive duration. For example, `-dedup.minScrapeInterval=60s` would de-duplicate data points on the same time series if they fall within the same discrete 60s bucket. The earliest data point will be kept. In the case of equal timestamps, an arbitrary data point will be kept. See [this comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2112#issuecomment-1032587618) for more details on how downsampling works. @@ -1141,34 +1103,34 @@ The de-duplication reduces disk space usage if multiple identically configured [ write data to the same VictoriaMetrics instance. These vmagent or Prometheus instances must have identical `external_labels` section in their configs, so they write data to the same time series. - ## Storage -VictoriaMetrics stores time series data in [MergeTree](https://en.wikipedia.org/wiki/Log-structured_merge-tree)-like +VictoriaMetrics stores time series data in [MergeTree](https://en.wikipedia.org/wiki/Log-structured_merge-tree)-like data structures. On insert, VictoriaMetrics accumulates up to 1s of data and dumps it on disk to -`<-storageDataPath>/data/small/YYYY_MM/` subdirectory forming a `part` with the following +`<-storageDataPath>/data/small/YYYY_MM/` subdirectory forming a `part` with the following name pattern: `rowsCount_blocksCount_minTimestamp_maxTimestamp`. Each part consists of two "columns": values and timestamps. These are sorted and compressed raw time series values. Additionally, part contains index files for searching for specific series in the values and timestamps files. -`Parts` are periodically merged into the bigger parts. The resulting `part` is constructed -under `<-storageDataPath>/data/{small,big}/YYYY_MM/tmp` subdirectory. When the resulting `part` is complete, it is atomically moved from the `tmp` -to its own subdirectory, while the source parts are atomically removed. The end result is that the source +`Parts` are periodically merged into the bigger parts. The resulting `part` is constructed +under `<-storageDataPath>/data/{small,big}/YYYY_MM/tmp` subdirectory. When the resulting `part` is complete, it is atomically moved from the `tmp` +to its own subdirectory, while the source parts are atomically removed. The end result is that the source parts are substituted by a single resulting bigger `part` in the `<-storageDataPath>/data/{small,big}/YYYY_MM/` directory. -Information about merging process is available in [single-node VictoriaMetrics](https://grafana.com/dashboards/10229) -and [clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11176) Grafana dashboards. +Information about merging process is available in [single-node VictoriaMetrics](https://grafana.com/dashboards/10229) +and [clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11176) Grafana dashboards. See more details in [monitoring docs](#monitoring). -The `merge` process is usually named "compaction", because the resulting `part` size is usually smaller than +The `merge` process is usually named "compaction", because the resulting `part` size is usually smaller than the sum of the source `parts`. There are following benefits of doing the merge process: + * it improves query performance, since lower number of `parts` are inspected with each query; -* it reduces the number of data files, since each `part`contains fixed number of files; +* it reduces the number of data files, since each `part`contains fixed number of files; * better compression rate for the resulting part. -Newly added `parts` either appear in the storage or fail to appear. -Storage never contains partially created parts. The same applies to merge process — `parts` are either fully -merged into a new `part` or fail to merge. There are no partially merged `parts` in MergeTree. -`Part` contents in MergeTree never change. Parts are immutable. They may be only deleted after the merge +Newly added `parts` either appear in the storage or fail to appear. +Storage never contains partially created parts. The same applies to merge process — `parts` are either fully +merged into a new `part` or fail to merge. There are no partially merged `parts` in MergeTree. +`Part` contents in MergeTree never change. Parts are immutable. They may be only deleted after the merge to a bigger `part` or when the `part` contents goes outside the configured `-retentionPeriod`. See [this article](https://valyala.medium.com/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) for more details. @@ -1182,7 +1144,7 @@ Retention is configured with the `-retentionPeriod` command-line flag, which tak Data is split in per-month partitions inside `<-storageDataPath>/data/{small,big}` folders. Data partitions outside the configured retention are deleted on the first day of the new month. Each partition consists of one or more data parts with the following name pattern `rowsCount_blocksCount_minTimestamp_maxTimestamp`. -Data parts outside of the configured retention are eventually deleted during +Data parts outside of the configured retention are eventually deleted during [background merge](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282). The maximum disk space usage for a given `-retentionPeriod` is going to be (`-retentionPeriod` + 1) months. @@ -1209,7 +1171,6 @@ so it could route requests from particular user to VictoriaMetrics with the desi The same scheme could be implemented for multiple tenants in [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html). See [these docs](https://docs.victoriametrics.com/guides/guide-vmcluster-multiple-retention-setup.html) for multi-retention setup details. - ## Downsampling [VictoriaMetrics Enterprise](https://victoriametrics.com/products/enterprise/) supports multi-level downsampling with `-downsampling.period` command-line flag. For example: @@ -1222,12 +1183,10 @@ Downsampling is applied independently per each time series. It can reduce disk s The downsampling can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). - ## Multi-tenancy Single-node VictoriaMetrics doesn't support multi-tenancy. Use [cluster version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy) instead. - ## Scalability and cluster version Though single-node VictoriaMetrics cannot scale to multiple nodes, it is optimized for resource usage - storage size / bandwidth / IOPS, RAM, CPU. @@ -1238,7 +1197,6 @@ So try single-node VictoriaMetrics at first and then [switch to cluster version] horizontally scalable long-term remote storage for really large Prometheus deployments. [Contact us](mailto:info@victoriametrics.com) for enterprise support. - ## Alerting It is recommended using [vmalert](https://docs.victoriametrics.com/vmalert.html) for alerting. @@ -1249,7 +1207,6 @@ Additionally, alerting can be set up with the following tools: * With Promxy - see [the corresponding docs](https://github.com/jacksontj/promxy/blob/master/README.md#how-do-i-use-alertingrecording-rules-in-promxy). * With Grafana - see [the corresponding docs](https://grafana.com/docs/alerting/rules/). - ## Security Do not forget protecting sensitive endpoints in VictoriaMetrics when exposing it to untrusted networks such as the internet. @@ -1263,6 +1220,7 @@ Consider setting the following command-line flags: * `-forceMergeAuthKey` for protecting `/internal/force_merge` endpoint. See [force merge docs](#forced-merge). * `-search.resetCacheAuthKey` for protecting `/internal/resetRollupResultCache` endpoint. See [backfilling](#backfilling) for more details. * `-configAuthKey` for protecting `/config` endpoint, since it may contain sensitive information such as passwords. + - `-pprofAuthKey` for protecting `/debug/pprof/*` endpoints, which can be used for [profiling](#profiling). Explicitly set internal network interface for TCP and UDP ports for data ingestion with Graphite and OpenTSDB formats. @@ -1271,7 +1229,6 @@ For example, substitute `-graphiteListenAddr=:2003` with `-graphiteListenAddr=/cache` directory during graceful shutdown (e.g. when VictoriaMetrics is stopped by sending `SIGINT` signal). The caches are read on the next VictoriaMetrics startup. Sometimes it is needed to remove such caches on the next startup. This can be performed by placing `reset_cache_on_startup` file inside the `<-storageDataPath>/cache` directory before the restart of VictoriaMetrics. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447) for details. - ## Cache tuning VictoriaMetrics uses various in-memory caches for faster data ingestion and query performance. The following metrics for each type of cache are exported at [`/metrics` page](#monitoring): -- `vm_cache_size_bytes` - the actual cache size -- `vm_cache_size_max_bytes` - cache size limit -- `vm_cache_requests_total` - the number of requests to the cache -- `vm_cache_misses_total` - the number of cache misses -- `vm_cache_entries` - the number of entries in the cache +* `vm_cache_size_bytes` - the actual cache size +* `vm_cache_size_max_bytes` - cache size limit +* `vm_cache_requests_total` - the number of requests to the cache +* `vm_cache_misses_total` - the number of cache misses +* `vm_cache_entries` - the number of entries in the cache Both Grafana dashboards for [single-node VictoriaMetrics](https://grafana.com/dashboards/10229) and [clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11176) @@ -1452,28 +1405,28 @@ practical scenarios and workloads. Change the defaults only if you understand th To override the default values see command-line flags with `-storage.cacheSize` prefix. See the full description of flags [here](#list-of-command-line-flags). - ## Data migration ### From VictoriaMetrics -The simplest way to migrate data from one single-node (source) to another (destination), or from one vmstorage node +The simplest way to migrate data from one single-node (source) to another (destination), or from one vmstorage node to another do the following: + 1. Stop the VictoriaMetrics (source) with `kill -INT`; -2. Copy (via [rsync](https://en.wikipedia.org/wiki/Rsync) or any other tool) the entire folder specified +2. Copy (via [rsync](https://en.wikipedia.org/wiki/Rsync) or any other tool) the entire folder specified via `-storageDataPath` from the source node to the empty folder at the destination node. -3. Once copy is done, stop the VictoriaMetrics (destination) with `kill -INT` and verify that +3. Once copy is done, stop the VictoriaMetrics (destination) with `kill -INT` and verify that its `-storageDataPath` points to the copied folder from p.2; 4. Start the VictoriaMetrics (destination). The copied data should be now available. Things to consider when copying data: + 1. Data formats between single-node and vmstorage node aren't compatible and can't be copied. 2. Copying data folder means complete replacement of the previous data on destination VictoriaMetrics. For more complex scenarios like single-to-cluster, cluster-to-single, re-sharding or migrating only a fraction of data - see [vmctl. Migrating data from VictoriaMetrics](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-victoriametrics). - ### From other systems Use [vmctl](https://docs.victoriametrics.com/vmctl.html) for data migration. It supports the following data migration types: @@ -1485,7 +1438,6 @@ Use [vmctl](https://docs.victoriametrics.com/vmctl.html) for data migration. It See [vmctl docs](https://docs.victoriametrics.com/vmctl.html) for more details. - ## Backfilling VictoriaMetrics accepts historical data in arbitrary order of time via [any supported ingestion method](#how-to-import-time-series-data). @@ -1503,7 +1455,6 @@ Yet another solution is to increase `-search.cacheTimestampOffset` flag value in for data with timestamps close to the current time. Single-node VictoriaMetrics automatically resets response cache when samples with timestamps older than `now - search.cacheTimestampOffset` are ingested to it. - ## Data updates VictoriaMetrics doesn't support updating already existing sample values to new ones. It stores all the ingested data points @@ -1511,7 +1462,6 @@ for the same time series with identical timestamps. While it is possible substit [removal of old time series](#how-to-delete-time-series) and then [writing new time series](#backfilling), this approach should be used only for one-off updates. It shouldn't be used for frequent updates because of non-zero overhead related to data removal. - ## Replication Single-node VictoriaMetrics doesn't support application-level replication. Use cluster version instead. @@ -1521,7 +1471,6 @@ Storage-level replication may be offloaded to durable persistent storage such as See also [high availability docs](#high-availability) and [backup docs](#backups). - ## Backups VictoriaMetrics supports backups via [vmbackup](https://docs.victoriametrics.com/vmbackup.html) @@ -1529,19 +1478,17 @@ and [vmrestore](https://docs.victoriametrics.com/vmrestore.html) tools. We also provide [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html) tool for enterprise subscribers. Enterprise binaries can be downloaded and evaluated for free from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). - ## Benchmarks -Note, that vendors (including VictoriaMetrics) are often biased when doing such tests. E.g. they try highlighting -the best parts of their product, while highlighting the worst parts of competing products. -So we encourage users and all independent third parties to conduct their becnhmarks for various products +Note, that vendors (including VictoriaMetrics) are often biased when doing such tests. E.g. they try highlighting +the best parts of their product, while highlighting the worst parts of competing products. +So we encourage users and all independent third parties to conduct their becnhmarks for various products they are evaluating in production and publish the results. As a reference, please see [benchmarks](https://docs.victoriametrics.com/Articles.html#benchmarks) conducted by -VictoriaMetrics team. Please also see the [helm chart](https://github.com/VictoriaMetrics/benchmark) +VictoriaMetrics team. Please also see the [helm chart](https://github.com/VictoriaMetrics/benchmark) for running ingestion benchmarks based on node_exporter metrics. - ## Profiling VictoriaMetrics provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs): @@ -1570,7 +1517,6 @@ The command for collecting CPU profile waits for 30 seconds before returning. The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). - ## Integrations * [Helm charts for single-node and cluster versions of VictoriaMetrics](https://github.com/VictoriaMetrics/helm-charts). @@ -1584,7 +1530,6 @@ The collected profiles may be analyzed with [go tool pprof](https://github.com/g * [Snap package for VictoriaMetrics](https://snapcraft.io/victoriametrics). * [vmalert-cli](https://github.com/aorfanos/vmalert-cli) - a CLI application for managing [vmalert](https://docs.victoriametrics.com/vmalert.html). - ## Third-party contributions * [Unofficial yum repository](https://copr.fedorainfracloud.org/coprs/antonpatsev/VictoriaMetrics/) ([source code](https://github.com/patsevanton/victoriametrics-rpm)) @@ -1592,12 +1537,10 @@ The collected profiles may be analyzed with [go tool pprof](https://github.com/g * [Prometheus -> VictoriaMetrics exporter #2](https://github.com/AnchorFree/tsdb-remote-write) * [Prometheus Oauth proxy](https://gitlab.com/optima_public/prometheus_oauth_proxy) - see [this article](https://medium.com/@richard.holly/powerful-saas-solution-for-detection-metrics-c67b9208d362) for details. - ## Contacts Contact us with any questions regarding VictoriaMetrics at [info@victoriametrics.com](mailto:info@victoriametrics.com). - ## Community and contributions Feel free asking any questions regarding VictoriaMetrics: @@ -1631,7 +1574,6 @@ Adhering `KISS` principle simplifies the resulting code and architecture, so it Report bugs and propose new features [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues). - ## VictoriaMetrics Logo [Zip](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/VM_logo.zip) contains three folders with different image orientations (main color and inverted version). @@ -1661,315 +1603,314 @@ Files included in each folder: * Do not change spacing, alignment, or relative locations of the design elements. * Do not change the proportions of any of the design elements or the design itself. You may resize as needed but must retain all proportions. - ## List of command-line flags Pass `-help` to VictoriaMetrics in order to see the list of supported command-line flags with their description: ``` -bigMergeConcurrency int - The maximum number of CPU cores to use for big merges. Default value is used if set to 0 + The maximum number of CPU cores to use for big merges. Default value is used if set to 0 -configAuthKey string - Authorization key for accessing /config page. It must be passed via authKey query arg + Authorization key for accessing /config page. It must be passed via authKey query arg -csvTrimTimestamp duration - Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -datadog.maxInsertRequestSize size - The maximum size in bytes of a single DataDog POST request to /api/v1/series - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864) + The maximum size in bytes of a single DataDog POST request to /api/v1/series + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864) -dedup.minScrapeInterval duration - Leave only the first sample in every time series per each discrete interval equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication and https://docs.victoriametrics.com/#downsampling + Leave only the first sample in every time series per each discrete interval equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication and https://docs.victoriametrics.com/#downsampling -deleteAuthKey string - authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series and /tags/delSeries + authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series and /tags/delSeries -denyQueriesOutsideRetention - Whether to deny queries outside of the configured -retentionPeriod. When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. This may be useful when multiple data sources with distinct retentions are hidden behind query-tee + Whether to deny queries outside of the configured -retentionPeriod. When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. This may be useful when multiple data sources with distinct retentions are hidden behind query-tee -downsampling.period array - Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details - Supports an array of values separated by comma or specified via multiple flags. + Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details + Supports an array of values separated by comma or specified via multiple flags. -dryRun - Whether to check only -promscrape.config and then exit. Unknown config entries aren't allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse=false command-line flag + Whether to check only -promscrape.config and then exit. Unknown config entries aren't allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse=false command-line flag -enableTCP6 - Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used + Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used -envflag.enable - Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details + Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details -envflag.prefix string - Prefix for environment variables if -envflag.enable is set + Prefix for environment variables if -envflag.enable is set -eula - By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf + By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf -finalMergeDelay duration - The delay before starting final merge for per-month partition after no new data is ingested into it. Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. Zero value disables final merge + The delay before starting final merge for per-month partition after no new data is ingested into it. Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. Zero value disables final merge -forceFlushAuthKey string - authKey, which must be passed in query string to /internal/force_flush pages + authKey, which must be passed in query string to /internal/force_flush pages -forceMergeAuthKey string - authKey, which must be passed in query string to /internal/force_merge pages + authKey, which must be passed in query string to /internal/force_merge pages -fs.disableMmap - Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() + Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() -graphiteListenAddr string - TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty + TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty -graphiteTrimTimestamp duration - Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) + Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) -http.connTimeout duration - Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) + Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) -http.disableResponseCompression - Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth + Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth -http.idleConnTimeout duration - Timeout for incoming idle http connections (default 1m0s) + Timeout for incoming idle http connections (default 1m0s) -http.maxGracefulShutdownDuration duration - The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) + The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) -http.pathPrefix string - An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus + An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus -http.shutdownDelay duration - Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers + Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers -httpAuth.password string - Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty + Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty -httpAuth.username string - Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password + Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password -httpListenAddr string - TCP address to listen for http connections (default ":8428") + TCP address to listen for http connections (default ":8428") -import.maxLineLen size - The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600) + The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600) -influx.databaseNames array - Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb - Supports an array of values separated by comma or specified via multiple flags. + Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb + Supports an array of values separated by comma or specified via multiple flags. -influx.maxLineSize size - The maximum size in bytes for a single InfluxDB line during parsing - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144) + The maximum size in bytes for a single InfluxDB line during parsing + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144) -influxDBLabel string - Default label for the DB name sent over '?db={db_name}' query parameter (default "db") + Default label for the DB name sent over '?db={db_name}' query parameter (default "db") -influxListenAddr string - TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://:8428/write + TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://:8428/write -influxMeasurementFieldSeparator string - Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") + Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") -influxSkipMeasurement - Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator' + Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator' -influxSkipSingleField - Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field + Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field -influxTrimTimestamp duration - Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -insert.maxQueueDuration duration - The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s) + The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s) -logNewSeries - Whether to log new series. This option is for debug purposes only. It can lead to performance issues when big number of new series are ingested into VictoriaMetrics + Whether to log new series. This option is for debug purposes only. It can lead to performance issues when big number of new series are ingested into VictoriaMetrics -loggerDisableTimestamps - Whether to disable writing timestamps in logs + Whether to disable writing timestamps in logs -loggerErrorsPerSecondLimit int - Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit + Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit -loggerFormat string - Format for logs. Possible values: default, json (default "default") + Format for logs. Possible values: default, json (default "default") -loggerLevel string - Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") + Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") -loggerOutput string - Output for the logs. Supported values: stderr, stdout (default "stderr") + Output for the logs. Supported values: stderr, stdout (default "stderr") -loggerTimezone string - Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") + Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") -loggerWarnsPerSecondLimit int - Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit + Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit -maxConcurrentInserts int - The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16) + The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16) -maxInsertRequestSize size - The maximum size in bytes of a single Prometheus remote_write API request - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) + The maximum size in bytes of a single Prometheus remote_write API request + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) -maxLabelValueLen int - The maximum length of label values in the accepted time series. Longer label values are truncated. In this case the vm_too_long_label_values_total metric at /metrics page is incremented (default 16384) + The maximum length of label values in the accepted time series. Longer label values are truncated. In this case the vm_too_long_label_values_total metric at /metrics page is incremented (default 16384) -maxLabelsPerTimeseries int - The maximum number of labels accepted per time series. Superfluous labels are dropped. In this case the vm_metrics_with_dropped_labels_total metric at /metrics page is incremented (default 30) + The maximum number of labels accepted per time series. Superfluous labels are dropped. In this case the vm_metrics_with_dropped_labels_total metric at /metrics page is incremented (default 30) -memory.allowedBytes size - Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -memory.allowedPercent float - Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) + Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) -metricsAuthKey string - Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings -opentsdbHTTPListenAddr string - TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty + TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty -opentsdbListenAddr string - TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty + TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty -opentsdbTrimTimestamp duration - Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) + Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) -opentsdbhttp.maxInsertRequestSize size - The maximum size of OpenTSDB HTTP put request - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) + The maximum size of OpenTSDB HTTP put request + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) -opentsdbhttpTrimTimestamp duration - Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -pprofAuthKey string - Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings -precisionBits int - The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss (default 64) + The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss (default 64) -promscrape.cluster.memberNum int - The number of number in the cluster of scrapers. It must be an unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster + The number of number in the cluster of scrapers. It must be an unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster -promscrape.cluster.membersCount int - The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets + The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets -promscrape.cluster.replicationFactor int - The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1) + The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1) -promscrape.config string - Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details + Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details -promscrape.config.dryRun - Checks -promscrape.config file for errors and unsupported fields and then exits. Returns non-zero exit code on parsing errors and emits these errors to stderr. See also -promscrape.config.strictParse command-line flag. Pass -loggerLevel=ERROR if you don't need to see info messages in the output. + Checks -promscrape.config file for errors and unsupported fields and then exits. Returns non-zero exit code on parsing errors and emits these errors to stderr. See also -promscrape.config.strictParse command-line flag. Pass -loggerLevel=ERROR if you don't need to see info messages in the output. -promscrape.config.strictParse - Whether to deny unsupported fields in -promscrape.config . Set to false in order to silently skip unsupported fields (default true) + Whether to deny unsupported fields in -promscrape.config . Set to false in order to silently skip unsupported fields (default true) -promscrape.configCheckInterval duration - Interval for checking for changes in '-promscrape.config' file. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes + Interval for checking for changes in '-promscrape.config' file. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes -promscrape.consul.waitTime duration - Wait time used by Consul service discovery. Default value is used if not set + Wait time used by Consul service discovery. Default value is used if not set -promscrape.consulSDCheckInterval duration - Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) + Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) -promscrape.digitaloceanSDCheckInterval duration - Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s) + Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s) -promscrape.disableCompression - Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control + Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control -promscrape.disableKeepAlive - Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets + Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets -promscrape.discovery.concurrency int - The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100) + The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100) -promscrape.discovery.concurrentWaitTime duration - The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s) + The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s) -promscrape.dnsSDCheckInterval duration - Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config for details (default 30s) + Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config for details (default 30s) -promscrape.dockerSDCheckInterval duration - Interval for checking for changes in docker. This works only if docker_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#docker_sd_config for details (default 30s) + Interval for checking for changes in docker. This works only if docker_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#docker_sd_config for details (default 30s) -promscrape.dockerswarmSDCheckInterval duration - Interval for checking for changes in dockerswarm. This works only if dockerswarm_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config for details (default 30s) + Interval for checking for changes in dockerswarm. This works only if dockerswarm_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config for details (default 30s) -promscrape.dropOriginalLabels - Whether to drop original labels for scrape targets at /targets and /api/v1/targets pages. This may be needed for reducing memory usage when original labels for big number of scrape targets occupy big amounts of memory. Note that this reduces debuggability for improper per-target relabeling configs + Whether to drop original labels for scrape targets at /targets and /api/v1/targets pages. This may be needed for reducing memory usage when original labels for big number of scrape targets occupy big amounts of memory. Note that this reduces debuggability for improper per-target relabeling configs -promscrape.ec2SDCheckInterval duration - Interval for checking for changes in ec2. This works only if ec2_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config for details (default 1m0s) + Interval for checking for changes in ec2. This works only if ec2_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config for details (default 1m0s) -promscrape.eurekaSDCheckInterval duration - Interval for checking for changes in eureka. This works only if eureka_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config for details (default 30s) + Interval for checking for changes in eureka. This works only if eureka_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config for details (default 30s) -promscrape.fileSDCheckInterval duration - Interval for checking for changes in 'file_sd_config'. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config for details (default 5m0s) + Interval for checking for changes in 'file_sd_config'. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config for details (default 5m0s) -promscrape.gceSDCheckInterval duration - Interval for checking for changes in gce. This works only if gce_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config for details (default 1m0s) + Interval for checking for changes in gce. This works only if gce_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config for details (default 1m0s) -promscrape.httpSDCheckInterval duration - Interval for checking for changes in http endpoint service discovery. This works only if http_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config for details (default 1m0s) + Interval for checking for changes in http endpoint service discovery. This works only if http_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config for details (default 1m0s) -promscrape.kubernetes.apiServerTimeout duration - How frequently to reload the full state from Kuberntes API server (default 30m0s) + How frequently to reload the full state from Kuberntes API server (default 30m0s) -promscrape.kubernetesSDCheckInterval duration - Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config for details (default 30s) + Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config for details (default 30s) -promscrape.maxDroppedTargets int - The maximum number of droppedTargets to show at /api/v1/targets page. Increase this value if your setup drops more scrape targets during relabeling and you need investigating labels for all the dropped targets. Note that the increased number of tracked dropped targets may result in increased memory usage (default 1000) + The maximum number of droppedTargets to show at /api/v1/targets page. Increase this value if your setup drops more scrape targets during relabeling and you need investigating labels for all the dropped targets. Note that the increased number of tracked dropped targets may result in increased memory usage (default 1000) -promscrape.maxResponseHeadersSize size - The maximum size of http response headers from Prometheus scrape targets - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 4096) + The maximum size of http response headers from Prometheus scrape targets + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 4096) -promscrape.maxScrapeSize size - The maximum size of scrape response in bytes to process from Prometheus targets. Bigger responses are rejected - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16777216) + The maximum size of scrape response in bytes to process from Prometheus targets. Bigger responses are rejected + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16777216) -promscrape.minResponseSizeForStreamParse size - The minimum target response size for automatic switching to stream parsing mode, which can reduce memory usage. See https://docs.victoriametrics.com/vmagent.html#stream-parsing-mode - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 1000000) + The minimum target response size for automatic switching to stream parsing mode, which can reduce memory usage. See https://docs.victoriametrics.com/vmagent.html#stream-parsing-mode + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 1000000) -promscrape.noStaleMarkers - Whether to disable sending Prometheus stale markers for metrics when scrape target disappears. This option may reduce memory usage if stale markers aren't needed for your setup. This option also disables populating the scrape_series_added metric. See https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series + Whether to disable sending Prometheus stale markers for metrics when scrape target disappears. This option may reduce memory usage if stale markers aren't needed for your setup. This option also disables populating the scrape_series_added metric. See https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series -promscrape.openstackSDCheckInterval duration - Interval for checking for changes in openstack API server. This works only if openstack_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config for details (default 30s) + Interval for checking for changes in openstack API server. This works only if openstack_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config for details (default 30s) -promscrape.seriesLimitPerTarget int - Optional limit on the number of unique time series a single scrape target can expose. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more info + Optional limit on the number of unique time series a single scrape target can expose. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more info -promscrape.streamParse - Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is posible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control + Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is posible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control -promscrape.suppressDuplicateScrapeTargetErrors - Whether to suppress 'duplicate scrape target' errors; see https://docs.victoriametrics.com/vmagent.html#troubleshooting for details + Whether to suppress 'duplicate scrape target' errors; see https://docs.victoriametrics.com/vmagent.html#troubleshooting for details -promscrape.suppressScrapeErrors - Whether to suppress scrape errors logging. The last error for each target is always available at '/targets' page even if scrape errors logging is suppressed + Whether to suppress scrape errors logging. The last error for each target is always available at '/targets' page even if scrape errors logging is suppressed -relabelConfig string - Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal + Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal -relabelDebug - Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, then the metrics aren't sent to storage. This is useful for debugging the relabeling configs + Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, then the metrics aren't sent to storage. This is useful for debugging the relabeling configs -retentionPeriod value - Data with timestamps outside the retentionPeriod is automatically deleted - The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1) + Data with timestamps outside the retentionPeriod is automatically deleted + The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1) -search.cacheTimestampOffset duration - The maximum duration since the current time for response data, which is always queried from the original raw data, without using the response cache. Increase this value if you see gaps in responses due to time synchronization issues between VictoriaMetrics and data sources. See also -search.disableAutoCacheReset (default 5m0s) + The maximum duration since the current time for response data, which is always queried from the original raw data, without using the response cache. Increase this value if you see gaps in responses due to time synchronization issues between VictoriaMetrics and data sources. See also -search.disableAutoCacheReset (default 5m0s) -search.disableAutoCacheReset - Whether to disable automatic response cache reset if a sample with timestamp outside -search.cacheTimestampOffset is inserted into VictoriaMetrics + Whether to disable automatic response cache reset if a sample with timestamp outside -search.cacheTimestampOffset is inserted into VictoriaMetrics -search.disableCache - Whether to disable response caching. This may be useful during data backfilling + Whether to disable response caching. This may be useful during data backfilling -search.graphiteMaxPointsPerSeries int - The maximum number of points per series Graphite render API can return (default 1000000) + The maximum number of points per series Graphite render API can return (default 1000000) -search.graphiteStorageStep duration - The interval between datapoints stored in the database. It is used at Graphite Render API handler for normalizing the interval between datapoints in case it isn't normalized. It can be overriden by sending 'storage_step' query arg to /render API or by sending the desired interval via 'Storage-Step' http header during querying /render API (default 10s) + The interval between datapoints stored in the database. It is used at Graphite Render API handler for normalizing the interval between datapoints in case it isn't normalized. It can be overriden by sending 'storage_step' query arg to /render API or by sending the desired interval via 'Storage-Step' http header during querying /render API (default 10s) -search.latencyOffset duration - The time when data points become visible in query results after the collection. Too small value can result in incomplete last points for query results (default 30s) + The time when data points become visible in query results after the collection. Too small value can result in incomplete last points for query results (default 30s) -search.logSlowQueryDuration duration - Log queries with execution time exceeding this value. Zero disables slow query logging (default 5s) + Log queries with execution time exceeding this value. Zero disables slow query logging (default 5s) -search.maxConcurrentRequests int - The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration (default 8) + The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration (default 8) -search.maxExportDuration duration - The maximum duration for /api/v1/export call (default 720h0m0s) + The maximum duration for /api/v1/export call (default 720h0m0s) -search.maxLookback duration - Synonym to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. See also '-search.maxStalenessInterval' flag, which has the same meaining due to historical reasons + Synonym to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. See also '-search.maxStalenessInterval' flag, which has the same meaining due to historical reasons -search.maxPointsPerTimeseries int - The maximum points per a single timeseries returned from /api/v1/query_range. This option doesn't limit the number of scanned raw samples in the database. The main purpose of this option is to limit the number of per-series points returned to graphing UI such as Grafana. There is no sense in setting this limit to values bigger than the horizontal resolution of the graph (default 30000) + The maximum points per a single timeseries returned from /api/v1/query_range. This option doesn't limit the number of scanned raw samples in the database. The main purpose of this option is to limit the number of per-series points returned to graphing UI such as Grafana. There is no sense in setting this limit to values bigger than the horizontal resolution of the graph (default 30000) -search.maxQueryDuration duration - The maximum duration for query execution (default 30s) + The maximum duration for query execution (default 30s) -search.maxQueryLen size - The maximum search query length in bytes - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16384) + The maximum search query length in bytes + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16384) -search.maxQueueDuration duration - The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached; see also -search.maxQueryDuration (default 10s) + The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached; see also -search.maxQueryDuration (default 10s) -search.maxSamplesPerQuery int - The maximum number of raw samples a single query can process across all time series. This protects from heavy queries, which select unexpectedly high number of raw samples. See also -search.maxSamplesPerSeries (default 1000000000) + The maximum number of raw samples a single query can process across all time series. This protects from heavy queries, which select unexpectedly high number of raw samples. See also -search.maxSamplesPerSeries (default 1000000000) -search.maxSamplesPerSeries int - The maximum number of raw samples a single query can scan per each time series. This option allows limiting memory usage (default 30000000) + The maximum number of raw samples a single query can scan per each time series. This option allows limiting memory usage (default 30000000) -search.maxStalenessInterval duration - The maximum interval for staleness calculations. By default it is automatically calculated from the median interval between samples. This flag could be useful for tuning Prometheus data model closer to Influx-style data model. See https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness for details. See also '-search.maxLookback' flag, which has the same meaning due to historical reasons + The maximum interval for staleness calculations. By default it is automatically calculated from the median interval between samples. This flag could be useful for tuning Prometheus data model closer to Influx-style data model. See https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness for details. See also '-search.maxLookback' flag, which has the same meaning due to historical reasons -search.maxStatusRequestDuration duration - The maximum duration for /api/v1/status/* requests (default 5m0s) + The maximum duration for /api/v1/status/* requests (default 5m0s) -search.maxStepForPointsAdjustment duration - The maximum step when /api/v1/query_range handler adjusts points with timestamps closer than -search.latencyOffset to the current time. The adjustment is needed because such points may contain incomplete data (default 1m0s) + The maximum step when /api/v1/query_range handler adjusts points with timestamps closer than -search.latencyOffset to the current time. The adjustment is needed because such points may contain incomplete data (default 1m0s) -search.maxTagKeys int - The maximum number of tag keys returned from /api/v1/labels (default 100000) + The maximum number of tag keys returned from /api/v1/labels (default 100000) -search.maxTagValueSuffixesPerSearch int - The maximum number of tag value suffixes returned from /metrics/find (default 100000) + The maximum number of tag value suffixes returned from /metrics/find (default 100000) -search.maxTagValues int - The maximum number of tag values returned from /api/v1/label//values (default 100000) + The maximum number of tag values returned from /api/v1/label//values (default 100000) -search.maxUniqueTimeseries int - The maximum number of unique time series each search can scan. This option allows limiting memory usage (default 300000) + The maximum number of unique time series each search can scan. This option allows limiting memory usage (default 300000) -search.minStalenessInterval duration - The minimum interval for staleness calculations. This flag could be useful for removing gaps on graphs generated from time series with irregular intervals between samples. See also '-search.maxStalenessInterval' + The minimum interval for staleness calculations. This flag could be useful for removing gaps on graphs generated from time series with irregular intervals between samples. See also '-search.maxStalenessInterval' -search.noStaleMarkers - Set this flag to true if the database doesn't contain Prometheus stale markers, so there is no need in spending additional CPU time on its handling. Staleness markers may exist only in data obtained from Prometheus scrape targets + Set this flag to true if the database doesn't contain Prometheus stale markers, so there is no need in spending additional CPU time on its handling. Staleness markers may exist only in data obtained from Prometheus scrape targets -search.queryStats.lastQueriesCount int - Query stats for /api/v1/status/top_queries is tracked on this number of last queries. Zero value disables query stats tracking (default 20000) + Query stats for /api/v1/status/top_queries is tracked on this number of last queries. Zero value disables query stats tracking (default 20000) -search.queryStats.minQueryDuration duration - The minimum duration for queries to track in query stats at /api/v1/status/top_queries. Queries with lower duration are ignored in query stats (default 1ms) + The minimum duration for queries to track in query stats at /api/v1/status/top_queries. Queries with lower duration are ignored in query stats (default 1ms) -search.resetCacheAuthKey string - Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call + Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call -search.treatDotsAsIsInRegexps - Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter + Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter -selfScrapeInstance string - Value for 'instance' label, which is added to self-scraped metrics (default "self") + Value for 'instance' label, which is added to self-scraped metrics (default "self") -selfScrapeInterval duration - Interval for self-scraping own metrics at /metrics page + Interval for self-scraping own metrics at /metrics page -selfScrapeJob string - Value for 'job' label, which is added to self-scraped metrics (default "victoria-metrics") + Value for 'job' label, which is added to self-scraped metrics (default "victoria-metrics") -smallMergeConcurrency int - The maximum number of CPU cores to use for small merges. Default value is used if set to 0 + The maximum number of CPU cores to use for small merges. Default value is used if set to 0 -snapshotAuthKey string - authKey, which must be passed in query string to /snapshot* pages + authKey, which must be passed in query string to /snapshot* pages -sortLabels - Whether to sort labels for incoming samples before writing them to storage. This may be needed for reducing memory usage at storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}. Enabled sorting for labels can slow down ingestion performance a bit + Whether to sort labels for incoming samples before writing them to storage. This may be needed for reducing memory usage at storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}. Enabled sorting for labels can slow down ingestion performance a bit -storage.cacheSizeIndexDBDataBlocks size - Overrides max size for indexdb/dataBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Overrides max size for indexdb/dataBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -storage.cacheSizeIndexDBIndexBlocks size - Overrides max size for indexdb/indexBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Overrides max size for indexdb/indexBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -storage.cacheSizeStorageTSID size - Overrides max size for storage/tsid cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Overrides max size for storage/tsid cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -storage.maxDailySeries int - The maximum number of unique series can be added to the storage during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See also -storage.maxHourlySeries + The maximum number of unique series can be added to the storage during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See also -storage.maxHourlySeries -storage.maxHourlySeries int - The maximum number of unique series can be added to the storage during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See also -storage.maxDailySeries + The maximum number of unique series can be added to the storage during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See also -storage.maxDailySeries -storage.minFreeDiskSpaceBytes size - The minimum free disk space at -storageDataPath after which the storage stops accepting new data - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 10000000) + The minimum free disk space at -storageDataPath after which the storage stops accepting new data + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 10000000) -storageDataPath string - Path to storage data (default "victoria-metrics-data") + Path to storage data (default "victoria-metrics-data") -tls - Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set + Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set -tlsCertFile string - Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated -tlsKeyFile string - Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated -version - Show VictoriaMetrics version + Show VictoriaMetrics version ``` diff --git a/deployment/docker/README.md b/deployment/docker/README.md index d18f97771..8090f2a9d 100644 --- a/deployment/docker/README.md +++ b/deployment/docker/README.md @@ -9,6 +9,7 @@ For clustered version check [docker compose in cluster branch](https://github.co ## VictoriaMetrics VictoriaMetrics will be accessible on the following ports: + * `--graphiteListenAddr=:2003` * `--opentsdbListenAddr=:4242` * `--httpListenAddr=:8428` @@ -23,7 +24,7 @@ configuration `prometheus.yml` with listed targets for scraping. ## vmalert -vmalert evaluates alerting rules (`alerts.yml`) to track VictoriaMetrics +vmalert evaluates alerting rules (`alerts.yml`) to track VictoriaMetrics health state. It is connected with AlertManager for firing alerts, and with VictoriaMetrics for executing queries and storing alert's state. @@ -40,11 +41,13 @@ All notifications are blackholed according to `alertmanager.yml` config. To access service open following [link](http://localhost:3000). -Default creds: +Default credential: + * login - `admin` * password - `admin` Grafana is provisioned by default with following entities: + * VictoriaMetrics datasource * Prometheus datasource * VictoriaMetrics overview dashboard diff --git a/deployment/docker/docker-compose.yml b/deployment/docker/docker-compose.yml index b441b44f6..0889aece6 100644 --- a/deployment/docker/docker-compose.yml +++ b/deployment/docker/docker-compose.yml @@ -1,4 +1,4 @@ -version: '3.5' +version: "3.5" services: vmagent: container_name: vmagent @@ -11,8 +11,8 @@ services: - vmagentdata:/vmagentdata - ./prometheus.yml:/etc/prometheus/prometheus.yml command: - - '--promscrape.config=/etc/prometheus/prometheus.yml' - - '--remoteWrite.url=http://victoriametrics:8428/api/v1/write' + - "--promscrape.config=/etc/prometheus/prometheus.yml" + - "--remoteWrite.url=http://victoriametrics:8428/api/v1/write" networks: - vm_net restart: always @@ -29,11 +29,11 @@ services: volumes: - vmdata:/storage command: - - '--storageDataPath=/storage' - - '--graphiteListenAddr=:2003' - - '--opentsdbListenAddr=:4242' - - '--httpListenAddr=:8428' - - '--influxListenAddr=:8089' + - "--storageDataPath=/storage" + - "--graphiteListenAddr=:2003" + - "--opentsdbListenAddr=:4242" + - "--httpListenAddr=:8428" + - "--influxListenAddr=:8089" networks: - vm_net restart: always @@ -64,24 +64,24 @@ services: volumes: - ./alerts.yml:/etc/alerts/alerts.yml command: - - '--datasource.url=http://victoriametrics:8428/' - - '--remoteRead.url=http://victoriametrics:8428/' - - '--remoteWrite.url=http://victoriametrics:8428/' - - '--notifier.url=http://alertmanager:9093/' - - '--rule=/etc/alerts/*.yml' + - "--datasource.url=http://victoriametrics:8428/" + - "--remoteRead.url=http://victoriametrics:8428/" + - "--remoteWrite.url=http://victoriametrics:8428/" + - "--notifier.url=http://alertmanager:9093/" + - "--rule=/etc/alerts/*.yml" # display source of alerts in grafana - - '-external.url=http://127.0.0.1:3000' #grafana outside container - - '--external.alert.source=explore?orgId=1&left=["now-1h","now","VictoriaMetrics",{"expr":"{{$$expr|quotesEscape|crlfEscape|queryEscape}}"},{"mode":"Metrics"},{"ui":[true,true,true,"none"]}]' ## when copypaste the line be aware of '$$' for escaping in '$expr' networks: + - "--external.url=http://127.0.0.1:3000" #grafana outside container + - '--external.alert.source=explore?orgId=1&left=["now-1h","now","VictoriaMetrics",{"expr":"{{$$expr|quotesEscape|crlfEscape|queryEscape}}"},{"mode":"Metrics"},{"ui":[true,true,true,"none"]}]' ## when copypaste the line be aware of '$$' for escaping in '$expr' networks: - vm_net restart: always alertmanager: container_name: alertmanager - image: prom/alertmanager + image: prom/alertmanager volumes: - ./alertmanager.yml:/config/alertmanager.yml command: - - '--config.file=/config/alertmanager.yml' + - "--config.file=/config/alertmanager.yml" ports: - 9093:9093 networks: diff --git a/deployment/marketplace/digitialocean/one-click-droplet/README.md b/deployment/marketplace/digitialocean/one-click-droplet/README.md index 5d5ebbd37..2a6126474 100644 --- a/deployment/marketplace/digitialocean/one-click-droplet/README.md +++ b/deployment/marketplace/digitialocean/one-click-droplet/README.md @@ -4,9 +4,9 @@ VictoriaMetrics is a fast and scalable open source time series database and moni ## Description -VictoriaMetrics is a free [open source time series database](https://en.wikipedia.org/wiki/Time_series_database) (TSDB) and monitoring solution, designed to collect, store and process real-time metrics. +VictoriaMetrics is a free [open source time series database](https://en.wikipedia.org/wiki/Time_series_database) (TSDB) and monitoring solution, designed to collect, store and process real-time metrics. -It supports the [Prometheus](https://en.wikipedia.org/wiki/Prometheus_(software)) pull model and various push protocols ([Graphite](https://en.wikipedia.org/wiki/Graphite_(software)), [InfluxDB](https://en.wikipedia.org/wiki/InfluxDB), OpenTSDB) for data ingestion. It is optimized for storage with high-latency IO, low IOPS and time series with [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). +It supports the [Prometheus](https://en.wikipedia.org/wiki/Prometheus_(software)) pull model and various push protocols ([Graphite](https://en.wikipedia.org/wiki/Graphite_(software)), [InfluxDB](https://en.wikipedia.org/wiki/InfluxDB), OpenTSDB) for data ingestion. It is optimized for storage with high-latency IO, low IOPS and time series with [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate). For reading the data and evaluating alerting rules, VictoriaMetrics supports the PromQL, [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) and Graphite query languages. VictoriaMetrics Single is fully autonomous and can be used as a long-term storage for time series. @@ -16,7 +16,7 @@ For reading the data and evaluating alerting rules, VictoriaMetrics supports the ### Config -VictoriaMetrics configuration is located at `/etc/victoriametrics/single/scrape.yml` on the droplet. +VictoriaMetrics configuration is located at `/etc/victoriametrics/single/scrape.yml` on the droplet. This One Click app uses 8428, 2003, 4242 and 8089 ports to accept metrics from different protocols. It's recommended to disable ports for protocols which are not needed. [Ubuntu firewall](https://help.ubuntu.com/community/UFW) can be used to easily disable access for specific ports. ### Scraping metrics @@ -26,6 +26,7 @@ VictoriaMetrics supports metrics scraping in the same way as Prometheus does. Ch ### Sending metrics Besides scraping, VictoriaMetrics accepts write requests for various ingestion protocols. This One Click app supports the following protocols: + - [Datadog](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-datadog-agent), [Influx (telegraph)](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf), [JSON](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-data-in-json-line-format), [CSV](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-csv-data), [Prometheus](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-data-in-prometheus-exposition-format) on port :8428 - [Graphite (statsd)](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-graphite-compatible-agents-such-as-statsd) on port :2003 tcp/udp - [OpenTSDB](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-opentsdb-compatible-agents) on port :4242 @@ -51,4 +52,4 @@ Once the Droplet is created, you can use DigitalOcean's web console to start a s ```bash ssh root@your_droplet_public_ipv4 -``` \ No newline at end of file +``` diff --git a/docs/Articles.md b/docs/Articles.md index 9da4188f4..5eb94a845 100644 --- a/docs/Articles.md +++ b/docs/Articles.md @@ -59,7 +59,6 @@ See also [case studies](https://docs.victoriametrics.com/CaseStudies.html). * [VictoriaMetrics — creating the best remote storage for Prometheus](https://faun.pub/victoriametrics-creating-the-best-remote-storage-for-prometheus-5d92d66787ac) * [Anomaly Detection in VictoriaMetrics](https://victoriametrics.medium.com/anomaly-detection-in-victoriametrics-9528538786a7) - ### Benchmarks * [When size matters — benchmarking VictoriaMetrics vs Timescale and InfluxDB](https://valyala.medium.com/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4) @@ -71,7 +70,6 @@ See also [case studies](https://docs.victoriametrics.com/CaseStudies.html). * [Prometheus vs VictoriaMetrics benchmark on node-exporter metrics](https://valyala.medium.com/prometheus-vs-victoriametrics-benchmark-on-node-exporter-metrics-4ca29c75590f) * [Promscale vs VictoriaMetrics: resource usage on production workload](https://valyala.medium.com/promscale-vs-victoriametrics-resource-usage-on-production-workload-91c8e3786c03) - ### Technical articles * [How VictoriaMetrics makes instant snapshots for multi-terabyte time series data](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) @@ -84,7 +82,6 @@ See also [case studies](https://docs.victoriametrics.com/CaseStudies.html). * [Why irate from Prometheus doesn't capture spikes](https://valyala.medium.com/why-irate-from-prometheus-doesnt-capture-spikes-45f9896d7832) * [VictoriaMetrics: PromQL compliance](https://medium.com/@romanhavronenko/victoriametrics-promql-compliance-d4318203f51e) - ### Tutorials, guides and how-to articles * [PromQL tutorial for beginners and humans](https://valyala.medium.com/promql-tutorial-for-beginners-9ab455142085) @@ -97,7 +94,6 @@ See also [case studies](https://docs.victoriametrics.com/CaseStudies.html). * [How to monitor Go applications with VictoriaMetrics](https://victoriametrics.medium.com/how-to-monitor-go-applications-with-victoriametrics-c04703110870) * [Prometheus storage: tech terms for humans](https://valyala.medium.com/prometheus-storage-technical-terms-for-humans-4ab4de6c3d48) - ### Other articles * [How ClickHouse inspired us to build a high performance time series database](https://www.youtube.com/watch?v=p9qjb_yoBro). See also [slides](https://docs.google.com/presentation/d/1SdFrwsyR-HMXfbzrY8xfDZH_Dg6E7E5NJ84tQozMn3w/edit?usp=sharing). diff --git a/docs/BestPractices.md b/docs/BestPractices.md index be47c5275..46deb5b2f 100644 --- a/docs/BestPractices.md +++ b/docs/BestPractices.md @@ -4,25 +4,22 @@ sort: 19 # VictoriaMetrics best practices - ## Install Recommendation It is recommended running the latest available release of VictoriaMetrics from [this page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases), since it contains all the bugfixes and enhancements. There is no need to tune VictoriaMetrics because it uses reasonable defaults for command-line flags. These flags are automatically adjusted for the available CPU and RAM resources. There is no need in Operating System tuning because VictoriaMetrics is optimized for default OS settings. The only option is to increase the limit on the [number of open files in the OS](https://medium.com/@muhammadtriwibowo/set-permanently-ulimit-n-open-files-in-ubuntu-4d61064429a), so VictoriaMetrics could accept more incoming connections and could keep open more data files. - ## Filesystem The recommended filesystem for VictoriaMetrics is [ext4](https://en.wikipedia.org/wiki/Ext4). If you plan to store more than 1TB of data on ext4 partition or plan to extend it to more than 16TB, then the following options are recommended to pass to mkfs.ext4: -``` +```sh mkfs.ext4 ... -O 64bit,huge_file,extent -T huge ``` VictoriaMetrics should work OK with other filesystems, including network filesystems such as [NFS](https://en.wikipedia.org/wiki/Network_File_System), [Amazon EFS](https://aws.amazon.com/efs/) and [Google Filestore](https://cloud.google.com/filestore). - ## Operation System VictoriaMetrics is production-ready for the following operating systems: @@ -35,7 +32,6 @@ Some VictoriaMetrics components ([vmagent](https://docs.victoriametrics.com/vmag VictoriaMetrics can run also on MacOS for testing and development purposes. - ## Upgrade procedure It is safe to upgrade VictoriaMetrics to new versions unless the [release notes](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) say otherwise. It is safe to skip multiple versions during the upgrade unless release notes say otherwise. It is recommended to perform regular upgrades to the latest version, since it may contain important bug fixes, performance optimizations or new features. @@ -48,12 +44,10 @@ The following steps must be performed during the upgrade / downgrade procedure: * Wait until the process stops. This can take a few seconds. * Start the upgraded VictoriaMetrics. - ## Backup Recommendations VictoriaMetrics supports backups via [vmbackup](https://docs.victoriametrics.com/vmbackup.html) and [vmrestore](https://docs.victoriametrics.com/vmrestore.html) tools. There is also [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html), which simplifies backup automation. - ## Technical Support and Services There are the following channels for providing technical support for VictoriaMetrics: diff --git a/docs/CHANGELOG.md b/docs/CHANGELOG.md index e40a4bf9e..228310d0f 100644 --- a/docs/CHANGELOG.md +++ b/docs/CHANGELOG.md @@ -5,6 +5,7 @@ sort: 15 # CHANGELOG The following tip changes can be tested by building VictoriaMetrics components from the latest commits according to the following docs: + * [How to build single-node VictoriaMetrics](https://docs.victoriametrics.com/#how-to-build-from-sources) * [How to build cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#building-from-sources) * [How to build vmagent](https://docs.victoriametrics.com/vmagent.html#how-to-build-from-sources) @@ -40,7 +41,6 @@ See other changes introduced to vmalert [here](https://github.com/VictoriaMetric * BUGFIX: reduce the interval for checking for free disk space from 30 seconds to 1 second. This should reduce the probability of `no space left on device` panics when `-storage.minFreeDiskSpaceBytes` is set to too low values. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2305). * BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): prevent from panic at vmagent when importing a time series with big number of samples. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2335). Thanks to @bleedfish for discovering and fixing the issue. - ## [v1.74.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.74.0) Released at 03-03-2022 @@ -72,7 +72,6 @@ This rule is equivalent to less clear traditional one: * BUGFIX: properly handle [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) containing a filter for multiple metric names plus a negative filter. For example, `{__name__=~"foo|bar",job!="baz"}` . Previously VictoriaMetrics could return series with `foo` or `bar` names and with `job="baz"`. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2238). * BUGFIX: [vmgateway](https://docs.victoriametrics.com/vmgateway.html): properly parse JWT tokens if they are encoded with [URL-safe base64 encoding](https://datatracker.ietf.org/doc/html/rfc4648#section-5). - ## [v1.73.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.73.1) Released at 22-02-2022 @@ -94,7 +93,6 @@ Released at 22-02-2022 * BUGFIX: update default value for `-promscrape.fileSDCheckInterval`, so it matches default duration used by Prometheus for checking for updates in `file_sd_configs`. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2187). Thanks to @corporate-gadfly for the fix. * BUGFIX: [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html): do not return partial responses from `vmselect` if at least a single `vmstorage` node was reachable and returned an app-level error. Such errors are usually related to cluster mis-configuration, so they must be returned to the caller instead of being masked by [partial responses](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#cluster-availability). Partial responses can be returned only if some of `vmstorage` nodes are unreachable during the query. This may help the following issues: [one](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1941), [two](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/678). - ## [v1.73.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.73.0) Released at 14-02-2022 @@ -136,7 +134,6 @@ Released at 14-02-2022 * BUGFIX: vmagent: properly display `zone` contents for `gce_sd_configs` section at `http://vmagent:8429/config` page. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2179). Thanks to @artifactori for the bugfix. * BUGFIX: vmagent: properly handle `all_tenants: true` config option at `openstack_sd_config`. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2182). - ## [v1.72.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.72.0) Released at 18-01-2022 @@ -178,12 +175,11 @@ Released at 18-01-2022 * BUGFIX: [vmui](https://docs.victoriametrics.com/#vmui): fix incorrect calculations for graph limits on y axis. This could result in incorrect graph rendering in some cases. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2037). * BUGFIX: [vmui](https://docs.victoriametrics.com/#vmui): fix handling for multi-line queries. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2039). - ## [v1.71.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.71.0) Released at 20-12-2021 -**Update notes:** deduplication logic was slightly changed on the release, which may cause extra +**Update notes:** deduplication logic was slightly changed on the release, which may cause extra [background merges](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) for already existing data parts for installations with `-dedup.minScrapeInterval` flag value greater than 0. This process is intentionally limited by one CPU core, but still can result into increase of CPU usage until merges are finished. @@ -212,7 +208,6 @@ We recommend updating in "off-peak" time when load on the VictoriaMetrics is on * BUGFIX: de-duplicate data exported via [/api/v1/export/csv](https://docs.victoriametrics.com/#how-to-export-csv-data) by default if [deduplication](https://docs.victoriametrics.com/#deduplication) is enabled. The de-duplication can be disabled by passing `reduce_mem_usage=1` query arg to `/api/v1/export/csv`. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1837). * BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert.html): properly store [historical data](https://docs.victoriametrics.com/vmalert.html#rules-backfilling) to old Prometheus versions. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943). - ## [v1.70.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.70.0) Released at 02-12-2021 @@ -242,7 +237,6 @@ Released at 02-12-2021 * BUGFIX: [vmui](https://docs.victoriametrics.com/#vmui): do not store the last query across vmui page reloads. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1694). * BUGFIX: [vmui](https://docs.victoriametrics.com/#vmui): fix `Cannot read properties of undefined` error at table view. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1797). - ## [v1.69.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.69.0) Released at 08-11-2021 @@ -265,7 +259,6 @@ Released at 08-11-2021 * BUGFIX: vmagent: properly display `proxy_url` config option at `http://vmagent:8429/config` page. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1755). * BUGFIX: fix tests for Apple M1. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1653). - ## [v1.68.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.68.0) Released at 22-10-2021 @@ -294,7 +287,6 @@ Released at 22-10-2021 * BUGFIX: vmagent: group scrape targets by the original job names at `http://vmagent:8429/targets` page like Prometheus does. Previously they were grouped by the job name after relabeling, which may result in unexpected empty target groups. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1707). * BUGFIX: [vmctl](https://docs.victoriametrics.com/vmctl.html): fix importing boolean fields from InfluxDB line protocol. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1709). - ## [v1.67.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.67.0) Released at 08-10-2021 @@ -312,7 +304,6 @@ Released at 08-10-2021 * BUGFIX: return proper values (zeroes) from [stddev_over_time](https://docs.victoriametrics.com/MetricsQL.html#stddev_over_time) and [stdvar_over_time](https://docs.victoriametrics.com/MetricsQL.html#stdvar_over_time) functions when the lookbehind window in square brackets contains only a single sample. Previously the sample value was incorrectly returned in this case. * BUGFIX: vminsert: fix uneven distribution of time series among storage nodes in [multi-level cluster setup](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multi-level-cluster-setup). See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1672). - ## [v1.66.2](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.66.2) Released at 23-09-2021 @@ -322,7 +313,6 @@ Released at 23-09-2021 * BUGFIX: vmalert: properly reload rule groups if only the `interval` config option is changed. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1641). * BUGFIX: properly handle `{__name__=~"prefix(suffix1|suffix2)",other_label="..."}` queries. They may return unexpected empty responses since v1.66.0. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1644). - ## [v1.66.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.66.1) Released at 22-09-2021 @@ -332,7 +322,6 @@ Released at 22-09-2021 * BUGFIX: vmselect: fix accessing [Graphite APIs](https://docs.victoriametrics.com/#graphite-api-usage). The access has been broken in v1.66.0, because `/graphite/*` path prefix accidentally clashed with `/graph*` path prefix used for VictoriaMetrics UI (aka `vmui`). * BUGFIX: fix parsing `regex: ` in relabeling rules (for example, `regex: true` or `regex: 123`). The bug has been introduced in v1.66.0. - ## [v1.66.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.66.0) Released at 20-09-2021 @@ -370,7 +359,6 @@ Released at 20-09-2021 * BUGFIX: fix non-repeatable results from `quantile_over_time()` function when the number of input samples exceeds 1000. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1612). * BUGFIX: vmagent: fix EC2 zone discovery when `filters` are specified in [ec2_sc_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config). See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626). - ## [v1.65.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.65.0) Released at 01-09-2021 @@ -395,7 +383,6 @@ Released at 01-09-2021 * BUGFIX: [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html): fix timeout error when snapshot takes longer than 10 seconds. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1571). * BUGFIX: properly parse OpenTSDB `put` messages with multiple spaces between message elements. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1574). Thanks to @envzhu for the fix. - ## [v1.64.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.64.1) Released at 19-08-2021 @@ -410,7 +397,6 @@ Released at 19-08-2021 * BUGFIX: upgrade base Docker image from Alpine 3.14.0 to Alpine 3.14.1 . This fixes potential security issues - see [Alpine 3.14.1 release notes](https://www.alpinelinux.org/posts/Alpine-3.14.1-released.html). * BUGFIX: disable overriding the lookbehind window `[d]` at `last_over_time(m[d])` if `d` is smaller than the interval between samples, since users don't expect implicit overriding of explicitly set `[d]` in `last_over_time(m[d])`. - ## [v1.64.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.64.0) Released at 15-08-2021 @@ -437,7 +423,6 @@ Released at 15-08-2021 * BUGFIX: vmui: fix layout when the query selects more than 27 time series. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1497). * BUGFIX: vmagent: restore highlighting in red for DOWN targets at `/targets` page. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1461). - ## [v1.63.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.63.0) Released at 15-07-2021 @@ -459,7 +444,6 @@ Released at 15-07-2021 * BUGFIX: vmalert: accept Prometheus-like durations in `interval` config option inside `group` section. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1444). * BUGFIX: properly update `vm_merge_need_free_disk_space` metric at `/metrics` page when there is no enough free disk space for performing optimal merges. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373). - ## [v1.62.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.62.0) Released at 25-06-2021 @@ -483,7 +467,6 @@ Released at 25-06-2021 * BUGFIX: vmselect: return the last timestamp for the max / min value from `tmax_over_time(m[d])` and `tmin_over_time(m[d])` [MetricsQL functions](https://docs.victoriametrics.com/MetricsQL.html) as most users expect. See also [this issue](https://github.com/prometheus/prometheus/issues/8966). * BUGFIX: vmselect: return the expected value for `increase_pure()` [MetricsQL function](https://docs.victoriametrics.com/MetricsQL.html) after a gap in a time series. Previously incorrect too big value could be returned after the gap from `increase_pure()`. - ## [v1.61.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.61.1) Released at 11-06-2021 @@ -491,7 +474,6 @@ Released at 11-06-2021 * BUGFIX: vmalert: fix recording rules, which were broken in v1.61.0. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1369). * BUGFIX: reset the on-disk cache for mapping from the full metric name to an internal metric id (e.g. `metric_name{labels} -> internal_metric_id`) after deleting metrics via [delete API](https://docs.victoriametrics.com/#how-to-delete-time-series). This should prevent from possible inconsistent state after unclean shutdown. This [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347). - ## [v1.61.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.61.0) Released at 09-06-2021 @@ -510,7 +492,6 @@ Released at 09-06-2021 * BUGFIX: vmauth: do not panic on aborted http requests. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1353). * BUGFIX: properly generate `target` property for `*Series(foo.*.bar)` responses returned from [Graphite Render API](https://docs.victoriametrics.com/#graphite-render-api-usage). Previously the `target` contained the expanded list of series for `foo.*.bar`, e.g. `sumSeries(foo.a.bar,foo.b.bar,...foo.z.bar)`. Now VictoriaMetrics returns `sumSeries(foo.*.bar)` as a target in the same way as Graphite does. - ## [v1.60.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.60.0) Released at 24-05-2021 @@ -547,7 +528,6 @@ Released at 24-05-2021 * BUGFIX: vmalert: properly import default rules from OpenShift. See [this issue](https://github.com/VictoriaMetrics/operator/issues/243). * BUGFIX: reduce the probability of `the removal queue is full` panic when highly loaded VictoriaMetrics stores data on NFS. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313). - ## [v1.59.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.59.0) Released at 01-05-2021 @@ -565,7 +545,6 @@ Thanks to @johnseekins! * BUGFIX: vmagent: eliminate possible data race when obtaining value for the metric `vm_persistentqueue_bytes_pending`. The data race could result in incorrect value for this metric. * BUGFIX: vmstorage: remove empty directories on startup. Such directories can be left after unclean shutdown on NFS storage. Previously such directories could lead to crashloop until manually removed. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1142). - ## [v1.58.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.58.0) Released at 08-04-2021 @@ -593,7 +572,6 @@ Released at 08-04-2021 * BUGFIX: vmagent: properly discover `role: endpoints` and `role: endpointslices` targets in `kubernetes_sd_config`. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182). * BUGFIX: properly generate filename for `*.tar.gz` archive inside `_checksums.txt` file posted at [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1171). - ## [v1.57.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.57.1) Released at 30-03-2021 @@ -604,7 +582,6 @@ Released at 30-03-2021 * BUGFIX: vmselect: remove `-search.storageTimeout` command-line flag, since it has the same meaning as `-search.maxQueryDuration`. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711#issuecomment-808884995). * BUGFIX: vminsert: return back `type` label to per-tenant metric `vm_tenant_inserted_rows_total`. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/932). - ## [v1.57.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.57.0) Released at 29-03-2021 @@ -624,20 +601,19 @@ Released at 29-03-2021 * BUGFIX: prevent from infinite loop on `{__graphite__="..."}` filters when a metric name contains `*`, `{` or `[` chars. * BUGFIX: prevent from infinite loop in `/metrics/find` and `/metrics/expand` [Graphite Metrics API handlers](https://docs.victoriametrics.com/#graphite-metrics-api-usage) when they match metric names or labels with `*`, `{` or `[` chars. -* BUGFIX: do not merge duplicate time series during requests to `/api/v1/query`. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1141 -* BUGFIX: vmagent: properly handle `too old resource version` error messages from Kubernetes watch API. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1150 -* BUGFIX: vmagent: do not retry sending data blocks if remote storage returns `400 Bad Request` error. The number of dropped blocks due to such errors can be monitored with `vmagent_remotewrite_packets_dropped_total` metrics. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1149 +* BUGFIX: do not merge duplicate time series during requests to `/api/v1/query`. See +* BUGFIX: vmagent: properly handle `too old resource version` error messages from Kubernetes watch API. See +* BUGFIX: vmagent: do not retry sending data blocks if remote storage returns `400 Bad Request` error. The number of dropped blocks due to such errors can be monitored with `vmagent_remotewrite_packets_dropped_total` metrics. See * BUGFIX: properly calculate `summarize` and `*Series` functions in [Graphite Render API](https://docs.victoriametrics.com/#graphite-render-api-usage). - ## [v1.56.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.56.0) Released at 17-03-2021 * FEATURE: add the following functions to [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html): - - `histogram_avg(buckets)` - returns the average value for the given buckets. - - `histogram_stdvar(buckets)` - returns standard variance for the given buckets. - - `histogram_stddev(buckets)` - returns standard deviation for the given buckets. + * `histogram_avg(buckets)` - returns the average value for the given buckets. + * `histogram_stdvar(buckets)` - returns standard variance for the given buckets. + * `histogram_stddev(buckets)` - returns standard deviation for the given buckets. * FEATURE: export `vm_available_memory_bytes` and `vm_available_cpu_cores` metrics, which show the number of available RAM and available CPU cores for VictoriaMetrics apps. * FEATURE: export `vm_index_search_duration_seconds` histogram, which can be used for troubleshooting time series search performance. * FEATURE: vmagent: add ability to replicate scrape targets among `vmagent` instances in the cluster with `-promscrape.cluster.replicationFactor` command-line flag. See [these docs](https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets). @@ -651,23 +627,21 @@ Released at 17-03-2021 * FEATURE: listen for IPv6 UDP if `-enableTCP6` command-line flag is passed to VictoriaMetrics. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1131). * BUGFIX: vmagent: prevent from high CPU usage bug during failing scrapes with small `scrape_timeout` (less than a few seconds). -* BUGFIX: vmagent: reduce memory usage when Kubernetes service discovery is used in big number of distinct scrape config jobs by sharing Kubernetes object cache. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113 +* BUGFIX: vmagent: reduce memory usage when Kubernetes service discovery is used in big number of distinct scrape config jobs by sharing Kubernetes object cache. See * BUGFIX: vmagent: apply `sample_limit` only after `metric_relabel_configs` are applied as Prometheus does. Previously the `sample_limit` was applied before metrics relabeling. -* BUGFIX: vmagent: properly apply `tls_config`, `basic_auth` and `bearer_token` to proxy connections if `proxy_url` option is set. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116 -* BUGFIX: vmagent: properly scrape targets via https proxy specified in `proxy_url` if `insecure_skip_verify` flag isn't set in `tls_config` section. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116 +* BUGFIX: vmagent: properly apply `tls_config`, `basic_auth` and `bearer_token` to proxy connections if `proxy_url` option is set. See +* BUGFIX: vmagent: properly scrape targets via https proxy specified in `proxy_url` if `insecure_skip_verify` flag isn't set in `tls_config` section. See * BUGFUX: avoid `duplicate time series` error if `prometheus_buckets()` covers a time range with distinct set of buckets. -* BUGFIX: prevent exponent overflow when processing extremely small values close to zero such as `2.964393875E-314`. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1114 +* BUGFIX: prevent exponent overflow when processing extremely small values close to zero such as `2.964393875E-314`. See * BUGFIX: do not include datapoints with a timestamp `t-d` when returning results from `/api/v1/query?query=m[d]&time=t` as Prometheus does. * BUGFIX: do not crash if a query contains `histogram_over_time()` function name with uppercase chars. For example, `Histogram_Over_Time(m[5m])`. - ## [v1.55.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.55.1) Released at 03-03-2021 -* BUGFIX: vmagent: fix a panic in Kubernetes service discovery when a target is filtered out with relabeling. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1107 -* BUGFIX: vmagent: fix Kubernetes service discovery for `role: ingress`. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1110 - +* BUGFIX: vmagent: fix a panic in Kubernetes service discovery when a target is filtered out with relabeling. See +* BUGFIX: vmagent: fix Kubernetes service discovery for `role: ingress`. See ## [v1.55.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.55.0) @@ -693,26 +667,24 @@ Released at 02-03-2021 * FEATURE: vmalert: properly process query params in `-datasource.url` and `-remoteRead.url` command-line flags. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1087) for details. * BUGFIX: vmagent: properly apply `-remoteWrite.rateLimit` when `-remoteWrite.queues` is greater than 1. Previously there was a data race, which could prevent from proper rate limiting. -* BUGFIX: vmagent: properly perform graceful shutdown on `SIGINT` and `SIGTERM` signals. The graceful shutdown has been broken in `v1.54.0`. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1065 +* BUGFIX: vmagent: properly perform graceful shutdown on `SIGINT` and `SIGTERM` signals. The graceful shutdown has been broken in `v1.54.0`. See * BUGFIX: reduce the probability of `duplicate time series` errors when querying Kubernetes metrics. * BUGFIX: properly calculate `histogram_quantile()` over time series with only a single non-zero bucket with `{le="+Inf"}`. Previously `NaN` was returned, now the value for the last bucket before `{le="+Inf"}` is returned like Prometheus does. -* BUGFIX: vmselect: do not cache partial query results on timeout when receiving data from `vmstorage` nodes. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1085 +* BUGFIX: vmselect: do not cache partial query results on timeout when receiving data from `vmstorage` nodes. See * BUGFIX: properly handle `stale NFS file handle` error. -* BUGFIX: properly cache query results when `extra_label` query arg is used. Previously the cached results could clash for different `extra_label` values. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1095 -* BUGFIX: fix `http: superfluous response.WriteHeader call` issue. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1078 -* BUGFIX: fix arm64 builds due to the issue in `github.com/golang/snappy`. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1074 -* BUGFIX: fix `index out of range [1024819115206086200] with length 27` panic, which could occur when `1e-9` value is passed to VictoriaMetrics histogram. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1096 -* BUGFIX: fix parsing for Graphite line with empty tags such as `foo; 123 456`. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1100 +* BUGFIX: properly cache query results when `extra_label` query arg is used. Previously the cached results could clash for different `extra_label` values. See +* BUGFIX: fix `http: superfluous response.WriteHeader call` issue. See +* BUGFIX: fix arm64 builds due to the issue in `github.com/golang/snappy`. See +* BUGFIX: fix `index out of range [1024819115206086200] with length 27` panic, which could occur when `1e-9` value is passed to VictoriaMetrics histogram. See +* BUGFIX: fix parsing for Graphite line with empty tags such as `foo; 123 456`. See * BUGFIX: unescape only `\\`, `\n` and `\"` in label names when parsing Prometheus text exposition format as Prometheus does. Previously other escape sequences could be improperly unescaped. - ## [v1.54.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.54.1) Released at 18-02-2021 * BUGFIX: properly handle queries containing a filter on metric name plus any number of negative filters and zero non-negative filters. For example, `node_cpu_seconds_total{mode!="idle"}`. The bug was introduced in [v1.54.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.54.0). - ## [v1.54.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.54.0) Released at 18-02-2021 @@ -721,20 +693,19 @@ Released at 18-02-2021 * FEATURE: reduce execution times for `q1 q2` queries by executing `q1` and `q2` in parallel. * FEATURE: switch from Go1.15 to [Go1.16](https://golang.org/doc/go1.16) for building prod binaries. * FEATURE: single-node VictoriaMetrics now accepts requests to handlers with `/prometheus` and `/graphite` prefixes such as `/prometheus/api/v1/query`. This improves compatibility with [handlers from VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format). -* FEATURE: expose `process_open_fds` and `process_max_fds` metrics. These metrics can be used for alerting when `process_open_fds` reaches `process_max_fds`. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/402 and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1037 +* FEATURE: expose `process_open_fds` and `process_max_fds` metrics. These metrics can be used for alerting when `process_open_fds` reaches `process_max_fds`. See and * FEATURE: vmalert: add `-datasource.appendTypePrefix` command-line option for querying both Prometheus and Graphite datasource in cluster version of VictoriaMetrics. See [these docs](https://docs.victoriametrics.com/vmalert.html#graphite) for details. -* FEATURE: vmauth: add ability to route requests from a single user to multiple destinations depending on the requested paths. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1064 +* FEATURE: vmauth: add ability to route requests from a single user to multiple destinations depending on the requested paths. See * FEATURE: remove dependency on external programs such as `cat`, `grep` and `cut` when detecting cpu and memory limits inside Docker or LXC container. * FEATURE: vmagent: add `__meta_kubernetes_endpoints_label_*`, `__meta_kubernetes_endpoints_labelpresent_*`, `__meta_kubernetes_endpoints_annotation_*` and `__meta_kubernetes_endpoints_annotationpresent_*` labels for `role: endpoints` in Kubernetes service discovery. These labels where added in Prometheus 2.25. * FEATURE: reduce the minimum supported retention period for inverted index (aka `indexdb`) from one month to one day. This should reduce disk space usage for `<-storageDataPath>/indexdb` folder if `-retentionPeriod` is set to values smaller than one month. -* FEATURE: vmselect: export per-tenant metrics `vm_vmselect_http_requests_total` and `vm_vmselect_http_requests_duration_ms_total` . Other per-tenant metrics are available as a part of [enterprise package](https://victoriametrics.com/products/enterprise/). See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/932 for details. +* FEATURE: vmselect: export per-tenant metrics `vm_vmselect_http_requests_total` and `vm_vmselect_http_requests_duration_ms_total` . Other per-tenant metrics are available as a part of [enterprise package](https://victoriametrics.com/products/enterprise/). See for details. * BUGFIX: properly convert regexp tag filters containing escaped dots to non-regexp tag filters. For example, `{foo=~"bar\.baz"}` should be converted to `{foo="bar.baz"}`. Previously it was incorrectly converted to `{foo="bar\.baz"}`, which could result in missing time series for this tag filter. -* BUGFIX: do not spam error logs when discovering Docker Swarm targets without dedicated IP. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1028 . +* BUGFIX: do not spam error logs when discovering Docker Swarm targets without dedicated IP. See . * BUGFIX: properly embed timezone data into VictoriaMetrics apps. This should fix `-loggerTimezone` usage inside Docker containers. * BUGFIX: properly build Docker images for non-amd64 architectures (arm, arm64, ppc64le, 386) on [Docker hub](https://hub.docker.com/u/victoriametrics/). Previously these images were incorrectly based on amd64 base image, so they didn't work. -* BUGFIX: vmagent: return back unsent block to the queue during graceful shutdown. Previously this block could be dropped if remote storage is unavailable during vmagent shutdown. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1065 . - +* BUGFIX: vmagent: return back unsent block to the queue during graceful shutdown. Previously this block could be dropped if remote storage is unavailable during vmagent shutdown. See . ## [v1.53.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.53.1) @@ -742,7 +713,6 @@ Released at 03-02-2021 * BUGFIX: vmselect: fix the bug peventing from proper searching by Graphite filter with wildcards such as `{__graphite__="foo.*.bar"}`. - ## [v1.53.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.53.0) Released at 03-02-2021 @@ -752,24 +722,23 @@ Released at 03-02-2021 * FEATURE: added `-search.maxStepForPointsAdjustment` command-line flag, which can be used for disabling adjustment for points returned by `/api/v1/query_range` handler if such points have timestamps closer than `-search.latencyOffset` to the current time. Such points may contain incomplete data, so they are substituted by the previous values for `step` query args smaller than one minute by default. * FEATURE: vmselect: added ability to use Graphite-compatible filters in MetricsQL via `{__graphite__="foo.*.bar"}` syntax. This expression is equivalent to `{__name__=~"foo[.][^.]*[.]bar"}`, but it works faster and it is easier to use when migrating from Graphite to VictoriaMetrics. This feature deprecates the usage of `-search.treatDotsAsIsInRegexps` command-line flag. * FEATURE: vmselect: added ability to set additional label filters, which must be applied during queries. Such label filters can be set via optional `extra_label` query arg, which is accepted by [querying API](https://docs.victoriametrics.com/#prometheus-querying-api-usage) handlers. For example, the request to `/api/v1/query_range?extra_label=tenant_id=123&query=` adds `{tenant_id="123"}` label filter to the given ``. It is expected that the `extra_label` query arg is automatically set by auth proxy sitting -in front of VictoriaMetrics. [Contact us](mailto:sales@victoriametrics.com) if you need assistance with such a proxy. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1021 . -* FEATURE: vmalert: added `-datasource.queryStep` command-line flag for passing optional `step` query arg to `/api/v1/query` endpoint. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1025 +in front of VictoriaMetrics. [Contact us](mailto:sales@victoriametrics.com) if you need assistance with such a proxy. See . +* FEATURE: vmalert: added `-datasource.queryStep` command-line flag for passing optional `step` query arg to `/api/v1/query` endpoint. See * FEATURE: vmalert: added ability to query Graphite datasource when evaluating alerting and recording rules. See [these docs](https://docs.victoriametrics.com/vmalert.html#graphite) for details. * FEATURE: vmagent: added `-remoteWrite.roundDigits` command-line option for rounding metric values to the given number of decimal digits after the point before sending the metric to the corresponding `-remoteWrite.url`. This option can be used for improving data compression on the remote storage, because values with lower number of decimal digits can be compressed better than values with bigger number of decimal digits. -* FEATURE: vmagent: added `-remoteWrite.rateLimit` command-line flag for limiting data transfer rate to `-remoteWrite.url`. This may be useful when big amounts of buffered data is sent after temporarily unavailability of the remote storage. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1035 +* FEATURE: vmagent: added `-remoteWrite.rateLimit` command-line flag for limiting data transfer rate to `-remoteWrite.url`. This may be useful when big amounts of buffered data is sent after temporarily unavailability of the remote storage. See * FEATURE: vmagent: export the following additional metrics, which may be useful during troubleshooting: - - `vm_promscrape_scrapes_failed_per_url_total` - - `vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total` - - `vm_promscrape_discovery_requests_total` - - `vm_promscrape_discovery_retries_total` - - `vm_promscrape_scrape_retries_total` - - `vm_promscrape_service_discovery_duration_seconds` + * `vm_promscrape_scrapes_failed_per_url_total` + * `vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total` + * `vm_promscrape_discovery_requests_total` + * `vm_promscrape_discovery_retries_total` + * `vm_promscrape_scrape_retries_total` + * `vm_promscrape_service_discovery_duration_seconds` * FEATURE: vmselect: initial implementation for [Graphite Render API](https://docs.victoriametrics.com/#graphite-render-api-usage). * BUGFIX: vmagent: reduce HTTP reconnection rate for scrape targets. Previously vmagent could errorneusly close HTTP keep-alive connections more frequently than needed. * BUGFIX: vmagent: retry scrape and service discovery requests when the remote server closes HTTP keep-alive connection. Previously `disable_keepalive: true` option could be used under `scrape_configs` section when working with such servers. - ## [v1.52.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.52.0) Released at 13-01-2021 @@ -777,34 +746,32 @@ Released at 13-01-2021 * FEATURE: provide a sample list of alerting rules for VictoriaMetrics components. It is available [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/alerts.yml). * FEATURE: disable final merge for data for the previous month at the beginning of new month, since it may result in high disk IO and CPU usage. Final merge can be enabled by setting `-finalMergeDelay` command-line flag to positive duration. * FEATURE: add `tfirst_over_time(m[d])` and `tlast_over_time(m[d])` functions to [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) for returning timestamps for the first and the last data point in `m` over `d` duration. -* FEATURE: add ability to pass multiple labels to `sort_by_label()` and `sort_by_label_desc()` functions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/992 . +* FEATURE: add ability to pass multiple labels to `sort_by_label()` and `sort_by_label_desc()` functions. See . * FEATURE: enforce at least TLS v1.2 when accepting HTTPS requests if `-tls`, `-tlsCertFile` and `-tlsKeyFile` command-line flags are set, because older TLS protocols such as v1.0 and v1.1 have been deprecated due to security vulnerabilities. * FEATURE: support `extra_label` query arg for all HTTP-based [data ingestion protocols](https://docs.victoriametrics.com/#how-to-import-time-series-data). This query arg can be used for specifying extra labels which should be added for the ingested data. -* FEATURE: vmbackup: increase backup chunk size from 128MB to 1GB. This should reduce the number of Object storage API calls during backups by 8x. This may also reduce costs, since object storage API calls usually have non-zero costs. See https://aws.amazon.com/s3/pricing/ and https://cloud.google.com/storage/pricing#operations-pricing . +* FEATURE: vmbackup: increase backup chunk size from 128MB to 1GB. This should reduce the number of Object storage API calls during backups by 8x. This may also reduce costs, since object storage API calls usually have non-zero costs. See and . -* BUGFIX: properly parse escaped unicode chars in MetricsQL metric names, label names and function names. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/990 +* BUGFIX: properly parse escaped unicode chars in MetricsQL metric names, label names and function names. See * BUGFIX: override user-provided labels with labels set in `extra_label` query args during data ingestion over HTTP-based protocols. -* BUGFIX: vmagent: prevent from `dialing to the given TCP address time out` error when scraping big number of unavailable targets. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/987 +* BUGFIX: vmagent: prevent from `dialing to the given TCP address time out` error when scraping big number of unavailable targets. See * BUGFIX: vmagent: properly show scrape duration on `/targets` page. Previously it was incorrectly shown as 0.000s. -* BUGFIX: vmagent: properly log errors when `-promscrape.streamParse` command-line flag is set. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1009 -* BUGFIX: vmagent: properly suppress errors when both `-promscrape.suppressScrapeErrors` and `-promscrape.streamParse` command-line flags are set. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1009 . -* BUGFIX: vmalert: return non-empty result in template func `query` stub to pass validation. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/989 . -* BUGFIX: upgrade base image for Docker packages from Alpine 3.12.1 to Alpine 3.12.3 in order to fix potential security issues. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1010 - +* BUGFIX: vmagent: properly log errors when `-promscrape.streamParse` command-line flag is set. See +* BUGFIX: vmagent: properly suppress errors when both `-promscrape.suppressScrapeErrors` and `-promscrape.streamParse` command-line flags are set. See . +* BUGFIX: vmalert: return non-empty result in template func `query` stub to pass validation. See . +* BUGFIX: upgrade base image for Docker packages from Alpine 3.12.1 to Alpine 3.12.3 in order to fix potential security issues. See ## [v1.51.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.51.0) Released at 27-12-2020 -* FEATURE: add `/api/v1/status/top_queries` handler, which returns the most frequently executed queries and queries that took the most time for execution. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/907 -* FEATURE: vmagent: add support for `proxy_url` config option in Prometheus scrape configs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/503 +* FEATURE: add `/api/v1/status/top_queries` handler, which returns the most frequently executed queries and queries that took the most time for execution. See +* FEATURE: vmagent: add support for `proxy_url` config option in Prometheus scrape configs. See * FEATURE: remove parts with stale data as soon as they go outside the configured `-retentionPeriod`. Previously such parts may remain active for long periods of time. This should help reducing disk usage for `-retentionPeriod` smaller than one month. * FEATURE: vmalert: allow setting multiple values for `-notifier.tlsInsecureSkipVerify` command-line flag per each `-notifier.url`. -* BUGFIX: vmalert: properly escape multiline queries when passing them to Grafana. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890 -* BUGFIX: vmagent: set missing `__meta_kubernetes_service_*` labels in `kubernetes_sd_config` for `endpoints` and `endpointslices` roles. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/982 -* BUGFIX: do not adjust `offset` value provided in MetricsQL query. Previously it could be modified in order to improve response cache hit ratio. This is unneeded, since cache hit ratio should remain good because the query time range should be already aligned to multiple of `step` values. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/976 - +* BUGFIX: vmalert: properly escape multiline queries when passing them to Grafana. See +* BUGFIX: vmagent: set missing `__meta_kubernetes_service_*` labels in `kubernetes_sd_config` for `endpoints` and `endpointslices` roles. See +* BUGFIX: do not adjust `offset` value provided in MetricsQL query. Previously it could be modified in order to improve response cache hit ratio. This is unneeded, since cache hit ratio should remain good because the query time range should be already aligned to multiple of `step` values. See ## [v1.50.2](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.50.2) @@ -812,10 +779,9 @@ Released at 19-12-2020 * FEATURE: do not publish duplicate Docker images with `-cluster` tag suffix for [vmagent](https://docs.victoriametrics.com/vmagent.html), [vmalert](https://docs.victoriametrics.com/vmalert.html), [vmauth](https://docs.victoriametrics.com/vmauth.html), [vmbackup](https://docs.victoriametrics.com/vmbackup.html) and [vmrestore](https://docs.victoriametrics.com/vmrestore.html), since they are identical to images without `-cluster` tag suffix. -* BUGFIX: vmalert: properly populate template variables. This has been broken in v1.50.0. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/974 +* BUGFIX: vmalert: properly populate template variables. This has been broken in v1.50.0. See * BUGFIX: properly parse negative combined duration in MetricsQL such as `-1h3m4s`. It must be parsed as `-(1h + 3m + 4s)`. Prevsiously it was parsed as `-1h + 3m + 4s`. -* BUGFIX: properly parse lines in [Prometheus exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md) and in [OpenMetrics format](https://github.com/OpenObservability/OpenMetrics/blob/master/specification/OpenMetrics.md) with whitespace after the timestamp. For example, `foo 123 456 ## some comment here`. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/970 - +* BUGFIX: properly parse lines in [Prometheus exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md) and in [OpenMetrics format](https://github.com/OpenObservability/OpenMetrics/blob/master/specification/OpenMetrics.md) with whitespace after the timestamp. For example, `foo 123 456 ## some comment here`. See ## [v1.50.1](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.50.1) @@ -825,13 +791,12 @@ Released at 15-12-2020 * BUGFIX: vmagent: properly delete unregistered scrape targets from `/targets` and `/api/v1/targets` pages. They weren't deleted due to the bug in `v1.50.0`. - ## [v1.50.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.50.0) Released at 15-12-2020 * FEATURE: automatically reset response cache when samples with timestamps older than `now - search.cacheTimestampOffset` are ingested to VictoriaMetrics. This makes unnecessary disabling response cache during data backfilling or resetting it after backfilling is complete as described [in these docs](https://docs.victoriametrics.com/#backfilling). This feature applies only to single-node VictoriaMetrics. It doesn't apply to cluster version of VictoriaMetrics because `vminsert` nodes don't know about `vmselect` nodes where the response cache must be reset. -* FEATURE: vmalert: add `query`, `first` and `value` functions to alert templates. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/539 +* FEATURE: vmalert: add `query`, `first` and `value` functions to alert templates. See * FEATURE: vmagent: return user-friendly HTML page when requesting `/targets` page from web browser. The page is returned in the old plaintext format when requesting via curl or similar tool. * FEATURE: allow multiple whitespace chars between measurements, fields and timestamp when parsing InfluxDB line protocol. Though [InfluxDB line protocol](https://docs.influxdata.com/influxdb/v1.8/write_protocols/line_protocol_tutorial/) denies multiple whitespace chars between these entities, @@ -841,25 +806,24 @@ Released at 15-12-2020 per each service discovery type. * FEATURE: vmagent: allow setting per-`-remoteWrite.url` command-line flags for `-remoteWrite.sendTimeout` and `-remoteWrite.tlsInsecureSkipVerify`. -* BUGFIX: properly handle `*` and `[...]` inside curly braces in query passed to Graphite Metrics API. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/952 +* BUGFIX: properly handle `*` and `[...]` inside curly braces in query passed to Graphite Metrics API. See * BUGFIX: vmagent: fix memory leak when big number of targets is discovered via service discovery. -* BUGFIX: vmagent: properly pass `datacenter` filter to Consul API server. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574#issuecomment-740454170 -* BUGFIX: properly handle CPU limits set on the host system or host container. The bugfix may result in lower memory usage on systems with CPU limits. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946 -* BUGFIX: prevent from duplicate `name` tag returned from `/tags/autoComplete/tags` handler. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/942 -* BUGFIX: do not enable strict parsing for `-promscrape.config` if `-promscrape.config.dryRun` comand-line flag is set. Strict parsing can be enabled with `-promscrape.config.strictParse` command-line flag. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/944 -* BUGFIX: vminsert: properly update `vm_rpc_rerouted_rows_processed_total` metric. Previously it wasn't updated. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/955 -* BUGFIX: vmagent: properly recover when opening incorrectly stored persistent queue. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/964 -* BUGFIX: vmagent: properly handle scrape errors when stream parsing is enabled with `-promscrape.streamParse` command-line flag or with `stream_parse: true` per-target config option. Previously such errors weren't reported at `/targets` page. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/967 -* BUGFIX: assume the previous value is 0 when calculating `increase()` for the first point on the graph if its value doesn't exceed 100 and the delta between two first points equals to 0. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/962 - +* BUGFIX: vmagent: properly pass `datacenter` filter to Consul API server. See +* BUGFIX: properly handle CPU limits set on the host system or host container. The bugfix may result in lower memory usage on systems with CPU limits. See +* BUGFIX: prevent from duplicate `name` tag returned from `/tags/autoComplete/tags` handler. See +* BUGFIX: do not enable strict parsing for `-promscrape.config` if `-promscrape.config.dryRun` comand-line flag is set. Strict parsing can be enabled with `-promscrape.config.strictParse` command-line flag. See +* BUGFIX: vminsert: properly update `vm_rpc_rerouted_rows_processed_total` metric. Previously it wasn't updated. See +* BUGFIX: vmagent: properly recover when opening incorrectly stored persistent queue. See +* BUGFIX: vmagent: properly handle scrape errors when stream parsing is enabled with `-promscrape.streamParse` command-line flag or with `stream_parse: true` per-target config option. Previously such errors weren't reported at `/targets` page. See +* BUGFIX: assume the previous value is 0 when calculating `increase()` for the first point on the graph if its value doesn't exceed 100 and the delta between two first points equals to 0. See ## [v1.49.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.49.0) Released at 05-12-2020 -* FEATURE: optimize Consul service discovery speed when discovering big number of services. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574 +* FEATURE: optimize Consul service discovery speed when discovering big number of services. See * FEATURE: add `label_uppercase(q, label1, ... labelN)` and `label_lowercase(q, label1, ... labelN)` function to [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) - for uppercasing and lowercasing values for the given labels. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/936 + for uppercasing and lowercasing values for the given labels. See * FEATURE: add `count_eq_over_time(m[d], N)` and `count_ne_over_time(m[d], N)` for counting the number of samples for `m` over `d` that (equal / not equal) to `N`. * FEATURE: do not print usage info for all the command-line flags when incorrect command-line flag is passed. Previously it could be hard reading the error message about incorrect command-line flag because of too big usage info for all the flags. @@ -874,55 +838,55 @@ Released at 05-12-2020 * BUGFIX: return `nan` for `minute(m)` query when `m` equals to `nan` like Prometheus does. This applies to all the time-related functions such as `day_of_month`, `day_of_week`, `days_in_month`, `hour`, `month` and `year`. - ## [v1.48.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.48.0) Released at 26-11-2020 * FEATURE: added [Snap package for single-node VictoriaMetrics](https://snapcraft.io/victoriametrics). This simplifies installation under Ubuntu to a single command: + ```bash snap install victoriametrics ``` + * FEATURE: vmselect: add `-replicationFactor` command-line flag for reducing query duration when replication is enabled and a part of vmstorage nodes - are temporarily slow and/or temporarily unavailable. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711 + are temporarily slow and/or temporarily unavailable. See * FEATURE: vminsert: export `vm_rpc_vmstorage_is_reachable` metric, which can be used for monitoring reachability of vmstorage nodes from vminsert nodes. -* FEATURE: vmagent: add [Netflix Eureka](https://github.com/Netflix/eureka) service discovery (aka [eureka_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config)). See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/851 -* FEATURE: add `filters` option to `dockerswarm_sd_config` like Prometheus did in v2.23.0 - see https://github.com/prometheus/prometheus/pull/8074 +* FEATURE: vmagent: add [Netflix Eureka](https://github.com/Netflix/eureka) service discovery (aka [eureka_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config)). See +* FEATURE: add `filters` option to `dockerswarm_sd_config` like Prometheus did in v2.23.0 - see * FEATURE: expose `__meta_ec2_ipv6_addresses` label for `ec2_sd_config` like Prometheus will do in the next release. -* FEATURE: add `-loggerWarnsPerSecondLimit` command-line flag for rate limiting of WARN messages in logs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/905 +* FEATURE: add `-loggerWarnsPerSecondLimit` command-line flag for rate limiting of WARN messages in logs. See * FEATURE: apply `loggerErrorsPerSecondLimit` and `-loggerWarnsPerSecondLimit` rate limit per caller. I.e. log messages are suppressed if the same caller logs the same message - at the rate exceeding the given limit. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/905#issuecomment-729395855 + at the rate exceeding the given limit. See * FEATURE: add remoteAddr to slow query log in order to simplify identifying the client that sends slow queries to VictoriaMetrics. Slow query logging is controlled with `-search.logSlowQueryDuration` command-line flag. -* FEATURE: add `/tags/delSeries` handler from Graphite Tags API. See https://docs.victoriametrics.com/#graphite-tags-api-usage +* FEATURE: add `/tags/delSeries` handler from Graphite Tags API. See * FEATURE: log metric name plus all its labels when the metric timestamp is out of the configured retention. This should simplify detecting the source of metrics with unexpected timestamps. * FEATURE: add `-dryRun` command-line flag to single-node VictoriaMetrics in order to check config file pointed by `-promscrape.config`. * BUGFIX: properly parse Prometheus metrics with [exemplars](https://github.com/OpenObservability/OpenMetrics/blob/master/OpenMetrics.md#exemplars-1) such as `foo 123 ## {bar="baz"} 1`. * BUGFIX: properly parse "infinity" values in [OpenMetrics format](https://github.com/OpenObservability/OpenMetrics/blob/master/OpenMetrics.md#abnf). - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/924 - + See ## [v1.47.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.47.0) Released at 16-11-2020 * FEATURE: vmselect: return the original error from `vmstorage` node in query response if `-search.denyPartialResponse` is set. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/891 + See * FEATURE: vmselect: add `"isPartial":{true|false}` field in JSON output for `/api/v1/*` functions from [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/). `"isPartial":true` is set if the response contains partial data because of a part of `vmstorage` nodes were unavailable during query processing. * FEATURE: improve performance for `/api/v1/series`, `/api/v1/labels` and `/api/v1/label//values` on time ranges exceeding one day. * FEATURE: vmagent: reduce memory usage when service discovery detects big number of scrape targets and the set of discovered targets changes over time. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825 + See * FEATURE: vmagent: add `-promscrape.dropOriginalLabels` command-line option, which can be used for reducing memory usage when scraping big number of targets. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825#issuecomment-724308361 for details. -* FEATURE: vmalert: explicitly set extra labels to alert entities. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/870 + See for details. +* FEATURE: vmalert: explicitly set extra labels to alert entities. See * FEATURE: add `-search.treatDotsAsIsInRegexps` command-line flag, which can be used for automatic escaping of dots in regexp label filters used in queries. For example, if `-search.treatDotsAsIsInRegexps` is set, then the query `foo{bar=~"aaa.bb.cc|dd.eee"}` is automatically converted to `foo{bar=~"aaa\\.bb\\.cc|dd\\.eee"}`. This may be useful for querying Graphite data. * FEATURE: consistently return text-based HTTP responses such as `plain/text` and `application/json` with `charset=utf-8`. - See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897 + See * FEATURE: update Go builder from v1.15.4 to v1.15.5. This should fix [these issues in Go](https://github.com/golang/go/issues?q=milestone%3AGo1.15.5+label%3ACherryPickApproved). * FEATURE: added `/internal/force_flush` http handler for flushing recently ingested data from in-memory buffers to persistent storage. See [troubleshooting docs](https://docs.victoriametrics.com/#troubleshooting) for more details. @@ -930,10 +894,9 @@ Released at 16-11-2020 See [these docs](https://docs.victoriametrics.com/#graphite-tags-api-usage) for details. * BUGFIX: do not return data points in the end of the selected time range for time series ending in the middle of the selected time range. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/887 and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845 -* BUGFIX: remove spikes at the end of time series gaps for `increase()` or `delta()` functions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/894 -* BUGFIX: vminsert: properly return HTTP 503 status code when all the vmstorage nodes are unavailable. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/896 - + See and +* BUGFIX: remove spikes at the end of time series gaps for `increase()` or `delta()` functions. See +* BUGFIX: vminsert: properly return HTTP 503 status code when all the vmstorage nodes are unavailable. See ## [v1.46.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.46.0) @@ -943,73 +906,73 @@ Released at 07-11-2020 * FEATURE: reduce memory usage when query touches big number of time series. * FEATURE: vmagent: reduce memory usage when `kubernetes_sd_config` discovers big number of scrape targets (e.g. hundreds of thousands) and the majority of these targets (99%) are dropped during relabeling. Previously labels for all the dropped targets were displayed at `/api/v1/targets` page. Now only up to `-promscrape.maxDroppedTargets` such - targets are displayed. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/878 for details. + targets are displayed. See for details. * FEATURE: vmagent: reduce memory usage when scraping big number of targets with big number of temporary labels starting with `__`. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825 + See * FEATURE: vmagent: add `/ready` HTTP endpoint, which returns 200 OK status code when all the service discovery has been initialized. - This may be useful during rolling upgrades. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/875 + This may be useful during rolling upgrades. See * BUGFIX: vmagent: eliminate data race when `-promscrape.streamParse` command-line is set. Previously this mode could result in scraped metrics with garbage labels. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825#issuecomment-723198247 for details. + See for details. * BUGFIX: properly calculate `topk_*` and `bottomk_*` functions from [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) for time series with gaps. - See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/883 - + See ## [v1.45.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.45.0) Released at 02-11-2020 * FEATURE: allow setting `-retentionPeriod` smaller than one month. I.e. `-retentionPeriod=3d`, `-retentionPeriod=2w`, etc. is supported now. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/173 -* FEATURE: optimize more cases according to https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusLabelNonOptimization . Now the following cases are optimized too: + See +* FEATURE: optimize more cases according to . Now the following cases are optimized too: * `rollup_func(foo{filters}[d]) op bar` -> `rollup_func(foo{filters}[d]) op bar{filters}` * `transform_func(foo{filters}) op bar` -> `transform_func(foo{filters}) op bar{filters}` * `num_or_scalar op foo{filters} op bar` -> `num_or_scalar op foo{filters} op bar{filters}` * FEATURE: improve time series search for queries with multiple label filters. I.e. `foo{label1="value", label2=~"regexp"}`. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/781 + See * FEATURE: vmagent: add `stream parse` mode. This mode allows reducing memory usage when individual scrape targets expose tens of millions of metrics. For example, during scraping Prometheus in [federation](https://prometheus.io/docs/prometheus/latest/federation/) mode. See `-promscrape.streamParse` command-line option and `stream_parse: true` config option for `scrape_config` section in `-promscrape.config`. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825 and [troubleshooting docs for vmagent](https://docs.victoriametrics.com/vmagent.html#troubleshooting). + See and [troubleshooting docs for vmagent](https://docs.victoriametrics.com/vmagent.html#troubleshooting). * FEATURE: vmalert: add `-dryRun` command-line option for validating the provided config files without the need to start `vmalert` service. * FEATURE: accept optional third argument of string type at `topk_*` and `bottomk_*` functions. This is label name for additional time series to return with the sum of time series outside top/bottom K. See [MetricsQL docs](https://docs.victoriametrics.com/MetricsQL.html) for more details. * FEATURE: vmagent: expose `/api/v1/targets` page according to [the corresponding Prometheus API](https://prometheus.io/docs/prometheus/latest/querying/api/#targets). - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/643 + See * BUGFIX: vmagent: properly handle OpenStack endpoint ending with `v3.0` such as `https://ostack.example.com:5000/v3.0` - in the same way as Prometheus does. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728#issuecomment-709914803 -* BUGFIX: drop trailing data points for time series with a single raw sample. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748 -* BUGFIX: do not drop trailing data points for instant queries to `/api/v1/query`. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845 -* BUGFIX: vmbackup: fix panic when `-origin` isn't specified. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/856 + in the same way as Prometheus does. See +* BUGFIX: drop trailing data points for time series with a single raw sample. See +* BUGFIX: do not drop trailing data points for instant queries to `/api/v1/query`. See +* BUGFIX: vmbackup: fix panic when `-origin` isn't specified. See * BUGFIX: vmalert: skip automatically added labels on alerts restore. Label `alertgroup` was introduced in [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/611) and automatically added to generated time series. By mistake, this new label wasn't correctly purged on restore event and affected alert's ID uniqueness. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/870 -* BUGFIX: vmagent: fix panic at scrape error body formating. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/864 -* BUGFIX: vmagent: add leading missing slash to metrics path like Prometheus does. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/835 + See +* BUGFIX: vmagent: fix panic at scrape error body formating. See +* BUGFIX: vmagent: add leading missing slash to metrics path like Prometheus does. See * BUGFIX: vmagent: drop packet if remote storage returns 4xx status code. This make the behaviour consistent with Prometheus. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/873 -* BUGFIX: vmagent: properly handle 301 redirects. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/869 - + See +* BUGFIX: vmagent: properly handle 301 redirects. See ## [v1.44.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.44.0) Released at 13-10-2020 -* FEATURE: automatically add missing label filters to binary operands as described at https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusLabelNonOptimization . +* FEATURE: automatically add missing label filters to binary operands as described at . This should improve performance for queries with missing label filters in binary operands. For example, the following query should work faster now, because it shouldn't fetch and discard time series for `node_filesystem_files_free` metric without matching labels for the left side of the expression: + ``` node_filesystem_files{ host="$host", mountpoint="/" } - node_filesystem_files_free ``` + * FEATURE: vmagent: add Docker Swarm service discovery (aka [dockerswarm_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config)). - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/656 + See * FEATURE: add ability to export data in CSV format. See [these docs](https://docs.victoriametrics.com/#how-to-export-csv-data) for details. * FEATURE: vmagent: add `-promscrape.suppressDuplicateScrapeTargetErrors` command-line flag for suppressing `duplicate scrape target` errors. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651 and https://docs.victoriametrics.com/vmagent.html#troubleshooting . + See and . * FEATURE: vmagent: show original labels before relabeling is applied on `duplicate scrape target` errors. This should simplify debugging for incorrect relabeling. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651 + See * FEATURE: vmagent: `/targets` page now accepts optional `show_original_labels=1` query arg for displaying original labels for each target before relabeling is applied. - This should simplify debugging for target relabeling configs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651 + This should simplify debugging for target relabeling configs. See * FEATURE: add `-finalMergeDelay` command-line flag for configuring the delay before final merge for per-month partitions. The final merge is started after no new data is ingested into per-month partition during `-finalMergeDelay`. * FEATURE: add `vm_rows_added_to_storage_total` metric, which shows the total number of rows added to storage since app start. @@ -1018,41 +981,40 @@ Released at 13-10-2020 than `sum(rate(vm_rows_inserted_total))` if [replication](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#replication-and-data-safety) is enabled. * FEATURE: keep metric name after applying [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) functions, which don't change time series meaning. The list of such functions: - * `keep_last_value` - * `keep_next_value` - * `interpolate` - * `running_min` - * `running_max` - * `running_avg` - * `range_min` - * `range_max` - * `range_avg` - * `range_first` - * `range_last` - * `range_quantile` - * `smooth_exponential` - * `ceil` - * `floor` - * `round` - * `clamp_min` - * `clamp_max` - * `max_over_time` - * `min_over_time` - * `avg_over_time` - * `quantile_over_time` - * `mode_over_time` - * `geomean_over_time` - * `holt_winters` - * `predict_linear` - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674 + * `keep_last_value` + * `keep_next_value` + * `interpolate` + * `running_min` + * `running_max` + * `running_avg` + * `range_min` + * `range_max` + * `range_avg` + * `range_first` + * `range_last` + * `range_quantile` + * `smooth_exponential` + * `ceil` + * `floor` + * `round` + * `clamp_min` + * `clamp_max` + * `max_over_time` + * `min_over_time` + * `avg_over_time` + * `quantile_over_time` + * `mode_over_time` + * `geomean_over_time` + * `holt_winters` + * `predict_linear` + See * BUGFIX: properly handle stale time series after K8S deployment. Previously such time series could be double-counted. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748 + See * BUGFIX: return a single time series at max from `absent()` function like Prometheus does. -* BUGFIX: vmalert: accept days, weeks and years in `for: ` part of config like Prometheus does. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817 +* BUGFIX: vmalert: accept days, weeks and years in `for:` part of config like Prometheus does. See * BUGFIX: fix `mode_over_time(m[d])` calculations. Previously the function could return incorrect results. - ## [v1.43.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.43.0) Released at 06-10-2020 @@ -1060,20 +1022,19 @@ Released at 06-10-2020 * FEATURE: reduce CPU usage for repeated queries over sliding time window when no new time series are added to the database. Typical use cases: repeated evaluation of alerting rules in [vmalert](https://docs.victoriametrics.com/vmalert.html) or dashboard auto-refresh in Grafana. * FEATURE: vmagent: add OpenStack service discovery aka [openstack_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config). - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728 . + See . * FEATURE: vmalert: make `-maxIdleConnections` configurable for datasource HTTP client. This option can be used for minimizing connection churn. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/795 . + See . * FEATURE: add `-influx.maxLineSize` command-line flag for configuring the maximum size for a single InfluxDB line during parsing. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/807 + See * BUGFIX: properly handle `inf` values during [background merge of LSM parts](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282). - Previously `Inf` values could result in `NaN` values for adjacent samples in time series. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/805 . -* BUGFIX: fill gaps on graphs for `range_*` and `running_*` functions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/806 . + Previously `Inf` values could result in `NaN` values for adjacent samples in time series. See . +* BUGFIX: fill gaps on graphs for `range_*` and `running_*` functions. See . * BUGFIX: make a copy of label with new name during relabeling with `action: labelmap` in the same way as Prometheus does. - Previously the original label name has been replaced. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/812 . + Previously the original label name has been replaced. See . * BUGFIX: support parsing floating-point timestamp like Graphite Carbon does. Such timestmaps are truncated to seconds. - ## [v1.42.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.42.0) Released at 30-09-2020 @@ -1085,27 +1046,26 @@ Released at 30-09-2020 connections to VictoriaMetrics or [vmagent](https://docs.victoriametrics.com/vmagent.html) in order to achieve the maximum data ingestion speed. * FEATURE: cluster: improve performance for data ingestion path from `vminsert` to `vmstorage` nodes. The maximum data ingestion performance for a single connection between `vminsert` and `vmstorage` node scales with the number of available CPU cores on `vmstorage` side. - This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/791 . + This should help with . * FEATURE: add ability to export / import data in native format via `/api/v1/export/native` and `/api/v1/import/native`. This is the most optimized approach for data migration between VictoriaMetrics instances. Both single-node and cluster instances are supported. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/787#issuecomment-700632551 . + See . * FEATURE: add `reduce_mem_usage` query option to `/api/v1/export` in order to reduce memory usage during data export / import. See [these docs](https://docs.victoriametrics.com/#how-to-export-data-in-json-line-format) for details. * FEATURE: improve performance for `/api/v1/series` handler when it returns big number of time series. * FEATURE: add `vm_merge_need_free_disk_space` metric, which can be used for estimating the number of deferred background data merges due to the lack of free disk space. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686 . -* FEATURE: add OpenBSD support. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/785 . + See . +* FEATURE: add OpenBSD support. See . -* BUGFIX: properly apply `-search.maxStalenessInterval` command-line flag value. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/784 . -* BUGFIX: fix displaying data in Grafana tables. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/720 . +* BUGFIX: properly apply `-search.maxStalenessInterval` command-line flag value. See . +* BUGFIX: fix displaying data in Grafana tables. See . * BUGFIX: do not adjust the number of detected CPU cores found at `/sys/devices/system/cpu/online`. The adjustment was increasing the resulting GOMAXPROC by 1, which looked confusing to users. - See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685#issuecomment-698595309 . -* BUGFIX: vmagent: do not show `-remoteWrite.url` in initial logs if `-remoteWrite.showURL` isn't set. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/773 . + See . +* BUGFIX: vmagent: do not show `-remoteWrite.url` in initial logs if `-remoteWrite.showURL` isn't set. See . * BUGFIX: properly handle case when [/metrics/find](https://docs.victoriametrics.com/#graphite-metrics-api-usage) finds both a leaf and a node for the given `query=prefix.*`. In this case only the node must be returned with stripped dot in the end of id as carbonapi does. - ## Previous releases See [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). diff --git a/docs/CaseStudies.md b/docs/CaseStudies.md index 294982002..a047b6e08 100644 --- a/docs/CaseStudies.md +++ b/docs/CaseStudies.md @@ -7,33 +7,33 @@ sort: 11 Below please find public case studies and talks from VictoriaMetrics users. You can also join our [community Slack channel](https://slack.victoriametrics.com/) where you can chat with VictoriaMetrics users to get additional references, reviews and case studies. -* [AbiosGaming](#abiosgaming) -* [adidas](#adidas) -* [Adsterra](#adsterra) -* [ARNES](#arnes) -* [Brandwatch](#brandwatch) -* [CERN](#cern) -* [COLOPL](#colopl) -* [Dreamteam](#dreamteam) -* [Fly.io](#flyio) -* [German Research Center for Artificial Intelligence](#german-research-center-for-artificial-intelligence) -* [Grammarly](#grammarly) -* [Groove X](#groove-x) -* [Idealo.de](#idealode) -* [MHI Vestas Offshore Wind](#mhi-vestas-offshore-wind) -* [Percona](#percona) -* [Razorpay](#razorpay) -* [Sensedia](#sensedia) -* [Smarkets](#smarkets) -* [Synthesio](#synthesio) -* [Wedos.com](#wedoscom) -* [Wix.com](#wixcom) -* [Zerodha](#zerodha) -* [zhihu](#zhihu) +- [Case studies and talks](#case-studies-and-talks) + - [AbiosGaming](#abiosgaming) + - [adidas](#adidas) + - [Adsterra](#adsterra) + - [ARNES](#arnes) + - [Brandwatch](#brandwatch) + - [CERN](#cern) + - [COLOPL](#colopl) + - [Dreamteam](#dreamteam) + - [Fly.io](#flyio) + - [German Research Center for Artificial Intelligence](#german-research-center-for-artificial-intelligence) + - [Grammarly](#grammarly) + - [Groove X](#groove-x) + - [Idealo.de](#idealode) + - [MHI Vestas Offshore Wind](#mhi-vestas-offshore-wind) + - [Percona](#percona) + - [Razorpay](#razorpay) + - [Sensedia](#sensedia) + - [Smarkets](#smarkets) + - [Synthesio](#synthesio) + - [Wedos.com](#wedoscom) + - [Wix.com](#wixcom) + - [Zerodha](#zerodha) + - [zhihu](#zhihu) You can also read [articles about VictoriaMetrics from our users](https://docs.victoriametrics.com/Articles.html#third-party-articles-and-slides-about-victoriametrics). - ## AbiosGaming [AbiosGaming](https://abiosgaming.com/) provides industry leading esports data and technology across the globe. @@ -52,7 +52,6 @@ You can also read [articles about VictoriaMetrics from our users](https://docs.v See [the full article](https://abiosgaming.com/press/high-cardinality-aggregations/). - ## adidas See our [slides](https://promcon.io/2019-munich/slides/remote-write-storage-wars.pdf) and [video](https://youtu.be/OsH6gPdxR4s) @@ -118,7 +117,6 @@ We have 2 single-node instances of VictoriaMetrics. The first instance collects The second instance collects and stores low-resolution metrics (300s scrape interval) for a month. We use Promxy + Alertmanager for global view and alerts evaluation. - ## ARNES [The Academic and Research Network of Slovenia](https://www.arnes.si/en/) (ARNES) is a public institute that provides network services to research, @@ -139,7 +137,7 @@ the same result with far less maintenance overhead and lower hardware requiremen After testing it a few months and with great support from the maintainers on [Slack](https://slack.victoriametrics.com/), we decided to go with it. VM's support for the ingestion of InfluxDB metrics was an additional bonus as our hardware team uses -SNMPCollector to collect metrics from network devices and switching from InfluxDB to VictoriaMetrics required just a simple change in the config file. +SNMPCollector to collect metrics from network devices and switching from InfluxDB to VictoriaMetrics required just a simple change in the config file. Numbers: @@ -169,6 +167,7 @@ The engineering department at Brandwatch has been using InfluxDB to store applic but when End-of-Life of InfluxDB version 1.x was announced we decided to re-evaluate our entire metrics collection and storage stack. The main goals for the new metrics stack were: + - improved performance - lower maintenance - support for native clustering in open source version @@ -180,6 +179,7 @@ that made them unfit for our use case. Prometheus was also considered but it's p to include in the already significant change. Once we found VictoriaMetrics it solved the following problems: + - it is very lightweight and we can now run virtual machines instead of dedicated hardware machines for metrics storage - very short startup time and any possible gaps in data can easily be filled in using Promxy - we could continue using Telegraf as our metrics agent and ship identical metrics to both InfluxDB and VictoriaMetrics during the migration period (migration just about to start) @@ -211,24 +211,23 @@ of the [CMS](https://home.cern/science/experiments/cms) detector system. According to [published talk](https://indico.cern.ch/event/877333/contributions/3696707/attachments/1972189/3281133/CMS_mon_RD_for_opInt.pdf) VictoriaMetrics is used for the following purposes as a part of the "CMS Monitoring cluster": -* As a long-term storage for messages ingested from the [NATS messaging system](https://nats.io/). Ingested messages are pushed directly to VictoriaMetrics via HTTP protocol -* As a long-term storage for Prometheus monitoring system (30 days retention policy. There are plans to increase it up to ½ year) -* As a data source for visualizing metrics in Grafana. +- As a long-term storage for messages ingested from the [NATS messaging system](https://nats.io/). Ingested messages are pushed directly to VictoriaMetrics via HTTP protocol +- As a long-term storage for Prometheus monitoring system (30 days retention policy. There are plans to increase it up to ½ year) +- As a data source for visualizing metrics in Grafana. R&D topic: Evaluate VictoraMetrics vs InfluxDB for large cardinality data. Please also see [The CMS monitoring infrastructure and applications](https://arxiv.org/pdf/2007.03630.pdf) publication from CERN with details about their VictoriaMetrics usage. - ## COLOPL [COLOPL](http://www.colopl.co.jp/en/) is Japanese game development company. It started using VictoriaMetrics after evaulating the following remote storage solutions for Prometheus: -* Cortex -* Thanos -* M3DB -* VictoriaMetrics +- Cortex +- Thanos +- M3DB +- VictoriaMetrics See [slides](https://speakerdeck.com/inletorder/monitoring-platform-with-victoria-metrics) and [video](https://www.youtube.com/watch?v=hUpHIluxw80) from `Large-scale, super-load system monitoring platform built with VictoriaMetrics` talk at [Prometheus Meetup Tokyo #3](https://prometheus.connpass.com/event/157721/). @@ -239,10 +238,10 @@ from `Large-scale, super-load system monitoring platform built with VictoriaMetr Numbers: -* Active time series: from 350K to 725K -* Total number of time series: from 100M to 320M -* Total number of datapoints: from 120 billions to 155 billions -* Retention period: 3 months +- Active time series: from 350K to 725K +- Total number of time series: from 100M to 320M +- Total number of datapoints: from 120 billions to 155 billions +- Retention period: 3 months VictoriaMetrics in production environment runs on 2 M5 EC2 instances in "HA" mode, managed by Terraform and Ansible TF module. 2 Prometheus instances are writing to both VMs, with 2 [Promxy](https://github.com/jacksontj/promxy) replicas @@ -260,7 +259,6 @@ as the load balancer for reads. See [the full post](https://fly.io/blog/measuring-fly/). - ## German Research Center for Artificial Intelligence [German Research Center for Artificial Intelligence](https://en.wikipedia.org/wiki/German_Research_Centre_for_Artificial_Intelligence) (DFKI) is one of the world's largest nonprofit contract research institutes for software technology based on artificial intelligence (AI) methods. DFKI was founded in 1988, and has facilities in the German cities of Kaiserslautern, Saarbrücken, Bremen and Berlin. @@ -296,7 +294,6 @@ Numbers: - CPU usage: 0.1 CPU cores - RAM usage: 2.8 GB - ## Grammarly [Grammarly](https://www.grammarly.com/) provides digital writing assistant that helps 30 million people and 30 thousand teams write more clearly and effectively every day. In building a product that scales across multiple platforms and devices, Grammarly works to empower users whenever and wherever they communicate. @@ -332,7 +329,6 @@ Numbers: - CPU usage: 12 CPU cores - RAM usage: 250 GB - ## Groove X [Groove X](https://groove-x.com/en/) designs and produces robotics solutions. Its mission is to bring out humanity’s full potential through robotics. @@ -373,7 +369,6 @@ Numbers: - Retention: 13 months - Size of all datapoints: 3.5 TB - ## MHI Vestas Offshore Wind The mission of [MHI Vestas Offshore Wind](http://www.mhivestasoffshore.com) is to co-develop offshore wind as an economically viable and sustainable energy resource to benefit future generations. @@ -388,14 +383,12 @@ Numbers with current, limited roll out: - Data size on disk: 800 GiB - Retention period: 3 years - ## Percona [Percona](https://www.percona.com/) is a leader in providing best-of-breed enterprise-class support, consulting, managed services, training and software for MySQL®, MariaDB®, MongoDB®, PostgreSQL® and other open source databases in on-premises and cloud environments. Percona migrated from Prometheus to VictoriaMetrics in the [Percona Monitoring and Management](https://www.percona.com/software/database-tools/percona-monitoring-and-management) product. This allowed [reducing resource usage](https://www.percona.com/blog/2020/12/23/observations-on-better-resource-usage-with-percona-monitoring-and-management-v2-12-0/) and [getting rid of complex firewall setup](https://www.percona.com/blog/2020/12/01/foiled-by-the-firewall-a-tale-of-transition-from-prometheus-to-victoriametrics/), while [improving user experience](https://www.percona.com/blog/2020/02/28/better-prometheus-rate-function-with-victoriametrics/). - ## Razorpay [Razorpay](https://razorpay.com/) aims to revolutionize money management for online businesses by providing clean, developer-friendly APIs and hassle-free integration. @@ -405,18 +398,18 @@ Percona migrated from Prometheus to VictoriaMetrics in the [Percona Monitoring a > We executed a variety of POCs on various solutions and finally arrived at the following technologies: M3DB, Thanos, Cortex and VictoriaMetrics. The clear winner was VictoriaMetrics. > The following are some of the basic observations we derived from Victoria Metrics: -> * Simple components, each horizontally scalable. -> * Clear separation between writes and reads. -> * Runs from default configurations, with no extra frills. -> * Default retention starts with 1 month -> * Storage, ingestion, and reads can be easily scaled. -> * High Compression store ~ 70% more compression. -> * Currently running in production with commodity hardware with a good mix of spot instances. -> * Successfully ran some of the worst Grafana dashboards/queries that have historically failed to run. +> +> - Simple components, each horizontally scalable. +> - Clear separation between writes and reads. +> - Runs from default configurations, with no extra frills. +> - Default retention starts with 1 month +> - Storage, ingestion, and reads can be easily scaled. +> - High Compression store ~ 70% more compression. +> - Currently running in production with commodity hardware with a good mix of spot instances. +> - Successfully ran some of the worst Grafana dashboards/queries that have historically failed to run. See [the full article](https://engineering.razorpay.com/scaling-to-trillions-of-metric-data-points-f569a5b654f2). - ## Sensedia [Sensedia](https://www.sensedia.com) is a leading integration solutions provider with more than 120 enterprise clients across a range of sectors. Its world-class portfolio includes: an API Management Platform, Adaptive Governance, Events Hub, Service Mesh, Cloud Connectors and Strategic Professional Services' teams. @@ -432,6 +425,7 @@ See [the full article](https://engineering.razorpay.com/scaling-to-trillions-of- [Aecio dos Santos Pires](http://aeciopires.com), Cloud Architect, Sensedia. Numbers: + - Cluster mode - Active time series: 700K - Ingestion rate: 70K datapoints per second @@ -441,7 +435,6 @@ Numbers: - Churn rate: 3 million of new time series per day - Query response time (99th percentile): 500ms - ## Smarkets [Smarkets](https://smarkets.com/) simplifies peer-to-peer trading on sporting and political events. @@ -449,15 +442,15 @@ Numbers: > We always wanted our developers to have out-of-the-box monitoring available for any application or service. Before we adopted Kubernetes this was achieved either with Prometheus metrics, or with statsd being sent over to the underlying host and then converted into Prometheus metrics. As we expanded our Kubernetes adoption and started to split clusters, we also wanted developers to be able to expose metrics directly to Prometheus by annotating services. Those metrics were then only available inside the cluster so they couldn’t be scraped globally. > We considered three different solutions to improve our architecture: -> * Prometheus + Cortex -> * Prometheus + Thanos Receive -> * Prometheus + Victoria Metrics +> +> - Prometheus + Cortex +> - Prometheus + Thanos Receive +> - Prometheus + Victoria Metrics > We selected Victoria Metrics. Our new architecture has been very stable since it was put into production. With the previous setup we would have had two or three cardinality explosions in a two-week period, with this new one we have none. See [the full article](https://smarketshq.com/monitoring-kubernetes-clusters-41a4b24c19e3). - ## Synthesio [Synthesio](https://www.synthesio.com/) is the leading social intelligence tool for social media monitoring and analytics. @@ -465,6 +458,7 @@ See [the full article](https://smarketshq.com/monitoring-kubernetes-clusters-41a > We fully migrated from [Metrictank](https://grafana.com/oss/metrictank/) to VictoriaMetrics Numbers: + - Single node - Active time series: 5 millions - Datapoints: 1.25 trillions @@ -480,11 +474,11 @@ Numbers: Numbers: -* The number of acitve time series: 5M. -* Ingestion rate: 170K data points per second. -* Query duration: median is ~2ms, 99th percentile is ~50ms. +- The number of acitve time series: 5M. +- Ingestion rate: 170K data points per second. +- Query duration: median is ~2ms, 99th percentile is ~50ms. -> We like that VictoriaMetrics is simple to configuree and requires zero maintenance. It works right out of the box and once it's set up you can just forget about it. +> We like that VictoriaMetrics is simple to configuree and requires zero maintenance. It works right out of the box and once it's set up you can just forget about it. ## Wix.com @@ -494,24 +488,24 @@ Numbers: Numbers: -* The number of active time series per VictoriaMetrics instance is 50 millions. -* The total number of time series per VictoriaMetrics instance is 5000 million. -* Ingestion rate per VictoriaMetrics instance is 1.1 millions data points per second. -* The total number of datapoints per VictoriaMetrics instance is 8.5 trillion. -* The average churn rate is 150 millions new time series per day. -* The average query rate is ~150 per second (mostly alert queries). -* Query duration: median is ~1ms, 99th percentile is ~1sec. -* Retention period: 3 months. +- The number of active time series per VictoriaMetrics instance is 50 millions. +- The total number of time series per VictoriaMetrics instance is 5000 million. +- Ingestion rate per VictoriaMetrics instance is 1.1 millions data points per second. +- The total number of datapoints per VictoriaMetrics instance is 8.5 trillion. +- The average churn rate is 150 millions new time series per day. +- The average query rate is ~150 per second (mostly alert queries). +- Query duration: median is ~1ms, 99th percentile is ~1sec. +- Retention period: 3 months. > The alternatives that we tested prior to choosing VictoriaMetrics were: Prometheus federated, Cortex, IronDB and Thanos. > The items that were critical to us central tsdb, in order of importance were as follows: -* At least 3 month worth of retention. -* Raw data, no aggregation, no sampling. -* High query speed. -* Clean fail state for HA (multi-node clusters may return partial data resulting in false alerts). -* Enough headroom/scaling capacity for future growth which is planned to be up to 100M active time series. -* Ability to split DB replicas per workload. Alert queries go to one replica and user queries go to another (speed for users, effective cache). +- At least 3 month worth of retention. +- Raw data, no aggregation, no sampling. +- High query speed. +- Clean fail state for HA (multi-node clusters may return partial data resulting in false alerts). +- Enough headroom/scaling capacity for future growth which is planned to be up to 100M active time series. +- Ability to split DB replicas per workload. Alert queries go to one replica and user queries go to another (speed for users, effective cache). > Optimizing for those points and our specific workload, VictoriaMetrics proved to be the best option. As icing on the cake we’ve got [PromQL extensions](https://docs.victoriametrics.com/MetricsQL.html) - `default 0` and `histogram` are my favorite ones. We really like having a lot of tsdb params easily available via config options which makes tsdb easy to tune for each specific use case. We've also found a great community in [Slack channel](https://slack.victoriametrics.com/) and responsive and helpful maintainer support. @@ -521,24 +515,23 @@ Alex Ulstein, Head of Monitoring, Wix.com [Zerodha](https://zerodha.com/) is India's largest stock broker. The monitoring team at Zerodha had the following requirements: -* Multiple K8s clusters to monitor -* Consistent monitoring infra for each cluster across the fleet -* The ability to handle billions of timeseries events at any point of time -* Easy to operate and cost effective +- Multiple K8s clusters to monitor +- Consistent monitoring infra for each cluster across the fleet +- The ability to handle billions of timeseries events at any point of time +- Easy to operate and cost effective Thanos, Cortex and VictoriaMetrics were evaluated as a long-term storage for Prometheus. VictoriaMetrics has been selected for the following reasons: -* Blazingly fast benchmarks for a single node setup. -* Single binary mode. Easy to scale vertically with far fewer operational headaches. -* Considerable [improvements on creating Histograms](https://medium.com/@valyala/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350). -* [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) gives us the ability to extend PromQL with more aggregation operators. -* The API is compatible with Prometheus and nearly all standard PromQL queries work well out of the box. -* Handles storage well, with periodic compaction which makes it easy to take snapshots. +- Blazingly fast benchmarks for a single node setup. +- Single binary mode. Easy to scale vertically with far fewer operational headaches. +- Considerable [improvements on creating Histograms](https://medium.com/@valyala/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350). +- [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) gives us the ability to extend PromQL with more aggregation operators. +- The API is compatible with Prometheus and nearly all standard PromQL queries work well out of the box. +- Handles storage well, with periodic compaction which makes it easy to take snapshots. Please see [Monitoring K8S with VictoriaMetrics](https://docs.google.com/presentation/d/1g7yUyVEaAp4tPuRy-MZbPXKqJ1z78_5VKuV841aQfsg/edit) slides, [video](https://youtu.be/ZJQYW-cFOms) and [Infrastructure monitoring with Prometheus at Zerodha](https://zerodha.tech/blog/infra-monitoring-at-zerodha/) blog post for more details. - ## zhihu [zhihu](https://www.zhihu.com) is the largest Chinese question-and-answer website. We use VictoriaMetrics to store and use Graphite metrics. We shared the [promate](https://github.com/zhihu/promate) solution in our [单机 20 亿指标,知乎 Graphite 极致优化!](https://qcon.infoq.cn/2020/shenzhen/presentation/2881)([slides](https://static001.geekbang.org/con/76/pdf/828698018/file/%E5%8D%95%E6%9C%BA%2020%20%E4%BA%BF%E6%8C%87%E6%A0%87%EF%BC%8C%E7%9F%A5%E4%B9%8E%20Graphite%20%E6%9E%81%E8%87%B4%E4%BC%98%E5%8C%96%EF%BC%81-%E7%86%8A%E8%B1%B9.pdf)) talk at [QCon 2020](https://qcon.infoq.cn/2020/shenzhen/). diff --git a/docs/Cluster-VictoriaMetrics.md b/docs/Cluster-VictoriaMetrics.md index ce9446e6f..8122dbedb 100644 --- a/docs/Cluster-VictoriaMetrics.md +++ b/docs/Cluster-VictoriaMetrics.md @@ -16,7 +16,6 @@ Single-node version is easier to configure and operate comparing to cluster vers Join [our Slack](https://slack.victoriametrics.com/) or [contact us](mailto:info@victoriametrics.com) with consulting and support questions. - ## Prominent features - Supports all the features of [single-node version](https://github.com/VictoriaMetrics/VictoriaMetrics). @@ -24,7 +23,6 @@ Join [our Slack](https://slack.victoriametrics.com/) or [contact us](mailto:info - Supports multiple independent namespaces for time series data (aka multi-tenancy). See [these docs for details](#multitenancy). - Supports replication. See [these docs for details](#replication-and-data-safety). - ## Architecture overview VictoriaMetrics cluster consists of the following services: @@ -40,27 +38,25 @@ It increases cluster availability, simplifies cluster maintenance and cluster sc - ## Multitenancy VictoriaMetrics cluster supports multiple isolated tenants (aka namespaces). Tenants are identified by `accountID` or `accountID:projectID`, which are put inside request urls. See [these docs](#url-format) for details. Some facts about tenants in VictoriaMetrics: -* Each `accountID` and `projectID` is identified by an arbitrary 32-bit integer in the range `[0 .. 2^32)`. +- Each `accountID` and `projectID` is identified by an arbitrary 32-bit integer in the range `[0 .. 2^32)`. If `projectID` is missing, then it is automatically assigned to `0`. It is expected that other information about tenants such as auth tokens, tenant names, limits, accounting, etc. is stored in a separate relational database. This database must be managed by a separate service sitting in front of VictoriaMetrics cluster such as [vmauth](https://docs.victoriametrics.com/vmauth.html) or [vmgateway](https://docs.victoriametrics.com/vmgateway.html). [Contact us](mailto:info@victoriametrics.com) if you need assistance with such service. -* Tenants are automatically created when the first data point is written into the given tenant. +- Tenants are automatically created when the first data point is written into the given tenant. -* Data for all the tenants is evenly spread among available `vmstorage` nodes. This guarantees even load among `vmstorage` nodes +- Data for all the tenants is evenly spread among available `vmstorage` nodes. This guarantees even load among `vmstorage` nodes when different tenants have different amounts of data and different query load. -* The database performance and resource usage doesn't depend on the number of tenants. It depends mostly on the total number of active time series in all the tenants. A time series is considered active if it received at least a single sample during the last hour or it has been touched by queries during the last hour. - -* VictoriaMetrics doesn't support querying multiple tenants in a single request. +- The database performance and resource usage doesn't depend on the number of tenants. It depends mostly on the total number of active time series in all the tenants. A time series is considered active if it received at least a single sample during the last hour or it has been touched by queries during the last hour. +- VictoriaMetrics doesn't support querying multiple tenants in a single request. ## Binaries @@ -69,16 +65,14 @@ See archives containing `cluster` word. Docker images for cluster version are available here: -- `vminsert` - https://hub.docker.com/r/victoriametrics/vminsert/tags -- `vmselect` - https://hub.docker.com/r/victoriametrics/vmselect/tags -- `vmstorage` - https://hub.docker.com/r/victoriametrics/vmstorage/tags - +- `vminsert` - +- `vmselect` - +- `vmstorage` - ## Building from sources Source code for cluster version is available at [cluster branch](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster). - ### Production builds There is no need in installing Go on a host system since binaries are built @@ -91,6 +85,7 @@ make vminsert-prod vmselect-prod vmstorage-prod ``` Production binaries are built into statically linked binaries. They are put into `bin` folder with `-prod` suffixes: + ``` $ make vminsert-prod vmselect-prod vmstorage-prod $ ls -1 bin @@ -105,14 +100,13 @@ vmstorage-prod 2. Run `make` from [the repository root](https://github.com/VictoriaMetrics/VictoriaMetrics). It should build `vmstorage`, `vmselect` and `vminsert` binaries and put them into the `bin` folder. - ### Building docker images Run `make package`. It will build the following docker images locally: -* `victoriametrics/vminsert:` -* `victoriametrics/vmselect:` -* `victoriametrics/vmstorage:` +- `victoriametrics/vminsert:` +- `victoriametrics/vmselect:` +- `victoriametrics/vmstorage:` `` is auto-generated image tag, which depends on source code in [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics). The `` may be manually set via `PKG_TAG=foobar make package`. @@ -131,14 +125,15 @@ ROOT_IMAGE=scratch make package A minimal cluster must contain the following nodes: -* a single `vmstorage` node with `-retentionPeriod` and `-storageDataPath` flags -* a single `vminsert` node with `-storageNode=` -* a single `vmselect` node with `-storageNode=` +- a single `vmstorage` node with `-retentionPeriod` and `-storageDataPath` flags +- a single `vminsert` node with `-storageNode=` +- a single `vmselect` node with `-storageNode=` It is recommended to run at least two nodes for each service for high availability purposes. An http load balancer such as [vmauth](https://docs.victoriametrics.com/vmauth.html) or `nginx` must be put in front of `vminsert` and `vmselect` nodes. It must contain the following routing configs according to [the url format](#url-format): + - requests starting with `/insert` must be routed to port `8480` on `vminsert` nodes. - requests starting with `/select` must be routed to port `8481` on `vmselect` nodes. @@ -147,22 +142,19 @@ Ports may be altered by setting `-httpListenAddr` on the corresponding nodes. It is recommended setting up [monitoring](#monitoring) for the cluster. The following tools can simplify cluster setup: -* [An example docker-compose config for VictoriaMetrics cluster](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/docker-compose.yml) -* [Helm charts for VictoriaMetrics](https://github.com/VictoriaMetrics/helm-charts) -* [Kubernetes operator for VictoriaMetrics](https://github.com/VictoriaMetrics/operator) - +- [An example docker-compose config for VictoriaMetrics cluster](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/deployment/docker/docker-compose.yml) +- [Helm charts for VictoriaMetrics](https://github.com/VictoriaMetrics/helm-charts) +- [Kubernetes operator for VictoriaMetrics](https://github.com/VictoriaMetrics/operator) It is possible manualy setting up a toy cluster on a single host. In this case every cluster component - `vminsert`, `vmselect` and `vmstorage` - must have distinct values for `-httpListenAddr` command-line flag. This flag specifies http address for accepting http requests for [monitoring](#monitoring) and [profiling](#profiling). `vmstorage` node must have distinct values for the following additional command-line flags in order to prevent resource usage clash: -* `-storageDataPath` - every `vmstorage` node must have a dedicated data storage. -* `-vminsertAddr` - every `vmstorage` node must listen for a distinct tcp address for accepting data from `vminsert` nodes. -* `-vmselectAddr` - every `vmstorage` node must listen for a distinct tcp address for accepting requests from `vmselect` nodes. - +- `-storageDataPath` - every `vmstorage` node must have a dedicated data storage. +- `-vminsertAddr` - every `vmstorage` node must listen for a distinct tcp address for accepting data from `vminsert` nodes. +- `-vmselectAddr` - every `vmstorage` node must listen for a distinct tcp address for accepting requests from `vmselect` nodes. ## mTLS protection By default `vminsert` and `vmselect` nodes use unencrypted connections to `vmstorage` nodes, since it is assumed that all the cluster components run in a protected environment. [Enterprise version of VictoriaMetrics](https://victoriametrics.com/products/enterprise/) provides optional support for [mTLS connections](https://en.wikipedia.org/wiki/Mutual_authentication#mTLS) between cluster components. Pass `-cluster.tls=true` command-line flag to `vminsert`, `vmselect` and `vmstorage` nodes in order to enable mTLS protection. Additionally, `vminsert` and `vmselect` must be configured with client-side certificates via `-cluster.tlsCertFile`, `-cluster.tlsKeyFile` command-line options. These certificates are verified by `vmstorage` when `vminsert` and `vmselect` dial `vmstorage`. An optional `-cluster.tlsCAFile` command-line flag can be set at `vminsert`, `vmselect` and `vmstorage` for verifying peer certificates issued with custom [certificate authority](https://en.wikipedia.org/wiki/Certificate_authority). - ### Environment variables Each flag values can be set thru environment variables by following these rules: @@ -172,11 +164,11 @@ Each flag values can be set thru environment variables by following these rules: - For repeating flags, an alternative syntax can be used by joining the different values into one using `,` as separator (for example `-storageNode -storageNode ` will translate to `storageNode=,`) - It is possible setting prefix for environment vars with `-envflag.prefix`. For instance, if `-envflag.prefix=VM_`, then env vars must be prepended with `VM_` - ## Monitoring All the cluster components expose various metrics in Prometheus-compatible format at `/metrics` page on the TCP port set in `-httpListenAddr` command-line flag. By default the following TCP ports are used: + - `vminsert` - 8480 - `vmselect` - 8481 - `vmstorage` - 8482 @@ -192,24 +184,22 @@ It is recommended setting up alerts in [vmalert](https://docs.victoriametrics.co `vmstorage` nodes automatically switch to readonly mode when the directory pointed by `-storageDataPath` contains less than `-storage.minFreeDiskSpaceBytes` of free space. `vminsert` nodes stop sending data to such nodes and start re-routing the data to the remaining `vmstorage` nodes. - - ## URL format -* URLs for data ingestion: `http://:8480/insert//`, where: +- URLs for data ingestion: `http://:8480/insert//`, where: - `` is an arbitrary 32-bit integer identifying namespace for data ingestion (aka tenant). It is possible to set it as `accountID:projectID`, where `projectID` is also arbitrary 32-bit integer. If `projectID` isn't set, then it equals to `0`. - `` may have the following values: - - `prometheus` and `prometheus/api/v1/write` - for inserting data with [Prometheus remote write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write). - - `datadog/api/v1/series` - for inserting data with [DataDog submit metrics API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics). See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-datadog-agent) for details. - - `influx/write` and `influx/api/v2/write` - for inserting data with [InfluxDB line protocol](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/). See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) for details. - - `opentsdb/api/put` - for accepting [OpenTSDB HTTP /api/put requests](http://opentsdb.net/docs/build/html/api_http/put.html). This handler is disabled by default. It is exposed on a distinct TCP address set via `-opentsdbHTTPListenAddr` command-line flag. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#sending-opentsdb-data-via-http-apiput-requests) for details. - - `prometheus/api/v1/import` - for importing data obtained via `api/v1/export` at `vmselect` (see below). - - `prometheus/api/v1/import/native` - for importing data obtained via `api/v1/export/native` on `vmselect` (see below). - - `prometheus/api/v1/import/csv` - for importing arbitrary CSV data. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-csv-data) for details. - - `prometheus/api/v1/import/prometheus` - for importing data in [Prometheus text exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format) and in [OpenMetrics format](https://github.com/OpenObservability/OpenMetrics/blob/master/specification/OpenMetrics.md). See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-data-in-prometheus-exposition-format) for details. + - `prometheus` and `prometheus/api/v1/write` - for inserting data with [Prometheus remote write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write). + - `datadog/api/v1/series` - for inserting data with [DataDog submit metrics API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics). See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-datadog-agent) for details. + - `influx/write` and `influx/api/v2/write` - for inserting data with [InfluxDB line protocol](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/). See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) for details. + - `opentsdb/api/put` - for accepting [OpenTSDB HTTP /api/put requests](http://opentsdb.net/docs/build/html/api_http/put.html). This handler is disabled by default. It is exposed on a distinct TCP address set via `-opentsdbHTTPListenAddr` command-line flag. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#sending-opentsdb-data-via-http-apiput-requests) for details. + - `prometheus/api/v1/import` - for importing data obtained via `api/v1/export` at `vmselect` (see below). + - `prometheus/api/v1/import/native` - for importing data obtained via `api/v1/export/native` on `vmselect` (see below). + - `prometheus/api/v1/import/csv` - for importing arbitrary CSV data. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-csv-data) for details. + - `prometheus/api/v1/import/prometheus` - for importing data in [Prometheus text exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format) and in [OpenMetrics format](https://github.com/OpenObservability/OpenMetrics/blob/master/specification/OpenMetrics.md). See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-data-in-prometheus-exposition-format) for details. -* URLs for [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/): `http://:8481/select//prometheus/`, where: +- URLs for [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/): `http://:8481/select//prometheus/`, where: - `` is an arbitrary number identifying data namespace for the query (aka tenant) - `` may have the following values: - `api/v1/query` - performs [PromQL instant query](https://prometheus.io/docs/prometheus/latest/querying/api/#instant-queries). @@ -227,31 +217,31 @@ It is recommended setting up alerts in [vmalert](https://docs.victoriametrics.co which is returned in the response. - `api/v1/status/top_queries` - for listing the most frequently executed queries and queries taking the most duration. -* URLs for [Graphite Metrics API](https://graphite-api.readthedocs.io/en/latest/api.html#the-metrics-api): `http://:8481/select//graphite/`, where: - - `` is an arbitrary number identifying data namespace for query (aka tenant) - - `` may have the following values: - - `render` - implements Graphite Render API. See [these docs](https://graphite.readthedocs.io/en/stable/render_api.html). This functionality is available in [Enterprise package](https://victoriametrics.com/products/enterprise/). - - `metrics/find` - searches Graphite metrics. See [these docs](https://graphite-api.readthedocs.io/en/latest/api.html#metrics-find). - - `metrics/expand` - expands Graphite metrics. See [these docs](https://graphite-api.readthedocs.io/en/latest/api.html#metrics-expand). - - `metrics/index.json` - returns all the metric names. See [these docs](https://graphite-api.readthedocs.io/en/latest/api.html#metrics-index-json). - - `tags/tagSeries` - registers time series. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb). - - `tags/tagMultiSeries` - register multiple time series. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb). - - `tags` - returns tag names. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags). - - `tags/` - returns tag values for the given ``. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags). - - `tags/findSeries` - returns series matching the given `expr`. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags). - - `tags/autoComplete/tags` - returns tags matching the given `tagPrefix` and/or `expr`. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#auto-complete-support). - - `tags/autoComplete/values` - returns tag values matching the given `valuePrefix` and/or `expr`. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#auto-complete-support). - - `tags/delSeries` - deletes series matching the given `path`. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#removing-series-from-the-tagdb). +- URLs for [Graphite Metrics API](https://graphite-api.readthedocs.io/en/latest/api.html#the-metrics-api): `http://:8481/select//graphite/`, where: + - `` is an arbitrary number identifying data namespace for query (aka tenant) + - `` may have the following values: + - `render` - implements Graphite Render API. See [these docs](https://graphite.readthedocs.io/en/stable/render_api.html). This functionality is available in [Enterprise package](https://victoriametrics.com/products/enterprise/). + - `metrics/find` - searches Graphite metrics. See [these docs](https://graphite-api.readthedocs.io/en/latest/api.html#metrics-find). + - `metrics/expand` - expands Graphite metrics. See [these docs](https://graphite-api.readthedocs.io/en/latest/api.html#metrics-expand). + - `metrics/index.json` - returns all the metric names. See [these docs](https://graphite-api.readthedocs.io/en/latest/api.html#metrics-index-json). + - `tags/tagSeries` - registers time series. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb). + - `tags/tagMultiSeries` - register multiple time series. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb). + - `tags` - returns tag names. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags). + - `tags/` - returns tag values for the given ``. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags). + - `tags/findSeries` - returns series matching the given `expr`. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags). + - `tags/autoComplete/tags` - returns tags matching the given `tagPrefix` and/or `expr`. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#auto-complete-support). + - `tags/autoComplete/values` - returns tag values matching the given `valuePrefix` and/or `expr`. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#auto-complete-support). + - `tags/delSeries` - deletes series matching the given `path`. See [these docs](https://graphite.readthedocs.io/en/stable/tags.html#removing-series-from-the-tagdb). -* URL with basic Web UI: `http://:8481/select//vmui/`. +- URL with basic Web UI: `http://:8481/select//vmui/`. -* URL for query stats across all tenants: `http://:8481/api/v1/status/top_queries`. It lists with the most frequently executed queries and queries taking the most duration. +- URL for query stats across all tenants: `http://:8481/api/v1/status/top_queries`. It lists with the most frequently executed queries and queries taking the most duration. -* URL for time series deletion: `http://:8481/delete//prometheus/api/v1/admin/tsdb/delete_series?match[]=`. +- URL for time series deletion: `http://:8481/delete//prometheus/api/v1/admin/tsdb/delete_series?match[]=`. Note that the `delete_series` handler should be used only in exceptional cases such as deletion of accidentally ingested incorrect time series. It shouldn't be used on a regular basis, since it carries non-zero overhead. -* `vmstorage` nodes provide the following HTTP endpoints on `8482` port: +- `vmstorage` nodes provide the following HTTP endpoints on `8482` port: - `/internal/force_merge` - initiate [forced compactions](https://docs.victoriametrics.com/#forced-merge) on the given `vmstorage` node. - `/snapshot/create` - create [instant snapshot](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282), which can be used for backups in background. Snapshots are created in `/snapshots` folder, where `` is the corresponding @@ -263,17 +253,16 @@ It is recommended setting up alerts in [vmalert](https://docs.victoriametrics.co Snapshots may be created independently on each `vmstorage` node. There is no need in synchronizing snapshots' creation across `vmstorage` nodes. - ## Cluster resizing and scalability Cluster performance and capacity scales with adding new nodes. -* `vminsert` and `vmselect` nodes are stateless and may be added / removed at any time. +- `vminsert` and `vmselect` nodes are stateless and may be added / removed at any time. Do not forget updating the list of these nodes on http load balancer. Adding more `vminsert` nodes scales data ingestion rate. See [this comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/175#issuecomment-536925841) about ingestion rate scalability. Adding more `vmselect` nodes scales select queries rate. -* `vmstorage` nodes own the ingested data, so they cannot be removed without data loss. +- `vmstorage` nodes own the ingested data, so they cannot be removed without data loss. Adding more `vmstorage` nodes scales cluster capacity. Steps to add `vmstorage` node: @@ -282,7 +271,6 @@ Steps to add `vmstorage` node: 2. Gradually restart all the `vmselect` nodes with new `-storageNode` arg containing ``. 3. Gradually restart all the `vminsert` nodes with new `-storageNode` arg containing ``. - ## Updating / reconfiguring cluster nodes All the node types - `vminsert`, `vmselect` and `vmstorage` - may be updated via graceful shutdown. @@ -294,11 +282,10 @@ the update process. See [cluster availability](#cluster-availability) section fo See also more advanced [cardinality limiter in vmagent](https://docs.victoriametrics.com/vmagent.html#cardinality-limiter). - ## Cluster availability -* HTTP load balancer must stop routing requests to unavailable `vminsert` and `vmselect` nodes. -* The cluster remains available if at least a single `vmstorage` node exists: +- HTTP load balancer must stop routing requests to unavailable `vminsert` and `vmselect` nodes. +- The cluster remains available if at least a single `vmstorage` node exists: - `vminsert` re-routes incoming data from unavailable `vmstorage` nodes to healthy `vmstorage` nodes - `vmselect` continues serving partial responses if at least a single `vmstorage` node is available. If consistency over availability is preferred, then either pass `-search.denyPartialResponse` command-line flag to `vmselect` or pass `deny_partial_response=1` query arg in requests to `vmselect`. @@ -307,7 +294,6 @@ See also more advanced [cardinality limiter in vmagent](https://docs.victoriamet Data replication can be used for increasing storage durability. See [these docs](#replication-and-data-safety) for details. - ## Capacity planning VictoriaMetrics uses lower amounts of CPU, RAM and storage space on production workloads compared to competing solutions (Prometheus, Thanos, Cortex, TimescaleDB, InfluxDB, QuestDB, M3DB) according to [our case studies](https://docs.victoriametrics.com/CaseStudies.html). @@ -318,19 +304,17 @@ The needed storage space for the given retention (the retention is set via `-ret It is recommended leaving the following amounts of spare resources: -* 50% of free RAM across all the node types for reducing the probability of OOM (out of memory) crashes and slowdowns during temporary spikes in workload. -* 50% of spare CPU across all the node types for reducing the probability of slowdowns during temporary spikes in workload. -* At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag at `vmstorage` nodes. See also `-storage.minFreeDiskSpaceBytes` command-line flag [description for vmstorage](#list-of-command-line-flags-for-vmstorage). - +- 50% of free RAM across all the node types for reducing the probability of OOM (out of memory) crashes and slowdowns during temporary spikes in workload. +- 50% of spare CPU across all the node types for reducing the probability of slowdowns during temporary spikes in workload. +- At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag at `vmstorage` nodes. See also `-storage.minFreeDiskSpaceBytes` command-line flag [description for vmstorage](#list-of-command-line-flags-for-vmstorage). Some capacity planning tips for VictoriaMetrics cluster: -* The [replication](#replication-and-data-safety) increases the amounts of needed resources for the cluster by up to `N` times where `N` is replication factor. -* Cluster capacity for [active time series](https://docs.victoriametrics.com/FAQ.html#what-is-an-active-time-series) can be increased by adding more `vmstorage` nodes and/or by increasing RAM and CPU resources per each `vmstorage` node. -* Query latency can be reduced by increasing the number of `vmstorage` nodes and/or by increasing RAM and CPU resources per each `vmselect` node. -* The total number of CPU cores needed for all the `vminsert` nodes can be calculated from the ingestion rate: `CPUs = ingestion_rate / 100K`. -* The `-rpc.disableCompression` command-line flag at `vminsert` nodes can increase ingestion capacity at the cost of higher network bandwidth usage between `vminsert` and `vmstorage`. - +- The [replication](#replication-and-data-safety) increases the amounts of needed resources for the cluster by up to `N` times where `N` is replication factor. +- Cluster capacity for [active time series](https://docs.victoriametrics.com/FAQ.html#what-is-an-active-time-series) can be increased by adding more `vmstorage` nodes and/or by increasing RAM and CPU resources per each `vmstorage` node. +- Query latency can be reduced by increasing the number of `vmstorage` nodes and/or by increasing RAM and CPU resources per each `vmselect` node. +- The total number of CPU cores needed for all the `vminsert` nodes can be calculated from the ingestion rate: `CPUs = ingestion_rate / 100K`. +- The `-rpc.disableCompression` command-line flag at `vminsert` nodes can increase ingestion capacity at the cost of higher network bandwidth usage between `vminsert` and `vmstorage`. ## High availability @@ -345,25 +329,21 @@ into all the cluster. Then [promxy](https://github.com/jacksontj/promxy) could b Another solution is to use [multi-level cluster setup](#multi-level-cluster-setup). - ## Multi-level cluster setup `vminsert` nodes can accept data from another `vminsert` nodes starting from [v1.60.0](https://docs.victoriametrics.com/CHANGELOG.html#v1600) if `-clusternativeListenAddr` command-line flag is set. For example, if `vminsert` is started with `-clusternativeListenAddr=:8400` command-line flag, then it can accept data from another `vminsert` nodes at TCP port 8400 in the same way as `vmstorage` nodes do. This allows chaining `vminsert` nodes and building multi-level cluster topologies with flexible configs. For example, the top level of `vminsert` nodes can replicate data among the second level of `vminsert` nodes located in distinct availability zones (AZ), while the second-level `vminsert` nodes can spread the data among `vmstorage` nodes located in the same AZ. Such setup guarantees cluster availability if some AZ becomes unavailable. The data from all the `vmstorage` nodes in all the AZs can be read via `vmselect` nodes, which are configured to query all the `vmstorage` nodes in all the availability zones (e.g. all the `vmstorage` addresses are passed via `-storageNode` command-line flag to `vmselect` nodes). Additionally, `-replicationFactor=k+1` must be passed to `vmselect` nodes, where `k` is the lowest number of `vmstorage` nodes in a single AZ. See [replication docs](#replication-and-data-safety) for more details. Another option is to set up [vmagent](https://docs.victoriametrics.com/vmagent.html) for replicating the data among multiple VictoriaMetrics clusters. See [these docs](https://docs.victoriametrics.com/vmagent.html#multitenancy) for details. - ## Helm Helm chart simplifies managing cluster version of VictoriaMetrics in Kubernetes. It is available in the [helm-charts](https://github.com/VictoriaMetrics/helm-charts) repository. - ## Kubernetes operator [K8s operator](https://github.com/VictoriaMetrics/operator) simplifies managing VictoriaMetrics components in Kubernetes. - ## Replication and data safety By default VictoriaMetrics offloads replication to the underlying storage pointed by `-storageDataPath`. @@ -391,7 +371,6 @@ HDD-based persistent disks should be enough for the majority of use cases. It is recommended using durable replicated persistent volumes in Kubernetes. - ## Backups It is recommended performing periodical backups from [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) @@ -412,19 +391,17 @@ Restoring from backup: 2. Restore data from backup using [vmrestore](https://docs.victoriametrics.com/vmrestore.html) into `-storageDataPath` directory. 3. Start `vmstorage` node. - ## Downsampling Downsampling is available in [enterprise version of VictoriaMetrics](https://victoriametrics.com/products/enterprise/). It is configured with `-downsampling.period` command-line flag. The same flag value must be passed to both `vmstorage` and `vmselect` nodes. See [these docs](https://docs.victoriametrics.com/#downsampling) for details. - ## Profiling All the cluster components provide the following handlers for [profiling](https://blog.golang.org/profiling-go-programs): -* `http://vminsert:8480/debug/pprof/heap` for memory profile and `http://vminsert:8480/debug/pprof/profile` for CPU profile -* `http://vmselect:8481/debug/pprof/heap` for memory profile and `http://vmselect:8481/debug/pprof/profile` for CPU profile -* `http://vmstorage:8482/debug/pprof/heap` for memory profile and `http://vmstorage:8482/debug/pprof/profile` for CPU profile +- `http://vminsert:8480/debug/pprof/heap` for memory profile and `http://vminsert:8480/debug/pprof/profile` for CPU profile +- `http://vmselect:8481/debug/pprof/heap` for memory profile and `http://vmselect:8481/debug/pprof/profile` for CPU profile +- `http://vmstorage:8482/debug/pprof/heap` for memory profile and `http://vmstorage:8482/debug/pprof/profile` for CPU profile Example command for collecting cpu profile from `vmstorage` (replace `0.0.0.0` with `vmstorage` hostname if needed): @@ -469,18 +446,15 @@ Due to `KISS`, cluster version of VictoriaMetrics has no the following "features - Automatic discovering and addition of new nodes in the cluster, which may mix data between dev and prod clusters :) - Automatic leader election, which may result in split brain disaster on network errors. - ## Reporting bugs Report bugs and propose new features [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues). - ## List of command-line flags -* [List of command-line flags for vminsert](#list-of-command-line-flags-for-vminsert) -* [List of command-line flags for vmselect](#list-of-command-line-flags-for-vmselect) -* [List of command-line flags for vmstorage](#list-of-command-line-flags-for-vmstorage) - +- [List of command-line flags for vminsert](#list-of-command-line-flags-for-vminsert) +- [List of command-line flags for vmselect](#list-of-command-line-flags-for-vmselect) +- [List of command-line flags for vmstorage](#list-of-command-line-flags-for-vmstorage) ### List of command-line flags for vminsert @@ -488,135 +462,135 @@ Below is the output for `/path/to/vminsert -help`: ``` -cluster.tls - Whether to use TLS for connections to -storageNode. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection + Whether to use TLS for connections to -storageNode. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection -cluster.tlsCAFile string - Path to TLS CA file to use for verifying certificates provided by -storageNode. By default system CA is used. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection + Path to TLS CA file to use for verifying certificates provided by -storageNode. By default system CA is used. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection -cluster.tlsCertFile string - Path to client-side TLS certificate file to use when connecting to -storageNode. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection + Path to client-side TLS certificate file to use when connecting to -storageNode. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection -cluster.tlsKeyFile string - Path to client-side TLS key file to use when connecting to -storageNode. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection + Path to client-side TLS key file to use when connecting to -storageNode. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection -clusternativeListenAddr string - TCP address to listen for data from other vminsert nodes in multi-level cluster setup. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multi-level-cluster-setup . Usually :8400 must be set. Doesn't work if empty + TCP address to listen for data from other vminsert nodes in multi-level cluster setup. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multi-level-cluster-setup . Usually :8400 must be set. Doesn't work if empty -csvTrimTimestamp duration - Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -datadog.maxInsertRequestSize size - The maximum size in bytes of a single DataDog POST request to /api/v1/series - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864) + The maximum size in bytes of a single DataDog POST request to /api/v1/series + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864) -disableRerouting - Whether to disable re-routing when some of vmstorage nodes accept incoming data at slower speed compared to other storage nodes. Disabled re-routing limits the ingestion rate by the slowest vmstorage node. On the other side, disabled re-routing minimizes the number of active time series in the cluster during rolling restarts and during spikes in series churn rate. See also -dropSamplesOnOverload (default true) + Whether to disable re-routing when some of vmstorage nodes accept incoming data at slower speed compared to other storage nodes. Disabled re-routing limits the ingestion rate by the slowest vmstorage node. On the other side, disabled re-routing minimizes the number of active time series in the cluster during rolling restarts and during spikes in series churn rate. See also -dropSamplesOnOverload (default true) -dropSamplesOnOverload - Whether to drop incoming samples if the destination vmstorage node is overloaded and/or unavailable. This prioritizes cluster availability over consistency, e.g. the cluster continues accepting all the ingested samples, but some of them may be dropped if vmstorage nodes are temporarily unavailable and/or overloaded + Whether to drop incoming samples if the destination vmstorage node is overloaded and/or unavailable. This prioritizes cluster availability over consistency, e.g. the cluster continues accepting all the ingested samples, but some of them may be dropped if vmstorage nodes are temporarily unavailable and/or overloaded -enableTCP6 - Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used + Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used -envflag.enable - Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details + Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details -envflag.prefix string - Prefix for environment variables if -envflag.enable is set + Prefix for environment variables if -envflag.enable is set -eula - By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf + By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf -fs.disableMmap - Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() + Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() -graphiteListenAddr string - TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty + TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty -graphiteTrimTimestamp duration - Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) + Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) -http.connTimeout duration - Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) + Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) -http.disableResponseCompression - Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth + Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth -http.idleConnTimeout duration - Timeout for incoming idle http connections (default 1m0s) + Timeout for incoming idle http connections (default 1m0s) -http.maxGracefulShutdownDuration duration - The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) + The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) -http.pathPrefix string - An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus + An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus -http.shutdownDelay duration - Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers + Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers -httpListenAddr string - Address to listen for http connections (default ":8480") + Address to listen for http connections (default ":8480") -import.maxLineLen size - The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600) + The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600) -influx.databaseNames array - Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb - Supports an array of values separated by comma or specified via multiple flags. + Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb + Supports an array of values separated by comma or specified via multiple flags. -influx.maxLineSize size - The maximum size in bytes for a single InfluxDB line during parsing - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144) + The maximum size in bytes for a single InfluxDB line during parsing + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144) -influxDBLabel string - Default label for the DB name sent over '?db={db_name}' query parameter (default "db") + Default label for the DB name sent over '?db={db_name}' query parameter (default "db") -influxListenAddr string - TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://:8428/write + TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://:8428/write -influxMeasurementFieldSeparator string - Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") + Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") -influxSkipMeasurement - Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator' + Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator' -influxSkipSingleField - Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field + Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field -influxTrimTimestamp duration - Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -insert.maxQueueDuration duration - The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s) + The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s) -loggerDisableTimestamps - Whether to disable writing timestamps in logs + Whether to disable writing timestamps in logs -loggerErrorsPerSecondLimit int - Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit + Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit -loggerFormat string - Format for logs. Possible values: default, json (default "default") + Format for logs. Possible values: default, json (default "default") -loggerLevel string - Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") + Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") -loggerOutput string - Output for the logs. Supported values: stderr, stdout (default "stderr") + Output for the logs. Supported values: stderr, stdout (default "stderr") -loggerTimezone string - Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") + Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") -loggerWarnsPerSecondLimit int - Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit + Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit -maxConcurrentInserts int - The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16) + The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16) -maxInsertRequestSize size - The maximum size in bytes of a single Prometheus remote_write API request - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) + The maximum size in bytes of a single Prometheus remote_write API request + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) -maxLabelValueLen int - The maximum length of label values in the accepted time series. Longer label values are truncated. In this case the vm_too_long_label_values_total metric at /metrics page is incremented (default 16384) + The maximum length of label values in the accepted time series. Longer label values are truncated. In this case the vm_too_long_label_values_total metric at /metrics page is incremented (default 16384) -maxLabelsPerTimeseries int - The maximum number of labels accepted per time series. Superfluous labels are dropped. In this case the vm_metrics_with_dropped_labels_total metric at /metrics page is incremented (default 30) + The maximum number of labels accepted per time series. Superfluous labels are dropped. In this case the vm_metrics_with_dropped_labels_total metric at /metrics page is incremented (default 30) -memory.allowedBytes size - Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -memory.allowedPercent float - Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) + Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) -opentsdbHTTPListenAddr string - TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty + TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty -opentsdbListenAddr string - TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty + TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty -opentsdbTrimTimestamp duration - Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) + Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) -opentsdbhttp.maxInsertRequestSize size - The maximum size of OpenTSDB HTTP put request - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) + The maximum size of OpenTSDB HTTP put request + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) -opentsdbhttpTrimTimestamp duration - Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -relabelConfig string - Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal + Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal -relabelDebug - Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, then the metrics aren't sent to storage. This is useful for debugging the relabeling configs + Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, then the metrics aren't sent to storage. This is useful for debugging the relabeling configs -replicationFactor int - Replication factor for the ingested data, i.e. how many copies to make among distinct -storageNode instances. Note that vmselect must run with -dedup.minScrapeInterval=1ms for data de-duplication when replicationFactor is greater than 1. Higher values for -dedup.minScrapeInterval at vmselect is OK (default 1) + Replication factor for the ingested data, i.e. how many copies to make among distinct -storageNode instances. Note that vmselect must run with -dedup.minScrapeInterval=1ms for data de-duplication when replicationFactor is greater than 1. Higher values for -dedup.minScrapeInterval at vmselect is OK (default 1) -rpc.disableCompression - Whether to disable compression of RPC traffic. This reduces CPU usage at the cost of higher network bandwidth usage + Whether to disable compression of RPC traffic. This reduces CPU usage at the cost of higher network bandwidth usage -sortLabels - Whether to sort labels for incoming samples before writing them to storage. This may be needed for reducing memory usage at storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}. Enabled sorting for labels can slow down ingestion performance a bit + Whether to sort labels for incoming samples before writing them to storage. This may be needed for reducing memory usage at storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}. Enabled sorting for labels can slow down ingestion performance a bit -storageNode array - Comma-separated addresses of vmstorage nodes; usage: -storageNode=vmstorage-host1,...,vmstorage-hostN - Supports an array of values separated by comma or specified via multiple flags. + Comma-separated addresses of vmstorage nodes; usage: -storageNode=vmstorage-host1,...,vmstorage-hostN + Supports an array of values separated by comma or specified via multiple flags. -tls - Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set + Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set -tlsCertFile string - Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated -tlsKeyFile string - Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated -version - Show VictoriaMetrics version + Show VictoriaMetrics version ``` ### List of command-line flags for vmselect @@ -625,132 +599,132 @@ Below is the output for `/path/to/vmselect -help`: ``` -cacheDataPath string - Path to directory for cache files. Cache isn't saved if empty + Path to directory for cache files. Cache isn't saved if empty -cluster.tls - Whether to use TLS for connections to -storageNode. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection + Whether to use TLS for connections to -storageNode. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection -cluster.tlsCAFile string - Path to TLS CA file to use for verifying certificates provided by -storageNode. By default system CA is used. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection + Path to TLS CA file to use for verifying certificates provided by -storageNode. By default system CA is used. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection -cluster.tlsCertFile string - Path to client-side TLS certificate file to use when connecting to -storageNode. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection + Path to client-side TLS certificate file to use when connecting to -storageNode. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection -cluster.tlsKeyFile string - Path to client-side TLS key file to use when connecting to -storageNode. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection + Path to client-side TLS key file to use when connecting to -storageNode. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection -dedup.minScrapeInterval duration - Leave only the first sample in every time series per each discrete interval equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication for details + Leave only the first sample in every time series per each discrete interval equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication for details -downsampling.period array - Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details - Supports an array of values separated by comma or specified via multiple flags. + Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details + Supports an array of values separated by comma or specified via multiple flags. -enableTCP6 - Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used + Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used -envflag.enable - Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details + Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details -envflag.prefix string - Prefix for environment variables if -envflag.enable is set + Prefix for environment variables if -envflag.enable is set -eula - By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf + By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf -fs.disableMmap - Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() + Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() -graphiteTrimTimestamp duration - Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) + Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) -http.connTimeout duration - Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) + Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) -http.disableResponseCompression - Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth + Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth -http.idleConnTimeout duration - Timeout for incoming idle http connections (default 1m0s) + Timeout for incoming idle http connections (default 1m0s) -http.maxGracefulShutdownDuration duration - The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) + The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) -http.pathPrefix string - An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus + An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus -http.shutdownDelay duration - Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers + Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers -httpListenAddr string - Address to listen for http connections (default ":8481") + Address to listen for http connections (default ":8481") -loggerDisableTimestamps - Whether to disable writing timestamps in logs + Whether to disable writing timestamps in logs -loggerErrorsPerSecondLimit int - Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit + Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit -loggerFormat string - Format for logs. Possible values: default, json (default "default") + Format for logs. Possible values: default, json (default "default") -loggerLevel string - Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") + Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") -loggerOutput string - Output for the logs. Supported values: stderr, stdout (default "stderr") + Output for the logs. Supported values: stderr, stdout (default "stderr") -loggerTimezone string - Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") + Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") -loggerWarnsPerSecondLimit int - Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit + Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit -memory.allowedBytes size - Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -memory.allowedPercent float - Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) + Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) -replicationFactor int - How many copies of every time series is available on vmstorage nodes. See -replicationFactor command-line flag for vminsert nodes (default 1) + How many copies of every time series is available on vmstorage nodes. See -replicationFactor command-line flag for vminsert nodes (default 1) -search.cacheTimestampOffset duration - The maximum duration since the current time for response data, which is always queried from the original raw data, without using the response cache. Increase this value if you see gaps in responses due to time synchronization issues between VictoriaMetrics and data sources (default 5m0s) + The maximum duration since the current time for response data, which is always queried from the original raw data, without using the response cache. Increase this value if you see gaps in responses due to time synchronization issues between VictoriaMetrics and data sources (default 5m0s) -search.denyPartialResponse - Whether to deny partial responses if a part of -storageNode instances fail to perform queries; this trades availability over consistency; see also -search.maxQueryDuration + Whether to deny partial responses if a part of -storageNode instances fail to perform queries; this trades availability over consistency; see also -search.maxQueryDuration -search.disableCache - Whether to disable response caching. This may be useful during data backfilling + Whether to disable response caching. This may be useful during data backfilling -search.graphiteMaxPointsPerSeries int - The maximum number of points per series Graphite render API can return (default 1000000) + The maximum number of points per series Graphite render API can return (default 1000000) -search.graphiteStorageStep duration - The interval between datapoints stored in the database. It is used at Graphite Render API handler for normalizing the interval between datapoints in case it isn't normalized. It can be overriden by sending 'storage_step' query arg to /render API or by sending the desired interval via 'Storage-Step' http header during querying /render API (default 10s) + The interval between datapoints stored in the database. It is used at Graphite Render API handler for normalizing the interval between datapoints in case it isn't normalized. It can be overriden by sending 'storage_step' query arg to /render API or by sending the desired interval via 'Storage-Step' http header during querying /render API (default 10s) -search.latencyOffset duration - The time when data points become visible in query results after the collection. Too small value can result in incomplete last points for query results (default 30s) + The time when data points become visible in query results after the collection. Too small value can result in incomplete last points for query results (default 30s) -search.logSlowQueryDuration duration - Log queries with execution time exceeding this value. Zero disables slow query logging (default 5s) + Log queries with execution time exceeding this value. Zero disables slow query logging (default 5s) -search.maxConcurrentRequests int - The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration (default 8) + The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration (default 8) -search.maxExportDuration duration - The maximum duration for /api/v1/export call (default 720h0m0s) + The maximum duration for /api/v1/export call (default 720h0m0s) -search.maxLookback duration - Synonym to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. See also '-search.maxStalenessInterval' flag, which has the same meaining due to historical reasons + Synonym to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. See also '-search.maxStalenessInterval' flag, which has the same meaining due to historical reasons -search.maxPointsPerTimeseries int - The maximum points per a single timeseries returned from /api/v1/query_range. This option doesn't limit the number of scanned raw samples in the database. The main purpose of this option is to limit the number of per-series points returned to graphing UI such as Grafana. There is no sense in setting this limit to values bigger than the horizontal resolution of the graph (default 30000) + The maximum points per a single timeseries returned from /api/v1/query_range. This option doesn't limit the number of scanned raw samples in the database. The main purpose of this option is to limit the number of per-series points returned to graphing UI such as Grafana. There is no sense in setting this limit to values bigger than the horizontal resolution of the graph (default 30000) -search.maxQueryDuration duration - The maximum duration for query execution (default 30s) + The maximum duration for query execution (default 30s) -search.maxQueryLen size - The maximum search query length in bytes - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16384) + The maximum search query length in bytes + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16384) -search.maxQueueDuration duration - The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached; see also -search.maxQueryDuration (default 10s) + The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached; see also -search.maxQueryDuration (default 10s) -search.maxSamplesPerQuery int - The maximum number of raw samples a single query can process across all time series. This protects from heavy queries, which select unexpectedly high number of raw samples. See also -search.maxSamplesPerSeries (default 1000000000) + The maximum number of raw samples a single query can process across all time series. This protects from heavy queries, which select unexpectedly high number of raw samples. See also -search.maxSamplesPerSeries (default 1000000000) -search.maxSamplesPerSeries int - The maximum number of raw samples a single query can scan per each time series. See also -search.maxSamplesPerQuery (default 30000000) + The maximum number of raw samples a single query can scan per each time series. See also -search.maxSamplesPerQuery (default 30000000) -search.maxStalenessInterval duration - The maximum interval for staleness calculations. By default it is automatically calculated from the median interval between samples. This flag could be useful for tuning Prometheus data model closer to Influx-style data model. See https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness for details. See also '-search.maxLookback' flag, which has the same meaning due to historical reasons + The maximum interval for staleness calculations. By default it is automatically calculated from the median interval between samples. This flag could be useful for tuning Prometheus data model closer to Influx-style data model. See https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness for details. See also '-search.maxLookback' flag, which has the same meaning due to historical reasons -search.maxStatusRequestDuration duration - The maximum duration for /api/v1/status/* requests (default 5m0s) + The maximum duration for /api/v1/status/* requests (default 5m0s) -search.maxStepForPointsAdjustment duration - The maximum step when /api/v1/query_range handler adjusts points with timestamps closer than -search.latencyOffset to the current time. The adjustment is needed because such points may contain incomplete data (default 1m0s) + The maximum step when /api/v1/query_range handler adjusts points with timestamps closer than -search.latencyOffset to the current time. The adjustment is needed because such points may contain incomplete data (default 1m0s) -search.minStalenessInterval duration - The minimum interval for staleness calculations. This flag could be useful for removing gaps on graphs generated from time series with irregular intervals between samples. See also '-search.maxStalenessInterval' + The minimum interval for staleness calculations. This flag could be useful for removing gaps on graphs generated from time series with irregular intervals between samples. See also '-search.maxStalenessInterval' -search.noStaleMarkers - Set this flag to true if the database doesn't contain Prometheus stale markers, so there is no need in spending additional CPU time on its handling. Staleness markers may exist only in data obtained from Prometheus scrape targets + Set this flag to true if the database doesn't contain Prometheus stale markers, so there is no need in spending additional CPU time on its handling. Staleness markers may exist only in data obtained from Prometheus scrape targets -search.queryStats.lastQueriesCount int - Query stats for /api/v1/status/top_queries is tracked on this number of last queries. Zero value disables query stats tracking (default 20000) + Query stats for /api/v1/status/top_queries is tracked on this number of last queries. Zero value disables query stats tracking (default 20000) -search.queryStats.minQueryDuration duration - The minimum duration for queries to track in query stats at /api/v1/status/top_queries. Queries with lower duration are ignored in query stats (default 1ms) + The minimum duration for queries to track in query stats at /api/v1/status/top_queries. Queries with lower duration are ignored in query stats (default 1ms) -search.resetCacheAuthKey string - Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call + Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call -search.treatDotsAsIsInRegexps - Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter + Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter -selectNode array - Comma-serparated addresses of vmselect nodes; usage: -selectNode=vmselect-host1,...,vmselect-hostN - Supports an array of values separated by comma or specified via multiple flags. + Comma-serparated addresses of vmselect nodes; usage: -selectNode=vmselect-host1,...,vmselect-hostN + Supports an array of values separated by comma or specified via multiple flags. -storageNode array - Comma-separated addresses of vmstorage nodes; usage: -storageNode=vmstorage-host1,...,vmstorage-hostN - Supports an array of values separated by comma or specified via multiple flags. + Comma-separated addresses of vmstorage nodes; usage: -storageNode=vmstorage-host1,...,vmstorage-hostN + Supports an array of values separated by comma or specified via multiple flags. -tls - Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set + Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set -tlsCertFile string - Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated -tlsKeyFile string - Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated -version - Show VictoriaMetrics version + Show VictoriaMetrics version ``` ### List of command-line flags for vmstorage @@ -759,149 +733,147 @@ Below is the output for `/path/to/vmstorage -help`: ``` -bigMergeConcurrency int - The maximum number of CPU cores to use for big merges. Default value is used if set to 0 + The maximum number of CPU cores to use for big merges. Default value is used if set to 0 -cluster.tls - Whether to use TLS when accepting connections from vminsert and vmselect. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection + Whether to use TLS when accepting connections from vminsert and vmselect. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection -cluster.tlsCAFile string - Path to TLS CA file to use for verifying certificates provided by vminsert and vmselect. By default system CA is used. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection + Path to TLS CA file to use for verifying certificates provided by vminsert and vmselect. By default system CA is used. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection -cluster.tlsCertFile string - Path to server-side TLS certificate file to use when accepting connections from vminsert and vmselect. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection + Path to server-side TLS certificate file to use when accepting connections from vminsert and vmselect. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection -cluster.tlsKeyFile string - Path to server-side TLS key file to use when accepting connections from vminsert and vmselect. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection + Path to server-side TLS key file to use when accepting connections from vminsert and vmselect. See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#mtls-protection -dedup.minScrapeInterval duration - Leave only the first sample in every time series per each discrete interval equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication for details + Leave only the first sample in every time series per each discrete interval equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication for details -denyQueriesOutsideRetention - Whether to deny queries outside of the configured -retentionPeriod. When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. This may be useful when multiple data sources with distinct retentions are hidden behind query-tee + Whether to deny queries outside of the configured -retentionPeriod. When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. This may be useful when multiple data sources with distinct retentions are hidden behind query-tee -downsampling.period array - Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details - Supports an array of values separated by comma or specified via multiple flags. + Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details + Supports an array of values separated by comma or specified via multiple flags. -enableTCP6 - Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used + Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used -envflag.enable - Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details + Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details -envflag.prefix string - Prefix for environment variables if -envflag.enable is set + Prefix for environment variables if -envflag.enable is set -eula - By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf + By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf -finalMergeDelay duration - The delay before starting final merge for per-month partition after no new data is ingested into it. Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. Zero value disables final merge + The delay before starting final merge for per-month partition after no new data is ingested into it. Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. Zero value disables final merge -forceFlushAuthKey string - authKey, which must be passed in query string to /internal/force_flush pages + authKey, which must be passed in query string to /internal/force_flush pages -forceMergeAuthKey string - authKey, which must be passed in query string to /internal/force_merge pages + authKey, which must be passed in query string to /internal/force_merge pages -fs.disableMmap - Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() + Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() -http.connTimeout duration - Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) + Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) -http.disableResponseCompression - Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth + Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth -http.idleConnTimeout duration - Timeout for incoming idle http connections (default 1m0s) + Timeout for incoming idle http connections (default 1m0s) -http.maxGracefulShutdownDuration duration - The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) + The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) -http.pathPrefix string - An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus + An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus -http.shutdownDelay duration - Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers + Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers -httpListenAddr string - Address to listen for http connections (default ":8482") + Address to listen for http connections (default ":8482") -logNewSeries - Whether to log new series. This option is for debug purposes only. It can lead to performance issues when big number of new series are ingested into VictoriaMetrics + Whether to log new series. This option is for debug purposes only. It can lead to performance issues when big number of new series are ingested into VictoriaMetrics -loggerDisableTimestamps - Whether to disable writing timestamps in logs + Whether to disable writing timestamps in logs -loggerErrorsPerSecondLimit int - Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit + Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit -loggerFormat string - Format for logs. Possible values: default, json (default "default") + Format for logs. Possible values: default, json (default "default") -loggerLevel string - Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") + Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") -loggerOutput string - Output for the logs. Supported values: stderr, stdout (default "stderr") + Output for the logs. Supported values: stderr, stdout (default "stderr") -loggerTimezone string - Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") + Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") -loggerWarnsPerSecondLimit int - Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit + Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit -memory.allowedBytes size - Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -memory.allowedPercent float - Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) + Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) -precisionBits int - The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss (default 64) + The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss (default 64) -retentionPeriod value - Data with timestamps outside the retentionPeriod is automatically deleted - The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1) + Data with timestamps outside the retentionPeriod is automatically deleted + The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1) -rpc.disableCompression - Disable compression of RPC traffic. This reduces CPU usage at the cost of higher network bandwidth usage + Disable compression of RPC traffic. This reduces CPU usage at the cost of higher network bandwidth usage -search.maxTagKeys int - The maximum number of tag keys returned per search (default 100000) + The maximum number of tag keys returned per search (default 100000) -search.maxTagValueSuffixesPerSearch int - The maximum number of tag value suffixes returned from /metrics/find (default 100000) + The maximum number of tag value suffixes returned from /metrics/find (default 100000) -search.maxTagValues int - The maximum number of tag values returned per search (default 100000) + The maximum number of tag values returned per search (default 100000) -search.maxUniqueTimeseries int - The maximum number of unique time series a single query can process. This allows protecting against heavy queries, which select unexpectedly high number of series. See also -search.maxSamplesPerQuery and -search.maxSamplesPerSeries (default 300000) + The maximum number of unique time series a single query can process. This allows protecting against heavy queries, which select unexpectedly high number of series. See also -search.maxSamplesPerQuery and -search.maxSamplesPerSeries (default 300000) -smallMergeConcurrency int - The maximum number of CPU cores to use for small merges. Default value is used if set to 0 + The maximum number of CPU cores to use for small merges. Default value is used if set to 0 -snapshotAuthKey string - authKey, which must be passed in query string to /snapshot* pages + authKey, which must be passed in query string to /snapshot* pages -storage.cacheSizeIndexDBDataBlocks size - Overrides max size for indexdb/dataBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Overrides max size for indexdb/dataBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -storage.cacheSizeIndexDBIndexBlocks size - Overrides max size for indexdb/indexBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Overrides max size for indexdb/indexBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -storage.cacheSizeStorageTSID size - Overrides max size for storage/tsid cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Overrides max size for storage/tsid cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -storage.maxDailySeries int - The maximum number of unique series can be added to the storage during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See also -storage.maxHourlySeries + The maximum number of unique series can be added to the storage during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See also -storage.maxHourlySeries -storage.maxHourlySeries int - The maximum number of unique series can be added to the storage during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See also -storage.maxDailySeries + The maximum number of unique series can be added to the storage during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See also -storage.maxDailySeries -storage.minFreeDiskSpaceBytes size - The minimum free disk space at -storageDataPath after which the storage stops accepting new data - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 10000000) + The minimum free disk space at -storageDataPath after which the storage stops accepting new data + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 10000000) -storageDataPath string - Path to storage data (default "vmstorage-data") + Path to storage data (default "vmstorage-data") -tls - Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set + Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set -tlsCertFile string - Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated -tlsKeyFile string - Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated -version - Show VictoriaMetrics version + Show VictoriaMetrics version -vminsertAddr string - TCP address to accept connections from vminsert services (default ":8400") + TCP address to accept connections from vminsert services (default ":8400") -vmselectAddr string - TCP address to accept connections from vmselect services (default ":8401") + TCP address to accept connections from vmselect services (default ":8401") ``` - ## VictoriaMetrics Logo [Zip](VM_logo.zip) contains three folders with different image orientation (main color and inverted version). Files included in each folder: -* 2 JPEG Preview files -* 2 PNG Preview files with transparent background -* 2 EPS Adobe Illustrator EPS10 files - +- 2 JPEG Preview files +- 2 PNG Preview files with transparent background +- 2 EPS Adobe Illustrator EPS10 files ### Logo Usage Guidelines -#### Font used: +#### Font used -* Lato Black -* Lato Regular +- Lato Black +- Lato Regular -#### Color Palette: +#### Color Palette -* HEX [#110f0f](https://www.color-hex.com/color/110f0f) -* HEX [#ffffff](https://www.color-hex.com/color/ffffff) +- HEX [#110f0f](https://www.color-hex.com/color/110f0f) +- HEX [#ffffff](https://www.color-hex.com/color/ffffff) -### We kindly ask: +### We kindly ask - Please don't use any other font instead of suggested. - There should be sufficient clear space around the logo. diff --git a/docs/FAQ.md b/docs/FAQ.md index ee3d0923d..a814f26db 100644 --- a/docs/FAQ.md +++ b/docs/FAQ.md @@ -8,32 +8,26 @@ sort: 14 To provide the best monitoring solution. - ## Who uses VictoriaMetrics? See [case studies](https://docs.victoriametrics.com/CaseStudies.html). - ## Which features does VictoriaMetrics have? See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#prominent-features). - ## Are there performance comparisons with other solutions? Yes. See [these benchmarks](https://docs.victoriametrics.com/Articles.html#benchmarks). - ## How to start using VictoriaMetrics? See [these docs](https://docs.victoriametrics.com/Quick-Start.html). - ## Does VictoriaMetrics support replication? Yes. See [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#replication-and-data-safety) for details. - ## Can I use VictoriaMetrics instead of Prometheus? Yes in most cases. VictoriaMetrics can substitute Prometheus in the following aspects: @@ -42,43 +36,40 @@ Yes in most cases. VictoriaMetrics can substitute Prometheus in the following as * Prometheus-compatible alerting rules and recording rules can be processed with [vmalert](https://docs.victoriametrics.com/vmalert.html). * Prometheus-compatible querying in Grafana is supported by VictoriaMetrics. See [these docs](https://docs.victoriametrics.com/#grafana-setup). - ## What is the difference between vmagent and Prometheus? While both [vmagent](https://docs.victoriametrics.com/vmagent.html) and Prometheus may scrape Prometheus targets (aka `/metrics` pages) according to the provided Prometheus-compatible [scrape configs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) and send data to multiple remote storage systems, vmagent has the following additional features: -- vmagent usually requires lower amounts of CPU, RAM and disk IO compared to Prometheus when scraping an enormous number of targets (more than 1000) +* vmagent usually requires lower amounts of CPU, RAM and disk IO compared to Prometheus when scraping an enormous number of targets (more than 1000) or targets with a great number of exposed metrics. -- vmagent provides independent disk-backed buffers for each configured remote storage (see `-remoteWrite.url`). This means that slow or temporarily unavailable storage +* vmagent provides independent disk-backed buffers for each configured remote storage (see `-remoteWrite.url`). This means that slow or temporarily unavailable storage doesn't prevent it from sending data to healthy storage in parallel. Prometheus uses a single shared buffer for all the configured remote storage systems (see `remote_write->url`) with a hardcoded retention of 2 hours. -- vmagent may accept, relabel and filter data obtained via multiple data ingestion protocols in addition to data scraped from Prometheus targets. +* vmagent may accept, relabel and filter data obtained via multiple data ingestion protocols in addition to data scraped from Prometheus targets. That means it supports both `pull` and `push` protocols for data ingestion. See [these docs](https://docs.victoriametrics.com/vmagent.html#features) for details. -- vmagent may be used in different use cases: - - [IoT and edge monitoring](https://docs.victoriametrics.com/vmagent.html#iot-and-edge-monitoring) - - [Drop-in replacement for Prometheus](https://docs.victoriametrics.com/vmagent.html#drop-in-replacement-for-prometheus) - - [Replication and High Availability](https://docs.victoriametrics.com/vmagent.html#replication-and-high-availability) - - [Relabeling and Filtering](https://docs.victoriametrics.com/vmagent.html#relabeling-and-filtering) - - [Splitting data streams among multiple systems](https://docs.victoriametrics.com/vmagent.html#splitting-data-streams-among-multiple-systems) - - [Prometheus remote_write proxy](https://docs.victoriametrics.com/vmagent.html#prometheus-remote_write-proxy) - +* vmagent may be used in different use cases: + * [IoT and edge monitoring](https://docs.victoriametrics.com/vmagent.html#iot-and-edge-monitoring) + * [Drop-in replacement for Prometheus](https://docs.victoriametrics.com/vmagent.html#drop-in-replacement-for-prometheus) + * [Replication and High Availability](https://docs.victoriametrics.com/vmagent.html#replication-and-high-availability) + * [Relabeling and Filtering](https://docs.victoriametrics.com/vmagent.html#relabeling-and-filtering) + * [Splitting data streams among multiple systems](https://docs.victoriametrics.com/vmagent.html#splitting-data-streams-among-multiple-systems) + * [Prometheus remote_write proxy](https://docs.victoriametrics.com/vmagent.html#prometheus-remote_write-proxy) ## What is the difference between vmagent and Prometheus agent? Both [vmagent](https://docs.victoriametrics.com/vmagent.html) and [Prometheus agent](https://prometheus.io/blog/2021/11/16/agent/) serve the same purpose – to efficently scrape Prometheus-compatible targets at the edge. They have the following differences: -- vmagent usually requires lower amounts of CPU, RAM and disk IO compared to the Prometheus agent. -- Prometheus agent supports only pull-based data collection (e.g. it can scrape Prometheus-compatible targets), while vmagent supports both pull and push data collection – it can accept data via many popular data ingestion protocols such as InfluxDB line protocol, Graphite protocol, OpenTSDB protocol, DataDog protocol, Prometheus protocol, CSV and JSON – see [these docs](https://docs.victoriametrics.com/vmagent.html#features). -- vmagent can easily scale horizontally to multiple instances for scraping a big number of targets – see [these docs](https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets). -- vmagent supports [improved relabeling](https://docs.victoriametrics.com/vmagent.html#relabeling). -- vmagent can limit the number of scraped metrics per target – see [these docs](https://docs.victoriametrics.com/vmagent.html#cardinality-limiter). -- vmagent supports loading scrape configs from multiple files – see [these docs](https://docs.victoriametrics.com/vmagent.html#loading-scrape-configs-from-multiple-files). -- vmagent supports data reading and data writing from/to Kafka – see [these docs](https://docs.victoriametrics.com/vmagent.html#kafka-integration). -- vmagent can read and update scrape configs from http and https URLs, while the Prometheus agent can read them only from the local file system. - +* vmagent usually requires lower amounts of CPU, RAM and disk IO compared to the Prometheus agent. +* Prometheus agent supports only pull-based data collection (e.g. it can scrape Prometheus-compatible targets), while vmagent supports both pull and push data collection – it can accept data via many popular data ingestion protocols such as InfluxDB line protocol, Graphite protocol, OpenTSDB protocol, DataDog protocol, Prometheus protocol, CSV and JSON – see [these docs](https://docs.victoriametrics.com/vmagent.html#features). +* vmagent can easily scale horizontally to multiple instances for scraping a big number of targets – see [these docs](https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets). +* vmagent supports [improved relabeling](https://docs.victoriametrics.com/vmagent.html#relabeling). +* vmagent can limit the number of scraped metrics per target – see [these docs](https://docs.victoriametrics.com/vmagent.html#cardinality-limiter). +* vmagent supports loading scrape configs from multiple files – see [these docs](https://docs.victoriametrics.com/vmagent.html#loading-scrape-configs-from-multiple-files). +* vmagent supports data reading and data writing from/to Kafka – see [these docs](https://docs.victoriametrics.com/vmagent.html#kafka-integration). +* vmagent can read and update scrape configs from http and https URLs, while the Prometheus agent can read them only from the local file system. ## Is it safe to enable [remote write](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage) in Prometheus? @@ -88,133 +79,127 @@ and new data is available for querying via Prometheus as usual. It is recommended using [vmagent](https://docs.victoriametrics.com/vmagent.html) for scraping Prometheus targets and writing data to VictoriaMetrics. - ## How does VictoriaMetrics compare to other remote storage solutions for Prometheus such as [M3 from Uber](https://eng.uber.com/m3/), [Thanos](https://github.com/thanos-io/thanos), [Cortex](https://github.com/cortexproject/cortex), etc.? VictoriaMetrics is simpler, faster, more cost-effective and it provides [MetricsQL query language](MetricsQL) based on PromQL. The simplicity is twofold: -- It is simpler to configure and operate. There is no need for configuring [sidecars](https://github.com/thanos-io/thanos/blob/master/docs/components/sidecar.md), +* It is simpler to configure and operate. There is no need for configuring [sidecars](https://github.com/thanos-io/thanos/blob/master/docs/components/sidecar.md), fighting the [gossip protocol](https://github.com/improbable-eng/thanos/blob/030bc345c12c446962225221795f4973848caab5/docs/proposals/completed/201809_gossip-removal.md) or setting up third-party systems such as [Consul](https://github.com/cortexproject/cortex/issues/157), [Cassandra](https://cortexmetrics.io/docs/chunks-storage/running-chunks-storage-with-cassandra/), [DynamoDB](https://cortexmetrics.io/docs/chunks-storage/aws-tips/) or [Memcached](https://cortexmetrics.io/docs/chunks-storage/caching/). -- VictoriaMetrics has a simpler architecture. This means fewer bugs and more useful features in the long run compared to competing TSDBs. +* VictoriaMetrics has a simpler architecture. This means fewer bugs and more useful features in the long run compared to competing TSDBs. See [comparing Thanos to VictoriaMetrics cluster](https://medium.com/@valyala/comparing-thanos-to-victoriametrics-cluster-b193bea1683) and the [Remote Write Storage Wars](https://promcon.io/2019-munich/talks/remote-write-storage-wars/) talk from [PromCon 2019](https://promcon.io/2019-munich/talks/remote-write-storage-wars/). VictoriaMetrics also [uses less RAM than Thanos components](https://github.com/thanos-io/thanos/issues/448). - ## What is the difference between VictoriaMetrics and [QuestDB](https://questdb.io/)? -- QuestDB needs more than 20x storage space than VictoriaMetrics. This translates to higher storage costs and slower queries over historical data, which must be read from the disk. -- QuestDB is much harder to set up and operate than VictoriaMetrics. Compare [setup instructions for QuestDB](https://questdb.io/docs/get-started/binaries) to [setup instructions for VictoriaMetrics](https://docs.victoriametrics.com/#how-to-start-victoriametrics). -- VictoriaMetrics provides the [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) query language, which is better suited for typical queries over time series data than the SQL-like query language provided by QuestDB. See [this article](https://valyala.medium.com/promql-tutorial-for-beginners-9ab455142085) for details. -- VictoriaMetrics can be queried via the [Prometheus querying API](https://docs.victoriametrics.com/#prometheus-querying-api-usage) and via [Graphite's API](https://docs.victoriametrics.com/#graphite-api-usage). -- Thanks to PromQL support, VictoriaMetrics [can be used as a drop-in replacement for Prometheus in Grafana](https://docs.victoriametrics.com/#grafana-setup), while QuestDB needs a full rewrite of existing dashboards in Grafana. -- Thanks to Prometheus' remote_write API support, VictoriaMetrics can be used as a long-term storage for Prometheus or for [vmagent](https://docs.victoriametrics.com/vmagent.html), while QuestDB has no integration with Prometheus. -- QuestDB [supports a smaller range of popular data ingestion protocols](https://questdb.io/docs/develop/insert-data) compared to VictoriaMetrics (compare to [the list of supported data ingestion protocols for VictoriaMetrics](https://docs.victoriametrics.com/#how-to-import-time-series-data)). -- [VictoriaMetrics supports backfilling (e.g. storing historical data) out of the box](https://docs.victoriametrics.com/#backfilling), while QuestDB provides [very limited support for backfilling](https://questdb.io/blog/2021/05/10/questdb-release-6-0-tsbs-benchmark#the-problem-with-out-of-order-data). - +* QuestDB needs more than 20x storage space than VictoriaMetrics. This translates to higher storage costs and slower queries over historical data, which must be read from the disk. +* QuestDB is much harder to set up and operate than VictoriaMetrics. Compare [setup instructions for QuestDB](https://questdb.io/docs/get-started/binaries) to [setup instructions for VictoriaMetrics](https://docs.victoriametrics.com/#how-to-start-victoriametrics). +* VictoriaMetrics provides the [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) query language, which is better suited for typical queries over time series data than the SQL-like query language provided by QuestDB. See [this article](https://valyala.medium.com/promql-tutorial-for-beginners-9ab455142085) for details. +* VictoriaMetrics can be queried via the [Prometheus querying API](https://docs.victoriametrics.com/#prometheus-querying-api-usage) and via [Graphite's API](https://docs.victoriametrics.com/#graphite-api-usage). +* Thanks to PromQL support, VictoriaMetrics [can be used as a drop-in replacement for Prometheus in Grafana](https://docs.victoriametrics.com/#grafana-setup), while QuestDB needs a full rewrite of existing dashboards in Grafana. +* Thanks to Prometheus' remote_write API support, VictoriaMetrics can be used as a long-term storage for Prometheus or for [vmagent](https://docs.victoriametrics.com/vmagent.html), while QuestDB has no integration with Prometheus. +* QuestDB [supports a smaller range of popular data ingestion protocols](https://questdb.io/docs/develop/insert-data) compared to VictoriaMetrics (compare to [the list of supported data ingestion protocols for VictoriaMetrics](https://docs.victoriametrics.com/#how-to-import-time-series-data)). +* [VictoriaMetrics supports backfilling (e.g. storing historical data) out of the box](https://docs.victoriametrics.com/#backfilling), while QuestDB provides [very limited support for backfilling](https://questdb.io/blog/2021/05/10/questdb-release-6-0-tsbs-benchmark#the-problem-with-out-of-order-data). ## What is the difference between VictoriaMetrics and [Cortex](https://github.com/cortexproject/cortex)? VictoriaMetrics is similar to Cortex in the following aspects: -- Both systems accept data from [vmagent](https://docs.victoriametrics.com/vmagent.html) or Prometheus +* Both systems accept data from [vmagent](https://docs.victoriametrics.com/vmagent.html) or Prometheus via the standard [remote_write API](https://prometheus.io/docs/practices/remote_write/), so there is no need for running sidecars unlike in [Thanos](https://github.com/thanos-io/thanos)' case. -- Both systems support multi-tenancy out of the box. See [the corresponding docs for VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy). -- Both systems support data replication. See [replication in Cortex](https://github.com/cortexproject/cortex/blob/fe56f1420099aa1bf1ce09316c186e05bddee879/docs/architecture.md#hashing) and [replication in VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#replication-and-data-safety). -- Both systems scale horizontally to multiple nodes. See [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#cluster-resizing-and-scalability) for details. -- Both systems support alerting and recording rules via the corresponding tools such as [vmalert](https://docs.victoriametrics.com/vmalert.html). -- Both systems can be queried via the [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/) and integrate perfectly with Grafana. +* Both systems support multi-tenancy out of the box. See [the corresponding docs for VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy). +* Both systems support data replication. See [replication in Cortex](https://github.com/cortexproject/cortex/blob/fe56f1420099aa1bf1ce09316c186e05bddee879/docs/architecture.md#hashing) and [replication in VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#replication-and-data-safety). +* Both systems scale horizontally to multiple nodes. See [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#cluster-resizing-and-scalability) for details. +* Both systems support alerting and recording rules via the corresponding tools such as [vmalert](https://docs.victoriametrics.com/vmalert.html). +* Both systems can be queried via the [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/) and integrate perfectly with Grafana. The main differences between Cortex and VictoriaMetrics: -- Cortex re-uses Prometheus source code, while VictoriaMetrics is written from scratch. -- Cortex heavily relies on third-party services such as Consul, Memcache, DynamoDB, BigTable, Cassandra, etc. +* Cortex re-uses Prometheus source code, while VictoriaMetrics is written from scratch. +* Cortex heavily relies on third-party services such as Consul, Memcache, DynamoDB, BigTable, Cassandra, etc. This may increase operational complexity and reduce system reliability compared to VictoriaMetrics' case, which doesn't use any external services. Compare [Cortex' Architecture](https://github.com/cortexproject/cortex/blob/master/docs/architecture.md) to [VictoriaMetrics' architecture](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#architecture-overview). -- VictoriaMetrics provides [production-ready single-node solution](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html), +* VictoriaMetrics provides [production-ready single-node solution](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html), which is much easier to set up and operate than a Cortex cluster. -- Cortex may lose up to 12 hours of recent data on Ingestor failure – see [the corresponding docs](https://github.com/cortexproject/cortex/blob/fe56f1420099aa1bf1ce09316c186e05bddee879/docs/architecture.md#ingesters-failure-and-data-loss). +* Cortex may lose up to 12 hours of recent data on Ingestor failure – see [the corresponding docs](https://github.com/cortexproject/cortex/blob/fe56f1420099aa1bf1ce09316c186e05bddee879/docs/architecture.md#ingesters-failure-and-data-loss). VictoriaMetrics may lose only a few seconds of recent data, which isn't synced to persistent storage yet. See [this article for details](https://medium.com/@valyala/wal-usage-looks-broken-in-modern-time-series-databases-b62a627ab704). -- Cortex is usually slower and requires more CPU and RAM than VictoriaMetrics. See [this talk from adidas at PromCon 2019](https://promcon.io/2019-munich/talks/remote-write-storage-wars/) and [other case studies](https://docs.victoriametrics.com/CaseStudies.html). -- VictoriaMetrics accepts data in multiple popular data ingestion protocols additionally to Prometheus remote_write protocol – InfluxDB, OpenTSDB, Graphite, CSV, JSON, native binary. +* Cortex is usually slower and requires more CPU and RAM than VictoriaMetrics. See [this talk from adidas at PromCon 2019](https://promcon.io/2019-munich/talks/remote-write-storage-wars/) and [other case studies](https://docs.victoriametrics.com/CaseStudies.html). +* VictoriaMetrics accepts data in multiple popular data ingestion protocols additionally to Prometheus remote_write protocol – InfluxDB, OpenTSDB, Graphite, CSV, JSON, native binary. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-time-series-data) for details. -- VictoriaMetrics provides the [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) query language, while Cortex provides the [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/) query language. -- VictoriaMetrics can be queried via [Graphite's API](https://docs.victoriametrics.com/#graphite-api-usage). - +* VictoriaMetrics provides the [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) query language, while Cortex provides the [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/) query language. +* VictoriaMetrics can be queried via [Graphite's API](https://docs.victoriametrics.com/#graphite-api-usage). ## What is the difference between VictoriaMetrics and [Thanos](https://github.com/thanos-io/thanos)? -- Thanos re-uses Prometheus source code, while VictoriaMetrics is written from scratch. -- VictoriaMetrics accepts data via the [standard remote_write API for Prometheus](https://prometheus.io/docs/practices/remote_write/), +* Thanos re-uses Prometheus source code, while VictoriaMetrics is written from scratch. +* VictoriaMetrics accepts data via the [standard remote_write API for Prometheus](https://prometheus.io/docs/practices/remote_write/), while Thanos uses a non-standard [sidecar](https://github.com/thanos-io/thanos/blob/master/docs/components/sidecar.md) which must run alongside each Prometheus instance. -- The Thanos sidecar requires disabling data compaction in Prometheus, which may hurt Prometheus performance and increase RAM usage. See [these docs](https://thanos.io/tip/components/sidecar.md/) for more details. -- Thanos stores data in object storage (Amazon S3 or Google GCS), while VictoriaMetrics stores data in block storage +* The Thanos sidecar requires disabling data compaction in Prometheus, which may hurt Prometheus performance and increase RAM usage. See [these docs](https://thanos.io/tip/components/sidecar.md/) for more details. +* Thanos stores data in object storage (Amazon S3 or Google GCS), while VictoriaMetrics stores data in block storage ([GCP persistent disks](https://cloud.google.com/compute/docs/disks#pdspecs), Amazon EBS or bare metal HDD). While object storage is usually less expensive, block storage provides much lower latencies and higher throughput. VictoriaMetrics works perfectly with HDD-based block storage – there is no need for using more expensive SSD or NVMe disks in most cases. -- Thanos may lose up to 2 hours of recent data, which wasn't uploaded yet to object storage. VictoriaMetrics may lose only a few seconds of recent data, +* Thanos may lose up to 2 hours of recent data, which wasn't uploaded yet to object storage. VictoriaMetrics may lose only a few seconds of recent data, which hasn't been synced to persistent storage yet. See [this article for details](https://medium.com/@valyala/wal-usage-looks-broken-in-modern-time-series-databases-b62a627ab704). -- VictoriaMetrics provides a [production-ready single-node solution](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html), +* VictoriaMetrics provides a [production-ready single-node solution](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html), which is much easier to set up and operate than Thanos components. -- Thanos may be harder to set up and operate compared to VictoriaMetrics, since it has more moving parts, which can be connected with fewer reliable networks. +* Thanos may be harder to set up and operate compared to VictoriaMetrics, since it has more moving parts, which can be connected with fewer reliable networks. See [this article for details](https://medium.com/faun/comparing-thanos-to-victoriametrics-cluster-b193bea1683). -- Thanos is usually slower and requires more CPU and RAM than VictoriaMetrics. See [this talk from adidas at PromCon 2019](https://promcon.io/2019-munich/talks/remote-write-storage-wars/). -- VictoriaMetrics accepts data via multiple popular data ingestion protocols in addition to the Prometheus remote_write protocol – InfluxDB, OpenTSDB, Graphite, CSV, JSON, native binary. +* Thanos is usually slower and requires more CPU and RAM than VictoriaMetrics. See [this talk from adidas at PromCon 2019](https://promcon.io/2019-munich/talks/remote-write-storage-wars/). +* VictoriaMetrics accepts data via multiple popular data ingestion protocols in addition to the Prometheus remote_write protocol – InfluxDB, OpenTSDB, Graphite, CSV, JSON, native binary. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-time-series-data) for details. -- VictoriaMetrics provides the [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) query language, while Thanos provides the [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/) query language. -- VictoriaMetrics can be queried via [Graphite's API](https://docs.victoriametrics.com/#graphite-api-usage). - +* VictoriaMetrics provides the [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) query language, while Thanos provides the [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/) query language. +* VictoriaMetrics can be queried via [Graphite's API](https://docs.victoriametrics.com/#graphite-api-usage). ## How does VictoriaMetrics compare to [InfluxDB](https://www.influxdata.com/time-series-platform/influxdb/)? -- VictoriaMetrics requires [10x less RAM](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893) and it [works faster](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae). -- VictoriaMetrics needs lower amounts of storage space than InfluxDB for production data. -- VictoriaMetrics doesn't support InfluxQL or Flux but provides a better query language – [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html). See [this tutorial](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085) for details. -- VictoriaMetrics accepts data in multiple popular data ingestion protocols in addition to InfluxDB – Prometheus remote_write, OpenTSDB, Graphite, CSV, JSON, native binary. +* VictoriaMetrics requires [10x less RAM](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893) and it [works faster](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae). +* VictoriaMetrics needs lower amounts of storage space than InfluxDB for production data. +* VictoriaMetrics doesn't support InfluxQL or Flux but provides a better query language – [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html). See [this tutorial](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085) for details. +* VictoriaMetrics accepts data in multiple popular data ingestion protocols in addition to InfluxDB – Prometheus remote_write, OpenTSDB, Graphite, CSV, JSON, native binary. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-time-series-data) for details. -- VictoriaMetrics can be queried via [Graphite's API](https://docs.victoriametrics.com/#graphite-api-usage). - +* VictoriaMetrics can be queried via [Graphite's API](https://docs.victoriametrics.com/#graphite-api-usage). ## How does VictoriaMetrics compare to [TimescaleDB](https://www.timescale.com/)? -- TimescaleDB insists on using SQL as a query language. While SQL is more powerful than PromQL, this power is rarely required during typical usages of a TSDB. Real-world queries usually [look clearer and simpler when written in PromQL than in SQL](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085). -- VictoriaMetrics requires [up to 70x less storage space compared to TimescaleDB](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4) for storing the same amount of time series data. The gap in storage space usage can be lowered from 70x to 3x if [compression in TimescaleDB is properly configured](https://docs.timescale.com/latest/using-timescaledb/compression) (it isn't an easy task in general :)). -- VictoriaMetrics requires up to 10x less CPU and RAM resources than TimescaleDB for processing production data. See [this article](https://abiosgaming.com/press/high-cardinality-aggregations/) for details. -- TimescaleDB is [harder to set up, configure and operate](https://docs.timescale.com/timescaledb/latest/how-to-guides/install-timescaledb/self-hosted/ubuntu/installation-apt-ubuntu/) than VictoriaMetrics (see [how to run VictoriaMetrics](https://docs.victoriametrics.com/#how-to-start-victoriametrics)). -- VictoriaMetrics accepts data in multiple popular data ingestion protocols – InfluxDB, OpenTSDB, Graphite, CSV – while TimescaleDB supports only SQL inserts. -- VictoriaMetrics can be queried via [Graphite's API](https://docs.victoriametrics.com/#graphite-api-usage). - +* TimescaleDB insists on using SQL as a query language. While SQL is more powerful than PromQL, this power is rarely required during typical usages of a TSDB. Real-world queries usually [look clearer and simpler when written in PromQL than in SQL](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085). +* VictoriaMetrics requires [up to 70x less storage space compared to TimescaleDB](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4) for storing the same amount of time series data. The gap in storage space usage can be lowered from 70x to 3x if [compression in TimescaleDB is properly configured](https://docs.timescale.com/latest/using-timescaledb/compression) (it isn't an easy task in general :)). +* VictoriaMetrics requires up to 10x less CPU and RAM resources than TimescaleDB for processing production data. See [this article](https://abiosgaming.com/press/high-cardinality-aggregations/) for details. +* TimescaleDB is [harder to set up, configure and operate](https://docs.timescale.com/timescaledb/latest/how-to-guides/install-timescaledb/self-hosted/ubuntu/installation-apt-ubuntu/) than VictoriaMetrics (see [how to run VictoriaMetrics](https://docs.victoriametrics.com/#how-to-start-victoriametrics)). +* VictoriaMetrics accepts data in multiple popular data ingestion protocols – InfluxDB, OpenTSDB, Graphite, CSV – while TimescaleDB supports only SQL inserts. +* VictoriaMetrics can be queried via [Graphite's API](https://docs.victoriametrics.com/#graphite-api-usage). ## Does VictoriaMetrics use Prometheus technologies like other clustered TSDBs built on top of Prometheus such as [Thanos](https://github.com/thanos-io/thanos) or [Cortex](https://github.com/cortexproject/cortex)? No. VictoriaMetrics core is written in Go from scratch by [fasthttp](https://github.com/valyala/fasthttp)'s [author](https://github.com/valyala). The architecture is [optimized for storing and querying large amounts of time series data with high cardinality](https://medium.com/devopslinks/victoriametrics-creating-the-best-remote-storage-for-prometheus-5d92d66787ac). VictoriaMetrics storage uses [certain ideas from ClickHouse](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282). Special thanks to [Alexey Milovidov](https://github.com/alexey-milovidov). - ## What is the pricing for VictoriaMetrics? The following versions are open source and free: + * [Single-node version](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html). * [Cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster). We provide commercial support for both versions. [Contact us](mailto:info@victoriametrics.com) for the pricing. The following commercial versions of VictoriaMetrics are available: + * [Managed VictoriaMetrics at AWS](https://aws.amazon.com/marketplace/pp/prodview-4tbfq5icmbmyc) (aka managed Prometheus). The following commercial versions of VictoriaMetrics are planned: + * Managed VictoriaMetrics at Google Cloud. * Cloud monitoring solution based on VictoriaMetrics. [Contact us](mailto:info@victoriametrics.com) for more information on our plans. - ## Why doesn't VictoriaMetrics support the [Prometheus remote read API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#%3Cremote_read%3E)? The remote read API requires transferring all the raw data for all the requested metrics over the given time range. For instance, @@ -225,19 +210,17 @@ Prometheus' remote read API isn't intended for querying foreign data – aka `gl So just query VictoriaMetrics directly via [vmui](https://docs.victoriametrics.com/#vmui), the [Prometheus Querying API](https://docs.victoriametrics.com/#prometheus-querying-api-usage) or via [Prometheus datasource in Grafana](https://docs.victoriametrics.com/#grafana-setup). - ## Does VictoriaMetrics deduplicate data from Prometheus instances scraping the same targets (aka `HA pairs`)? Yes. See [these docs](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#deduplication) for details. - ## Where is the source code of VictoriaMetrics? Source code for the following versions is available in the following places: + * [Single-node version](https://github.com/VictoriaMetrics/VictoriaMetrics) * [Cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster) - ## Is VictoriaMetrics a good fit for data from IoT sensors and industrial sensors? VictoriaMetrics is able to handle data from hundreds of millions of IoT sensors and industrial sensors. @@ -245,26 +228,22 @@ It supports [high cardinality data](https://medium.com/@valyala/high-cardinality perfectly [scales up on a single node](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae) and scales horizontally to multiple nodes. - ## Where can I ask questions about VictoriaMetrics? Questions about VictoriaMetrics can be asked via the following channels: -- [Slack channel](https://slack.victoriametrics.com/) -- [Telegram channel](https://t.me/VictoriaMetrics_en) -- [Google group](https://groups.google.com/forum/#!forum/victorametrics-users) - +* [Slack channel](https://slack.victoriametrics.com/) +* [Telegram channel](https://t.me/VictoriaMetrics_en) +* [Google group](https://groups.google.com/forum/#!forum/victorametrics-users) ## Where can I file bugs and feature requests regarding VictoriaMetrics? File bugs and feature requests [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues). - ## Where can I find information about multi-tenancy? See [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy). Multitenancy is supported only by the [cluster version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) of VictoriaMetrics. - ## How to set a memory limit for VictoriaMetrics components? All the VictoriaMetrics components provide command-line flags to control the size of internal buffers and caches: `-memory.allowedPercent` and `-memory.allowedBytes` (pass `-help` to any VictoriaMetrics component in order to see the description for these flags). These limits don't take into account additional memory, which may be needed for processing incoming queries. Hard limits may be enforced only by the OS via [cgroups](https://en.wikipedia.org/wiki/Cgroups), Docker (see [these docs](https://docs.docker.com/config/containers/resource_constraints)) or Kubernetes (see [these docs](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers)). @@ -276,22 +255,18 @@ Memory usage for VictoriaMetrics components can be tuned according to the follow * [Troubleshooting for vmagent](https://docs.victoriametrics.com/vmagent.html#troubleshooting) * [Troubleshooting for single-node VictoriaMetrics](https://docs.victoriametrics.com/#troubleshooting) - ## How can I run VictoriaMetrics on FreeBSD? VictoriaMetrics is included in FreeBSD ports, so just install it from there. See [this link](https://www.freebsd.org/cgi/ports.cgi?query=victoria&stype=all). - ## Does VictoriaMetrics support the Graphite query language? Yes. See [these docs](https://docs.victoriametrics.com/#graphite-api-usage). - ## What is an active time series? A time series is uniquely identified by its name plus a set of its labels. For example, `temperature{city="NY",country="US"}` and `temperature{city="SF",country="US"}` are two distinct series, since they differ by the `city` label. A time series is considered active if it receives at least a single new sample during the last hour. - ## What is high churn rate? If old time series are constantly substituted by new time series at a high rate, then such a state is called `high churn rate`. High churn rate has the following negative consequences: @@ -309,54 +284,44 @@ The main reason for high churn rate is a metric label with frequently changed va The solution against high churn rate is to identify and eliminate labels with frequently changed values. The [/api/v1/status/tsdb](https://docs.victoriametrics.com/#tsdb-stats) page can help determining these labels. - ## What is high cardinality? High cardinality usually means a high number of [active time series](#what-is-an-active-time-series). High cardinality may lead to high memory usage and/or to a high percentage of [slow inserts](#what-is-a-slow-insert). The source of high cardinality is usually a label with a large number of unique values, which presents a big share of the ingested time series. The solution is to identify and remove the source of high cardinality with the help of [/api/v1/status/tsdb](https://docs.victoriametrics.com/#tsdb-stats). - ## What is a slow insert? VictoriaMetrics maintains in-memory cache for mapping of [active time series](#what-is-an-active-time-series) into internal series ids. The cache size depends on the available memory for VictoriaMetrics in the host system. If the information about all the active time series doesn't fit the cache, then VictoriaMetrics needs to read and unpack the information from disk on every incoming sample for time series missing in the cache. This operation is much slower than the cache lookup, so such an insert is named a `slow insert`. A high percentage of slow inserts on the [official dashboard for VictoriaMetrics](https://docs.victoriametrics.com/#monitoring) indicates a memory shortage for the current number of [active time series](#what-is-an-active-time-series). Such a condition usually leads to a significant slowdown for data ingestion and to significantly increased disk IO and CPU usage. The solution is to add more memory or to reduce the number of [active time series](#what-is-an-active-time-series). The `/api/v1/status/tsdb` page can be helpful for locating the source of high number of active time seriess – see [these docs](https://docs.victoriametrics.com/#tsdb-stats). - ## How to optimize MetricsQL query? See [this article](https://valyala.medium.com/how-to-optimize-promql-and-metricsql-queries-85a1b75bf986). - ## Why isn't MetricsQL 100% compatible with PromQL? [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) provides better user experience than PromQL. It fixes a few annoying issues in PromQL. This prevents MetricsQL to be 100% compatible with PromQL. See [this article](https://medium.com/@romanhavronenko/victoriametrics-promql-compliance-d4318203f51e) for details. - ## How to migrate data from Prometheus to VictoriaMetrics? Please see [these docs](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-prometheus). - ## How to migrate data from InfluxDB to VictoriaMetrics? Please see [these docs](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-influxdb-1x). - ## How to migrate data from OpenTSDB to VictoriaMetrics? Please see [these docs](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-opentsdb). - ## How to migrate data from Graphite to VictoriaMetrics? Please use the [whisper-to-graphite](https://github.com/bzed/whisper-to-graphite) tool for reading data from Graphite and pushing them to VictoriaMetrics via [Graphite's import API](https://docs.victoriametrics.com/#how-to-send-data-from-graphite-compatible-agents-such-as-statsd). - ## Why do the same metrics have differences in VictoriaMetrics' and Prometheus' dashboards? There could be a slight difference in stored values for time series. Due to different compression algorithms, VM may reduce the precision for float values with more than 12 significant decimal digits. Please see [this article](https://valyala.medium.com/evaluating-performance-and-correctness-victoriametrics-response-e27315627e87). The query engine may behave differently for some functions. Please see [this article](https://medium.com/@romanhavronenko/victoriametrics-promql-compliance-d4318203f51e). - ## If downsampling and deduplication are enabled how will this work? [Deduplication](https://docs.victoriametrics.com/#deduplication) is a special case of zero-offset [downsampling](https://docs.victoriametrics.com/#downsampling). So, if both downsampling and deduplication are enabled, then deduplication is replaced by zero-offset downsampling diff --git a/docs/MetricsQL.md b/docs/MetricsQL.md index 6a1d8f1e9..9eaa373c0 100644 --- a/docs/MetricsQL.md +++ b/docs/MetricsQL.md @@ -13,6 +13,7 @@ However, there are some [intentional differences](https://medium.com/@romanhavro If you are unfamiliar with PromQL, then it is suggested reading [this tutorial for beginners](https://medium.com/@valyala/promql-tutorial-for-beginners-9ab455142085). The following functionality is implemented differently in MetricsQL compared to PromQL. This improves user experience: + * MetricsQL takes into account the previous point before the window in square brackets for range functions such as [rate](#rate) and [increase](#increase). This allows returning the exact results users expect for `increase(metric[$__interval])` queries instead of incomplete results Prometheus returns for such queries. * MetricsQL doesn't extrapolate range function results. This addresses [this issue from Prometheus](https://github.com/prometheus/prometheus/issues/3746). See technical details about VictoriaMetrics and Prometheus calculations for [rate](#rate) and [increase](#increase) [in this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1215#issuecomment-850305711). * MetricsQL returns the expected non-empty responses for [rate](#rate) with `step` values smaller than scrape interval. This addresses [this issue from Grafana](https://github.com/grafana/grafana/issues/11451). See also [this blog post](https://www.percona.com/blog/2020/02/28/better-prometheus-rate-function-with-victoriametrics/). @@ -30,27 +31,26 @@ MetricsQL implements [PromQL](https://medium.com/@valyala/promql-tutorial-for-be This functionality can be evaluated at [an editable Grafana dashboard](https://play-grafana.victoriametrics.com/d/4ome8yJmz/node-exporter-on-victoriametrics-demo) or at your own [VictoriaMetrics instance](https://docs.victoriametrics.com/#how-to-start-victoriametrics). -- Graphite-compatible filters can be passed via `{__graphite__="foo.*.bar"}` syntax. See [these docs](https://docs.victoriametrics.com/#selecting-graphite-metrics). VictoriaMetrics also can be used as Graphite datasource in Grafana. See [these docs](https://docs.victoriametrics.com/#graphite-api-usage) for details. See also [label_graphite_group](#label_graphite_group) function, which can be used for extracting the given groups from Graphite metric name. -- Lookbehind window in square brackets may be omitted. VictoriaMetrics automatically selects the lookbehind window depending on the current step used for building the graph (e.g. `step` query arg passed to [/api/v1/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries)). For instance, the following query is valid in VictoriaMetrics: `rate(node_network_receive_bytes_total)`. It is equivalent to `rate(node_network_receive_bytes_total[$__interval])` when used in Grafana. -- [Aggregate functions](#aggregate-functions) accept arbitrary number of args. For example, `avg(q1, q2, q3)` would return the average values for every point across time series returned by `q1`, `q2` and `q3`. -- [@ modifier](https://prometheus.io/docs/prometheus/latest/querying/basics/#modifier) can be put anywhere in the query. For example, `sum(foo) @ end()` calculates `sum(foo)` at the `end` timestamp of the selected time range `[start ... end]`. -- Arbitrary subexpression can be used as [@ modifier](https://prometheus.io/docs/prometheus/latest/querying/basics/#modifier). For example, `foo @ (end() - 1h)` calculates `foo` at the `end - 1 hour` timestamp on the selected time range `[start ... end]`. -- [offset](https://prometheus.io/docs/prometheus/latest/querying/basics/#offset-modifier), lookbehind window in square brackets and `step` value for [subquery](#subqueries) may refer to the current step aka `$__interval` value from Grafana with `[Ni]` syntax. For instance, `rate(metric[10i] offset 5i)` would return per-second rate over a range covering 10 previous steps with the offset of 5 steps. -- [offset](https://prometheus.io/docs/prometheus/latest/querying/basics/#offset-modifier) may be put anywere in the query. For instance, `sum(foo) offset 24h`. -- Lookbehind window in square brackets and [offset](https://prometheus.io/docs/prometheus/latest/querying/basics/#offset-modifier) may be fractional. For instance, `rate(node_network_receive_bytes_total[1.5m] offset 0.5d)`. -- The duration suffix is optional. The duration is in seconds if the suffix is missing. For example, `rate(m[300] offset 1800)` is equivalent to `rate(m[5m]) offset 30m`. -- The duration can be placed anywhere in the query. For example, `sum_over_time(m[1h]) / 1h` is equivalent to `sum_over_time(m[1h]) / 3600`. -- Trailing commas on all the lists are allowed - label filters, function args and with expressions. For instance, the following queries are valid: `m{foo="bar",}`, `f(a, b,)`, `WITH (x=y,) x`. This simplifies maintenance of multi-line queries. -- Metric names and metric labels may contain escaped chars. For instance, `foo\-bar{baz\=aa="b"}` is valid expression. It returns time series with name `foo-bar` containing label `baz=aa` with value `b`. Additionally, `\xXX` escape sequence is supported, where `XX` is hexadecimal representation of escaped char. -- Aggregate functions support optional `limit N` suffix in order to limit the number of output series. For example, `sum(x) by (y) limit 3` limits the number of output time series after the aggregation to 3. All the other time series are dropped. -- [histogram_quantile](#histogram_quantile) accepts optional third arg - `boundsLabel`. In this case it returns `lower` and `upper` bounds for the estimated percentile. See [this issue for details](https://github.com/prometheus/prometheus/issues/5706). -- `default` binary operator. `q1 default q2` fills gaps in `q1` with the corresponding values from `q2`. -- `if` binary operator. `q1 if q2` removes values from `q1` for missing values from `q2`. -- `ifnot` binary operator. `q1 ifnot q2` removes values from `q1` for existing values from `q2`. -- String literals may be concatenated. This is useful with `WITH` templates: `WITH (commonPrefix="long_metric_prefix_") {__name__=commonPrefix+"suffix1"} / {__name__=commonPrefix+"suffix2"}`. -- `WITH` templates. This feature simplifies writing and managing complex queries. Go to [WITH templates playground](https://play.victoriametrics.com/promql/expand-with-exprs) and try it. -- `keep_metric_names` modifier can be applied to all the [rollup functions](#rollup-functions) and [transform functions](#transform-functions). This modifier prevents from dropping metric names in function results. For example, `rate({__name__=~"foo|bar"}[5m]) keep_metric_names` leaves `foo` and `bar` metric names in the resulting time series. - +* Graphite-compatible filters can be passed via `{__graphite__="foo.*.bar"}` syntax. See [these docs](https://docs.victoriametrics.com/#selecting-graphite-metrics). VictoriaMetrics also can be used as Graphite datasource in Grafana. See [these docs](https://docs.victoriametrics.com/#graphite-api-usage) for details. See also [label_graphite_group](#label_graphite_group) function, which can be used for extracting the given groups from Graphite metric name. +* Lookbehind window in square brackets may be omitted. VictoriaMetrics automatically selects the lookbehind window depending on the current step used for building the graph (e.g. `step` query arg passed to [/api/v1/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries)). For instance, the following query is valid in VictoriaMetrics: `rate(node_network_receive_bytes_total)`. It is equivalent to `rate(node_network_receive_bytes_total[$__interval])` when used in Grafana. +* [Aggregate functions](#aggregate-functions) accept arbitrary number of args. For example, `avg(q1, q2, q3)` would return the average values for every point across time series returned by `q1`, `q2` and `q3`. +* [@ modifier](https://prometheus.io/docs/prometheus/latest/querying/basics/#modifier) can be put anywhere in the query. For example, `sum(foo) @ end()` calculates `sum(foo)` at the `end` timestamp of the selected time range `[start ... end]`. +* Arbitrary subexpression can be used as [@ modifier](https://prometheus.io/docs/prometheus/latest/querying/basics/#modifier). For example, `foo @ (end() - 1h)` calculates `foo` at the `end - 1 hour` timestamp on the selected time range `[start ... end]`. +* [offset](https://prometheus.io/docs/prometheus/latest/querying/basics/#offset-modifier), lookbehind window in square brackets and `step` value for [subquery](#subqueries) may refer to the current step aka `$__interval` value from Grafana with `[Ni]` syntax. For instance, `rate(metric[10i] offset 5i)` would return per-second rate over a range covering 10 previous steps with the offset of 5 steps. +* [offset](https://prometheus.io/docs/prometheus/latest/querying/basics/#offset-modifier) may be put anywere in the query. For instance, `sum(foo) offset 24h`. +* Lookbehind window in square brackets and [offset](https://prometheus.io/docs/prometheus/latest/querying/basics/#offset-modifier) may be fractional. For instance, `rate(node_network_receive_bytes_total[1.5m] offset 0.5d)`. +* The duration suffix is optional. The duration is in seconds if the suffix is missing. For example, `rate(m[300] offset 1800)` is equivalent to `rate(m[5m]) offset 30m`. +* The duration can be placed anywhere in the query. For example, `sum_over_time(m[1h]) / 1h` is equivalent to `sum_over_time(m[1h]) / 3600`. +* Trailing commas on all the lists are allowed - label filters, function args and with expressions. For instance, the following queries are valid: `m{foo="bar",}`, `f(a, b,)`, `WITH (x=y,) x`. This simplifies maintenance of multi-line queries. +* Metric names and metric labels may contain escaped chars. For instance, `foo\-bar{baz\=aa="b"}` is valid expression. It returns time series with name `foo-bar` containing label `baz=aa` with value `b`. Additionally, `\xXX` escape sequence is supported, where `XX` is hexadecimal representation of escaped char. +* Aggregate functions support optional `limit N` suffix in order to limit the number of output series. For example, `sum(x) by (y) limit 3` limits the number of output time series after the aggregation to 3. All the other time series are dropped. +* [histogram_quantile](#histogram_quantile) accepts optional third arg - `boundsLabel`. In this case it returns `lower` and `upper` bounds for the estimated percentile. See [this issue for details](https://github.com/prometheus/prometheus/issues/5706). +* `default` binary operator. `q1 default q2` fills gaps in `q1` with the corresponding values from `q2`. +* `if` binary operator. `q1 if q2` removes values from `q1` for missing values from `q2`. +* `ifnot` binary operator. `q1 ifnot q2` removes values from `q1` for existing values from `q2`. +* String literals may be concatenated. This is useful with `WITH` templates: `WITH (commonPrefix="long_metric_prefix_") {__name__=commonPrefix+"suffix1"} / {__name__=commonPrefix+"suffix2"}`. +* `WITH` templates. This feature simplifies writing and managing complex queries. Go to [WITH templates playground](https://play.victoriametrics.com/promql/expand-with-exprs) and try it. +* `keep_metric_names` modifier can be applied to all the [rollup functions](#rollup-functions) and [transform functions](#transform-functions). This modifier prevents from dropping metric names in function results. For example, `rate({__name__=~"foo|bar"}[5m]) keep_metric_names` leaves `foo` and `bar` metric names in the resulting time series. ## MetricsQL functions @@ -63,20 +63,19 @@ MetricsQL provides the following functions: * [Label manipulation functions](#label-manipulation-functions) * [Aggregate functions](#aggregate-functions) - ### Rollup functions **Rollup functions** (aka range functions or window functions) calculate rollups over **raw samples** on the given lookbehind window for the [selected time series](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors). For example, `avg_over_time(temperature[24h])` calculates the average temperature over raw samples for the last 24 hours. Additional details: - * If rollup functions are used for building graphs in Grafana, then the rollup is calculated independently per each point on the graph. For example, every point for `avg_over_time(temperature[24h])` graph shows the average temperature for the last 24 hours ending at this point. The interval between points is set as `step` query arg passed by Grafana to [/api/v1/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries). - * If the given [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) returns multiple time series, then rollups are calculated individually per each returned series. - * If lookbehind window in square brackets is missing, then MetricsQL automatically sets the lookbehind window to the interval between points on the graph (aka `step` query arg at [/api/v1/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries), `$__interval` value from Grafana or `1i` duration in MetricsQL). For example, `rate(http_requests_total)` is equivalent to `rate(http_requests_total[$__interval])` in Grafana. It is also equivalent to `rate(http_requests_total[1i])`. - * Every [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) in MetricsQL must be wrapped into a rollup function. Otherwise it is automatically wrapped into [default_rollup](#default_rollup). For example, `foo{bar="baz"}` is automatically converted to `default_rollup(foo{bar="baz"}[1i])` before performing the calculations. - * If something other than [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) is passed to rollup function, then the inner arg is automatically converted to a [subquery](#subqueries). - * All the rollup functions accept optional `keep_metric_names` modifier. If it is set, then the function keeps metric names in results. For example, `rate({__name__=~"foo|bar}[5m]) keep_metric_names` leaves `foo` and `bar` metric names in results. + +* If rollup functions are used for building graphs in Grafana, then the rollup is calculated independently per each point on the graph. For example, every point for `avg_over_time(temperature[24h])` graph shows the average temperature for the last 24 hours ending at this point. The interval between points is set as `step` query arg passed by Grafana to [/api/v1/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries). +* If the given [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) returns multiple time series, then rollups are calculated individually per each returned series. +* If lookbehind window in square brackets is missing, then MetricsQL automatically sets the lookbehind window to the interval between points on the graph (aka `step` query arg at [/api/v1/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries), `$__interval` value from Grafana or `1i` duration in MetricsQL). For example, `rate(http_requests_total)` is equivalent to `rate(http_requests_total[$__interval])` in Grafana. It is also equivalent to `rate(http_requests_total[1i])`. +* Every [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) in MetricsQL must be wrapped into a rollup function. Otherwise it is automatically wrapped into [default_rollup](#default_rollup). For example, `foo{bar="baz"}` is automatically converted to `default_rollup(foo{bar="baz"}[1i])` before performing the calculations. +* If something other than [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) is passed to rollup function, then the inner arg is automatically converted to a [subquery](#subqueries). +* All the rollup functions accept optional `keep_metric_names` modifier. If it is set, then the function keeps metric names in results. For example, `rate({__name__=~"foo|bar}[5m]) keep_metric_names` leaves `foo` and `bar` metric names in results. See also [implicit query conversions](#implicit-query-conversions). - #### absent_over_time `absent_over_time(series_selector[d])` returns 1 if the given lookbehind window `d` doesn't contain raw samples. Otherwise it returns an empty result. This function is supported by PromQL. See also [present_over_time](#present_over_time). @@ -361,16 +360,15 @@ See also [implicit query conversions](#implicit-query-conversions). `zscore_over_time(series_selector[d])` calculates returns [z-score](https://en.wikipedia.org/wiki/Standard_score) for raw samples on the given lookbehind window `d`. It is calculated independently per each time series returned from the given [series_selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors). Metric names are stripped from the resulting rollups. Add `keep_metric_names` modifier in order to keep metric names. - ### Transform functions **Transform functions** calculate transformations over rollup results. For example, `abs(delta(temperature[24h]))` calculates the absolute value for every point of every time series returned from the rollup `delta(temperature[24h])`. Additional details: - * If transform function is applied directly to a [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors), then the [default_rollup()](#default_rollup) function is automatically applied before calculating the transformations. For example, `abs(temperature)` is implicitly transformed to `abs(default_rollup(temperature[1i]))`. - * All the transform functions accept optional `keep_metric_names` modifier. If it is set, then the function doesn't drop metric names from the resulting time series. For example, `ln({__name__=~"foo|bar"}) keep_metric_names` leaves `foo` and `bar` metric names in results. + +* If transform function is applied directly to a [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors), then the [default_rollup()](#default_rollup) function is automatically applied before calculating the transformations. For example, `abs(temperature)` is implicitly transformed to `abs(default_rollup(temperature[1i]))`. +* All the transform functions accept optional `keep_metric_names` modifier. If it is set, then the function doesn't drop metric names from the resulting time series. For example, `ln({__name__=~"foo|bar"}) keep_metric_names` leaves `foo` and `bar` metric names in results. See also [implicit query conversions](#implicit-query-conversions). - #### abs `abs(q)` calculates the absolute value for every point of every time series returned by `q`. This function is supported by PromQL. @@ -547,7 +545,6 @@ See also [implicit query conversions](#implicit-query-conversions). `rad(q)` converts [degrees to Radians](https://en.wikipedia.org/wiki/Radian#Conversions) for every point of every time series returned by `q`. Metric names are stripped from the resulting series. Add `keep_metric_names` modifier in order to keep metric names. This function is supported by PromQL. See also [deg](#deg). - #### prometheus_buckets `prometheus_buckets(buckets)` converts [VictoriaMetrics histogram buckets](https://valyala.medium.com/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350) with `vmrange` labels to Prometheus histogram buckets with `le` labels. This may be useful for building heatmaps in Grafana. See also [histogram_quantile](#histogram_quantile) and [buckets_limit](#buckets_limit). @@ -704,15 +701,14 @@ See also [implicit query conversions](#implicit-query-conversions). `year(q)` returns the year for every point of every time series returned by `q`. It is expected that `q` returns unix timestamps. Metric names are stripped from the resulting series. Add `keep_metric_names` modifier in order to keep metric names. This function is supported by PromQL. - ### Label manipulation functions **Label manipulation functions** perform manipulations with lables on the selected rollup results. Additional details: - * If label manipulation function is applied directly to a [series_selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors), then the [default_rollup()](#default_rollup) function is automatically applied before performing the label transformation. For example, `alias(temperature, "foo")` is implicitly transformed to `alias(default_rollup(temperature[1i]), "foo")`. + +* If label manipulation function is applied directly to a [series_selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors), then the [default_rollup()](#default_rollup) function is automatically applied before performing the label transformation. For example, `alias(temperature, "foo")` is implicitly transformed to `alias(default_rollup(temperature[1i]), "foo")`. See also [implicit query conversions](#implicit-query-conversions). - #### alias `alias(q, "name")` sets the given `name` to all the time series returned by `q`. For example, `alias(up, "foobar")` would rename `up` series to `foobar` series. @@ -777,24 +773,23 @@ sum by (__name__) ( #### label_uppercase -`label_uppercase(q, "label1", ..., "labelN")` uppercases values for the given `label*` labels in all the time series returned by `q`. +`label_uppercase(q, "label1", ..., "labelN")` uppercases values for the given `label*` labels in all the time series returned by `q`. #### label_value `label_value(q, "label")` returns number values for the given `label` for every time series returned by `q`. For example, if `label_value(foo, "bar")` is applied to `foo{bar="1.234"}`, then it will return a time series `foo{bar="1.234"}` with `1.234` value. - ### Aggregate functions **Aggregate functions** calculate aggregates over groups of rollup results. Additional details: - * By default a single group is used for aggregation. Multiple independent groups can be set up by specifying grouping labels in `by` and `without` modifiers. For example, `count(up) by (job)` would group rollup results by `job` label value and calculate the [count](#count) aggregate function independently per each group, while `count(up) without (instance)` would group rollup results by all the labels except `instance` before calculating [count](#count) aggregate function independently per each group. Multiple labels can be put in `by` and `without` modifiers. - * If the aggregate function is applied directly to a [series_selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors), then the [default_rollup()](#default_rollup) function is automatically applied before cacluating the aggregate. For example, `count(up)` is implicitly transformed to `count(default_rollup(up[1i]))`. - * Aggregate functions accept arbitrary number of args. For example, `avg(q1, q2, q3)` would return the average values for every point across time series returned by `q1`, `q2` and `q3`. - * Aggregate functions support optional `limit N` suffix, which can be used for limiting the number of output groups. For example, `sum(x) by (y) limit 3` limits the number of groups for the aggregation to 3. All the other groups are ignored. + +* By default a single group is used for aggregation. Multiple independent groups can be set up by specifying grouping labels in `by` and `without` modifiers. For example, `count(up) by (job)` would group rollup results by `job` label value and calculate the [count](#count) aggregate function independently per each group, while `count(up) without (instance)` would group rollup results by all the labels except `instance` before calculating [count](#count) aggregate function independently per each group. Multiple labels can be put in `by` and `without` modifiers. +* If the aggregate function is applied directly to a [series_selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors), then the [default_rollup()](#default_rollup) function is automatically applied before cacluating the aggregate. For example, `count(up)` is implicitly transformed to `count(default_rollup(up[1i]))`. +* Aggregate functions accept arbitrary number of args. For example, `avg(q1, q2, q3)` would return the average values for every point across time series returned by `q1`, `q2` and `q3`. +* Aggregate functions support optional `limit N` suffix, which can be used for limiting the number of output groups. For example, `sum(x) by (y) limit 3` limits the number of groups for the aggregation to 3. All the other groups are ignored. See also [implicit query conversions](#implicit-query-conversions). - #### any `any(q) by (group_labels)` returns a single series per `group_labels` out of time series returned by `q`. See also [group](#group). @@ -821,7 +816,7 @@ See also [implicit query conversions](#implicit-query-conversions). #### bottomk_median -`bottomk_median(k, q, "other_label=other_value")` returns up to `k` time series from `q with the smallest medians. If an optional `other_label=other_value` arg is set, then the sum of the remaining time series is returned with the given label. For example, `bottomk_median(3, sum(process_resident_memory_bytes) by (job), "job=other")` would return up to 3 time series with the smallest medians plus a time series with `{job="other"}` label with the sum of the remaining series if any. See also [topk_median](#topk_median). +`bottomk_median(k, q, "other_label=other_value")` returns up to `k` time series from `q with the smallest medians. If an optional`other_label=other_value` arg is set, then the sum of the remaining time series is returned with the given label. For example, `bottomk_median(3, sum(process_resident_memory_bytes) by (job), "job=other")` would return up to 3 time series with the smallest medians plus a time series with `{job="other"}` label with the sum of the remaining series if any. See also [topk_median](#topk_median). #### bottomk_min @@ -935,7 +930,6 @@ See also [implicit query conversions](#implicit-query-conversions). `zscore(q) by (group_labels)` returns [z-score](https://en.wikipedia.org/wiki/Standard_score) values per each `group_labels` for all the time series returned by `q`. The aggregate is calculated individually per each group of points with the same timestamp. Useful for detecting anomalies in the group of related time series. - ## Subqueries MetricsQL supports and extends PromQL subqueries. See [this article](https://valyala.medium.com/prometheus-subqueries-in-victoriametrics-9b1492b720b3) for details. Any [rollup function](#rollup-functions) for something other than [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) form a subquery. Nested rollup functions can be implicit thanks to the [implicit query conversions](#implicit-query-conversions). For example, `delta(sum(m))` is implicitly converted to `delta(sum(default_rollup(m[1i]))[1i:1i])`, so it becomes a subquery, since it contains [default_rollup](#default_rollup) nested into [delta](#delta). @@ -945,7 +939,6 @@ VictoriaMetrics performs subqueries in the following way: * It calculates the inner rollup function using the `step` value from the outer rollup function. For example, for expression `max_over_time(rate(http_requests_total[5m])[1h:30s])` the inner function `rate(http_requests_total[5m])` is calculated with `step=30s`. The resulting data points are aligned by the `step`. * It calculates the outer rollup function over the results of the inner rollup function using the `step` value passed by Grafana to [/api/v1/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries). - ## Implicit query conversions VictoriaMetrics performs the following implicit conversions for incoming queries before starting the calculations: diff --git a/docs/PerTenantStatistic.md b/docs/PerTenantStatistic.md index ebc5f4f3c..4cb802695 100644 --- a/docs/PerTenantStatistic.md +++ b/docs/PerTenantStatistic.md @@ -9,30 +9,31 @@ sort: 18 cluster-per-tenant-stat VictoriaMetrics cluster for enterprise provides various metrics and statistics usage per tenant: + - `vminsert` - * `vm_tenant_inserted_rows_total` - total number of inserted rows. Find out which tenant - puts the most of the pressure on the storage. - + - `vm_tenant_inserted_rows_total` - total number of inserted rows. Find out which tenant + puts the most of the pressure on the storage. + - `vmselect` - * `vm_tenant_select_requests_duration_ms_total` - query latency. + - `vm_tenant_select_requests_duration_ms_total` - query latency. Helps to identify tenants with the heaviest queries. - * `vm_tenant_select_requests_total` - total number of requests. + - `vm_tenant_select_requests_total` - total number of requests. Discover which tenant sends the most of the queries and how it changes with time. - `vmstorage` - * `vm_tenant_active_timeseries` - number of active time series. - This metric correlates with memory usage, so can be used to find the most expensive - tenant in terms of memory. - * `vm_tenant_used_tenant_bytes` - disk space usage. Helps to track disk space usage + - `vm_tenant_active_timeseries` - number of active time series. + This metric correlates with memory usage, so can be used to find the most expensive + tenant in terms of memory. + - `vm_tenant_used_tenant_bytes` - disk space usage. Helps to track disk space usage per tenant. - * `vm_tenant_timeseries_created_total` - number of new time series created. Helps to track + - `vm_tenant_timeseries_created_total` - number of new time series created. Helps to track the churn rate per tenant, or identify inefficient usage of the system. -Collect the metrics by any scrape agent you like (`vmagent`, `victoriametrics`, Prometheus, etc) and put into TSDB. +Collect the metrics by any scrape agent you like (`vmagent`, `victoriametrics`, Prometheus, etc) and put into TSDB. It is ok to use existing cluster for storing such metrics, but make sure to use a different tenant for it to avoid collisions. -Or just run a separate TSDB (VM single, Promethes, etc.) to keep the data isolated from the main cluster. +Or just run a separate TSDB (VM single, Promethes, etc.) to keep the data isolated from the main cluster. -Example of the scraping configuration for statistic is the following: +Example of the scraping configuration for statistic is the following: ```yaml scrape_configs: @@ -44,18 +45,17 @@ scrape_configs: ## Visualization -Visualisation of statistics can be done in Grafana using the following +Visualisation of statistics can be done in Grafana using the following [dashboard](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster/dashboards/clusterbytenant.json). - ## Integration with vmgateway -`vmgateway` supports integration with Per Tenant Statistics data for rate limiting purposes. +`vmgateway` supports integration with Per Tenant Statistics data for rate limiting purposes. More information can be found [here](https://docs.victoriametrics.com/vmgateway.html) ## Integration with vmalert -You can generate alerts based on each tenant's resource usage and send notifications +You can generate alerts based on each tenant's resource usage and send notifications to prevent limits exhaustion. Here is an alert example for high churn rate by the tenant: diff --git a/docs/Quick-Start.md b/docs/Quick-Start.md index 0bc2f0c19..c6fc676a0 100644 --- a/docs/Quick-Start.md +++ b/docs/Quick-Start.md @@ -16,5 +16,6 @@ Open `http://localhost:8428` in web browser and read [these docs](https://docs.v VictoriaMetrics is also available in binaries (see [this page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)) and in source code (see [how to build VictoriaMetrics from sources](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-build-from-sources)). There are also the following versions of VictoriaMetrics available: + * [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) - horizontally scalable VictoriaMetrics, which scales to multiple nodes. * [Managed VictoriaMetrics at AWS](https://aws.amazon.com/marketplace/pp/prodview-4tbfq5icmbmyc). diff --git a/docs/README.md b/docs/README.md index 6d950b239..136070a37 100644 --- a/docs/README.md +++ b/docs/README.md @@ -21,7 +21,6 @@ Cluster version of VictoriaMetrics is available [here](https://docs.victoriametr [Contact us](mailto:info@victoriametrics.com) if you need enterprise support for VictoriaMetrics. See [features available in enterprise package](https://victoriametrics.com/products/enterprise/). Enterprise binaries can be downloaded and evaluated for free from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). - ## Prominent features VictoriaMetrics has the following prominent features: @@ -61,7 +60,6 @@ VictoriaMetrics has the following prominent features: See also [various Articles about VictoriaMetrics](https://docs.victoriametrics.com/Articles.html). - ## Case studies and talks Case studies: @@ -92,7 +90,6 @@ Case studies: See also [articles and slides about VictoriaMetrics from our users](https://docs.victoriametrics.com/Articles.html#third-party-articles-and-slides-about-victoriametrics) - ## Operation ## How to start VictoriaMetrics @@ -112,7 +109,6 @@ VictoriaMetrics accepts [Prometheus querying API requests](#prometheus-querying- It is recommended setting up [monitoring](#monitoring) for VictoriaMetrics. - ### Environment variables Each flag value can be set via environment variables according to these rules: @@ -122,10 +118,8 @@ Each flag value can be set via environment variables according to these rules: * For repeating flags an alternative syntax can be used by joining the different values into one using `,` char as separator (for example `-storageNode -storageNode ` will translate to `storageNode=,`). * Environment var prefix can be set via `-envflag.prefix` flag. For instance, if `-envflag.prefix=VM_`, then env vars must be prepended with `VM_`. - ### Configuration with snap package - Snap package for VictoriaMetrics is available [here](https://snapcraft.io/victoriametrics). Command-line flags for Snap package can be set with following command: @@ -137,7 +131,6 @@ snap restart victoriametrics Do not change value for `-storageDataPath` flag, because snap package has limited access to host filesystem. - Changing scrape configuration is possible with text editor: ```text @@ -146,7 +139,6 @@ vi $SNAP_DATA/var/snap/victoriametrics/current/etc/victoriametrics-scrape-config After changes were made, trigger config re-read with the command `curl 127.0.0.1:8248/-/reload`. - ## Prometheus setup Add the following lines to Prometheus config file (it is usually located at `/etc/prometheus/prometheus.yml`) in order to send data to VictoriaMetrics: @@ -200,7 +192,6 @@ It is recommended upgrading Prometheus to [v2.12.0](https://github.com/prometheu Take a look also at [vmagent](https://docs.victoriametrics.com/vmagent.html) and [vmalert](https://docs.victoriametrics.com/vmalert.html), which can be used as faster and less resource-hungry alternative to Prometheus. - ## Grafana setup Create [Prometheus datasource](http://docs.grafana.org/features/datasources/prometheus/) in Grafana with the following url: @@ -213,7 +204,6 @@ Substitute `` with the hostname or IP address of VictoriaM Then build graphs and dashboards for the created datasource using [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/) or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html). - ## How to upgrade VictoriaMetrics It is safe upgrading VictoriaMetrics to new versions unless [release notes](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) say otherwise. It is safe skipping multiple versions during the upgrade unless [release notes](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) say otherwise. It is recommended performing regular upgrades to the latest version, since it may contain important bug fixes, performance optimizations or new features. @@ -228,7 +218,6 @@ The following steps must be performed during the upgrade / downgrade procedure: Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details. The same applies also to [vmagent](https://docs.victoriametrics.com/vmagent.html). - ## How to apply new config to VictoriaMetrics VictoriaMetrics is configured via command-line flags, so it must be restarted when new command-line flags should be applied: @@ -239,7 +228,6 @@ VictoriaMetrics is configured via command-line flags, so it must be restarted wh Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details. The same applies alos to [vmagent](https://docs.victoriametrics.com/vmagent.html). - ## How to scrape Prometheus exporters such as [node-exporter](https://github.com/prometheus/node_exporter) VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping targets configured in `prometheus.yml` config file according to [the specification](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file). Just set `-promscrape.config` command-line flag to the path to `prometheus.yml` config - and VictoriaMetrics should start scraping the configured targets. Currently the following [scrape_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) types are supported: @@ -258,7 +246,6 @@ VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping t * [digitalocean_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config) * [http_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config) - File a [feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you need support for other `*_sd_config` types. The file pointed by `-promscrape.config` may contain `%{ENV_VAR}` placeholders, which are substituted by the corresponding `ENV_VAR` environment variable values. @@ -267,7 +254,6 @@ VictoriaMetrics also supports [importing data in Prometheus exposition format](# See also [vmagent](https://docs.victoriametrics.com/vmagent.html), which can be used as drop-in replacement for Prometheus. - ## How to send data from DataDog agent VictoriaMetrics accepts data from [DataDog agent](https://docs.datadoghq.com/agent/) or [DogStatsD]() via ["submit metrics" API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics) at `/datadog/api/v1/series` path. @@ -315,7 +301,6 @@ This command should return the following output if everything is OK: Extra labels may be added to all the written time series by passing `extra_label=name=value` query args. For example, `/datadog/api/v1/series?extra_label=foo=bar` would add `{foo="bar"}` label to all the ingested metrics. - ## How to send data from InfluxDB-compatible agents such as [Telegraf](https://www.influxdata.com/time-series-platform/telegraf/) Use `http://:8428` url instead of InfluxDB url in agents' configs. @@ -503,7 +488,6 @@ The `/api/v1/export` endpoint should return the following response: Extra labels may be added to all the imported time series by passing `extra_label=name=value` query args. For example, `/api/put?extra_label=foo=bar` would add `{foo="bar"}` label to all the ingested metrics. - ## Prometheus querying API usage VictoriaMetrics supports the following handlers from [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/): @@ -519,7 +503,6 @@ VictoriaMetrics supports the following handlers from [Prometheus querying API](h These handlers can be queried from Prometheus-compatible clients such as Grafana or curl. All the Prometheus querying API handlers can be prepended with `/prometheus` prefix. For example, both `/prometheus/api/v1/query` and `/api/v1/query` should work. - ### Prometheus querying API enhancements VictoriaMetrics accepts optional `extra_label==` query arg, which can be used for enforcing additional label filters for queries. For example, @@ -552,7 +535,6 @@ Additionally VictoriaMetrics provides the following handlers: For example, request to `/api/v1/status/top_queries?topN=5&maxLifetime=30s` would return up to 5 queries per list, which were executed during the last 30 seconds. VictoriaMetrics tracks the last `-search.queryStats.lastQueriesCount` queries with durations at least `-search.queryStats.minQueryDuration`. - ## Graphite API usage VictoriaMetrics supports the following Graphite APIs, which are needed for [Graphite datasource in Grafana](https://grafana.com/docs/grafana/latest/datasources/graphite/): @@ -569,7 +551,6 @@ VictoriaMetrics accepts optional query args: `extra_label==:8428/api/v1/export?match[]=`, @@ -815,7 +788,6 @@ Exported data can be imported via POST'ing it to [/api/v1/import](#how-to-import The [deduplication](#deduplication) is applied to the data exported via `/api/v1/export` by default. The deduplication isn't applied if `reduce_mem_usage=1` query arg is passed to the request. - ### How to export CSV data Send a request to `http://:8428/api/v1/export/csv?format=&match=`, @@ -841,7 +813,6 @@ The exported CSV data can be imported to VictoriaMetrics via [/api/v1/import/csv The [deduplication](#deduplication) is applied for the data exported in CSV by default. It is possible to export raw data without de-duplication by passing `reduce_mem_usage=1` query arg to `/api/v1/export/csv`. - ### How to export data in native format Send a request to `http://:8428/api/v1/export/native?match[]=`, @@ -866,7 +837,6 @@ can fail to be imported into VictoriaMetrics release Y. The [deduplication](#deduplication) isn't applied for the data exported in native format. It is expected that the de-duplication is performed during data import. - ## How to import time series data Time series data can be imported into VictoriaMetrics via any supported ingestion protocol: @@ -884,7 +854,6 @@ Time series data can be imported into VictoriaMetrics via any supported ingestio * `/api/v1/import/csv` for importing arbitrary CSV data. See [these docs](#how-to-import-csv-data) for details. * `/api/v1/import/prometheus` for importing data in Prometheus exposition format. See [these docs](#how-to-import-data-in-prometheus-exposition-format) for details. - ### How to import data in JSON line format Example for importing data obtained via [/api/v1/export](#how-to-export-data-in-json-line-format): @@ -914,7 +883,6 @@ Note that it could be required to flush response cache after importing historica VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage. This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines. The solution is to split too long JSON lines into smaller lines. It is OK if samples for a single time series are split among multiple JSON lines. - ### How to import data in native format The specification of VictoriaMetrics' native format may yet change and is not formally documented yet. So currently we do not recommend that external clients attempt to pack their own metrics in native format file. @@ -934,7 +902,6 @@ For example, `/api/v1/import/native?extra_label=foo=bar` would add `"foo":"bar"` Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail. - ### How to import CSV data Arbitrary CSV data can be imported via `/api/v1/import/csv`. The CSV data is imported according to the provided `format` query arg. @@ -975,6 +942,7 @@ curl -G 'http://localhost:8428/api/v1/export' -d 'match[]={ticker!=""}' ``` The following response should be returned: + ```bash {"metric":{"__name__":"bid","market":"NASDAQ","ticker":"MSFT"},"values":[1.67],"timestamps":[1583865146520]} {"metric":{"__name__":"bid","market":"NYSE","ticker":"GOOG"},"values":[4.56],"timestamps":[1583865146495]} @@ -987,7 +955,6 @@ For example, `/api/v1/import/csv?extra_label=foo=bar` would add `"foo":"bar"` la Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail. - ### How to import data in Prometheus exposition format VictoriaMetrics accepts data in [Prometheus exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format) @@ -1029,8 +996,6 @@ Note that it could be required to flush response cache after importing historica VictoriaMetrics also may scrape Prometheus targets - see [these docs](#how-to-scrape-prometheus-exporters-such-as-node-exporter). - - ## Relabeling VictoriaMetrics supports Prometheus-compatible relabeling for all the ingested metrics if `-relabelConfig` command-line flag points @@ -1039,6 +1004,7 @@ The `-relabelConfig` also can point to http or https url. For example, `-relabel See [this article with relabeling tips and tricks](https://valyala.medium.com/how-to-use-relabeling-in-prometheus-and-victoriametrics-8b90fc22c4b2). Example contents for `-relabelConfig` file: + ```yml # Add {cluster="dev"} label. - target_label: cluster @@ -1052,7 +1018,6 @@ Example contents for `-relabelConfig` file: See [these docs](https://docs.victoriametrics.com/vmagent.html#relabeling) for more details about relabeling in VictoriaMetrics. - ## Federation VictoriaMetrics exports [Prometheus-compatible federation data](https://prometheus.io/docs/prometheus/latest/federation/) @@ -1064,7 +1029,6 @@ on the interval `[now - max_lookback ... now]` is scraped for each time series. For instance, `/federate?match[]=up&max_lookback=1h` would return last points on the `[now - 1h ... now]` interval. This may be useful for time series federation with scrape intervals exceeding `5m`. - ## Capacity planning VictoriaMetrics uses lower amounts of CPU, RAM and storage space on production workloads compared to competing solutions (Prometheus, Thanos, Cortex, TimescaleDB, InfluxDB, QuestDB, M3DB) according to [our case studies](https://docs.victoriametrics.com/CaseStudies.html). @@ -1087,7 +1051,6 @@ It is recommended leaving the following amounts of spare resources: * 50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload. * At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag. See also `-storage.minFreeDiskSpaceBytes` command-line flag description [here](#list-of-command-line-flags). - ## High availability * Install multiple VictoriaMetrics instances in distinct datacenters (availability zones). @@ -1128,7 +1091,6 @@ to write data to `victoriametrics-addr-1`, while each `r2` should write data to Another option is to write data simultaneously from Prometheus HA pair to a pair of VictoriaMetrics instances with the enabled de-duplication. See [this section](#deduplication) for details. - ## Deduplication VictoriaMetrics de-duplicates data points if `-dedup.minScrapeInterval` command-line flag is set to positive duration. For example, `-dedup.minScrapeInterval=60s` would de-duplicate data points on the same time series if they fall within the same discrete 60s bucket. The earliest data point will be kept. In the case of equal timestamps, an arbitrary data point will be kept. See [this comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2112#issuecomment-1032587618) for more details on how downsampling works. @@ -1141,34 +1103,34 @@ The de-duplication reduces disk space usage if multiple identically configured [ write data to the same VictoriaMetrics instance. These vmagent or Prometheus instances must have identical `external_labels` section in their configs, so they write data to the same time series. - ## Storage -VictoriaMetrics stores time series data in [MergeTree](https://en.wikipedia.org/wiki/Log-structured_merge-tree)-like +VictoriaMetrics stores time series data in [MergeTree](https://en.wikipedia.org/wiki/Log-structured_merge-tree)-like data structures. On insert, VictoriaMetrics accumulates up to 1s of data and dumps it on disk to -`<-storageDataPath>/data/small/YYYY_MM/` subdirectory forming a `part` with the following +`<-storageDataPath>/data/small/YYYY_MM/` subdirectory forming a `part` with the following name pattern: `rowsCount_blocksCount_minTimestamp_maxTimestamp`. Each part consists of two "columns": values and timestamps. These are sorted and compressed raw time series values. Additionally, part contains index files for searching for specific series in the values and timestamps files. -`Parts` are periodically merged into the bigger parts. The resulting `part` is constructed -under `<-storageDataPath>/data/{small,big}/YYYY_MM/tmp` subdirectory. When the resulting `part` is complete, it is atomically moved from the `tmp` -to its own subdirectory, while the source parts are atomically removed. The end result is that the source +`Parts` are periodically merged into the bigger parts. The resulting `part` is constructed +under `<-storageDataPath>/data/{small,big}/YYYY_MM/tmp` subdirectory. When the resulting `part` is complete, it is atomically moved from the `tmp` +to its own subdirectory, while the source parts are atomically removed. The end result is that the source parts are substituted by a single resulting bigger `part` in the `<-storageDataPath>/data/{small,big}/YYYY_MM/` directory. -Information about merging process is available in [single-node VictoriaMetrics](https://grafana.com/dashboards/10229) -and [clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11176) Grafana dashboards. +Information about merging process is available in [single-node VictoriaMetrics](https://grafana.com/dashboards/10229) +and [clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11176) Grafana dashboards. See more details in [monitoring docs](#monitoring). -The `merge` process is usually named "compaction", because the resulting `part` size is usually smaller than +The `merge` process is usually named "compaction", because the resulting `part` size is usually smaller than the sum of the source `parts`. There are following benefits of doing the merge process: + * it improves query performance, since lower number of `parts` are inspected with each query; -* it reduces the number of data files, since each `part`contains fixed number of files; +* it reduces the number of data files, since each `part`contains fixed number of files; * better compression rate for the resulting part. -Newly added `parts` either appear in the storage or fail to appear. -Storage never contains partially created parts. The same applies to merge process — `parts` are either fully -merged into a new `part` or fail to merge. There are no partially merged `parts` in MergeTree. -`Part` contents in MergeTree never change. Parts are immutable. They may be only deleted after the merge +Newly added `parts` either appear in the storage or fail to appear. +Storage never contains partially created parts. The same applies to merge process — `parts` are either fully +merged into a new `part` or fail to merge. There are no partially merged `parts` in MergeTree. +`Part` contents in MergeTree never change. Parts are immutable. They may be only deleted after the merge to a bigger `part` or when the `part` contents goes outside the configured `-retentionPeriod`. See [this article](https://valyala.medium.com/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) for more details. @@ -1182,7 +1144,7 @@ Retention is configured with the `-retentionPeriod` command-line flag, which tak Data is split in per-month partitions inside `<-storageDataPath>/data/{small,big}` folders. Data partitions outside the configured retention are deleted on the first day of the new month. Each partition consists of one or more data parts with the following name pattern `rowsCount_blocksCount_minTimestamp_maxTimestamp`. -Data parts outside of the configured retention are eventually deleted during +Data parts outside of the configured retention are eventually deleted during [background merge](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282). The maximum disk space usage for a given `-retentionPeriod` is going to be (`-retentionPeriod` + 1) months. @@ -1209,7 +1171,6 @@ so it could route requests from particular user to VictoriaMetrics with the desi The same scheme could be implemented for multiple tenants in [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html). See [these docs](https://docs.victoriametrics.com/guides/guide-vmcluster-multiple-retention-setup.html) for multi-retention setup details. - ## Downsampling [VictoriaMetrics Enterprise](https://victoriametrics.com/products/enterprise/) supports multi-level downsampling with `-downsampling.period` command-line flag. For example: @@ -1222,12 +1183,10 @@ Downsampling is applied independently per each time series. It can reduce disk s The downsampling can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). - ## Multi-tenancy Single-node VictoriaMetrics doesn't support multi-tenancy. Use [cluster version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy) instead. - ## Scalability and cluster version Though single-node VictoriaMetrics cannot scale to multiple nodes, it is optimized for resource usage - storage size / bandwidth / IOPS, RAM, CPU. @@ -1238,7 +1197,6 @@ So try single-node VictoriaMetrics at first and then [switch to cluster version] horizontally scalable long-term remote storage for really large Prometheus deployments. [Contact us](mailto:info@victoriametrics.com) for enterprise support. - ## Alerting It is recommended using [vmalert](https://docs.victoriametrics.com/vmalert.html) for alerting. @@ -1249,7 +1207,6 @@ Additionally, alerting can be set up with the following tools: * With Promxy - see [the corresponding docs](https://github.com/jacksontj/promxy/blob/master/README.md#how-do-i-use-alertingrecording-rules-in-promxy). * With Grafana - see [the corresponding docs](https://grafana.com/docs/alerting/rules/). - ## Security Do not forget protecting sensitive endpoints in VictoriaMetrics when exposing it to untrusted networks such as the internet. @@ -1263,6 +1220,7 @@ Consider setting the following command-line flags: * `-forceMergeAuthKey` for protecting `/internal/force_merge` endpoint. See [force merge docs](#forced-merge). * `-search.resetCacheAuthKey` for protecting `/internal/resetRollupResultCache` endpoint. See [backfilling](#backfilling) for more details. * `-configAuthKey` for protecting `/config` endpoint, since it may contain sensitive information such as passwords. + - `-pprofAuthKey` for protecting `/debug/pprof/*` endpoints, which can be used for [profiling](#profiling). Explicitly set internal network interface for TCP and UDP ports for data ingestion with Graphite and OpenTSDB formats. @@ -1271,7 +1229,6 @@ For example, substitute `-graphiteListenAddr=:2003` with `-graphiteListenAddr=/cache` directory during graceful shutdown (e.g. when VictoriaMetrics is stopped by sending `SIGINT` signal). The caches are read on the next VictoriaMetrics startup. Sometimes it is needed to remove such caches on the next startup. This can be performed by placing `reset_cache_on_startup` file inside the `<-storageDataPath>/cache` directory before the restart of VictoriaMetrics. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447) for details. - ## Cache tuning VictoriaMetrics uses various in-memory caches for faster data ingestion and query performance. The following metrics for each type of cache are exported at [`/metrics` page](#monitoring): -- `vm_cache_size_bytes` - the actual cache size -- `vm_cache_size_max_bytes` - cache size limit -- `vm_cache_requests_total` - the number of requests to the cache -- `vm_cache_misses_total` - the number of cache misses -- `vm_cache_entries` - the number of entries in the cache +* `vm_cache_size_bytes` - the actual cache size +* `vm_cache_size_max_bytes` - cache size limit +* `vm_cache_requests_total` - the number of requests to the cache +* `vm_cache_misses_total` - the number of cache misses +* `vm_cache_entries` - the number of entries in the cache Both Grafana dashboards for [single-node VictoriaMetrics](https://grafana.com/dashboards/10229) and [clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11176) @@ -1452,28 +1405,28 @@ practical scenarios and workloads. Change the defaults only if you understand th To override the default values see command-line flags with `-storage.cacheSize` prefix. See the full description of flags [here](#list-of-command-line-flags). - ## Data migration ### From VictoriaMetrics -The simplest way to migrate data from one single-node (source) to another (destination), or from one vmstorage node +The simplest way to migrate data from one single-node (source) to another (destination), or from one vmstorage node to another do the following: + 1. Stop the VictoriaMetrics (source) with `kill -INT`; -2. Copy (via [rsync](https://en.wikipedia.org/wiki/Rsync) or any other tool) the entire folder specified +2. Copy (via [rsync](https://en.wikipedia.org/wiki/Rsync) or any other tool) the entire folder specified via `-storageDataPath` from the source node to the empty folder at the destination node. -3. Once copy is done, stop the VictoriaMetrics (destination) with `kill -INT` and verify that +3. Once copy is done, stop the VictoriaMetrics (destination) with `kill -INT` and verify that its `-storageDataPath` points to the copied folder from p.2; 4. Start the VictoriaMetrics (destination). The copied data should be now available. Things to consider when copying data: + 1. Data formats between single-node and vmstorage node aren't compatible and can't be copied. 2. Copying data folder means complete replacement of the previous data on destination VictoriaMetrics. For more complex scenarios like single-to-cluster, cluster-to-single, re-sharding or migrating only a fraction of data - see [vmctl. Migrating data from VictoriaMetrics](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-victoriametrics). - ### From other systems Use [vmctl](https://docs.victoriametrics.com/vmctl.html) for data migration. It supports the following data migration types: @@ -1485,7 +1438,6 @@ Use [vmctl](https://docs.victoriametrics.com/vmctl.html) for data migration. It See [vmctl docs](https://docs.victoriametrics.com/vmctl.html) for more details. - ## Backfilling VictoriaMetrics accepts historical data in arbitrary order of time via [any supported ingestion method](#how-to-import-time-series-data). @@ -1503,7 +1455,6 @@ Yet another solution is to increase `-search.cacheTimestampOffset` flag value in for data with timestamps close to the current time. Single-node VictoriaMetrics automatically resets response cache when samples with timestamps older than `now - search.cacheTimestampOffset` are ingested to it. - ## Data updates VictoriaMetrics doesn't support updating already existing sample values to new ones. It stores all the ingested data points @@ -1511,7 +1462,6 @@ for the same time series with identical timestamps. While it is possible substit [removal of old time series](#how-to-delete-time-series) and then [writing new time series](#backfilling), this approach should be used only for one-off updates. It shouldn't be used for frequent updates because of non-zero overhead related to data removal. - ## Replication Single-node VictoriaMetrics doesn't support application-level replication. Use cluster version instead. @@ -1521,7 +1471,6 @@ Storage-level replication may be offloaded to durable persistent storage such as See also [high availability docs](#high-availability) and [backup docs](#backups). - ## Backups VictoriaMetrics supports backups via [vmbackup](https://docs.victoriametrics.com/vmbackup.html) @@ -1529,19 +1478,17 @@ and [vmrestore](https://docs.victoriametrics.com/vmrestore.html) tools. We also provide [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html) tool for enterprise subscribers. Enterprise binaries can be downloaded and evaluated for free from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). - ## Benchmarks -Note, that vendors (including VictoriaMetrics) are often biased when doing such tests. E.g. they try highlighting -the best parts of their product, while highlighting the worst parts of competing products. -So we encourage users and all independent third parties to conduct their becnhmarks for various products +Note, that vendors (including VictoriaMetrics) are often biased when doing such tests. E.g. they try highlighting +the best parts of their product, while highlighting the worst parts of competing products. +So we encourage users and all independent third parties to conduct their becnhmarks for various products they are evaluating in production and publish the results. As a reference, please see [benchmarks](https://docs.victoriametrics.com/Articles.html#benchmarks) conducted by -VictoriaMetrics team. Please also see the [helm chart](https://github.com/VictoriaMetrics/benchmark) +VictoriaMetrics team. Please also see the [helm chart](https://github.com/VictoriaMetrics/benchmark) for running ingestion benchmarks based on node_exporter metrics. - ## Profiling VictoriaMetrics provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs): @@ -1570,7 +1517,6 @@ The command for collecting CPU profile waits for 30 seconds before returning. The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). - ## Integrations * [Helm charts for single-node and cluster versions of VictoriaMetrics](https://github.com/VictoriaMetrics/helm-charts). @@ -1584,7 +1530,6 @@ The collected profiles may be analyzed with [go tool pprof](https://github.com/g * [Snap package for VictoriaMetrics](https://snapcraft.io/victoriametrics). * [vmalert-cli](https://github.com/aorfanos/vmalert-cli) - a CLI application for managing [vmalert](https://docs.victoriametrics.com/vmalert.html). - ## Third-party contributions * [Unofficial yum repository](https://copr.fedorainfracloud.org/coprs/antonpatsev/VictoriaMetrics/) ([source code](https://github.com/patsevanton/victoriametrics-rpm)) @@ -1592,12 +1537,10 @@ The collected profiles may be analyzed with [go tool pprof](https://github.com/g * [Prometheus -> VictoriaMetrics exporter #2](https://github.com/AnchorFree/tsdb-remote-write) * [Prometheus Oauth proxy](https://gitlab.com/optima_public/prometheus_oauth_proxy) - see [this article](https://medium.com/@richard.holly/powerful-saas-solution-for-detection-metrics-c67b9208d362) for details. - ## Contacts Contact us with any questions regarding VictoriaMetrics at [info@victoriametrics.com](mailto:info@victoriametrics.com). - ## Community and contributions Feel free asking any questions regarding VictoriaMetrics: @@ -1631,7 +1574,6 @@ Adhering `KISS` principle simplifies the resulting code and architecture, so it Report bugs and propose new features [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues). - ## VictoriaMetrics Logo [Zip](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/VM_logo.zip) contains three folders with different image orientations (main color and inverted version). @@ -1661,315 +1603,314 @@ Files included in each folder: * Do not change spacing, alignment, or relative locations of the design elements. * Do not change the proportions of any of the design elements or the design itself. You may resize as needed but must retain all proportions. - ## List of command-line flags Pass `-help` to VictoriaMetrics in order to see the list of supported command-line flags with their description: ``` -bigMergeConcurrency int - The maximum number of CPU cores to use for big merges. Default value is used if set to 0 + The maximum number of CPU cores to use for big merges. Default value is used if set to 0 -configAuthKey string - Authorization key for accessing /config page. It must be passed via authKey query arg + Authorization key for accessing /config page. It must be passed via authKey query arg -csvTrimTimestamp duration - Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -datadog.maxInsertRequestSize size - The maximum size in bytes of a single DataDog POST request to /api/v1/series - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864) + The maximum size in bytes of a single DataDog POST request to /api/v1/series + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864) -dedup.minScrapeInterval duration - Leave only the first sample in every time series per each discrete interval equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication and https://docs.victoriametrics.com/#downsampling + Leave only the first sample in every time series per each discrete interval equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication and https://docs.victoriametrics.com/#downsampling -deleteAuthKey string - authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series and /tags/delSeries + authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series and /tags/delSeries -denyQueriesOutsideRetention - Whether to deny queries outside of the configured -retentionPeriod. When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. This may be useful when multiple data sources with distinct retentions are hidden behind query-tee + Whether to deny queries outside of the configured -retentionPeriod. When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. This may be useful when multiple data sources with distinct retentions are hidden behind query-tee -downsampling.period array - Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details - Supports an array of values separated by comma or specified via multiple flags. + Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details + Supports an array of values separated by comma or specified via multiple flags. -dryRun - Whether to check only -promscrape.config and then exit. Unknown config entries aren't allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse=false command-line flag + Whether to check only -promscrape.config and then exit. Unknown config entries aren't allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse=false command-line flag -enableTCP6 - Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used + Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used -envflag.enable - Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details + Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details -envflag.prefix string - Prefix for environment variables if -envflag.enable is set + Prefix for environment variables if -envflag.enable is set -eula - By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf + By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf -finalMergeDelay duration - The delay before starting final merge for per-month partition after no new data is ingested into it. Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. Zero value disables final merge + The delay before starting final merge for per-month partition after no new data is ingested into it. Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. Zero value disables final merge -forceFlushAuthKey string - authKey, which must be passed in query string to /internal/force_flush pages + authKey, which must be passed in query string to /internal/force_flush pages -forceMergeAuthKey string - authKey, which must be passed in query string to /internal/force_merge pages + authKey, which must be passed in query string to /internal/force_merge pages -fs.disableMmap - Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() + Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() -graphiteListenAddr string - TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty + TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty -graphiteTrimTimestamp duration - Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) + Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) -http.connTimeout duration - Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) + Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) -http.disableResponseCompression - Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth + Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth -http.idleConnTimeout duration - Timeout for incoming idle http connections (default 1m0s) + Timeout for incoming idle http connections (default 1m0s) -http.maxGracefulShutdownDuration duration - The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) + The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) -http.pathPrefix string - An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus + An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus -http.shutdownDelay duration - Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers + Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers -httpAuth.password string - Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty + Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty -httpAuth.username string - Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password + Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password -httpListenAddr string - TCP address to listen for http connections (default ":8428") + TCP address to listen for http connections (default ":8428") -import.maxLineLen size - The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600) + The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600) -influx.databaseNames array - Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb - Supports an array of values separated by comma or specified via multiple flags. + Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb + Supports an array of values separated by comma or specified via multiple flags. -influx.maxLineSize size - The maximum size in bytes for a single InfluxDB line during parsing - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144) + The maximum size in bytes for a single InfluxDB line during parsing + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144) -influxDBLabel string - Default label for the DB name sent over '?db={db_name}' query parameter (default "db") + Default label for the DB name sent over '?db={db_name}' query parameter (default "db") -influxListenAddr string - TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://:8428/write + TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://:8428/write -influxMeasurementFieldSeparator string - Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") + Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") -influxSkipMeasurement - Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator' + Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator' -influxSkipSingleField - Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field + Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field -influxTrimTimestamp duration - Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -insert.maxQueueDuration duration - The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s) + The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s) -logNewSeries - Whether to log new series. This option is for debug purposes only. It can lead to performance issues when big number of new series are ingested into VictoriaMetrics + Whether to log new series. This option is for debug purposes only. It can lead to performance issues when big number of new series are ingested into VictoriaMetrics -loggerDisableTimestamps - Whether to disable writing timestamps in logs + Whether to disable writing timestamps in logs -loggerErrorsPerSecondLimit int - Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit + Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit -loggerFormat string - Format for logs. Possible values: default, json (default "default") + Format for logs. Possible values: default, json (default "default") -loggerLevel string - Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") + Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") -loggerOutput string - Output for the logs. Supported values: stderr, stdout (default "stderr") + Output for the logs. Supported values: stderr, stdout (default "stderr") -loggerTimezone string - Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") + Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") -loggerWarnsPerSecondLimit int - Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit + Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit -maxConcurrentInserts int - The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16) + The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16) -maxInsertRequestSize size - The maximum size in bytes of a single Prometheus remote_write API request - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) + The maximum size in bytes of a single Prometheus remote_write API request + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) -maxLabelValueLen int - The maximum length of label values in the accepted time series. Longer label values are truncated. In this case the vm_too_long_label_values_total metric at /metrics page is incremented (default 16384) + The maximum length of label values in the accepted time series. Longer label values are truncated. In this case the vm_too_long_label_values_total metric at /metrics page is incremented (default 16384) -maxLabelsPerTimeseries int - The maximum number of labels accepted per time series. Superfluous labels are dropped. In this case the vm_metrics_with_dropped_labels_total metric at /metrics page is incremented (default 30) + The maximum number of labels accepted per time series. Superfluous labels are dropped. In this case the vm_metrics_with_dropped_labels_total metric at /metrics page is incremented (default 30) -memory.allowedBytes size - Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -memory.allowedPercent float - Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) + Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) -metricsAuthKey string - Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings -opentsdbHTTPListenAddr string - TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty + TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty -opentsdbListenAddr string - TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty + TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty -opentsdbTrimTimestamp duration - Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) + Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) -opentsdbhttp.maxInsertRequestSize size - The maximum size of OpenTSDB HTTP put request - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) + The maximum size of OpenTSDB HTTP put request + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) -opentsdbhttpTrimTimestamp duration - Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -pprofAuthKey string - Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings -precisionBits int - The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss (default 64) + The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss (default 64) -promscrape.cluster.memberNum int - The number of number in the cluster of scrapers. It must be an unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster + The number of number in the cluster of scrapers. It must be an unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster -promscrape.cluster.membersCount int - The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets + The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets -promscrape.cluster.replicationFactor int - The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1) + The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1) -promscrape.config string - Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details + Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details -promscrape.config.dryRun - Checks -promscrape.config file for errors and unsupported fields and then exits. Returns non-zero exit code on parsing errors and emits these errors to stderr. See also -promscrape.config.strictParse command-line flag. Pass -loggerLevel=ERROR if you don't need to see info messages in the output. + Checks -promscrape.config file for errors and unsupported fields and then exits. Returns non-zero exit code on parsing errors and emits these errors to stderr. See also -promscrape.config.strictParse command-line flag. Pass -loggerLevel=ERROR if you don't need to see info messages in the output. -promscrape.config.strictParse - Whether to deny unsupported fields in -promscrape.config . Set to false in order to silently skip unsupported fields (default true) + Whether to deny unsupported fields in -promscrape.config . Set to false in order to silently skip unsupported fields (default true) -promscrape.configCheckInterval duration - Interval for checking for changes in '-promscrape.config' file. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes + Interval for checking for changes in '-promscrape.config' file. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes -promscrape.consul.waitTime duration - Wait time used by Consul service discovery. Default value is used if not set + Wait time used by Consul service discovery. Default value is used if not set -promscrape.consulSDCheckInterval duration - Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) + Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) -promscrape.digitaloceanSDCheckInterval duration - Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s) + Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s) -promscrape.disableCompression - Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control + Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control -promscrape.disableKeepAlive - Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets + Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets -promscrape.discovery.concurrency int - The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100) + The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100) -promscrape.discovery.concurrentWaitTime duration - The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s) + The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s) -promscrape.dnsSDCheckInterval duration - Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config for details (default 30s) + Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config for details (default 30s) -promscrape.dockerSDCheckInterval duration - Interval for checking for changes in docker. This works only if docker_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#docker_sd_config for details (default 30s) + Interval for checking for changes in docker. This works only if docker_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#docker_sd_config for details (default 30s) -promscrape.dockerswarmSDCheckInterval duration - Interval for checking for changes in dockerswarm. This works only if dockerswarm_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config for details (default 30s) + Interval for checking for changes in dockerswarm. This works only if dockerswarm_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config for details (default 30s) -promscrape.dropOriginalLabels - Whether to drop original labels for scrape targets at /targets and /api/v1/targets pages. This may be needed for reducing memory usage when original labels for big number of scrape targets occupy big amounts of memory. Note that this reduces debuggability for improper per-target relabeling configs + Whether to drop original labels for scrape targets at /targets and /api/v1/targets pages. This may be needed for reducing memory usage when original labels for big number of scrape targets occupy big amounts of memory. Note that this reduces debuggability for improper per-target relabeling configs -promscrape.ec2SDCheckInterval duration - Interval for checking for changes in ec2. This works only if ec2_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config for details (default 1m0s) + Interval for checking for changes in ec2. This works only if ec2_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config for details (default 1m0s) -promscrape.eurekaSDCheckInterval duration - Interval for checking for changes in eureka. This works only if eureka_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config for details (default 30s) + Interval for checking for changes in eureka. This works only if eureka_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config for details (default 30s) -promscrape.fileSDCheckInterval duration - Interval for checking for changes in 'file_sd_config'. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config for details (default 5m0s) + Interval for checking for changes in 'file_sd_config'. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config for details (default 5m0s) -promscrape.gceSDCheckInterval duration - Interval for checking for changes in gce. This works only if gce_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config for details (default 1m0s) + Interval for checking for changes in gce. This works only if gce_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config for details (default 1m0s) -promscrape.httpSDCheckInterval duration - Interval for checking for changes in http endpoint service discovery. This works only if http_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config for details (default 1m0s) + Interval for checking for changes in http endpoint service discovery. This works only if http_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config for details (default 1m0s) -promscrape.kubernetes.apiServerTimeout duration - How frequently to reload the full state from Kuberntes API server (default 30m0s) + How frequently to reload the full state from Kuberntes API server (default 30m0s) -promscrape.kubernetesSDCheckInterval duration - Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config for details (default 30s) + Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config for details (default 30s) -promscrape.maxDroppedTargets int - The maximum number of droppedTargets to show at /api/v1/targets page. Increase this value if your setup drops more scrape targets during relabeling and you need investigating labels for all the dropped targets. Note that the increased number of tracked dropped targets may result in increased memory usage (default 1000) + The maximum number of droppedTargets to show at /api/v1/targets page. Increase this value if your setup drops more scrape targets during relabeling and you need investigating labels for all the dropped targets. Note that the increased number of tracked dropped targets may result in increased memory usage (default 1000) -promscrape.maxResponseHeadersSize size - The maximum size of http response headers from Prometheus scrape targets - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 4096) + The maximum size of http response headers from Prometheus scrape targets + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 4096) -promscrape.maxScrapeSize size - The maximum size of scrape response in bytes to process from Prometheus targets. Bigger responses are rejected - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16777216) + The maximum size of scrape response in bytes to process from Prometheus targets. Bigger responses are rejected + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16777216) -promscrape.minResponseSizeForStreamParse size - The minimum target response size for automatic switching to stream parsing mode, which can reduce memory usage. See https://docs.victoriametrics.com/vmagent.html#stream-parsing-mode - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 1000000) + The minimum target response size for automatic switching to stream parsing mode, which can reduce memory usage. See https://docs.victoriametrics.com/vmagent.html#stream-parsing-mode + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 1000000) -promscrape.noStaleMarkers - Whether to disable sending Prometheus stale markers for metrics when scrape target disappears. This option may reduce memory usage if stale markers aren't needed for your setup. This option also disables populating the scrape_series_added metric. See https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series + Whether to disable sending Prometheus stale markers for metrics when scrape target disappears. This option may reduce memory usage if stale markers aren't needed for your setup. This option also disables populating the scrape_series_added metric. See https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series -promscrape.openstackSDCheckInterval duration - Interval for checking for changes in openstack API server. This works only if openstack_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config for details (default 30s) + Interval for checking for changes in openstack API server. This works only if openstack_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config for details (default 30s) -promscrape.seriesLimitPerTarget int - Optional limit on the number of unique time series a single scrape target can expose. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more info + Optional limit on the number of unique time series a single scrape target can expose. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more info -promscrape.streamParse - Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is posible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control + Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is posible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control -promscrape.suppressDuplicateScrapeTargetErrors - Whether to suppress 'duplicate scrape target' errors; see https://docs.victoriametrics.com/vmagent.html#troubleshooting for details + Whether to suppress 'duplicate scrape target' errors; see https://docs.victoriametrics.com/vmagent.html#troubleshooting for details -promscrape.suppressScrapeErrors - Whether to suppress scrape errors logging. The last error for each target is always available at '/targets' page even if scrape errors logging is suppressed + Whether to suppress scrape errors logging. The last error for each target is always available at '/targets' page even if scrape errors logging is suppressed -relabelConfig string - Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal + Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal -relabelDebug - Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, then the metrics aren't sent to storage. This is useful for debugging the relabeling configs + Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, then the metrics aren't sent to storage. This is useful for debugging the relabeling configs -retentionPeriod value - Data with timestamps outside the retentionPeriod is automatically deleted - The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1) + Data with timestamps outside the retentionPeriod is automatically deleted + The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1) -search.cacheTimestampOffset duration - The maximum duration since the current time for response data, which is always queried from the original raw data, without using the response cache. Increase this value if you see gaps in responses due to time synchronization issues between VictoriaMetrics and data sources. See also -search.disableAutoCacheReset (default 5m0s) + The maximum duration since the current time for response data, which is always queried from the original raw data, without using the response cache. Increase this value if you see gaps in responses due to time synchronization issues between VictoriaMetrics and data sources. See also -search.disableAutoCacheReset (default 5m0s) -search.disableAutoCacheReset - Whether to disable automatic response cache reset if a sample with timestamp outside -search.cacheTimestampOffset is inserted into VictoriaMetrics + Whether to disable automatic response cache reset if a sample with timestamp outside -search.cacheTimestampOffset is inserted into VictoriaMetrics -search.disableCache - Whether to disable response caching. This may be useful during data backfilling + Whether to disable response caching. This may be useful during data backfilling -search.graphiteMaxPointsPerSeries int - The maximum number of points per series Graphite render API can return (default 1000000) + The maximum number of points per series Graphite render API can return (default 1000000) -search.graphiteStorageStep duration - The interval between datapoints stored in the database. It is used at Graphite Render API handler for normalizing the interval between datapoints in case it isn't normalized. It can be overriden by sending 'storage_step' query arg to /render API or by sending the desired interval via 'Storage-Step' http header during querying /render API (default 10s) + The interval between datapoints stored in the database. It is used at Graphite Render API handler for normalizing the interval between datapoints in case it isn't normalized. It can be overriden by sending 'storage_step' query arg to /render API or by sending the desired interval via 'Storage-Step' http header during querying /render API (default 10s) -search.latencyOffset duration - The time when data points become visible in query results after the collection. Too small value can result in incomplete last points for query results (default 30s) + The time when data points become visible in query results after the collection. Too small value can result in incomplete last points for query results (default 30s) -search.logSlowQueryDuration duration - Log queries with execution time exceeding this value. Zero disables slow query logging (default 5s) + Log queries with execution time exceeding this value. Zero disables slow query logging (default 5s) -search.maxConcurrentRequests int - The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration (default 8) + The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration (default 8) -search.maxExportDuration duration - The maximum duration for /api/v1/export call (default 720h0m0s) + The maximum duration for /api/v1/export call (default 720h0m0s) -search.maxLookback duration - Synonym to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. See also '-search.maxStalenessInterval' flag, which has the same meaining due to historical reasons + Synonym to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. See also '-search.maxStalenessInterval' flag, which has the same meaining due to historical reasons -search.maxPointsPerTimeseries int - The maximum points per a single timeseries returned from /api/v1/query_range. This option doesn't limit the number of scanned raw samples in the database. The main purpose of this option is to limit the number of per-series points returned to graphing UI such as Grafana. There is no sense in setting this limit to values bigger than the horizontal resolution of the graph (default 30000) + The maximum points per a single timeseries returned from /api/v1/query_range. This option doesn't limit the number of scanned raw samples in the database. The main purpose of this option is to limit the number of per-series points returned to graphing UI such as Grafana. There is no sense in setting this limit to values bigger than the horizontal resolution of the graph (default 30000) -search.maxQueryDuration duration - The maximum duration for query execution (default 30s) + The maximum duration for query execution (default 30s) -search.maxQueryLen size - The maximum search query length in bytes - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16384) + The maximum search query length in bytes + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16384) -search.maxQueueDuration duration - The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached; see also -search.maxQueryDuration (default 10s) + The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached; see also -search.maxQueryDuration (default 10s) -search.maxSamplesPerQuery int - The maximum number of raw samples a single query can process across all time series. This protects from heavy queries, which select unexpectedly high number of raw samples. See also -search.maxSamplesPerSeries (default 1000000000) + The maximum number of raw samples a single query can process across all time series. This protects from heavy queries, which select unexpectedly high number of raw samples. See also -search.maxSamplesPerSeries (default 1000000000) -search.maxSamplesPerSeries int - The maximum number of raw samples a single query can scan per each time series. This option allows limiting memory usage (default 30000000) + The maximum number of raw samples a single query can scan per each time series. This option allows limiting memory usage (default 30000000) -search.maxStalenessInterval duration - The maximum interval for staleness calculations. By default it is automatically calculated from the median interval between samples. This flag could be useful for tuning Prometheus data model closer to Influx-style data model. See https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness for details. See also '-search.maxLookback' flag, which has the same meaning due to historical reasons + The maximum interval for staleness calculations. By default it is automatically calculated from the median interval between samples. This flag could be useful for tuning Prometheus data model closer to Influx-style data model. See https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness for details. See also '-search.maxLookback' flag, which has the same meaning due to historical reasons -search.maxStatusRequestDuration duration - The maximum duration for /api/v1/status/* requests (default 5m0s) + The maximum duration for /api/v1/status/* requests (default 5m0s) -search.maxStepForPointsAdjustment duration - The maximum step when /api/v1/query_range handler adjusts points with timestamps closer than -search.latencyOffset to the current time. The adjustment is needed because such points may contain incomplete data (default 1m0s) + The maximum step when /api/v1/query_range handler adjusts points with timestamps closer than -search.latencyOffset to the current time. The adjustment is needed because such points may contain incomplete data (default 1m0s) -search.maxTagKeys int - The maximum number of tag keys returned from /api/v1/labels (default 100000) + The maximum number of tag keys returned from /api/v1/labels (default 100000) -search.maxTagValueSuffixesPerSearch int - The maximum number of tag value suffixes returned from /metrics/find (default 100000) + The maximum number of tag value suffixes returned from /metrics/find (default 100000) -search.maxTagValues int - The maximum number of tag values returned from /api/v1/label//values (default 100000) + The maximum number of tag values returned from /api/v1/label//values (default 100000) -search.maxUniqueTimeseries int - The maximum number of unique time series each search can scan. This option allows limiting memory usage (default 300000) + The maximum number of unique time series each search can scan. This option allows limiting memory usage (default 300000) -search.minStalenessInterval duration - The minimum interval for staleness calculations. This flag could be useful for removing gaps on graphs generated from time series with irregular intervals between samples. See also '-search.maxStalenessInterval' + The minimum interval for staleness calculations. This flag could be useful for removing gaps on graphs generated from time series with irregular intervals between samples. See also '-search.maxStalenessInterval' -search.noStaleMarkers - Set this flag to true if the database doesn't contain Prometheus stale markers, so there is no need in spending additional CPU time on its handling. Staleness markers may exist only in data obtained from Prometheus scrape targets + Set this flag to true if the database doesn't contain Prometheus stale markers, so there is no need in spending additional CPU time on its handling. Staleness markers may exist only in data obtained from Prometheus scrape targets -search.queryStats.lastQueriesCount int - Query stats for /api/v1/status/top_queries is tracked on this number of last queries. Zero value disables query stats tracking (default 20000) + Query stats for /api/v1/status/top_queries is tracked on this number of last queries. Zero value disables query stats tracking (default 20000) -search.queryStats.minQueryDuration duration - The minimum duration for queries to track in query stats at /api/v1/status/top_queries. Queries with lower duration are ignored in query stats (default 1ms) + The minimum duration for queries to track in query stats at /api/v1/status/top_queries. Queries with lower duration are ignored in query stats (default 1ms) -search.resetCacheAuthKey string - Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call + Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call -search.treatDotsAsIsInRegexps - Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter + Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter -selfScrapeInstance string - Value for 'instance' label, which is added to self-scraped metrics (default "self") + Value for 'instance' label, which is added to self-scraped metrics (default "self") -selfScrapeInterval duration - Interval for self-scraping own metrics at /metrics page + Interval for self-scraping own metrics at /metrics page -selfScrapeJob string - Value for 'job' label, which is added to self-scraped metrics (default "victoria-metrics") + Value for 'job' label, which is added to self-scraped metrics (default "victoria-metrics") -smallMergeConcurrency int - The maximum number of CPU cores to use for small merges. Default value is used if set to 0 + The maximum number of CPU cores to use for small merges. Default value is used if set to 0 -snapshotAuthKey string - authKey, which must be passed in query string to /snapshot* pages + authKey, which must be passed in query string to /snapshot* pages -sortLabels - Whether to sort labels for incoming samples before writing them to storage. This may be needed for reducing memory usage at storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}. Enabled sorting for labels can slow down ingestion performance a bit + Whether to sort labels for incoming samples before writing them to storage. This may be needed for reducing memory usage at storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}. Enabled sorting for labels can slow down ingestion performance a bit -storage.cacheSizeIndexDBDataBlocks size - Overrides max size for indexdb/dataBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Overrides max size for indexdb/dataBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -storage.cacheSizeIndexDBIndexBlocks size - Overrides max size for indexdb/indexBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Overrides max size for indexdb/indexBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -storage.cacheSizeStorageTSID size - Overrides max size for storage/tsid cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Overrides max size for storage/tsid cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -storage.maxDailySeries int - The maximum number of unique series can be added to the storage during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See also -storage.maxHourlySeries + The maximum number of unique series can be added to the storage during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See also -storage.maxHourlySeries -storage.maxHourlySeries int - The maximum number of unique series can be added to the storage during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See also -storage.maxDailySeries + The maximum number of unique series can be added to the storage during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See also -storage.maxDailySeries -storage.minFreeDiskSpaceBytes size - The minimum free disk space at -storageDataPath after which the storage stops accepting new data - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 10000000) + The minimum free disk space at -storageDataPath after which the storage stops accepting new data + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 10000000) -storageDataPath string - Path to storage data (default "victoria-metrics-data") + Path to storage data (default "victoria-metrics-data") -tls - Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set + Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set -tlsCertFile string - Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated -tlsKeyFile string - Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated -version - Show VictoriaMetrics version + Show VictoriaMetrics version ``` diff --git a/docs/Release-Guide.md b/docs/Release-Guide.md index 1b02f357b..8dce6bfe2 100644 --- a/docs/Release-Guide.md +++ b/docs/Release-Guide.md @@ -13,37 +13,38 @@ sort: 17 * `git tag -s v1.xx.y-enterprise` in `enterprise` branch * `git tag -s v1.xx.y-enterprise-cluster` in `enterprise-cluster` branch 2. Run `TAG=v1.xx.y make publish-release`. It will create `*.tar.gz` release archives with the corresponding `_checksums.txt` files inside `bin` directory and publish Docker images for the given `TAG`, `TAG-cluster`, `TAG-enterprise` and `TAG-enterprise-cluster`. -5. Push release tag to https://github.com/VictoriaMetrics/VictoriaMetrics : `git push origin v1.xx.y`. -6. Go to https://github.com/VictoriaMetrics/VictoriaMetrics/releases , create new release from the pushed tag on step 5 and upload `*.tar.gz` archive with the corresponding `_checksums.txt` from step 2. +3. Push release tag to : `git push origin v1.xx.y`. +4. Go to , create new release from the pushed tag on step 5 and upload `*.tar.gz` archive with the corresponding `_checksums.txt` from step 2. -## Building snap package. +## Building snap package - pre-requirements: -- snapcraft binary, can be installed with commands: + pre-requirements: + +* snapcraft binary, can be installed with commands: for MacOS `brew install snapcraft` and [install mutipass](https://discourse.ubuntu.com/t/installing-multipass-on-macos/8329), for Ubuntu - `sudo snap install snapcraft --classic` -- exported snapcraft login to `~/.snap/login.json` with `snapcraft export-login login.json && mkdir -p ~/.snap && mv login.json ~/.snap/` -- already created release at github (it operates `git describe` version, so git tag must be annotated). +* exported snapcraft login to `~/.snap/login.json` with `snapcraft export-login login.json && mkdir -p ~/.snap && mv login.json ~/.snap/` +* already created release at github (it operates `git describe` version, so git tag must be annotated). 0. checkout to the latest git tag for single-node version. 1. execute `make release-snap` - it must build and upload snap package. 2. promote release to current, if needed manually at release page [snapcraft-releases](https://snapcraft.io/victoriametrics/releases) -### Public Announcement +### Public Announcement -- Publish message in Slack at https://victoriametrics.slack.com -- Post at Twitter at https://twitter.com/MetricsVictoria -- Post in Reddit at https://www.reddit.com/r/VictoriaMetrics/ -- Post in Linkedin at https://www.linkedin.com/company/victoriametrics/ -- Publish message in Telegram at https://t.me/VictoriaMetrics_en and https://t.me/VictoriaMetrics_ru1 -- Publish message in google groups at https://groups.google.com/forum/#!forum/victorametrics-users +* Publish message in Slack at +* Post at Twitter at +* Post in Reddit at +* Post in Linkedin at +* Publish message in Telegram at and +* Publish message in google groups at ## Helm Charts The helm chart repository [https://github.com/VictoriaMetrics/helm-charts/](https://github.com/VictoriaMetrics/helm-charts/) +### Bump the version of images -### Bump the version of images. In that case, don't need to bump the helm chart version 1. Need to update [`values.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-cluster/values.yaml), bump version for `vmselect`, `vminsert` and `vmstorage` @@ -52,20 +53,20 @@ In that case, don't need to bump the helm chart version 4. Push changes to master. `master` is a source of truth 5. Rebase `master` into `gh-pages` branch 6. Run `make package` which creates or updates zip file with the packed chart -7. Run `make merge`. It creates or updates metadata for charts in index.yaml -8. Push the changes to `gh-pages` branch +7. Run `make merge`. It creates or updates metadata for charts in index.yaml +8. Push the changes to `gh-pages` branch + +### Updating the chart -### Updating the chart. 1. Update chart version in [`Chart.yaml`](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-cluster/Chart.yaml) 2. Update [README.md](https://github.com/VictoriaMetrics/helm-charts/blob/master/charts/victoria-metrics-cluster/README.md) file, reflect changes in the documentation. 3. Repeat the procedure from step _4_ previous section. - ## Wiki pages All changes from `docs` folder and `.md` extension automatically push to Wiki -**_Note_**: no vice versa, direct changes on Wiki will be overitten after any changes in `docs/*.md` +**_Note_**: no vice versa, direct changes on Wiki will be overitten after any changes in `docs/*.md` ## Github pages diff --git a/docs/Single-server-VictoriaMetrics.md b/docs/Single-server-VictoriaMetrics.md index f5dcf968f..b7cea94a0 100644 --- a/docs/Single-server-VictoriaMetrics.md +++ b/docs/Single-server-VictoriaMetrics.md @@ -25,7 +25,6 @@ Cluster version of VictoriaMetrics is available [here](https://docs.victoriametr [Contact us](mailto:info@victoriametrics.com) if you need enterprise support for VictoriaMetrics. See [features available in enterprise package](https://victoriametrics.com/products/enterprise/). Enterprise binaries can be downloaded and evaluated for free from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). - ## Prominent features VictoriaMetrics has the following prominent features: @@ -65,7 +64,6 @@ VictoriaMetrics has the following prominent features: See also [various Articles about VictoriaMetrics](https://docs.victoriametrics.com/Articles.html). - ## Case studies and talks Case studies: @@ -96,7 +94,6 @@ Case studies: See also [articles and slides about VictoriaMetrics from our users](https://docs.victoriametrics.com/Articles.html#third-party-articles-and-slides-about-victoriametrics) - ## Operation ## How to start VictoriaMetrics @@ -116,7 +113,6 @@ VictoriaMetrics accepts [Prometheus querying API requests](#prometheus-querying- It is recommended setting up [monitoring](#monitoring) for VictoriaMetrics. - ### Environment variables Each flag value can be set via environment variables according to these rules: @@ -126,10 +122,8 @@ Each flag value can be set via environment variables according to these rules: * For repeating flags an alternative syntax can be used by joining the different values into one using `,` char as separator (for example `-storageNode -storageNode ` will translate to `storageNode=,`). * Environment var prefix can be set via `-envflag.prefix` flag. For instance, if `-envflag.prefix=VM_`, then env vars must be prepended with `VM_`. - ### Configuration with snap package - Snap package for VictoriaMetrics is available [here](https://snapcraft.io/victoriametrics). Command-line flags for Snap package can be set with following command: @@ -141,7 +135,6 @@ snap restart victoriametrics Do not change value for `-storageDataPath` flag, because snap package has limited access to host filesystem. - Changing scrape configuration is possible with text editor: ```text @@ -150,7 +143,6 @@ vi $SNAP_DATA/var/snap/victoriametrics/current/etc/victoriametrics-scrape-config After changes were made, trigger config re-read with the command `curl 127.0.0.1:8248/-/reload`. - ## Prometheus setup Add the following lines to Prometheus config file (it is usually located at `/etc/prometheus/prometheus.yml`) in order to send data to VictoriaMetrics: @@ -204,7 +196,6 @@ It is recommended upgrading Prometheus to [v2.12.0](https://github.com/prometheu Take a look also at [vmagent](https://docs.victoriametrics.com/vmagent.html) and [vmalert](https://docs.victoriametrics.com/vmalert.html), which can be used as faster and less resource-hungry alternative to Prometheus. - ## Grafana setup Create [Prometheus datasource](http://docs.grafana.org/features/datasources/prometheus/) in Grafana with the following url: @@ -217,7 +208,6 @@ Substitute `` with the hostname or IP address of VictoriaM Then build graphs and dashboards for the created datasource using [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/) or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html). - ## How to upgrade VictoriaMetrics It is safe upgrading VictoriaMetrics to new versions unless [release notes](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) say otherwise. It is safe skipping multiple versions during the upgrade unless [release notes](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) say otherwise. It is recommended performing regular upgrades to the latest version, since it may contain important bug fixes, performance optimizations or new features. @@ -232,7 +222,6 @@ The following steps must be performed during the upgrade / downgrade procedure: Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details. The same applies also to [vmagent](https://docs.victoriametrics.com/vmagent.html). - ## How to apply new config to VictoriaMetrics VictoriaMetrics is configured via command-line flags, so it must be restarted when new command-line flags should be applied: @@ -243,7 +232,6 @@ VictoriaMetrics is configured via command-line flags, so it must be restarted wh Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](https://grafana.com/blog/2019/03/25/whats-new-in-prometheus-2.8-wal-based-remote-write/) for details. The same applies alos to [vmagent](https://docs.victoriametrics.com/vmagent.html). - ## How to scrape Prometheus exporters such as [node-exporter](https://github.com/prometheus/node_exporter) VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping targets configured in `prometheus.yml` config file according to [the specification](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file). Just set `-promscrape.config` command-line flag to the path to `prometheus.yml` config - and VictoriaMetrics should start scraping the configured targets. Currently the following [scrape_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) types are supported: @@ -262,7 +250,6 @@ VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping t * [digitalocean_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config) * [http_sd_config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config) - File a [feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) if you need support for other `*_sd_config` types. The file pointed by `-promscrape.config` may contain `%{ENV_VAR}` placeholders, which are substituted by the corresponding `ENV_VAR` environment variable values. @@ -271,7 +258,6 @@ VictoriaMetrics also supports [importing data in Prometheus exposition format](# See also [vmagent](https://docs.victoriametrics.com/vmagent.html), which can be used as drop-in replacement for Prometheus. - ## How to send data from DataDog agent VictoriaMetrics accepts data from [DataDog agent](https://docs.datadoghq.com/agent/) or [DogStatsD]() via ["submit metrics" API](https://docs.datadoghq.com/api/latest/metrics/#submit-metrics) at `/datadog/api/v1/series` path. @@ -319,7 +305,6 @@ This command should return the following output if everything is OK: Extra labels may be added to all the written time series by passing `extra_label=name=value` query args. For example, `/datadog/api/v1/series?extra_label=foo=bar` would add `{foo="bar"}` label to all the ingested metrics. - ## How to send data from InfluxDB-compatible agents such as [Telegraf](https://www.influxdata.com/time-series-platform/telegraf/) Use `http://:8428` url instead of InfluxDB url in agents' configs. @@ -507,7 +492,6 @@ The `/api/v1/export` endpoint should return the following response: Extra labels may be added to all the imported time series by passing `extra_label=name=value` query args. For example, `/api/put?extra_label=foo=bar` would add `{foo="bar"}` label to all the ingested metrics. - ## Prometheus querying API usage VictoriaMetrics supports the following handlers from [Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/): @@ -523,7 +507,6 @@ VictoriaMetrics supports the following handlers from [Prometheus querying API](h These handlers can be queried from Prometheus-compatible clients such as Grafana or curl. All the Prometheus querying API handlers can be prepended with `/prometheus` prefix. For example, both `/prometheus/api/v1/query` and `/api/v1/query` should work. - ### Prometheus querying API enhancements VictoriaMetrics accepts optional `extra_label==` query arg, which can be used for enforcing additional label filters for queries. For example, @@ -556,7 +539,6 @@ Additionally VictoriaMetrics provides the following handlers: For example, request to `/api/v1/status/top_queries?topN=5&maxLifetime=30s` would return up to 5 queries per list, which were executed during the last 30 seconds. VictoriaMetrics tracks the last `-search.queryStats.lastQueriesCount` queries with durations at least `-search.queryStats.minQueryDuration`. - ## Graphite API usage VictoriaMetrics supports the following Graphite APIs, which are needed for [Graphite datasource in Grafana](https://grafana.com/docs/grafana/latest/datasources/graphite/): @@ -573,7 +555,6 @@ VictoriaMetrics accepts optional query args: `extra_label==:8428/api/v1/export?match[]=`, @@ -819,7 +792,6 @@ Exported data can be imported via POST'ing it to [/api/v1/import](#how-to-import The [deduplication](#deduplication) is applied to the data exported via `/api/v1/export` by default. The deduplication isn't applied if `reduce_mem_usage=1` query arg is passed to the request. - ### How to export CSV data Send a request to `http://:8428/api/v1/export/csv?format=&match=`, @@ -845,7 +817,6 @@ The exported CSV data can be imported to VictoriaMetrics via [/api/v1/import/csv The [deduplication](#deduplication) is applied for the data exported in CSV by default. It is possible to export raw data without de-duplication by passing `reduce_mem_usage=1` query arg to `/api/v1/export/csv`. - ### How to export data in native format Send a request to `http://:8428/api/v1/export/native?match[]=`, @@ -870,7 +841,6 @@ can fail to be imported into VictoriaMetrics release Y. The [deduplication](#deduplication) isn't applied for the data exported in native format. It is expected that the de-duplication is performed during data import. - ## How to import time series data Time series data can be imported into VictoriaMetrics via any supported ingestion protocol: @@ -888,7 +858,6 @@ Time series data can be imported into VictoriaMetrics via any supported ingestio * `/api/v1/import/csv` for importing arbitrary CSV data. See [these docs](#how-to-import-csv-data) for details. * `/api/v1/import/prometheus` for importing data in Prometheus exposition format. See [these docs](#how-to-import-data-in-prometheus-exposition-format) for details. - ### How to import data in JSON line format Example for importing data obtained via [/api/v1/export](#how-to-export-data-in-json-line-format): @@ -918,7 +887,6 @@ Note that it could be required to flush response cache after importing historica VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage. This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines. The solution is to split too long JSON lines into smaller lines. It is OK if samples for a single time series are split among multiple JSON lines. - ### How to import data in native format The specification of VictoriaMetrics' native format may yet change and is not formally documented yet. So currently we do not recommend that external clients attempt to pack their own metrics in native format file. @@ -938,7 +906,6 @@ For example, `/api/v1/import/native?extra_label=foo=bar` would add `"foo":"bar"` Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail. - ### How to import CSV data Arbitrary CSV data can be imported via `/api/v1/import/csv`. The CSV data is imported according to the provided `format` query arg. @@ -979,6 +946,7 @@ curl -G 'http://localhost:8428/api/v1/export' -d 'match[]={ticker!=""}' ``` The following response should be returned: + ```bash {"metric":{"__name__":"bid","market":"NASDAQ","ticker":"MSFT"},"values":[1.67],"timestamps":[1583865146520]} {"metric":{"__name__":"bid","market":"NYSE","ticker":"GOOG"},"values":[4.56],"timestamps":[1583865146495]} @@ -991,7 +959,6 @@ For example, `/api/v1/import/csv?extra_label=foo=bar` would add `"foo":"bar"` la Note that it could be required to flush response cache after importing historical data. See [these docs](#backfilling) for detail. - ### How to import data in Prometheus exposition format VictoriaMetrics accepts data in [Prometheus exposition format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format) @@ -1033,8 +1000,6 @@ Note that it could be required to flush response cache after importing historica VictoriaMetrics also may scrape Prometheus targets - see [these docs](#how-to-scrape-prometheus-exporters-such-as-node-exporter). - - ## Relabeling VictoriaMetrics supports Prometheus-compatible relabeling for all the ingested metrics if `-relabelConfig` command-line flag points @@ -1043,6 +1008,7 @@ The `-relabelConfig` also can point to http or https url. For example, `-relabel See [this article with relabeling tips and tricks](https://valyala.medium.com/how-to-use-relabeling-in-prometheus-and-victoriametrics-8b90fc22c4b2). Example contents for `-relabelConfig` file: + ```yml # Add {cluster="dev"} label. - target_label: cluster @@ -1056,7 +1022,6 @@ Example contents for `-relabelConfig` file: See [these docs](https://docs.victoriametrics.com/vmagent.html#relabeling) for more details about relabeling in VictoriaMetrics. - ## Federation VictoriaMetrics exports [Prometheus-compatible federation data](https://prometheus.io/docs/prometheus/latest/federation/) @@ -1068,7 +1033,6 @@ on the interval `[now - max_lookback ... now]` is scraped for each time series. For instance, `/federate?match[]=up&max_lookback=1h` would return last points on the `[now - 1h ... now]` interval. This may be useful for time series federation with scrape intervals exceeding `5m`. - ## Capacity planning VictoriaMetrics uses lower amounts of CPU, RAM and storage space on production workloads compared to competing solutions (Prometheus, Thanos, Cortex, TimescaleDB, InfluxDB, QuestDB, M3DB) according to [our case studies](https://docs.victoriametrics.com/CaseStudies.html). @@ -1091,7 +1055,6 @@ It is recommended leaving the following amounts of spare resources: * 50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload. * At least 30% of free storage space at the directory pointed by `-storageDataPath` command-line flag. See also `-storage.minFreeDiskSpaceBytes` command-line flag description [here](#list-of-command-line-flags). - ## High availability * Install multiple VictoriaMetrics instances in distinct datacenters (availability zones). @@ -1132,7 +1095,6 @@ to write data to `victoriametrics-addr-1`, while each `r2` should write data to Another option is to write data simultaneously from Prometheus HA pair to a pair of VictoriaMetrics instances with the enabled de-duplication. See [this section](#deduplication) for details. - ## Deduplication VictoriaMetrics de-duplicates data points if `-dedup.minScrapeInterval` command-line flag is set to positive duration. For example, `-dedup.minScrapeInterval=60s` would de-duplicate data points on the same time series if they fall within the same discrete 60s bucket. The earliest data point will be kept. In the case of equal timestamps, an arbitrary data point will be kept. See [this comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2112#issuecomment-1032587618) for more details on how downsampling works. @@ -1145,34 +1107,34 @@ The de-duplication reduces disk space usage if multiple identically configured [ write data to the same VictoriaMetrics instance. These vmagent or Prometheus instances must have identical `external_labels` section in their configs, so they write data to the same time series. - ## Storage -VictoriaMetrics stores time series data in [MergeTree](https://en.wikipedia.org/wiki/Log-structured_merge-tree)-like +VictoriaMetrics stores time series data in [MergeTree](https://en.wikipedia.org/wiki/Log-structured_merge-tree)-like data structures. On insert, VictoriaMetrics accumulates up to 1s of data and dumps it on disk to -`<-storageDataPath>/data/small/YYYY_MM/` subdirectory forming a `part` with the following +`<-storageDataPath>/data/small/YYYY_MM/` subdirectory forming a `part` with the following name pattern: `rowsCount_blocksCount_minTimestamp_maxTimestamp`. Each part consists of two "columns": values and timestamps. These are sorted and compressed raw time series values. Additionally, part contains index files for searching for specific series in the values and timestamps files. -`Parts` are periodically merged into the bigger parts. The resulting `part` is constructed -under `<-storageDataPath>/data/{small,big}/YYYY_MM/tmp` subdirectory. When the resulting `part` is complete, it is atomically moved from the `tmp` -to its own subdirectory, while the source parts are atomically removed. The end result is that the source +`Parts` are periodically merged into the bigger parts. The resulting `part` is constructed +under `<-storageDataPath>/data/{small,big}/YYYY_MM/tmp` subdirectory. When the resulting `part` is complete, it is atomically moved from the `tmp` +to its own subdirectory, while the source parts are atomically removed. The end result is that the source parts are substituted by a single resulting bigger `part` in the `<-storageDataPath>/data/{small,big}/YYYY_MM/` directory. -Information about merging process is available in [single-node VictoriaMetrics](https://grafana.com/dashboards/10229) -and [clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11176) Grafana dashboards. +Information about merging process is available in [single-node VictoriaMetrics](https://grafana.com/dashboards/10229) +and [clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11176) Grafana dashboards. See more details in [monitoring docs](#monitoring). -The `merge` process is usually named "compaction", because the resulting `part` size is usually smaller than +The `merge` process is usually named "compaction", because the resulting `part` size is usually smaller than the sum of the source `parts`. There are following benefits of doing the merge process: + * it improves query performance, since lower number of `parts` are inspected with each query; -* it reduces the number of data files, since each `part`contains fixed number of files; +* it reduces the number of data files, since each `part`contains fixed number of files; * better compression rate for the resulting part. -Newly added `parts` either appear in the storage or fail to appear. -Storage never contains partially created parts. The same applies to merge process — `parts` are either fully -merged into a new `part` or fail to merge. There are no partially merged `parts` in MergeTree. -`Part` contents in MergeTree never change. Parts are immutable. They may be only deleted after the merge +Newly added `parts` either appear in the storage or fail to appear. +Storage never contains partially created parts. The same applies to merge process — `parts` are either fully +merged into a new `part` or fail to merge. There are no partially merged `parts` in MergeTree. +`Part` contents in MergeTree never change. Parts are immutable. They may be only deleted after the merge to a bigger `part` or when the `part` contents goes outside the configured `-retentionPeriod`. See [this article](https://valyala.medium.com/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) for more details. @@ -1186,7 +1148,7 @@ Retention is configured with the `-retentionPeriod` command-line flag, which tak Data is split in per-month partitions inside `<-storageDataPath>/data/{small,big}` folders. Data partitions outside the configured retention are deleted on the first day of the new month. Each partition consists of one or more data parts with the following name pattern `rowsCount_blocksCount_minTimestamp_maxTimestamp`. -Data parts outside of the configured retention are eventually deleted during +Data parts outside of the configured retention are eventually deleted during [background merge](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282). The maximum disk space usage for a given `-retentionPeriod` is going to be (`-retentionPeriod` + 1) months. @@ -1213,7 +1175,6 @@ so it could route requests from particular user to VictoriaMetrics with the desi The same scheme could be implemented for multiple tenants in [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html). See [these docs](https://docs.victoriametrics.com/guides/guide-vmcluster-multiple-retention-setup.html) for multi-retention setup details. - ## Downsampling [VictoriaMetrics Enterprise](https://victoriametrics.com/products/enterprise/) supports multi-level downsampling with `-downsampling.period` command-line flag. For example: @@ -1226,12 +1187,10 @@ Downsampling is applied independently per each time series. It can reduce disk s The downsampling can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). - ## Multi-tenancy Single-node VictoriaMetrics doesn't support multi-tenancy. Use [cluster version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy) instead. - ## Scalability and cluster version Though single-node VictoriaMetrics cannot scale to multiple nodes, it is optimized for resource usage - storage size / bandwidth / IOPS, RAM, CPU. @@ -1242,7 +1201,6 @@ So try single-node VictoriaMetrics at first and then [switch to cluster version] horizontally scalable long-term remote storage for really large Prometheus deployments. [Contact us](mailto:info@victoriametrics.com) for enterprise support. - ## Alerting It is recommended using [vmalert](https://docs.victoriametrics.com/vmalert.html) for alerting. @@ -1253,7 +1211,6 @@ Additionally, alerting can be set up with the following tools: * With Promxy - see [the corresponding docs](https://github.com/jacksontj/promxy/blob/master/README.md#how-do-i-use-alertingrecording-rules-in-promxy). * With Grafana - see [the corresponding docs](https://grafana.com/docs/alerting/rules/). - ## Security Do not forget protecting sensitive endpoints in VictoriaMetrics when exposing it to untrusted networks such as the internet. @@ -1267,6 +1224,7 @@ Consider setting the following command-line flags: * `-forceMergeAuthKey` for protecting `/internal/force_merge` endpoint. See [force merge docs](#forced-merge). * `-search.resetCacheAuthKey` for protecting `/internal/resetRollupResultCache` endpoint. See [backfilling](#backfilling) for more details. * `-configAuthKey` for protecting `/config` endpoint, since it may contain sensitive information such as passwords. + - `-pprofAuthKey` for protecting `/debug/pprof/*` endpoints, which can be used for [profiling](#profiling). Explicitly set internal network interface for TCP and UDP ports for data ingestion with Graphite and OpenTSDB formats. @@ -1275,7 +1233,6 @@ For example, substitute `-graphiteListenAddr=:2003` with `-graphiteListenAddr=/cache` directory during graceful shutdown (e.g. when VictoriaMetrics is stopped by sending `SIGINT` signal). The caches are read on the next VictoriaMetrics startup. Sometimes it is needed to remove such caches on the next startup. This can be performed by placing `reset_cache_on_startup` file inside the `<-storageDataPath>/cache` directory before the restart of VictoriaMetrics. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447) for details. - ## Cache tuning VictoriaMetrics uses various in-memory caches for faster data ingestion and query performance. The following metrics for each type of cache are exported at [`/metrics` page](#monitoring): -- `vm_cache_size_bytes` - the actual cache size -- `vm_cache_size_max_bytes` - cache size limit -- `vm_cache_requests_total` - the number of requests to the cache -- `vm_cache_misses_total` - the number of cache misses -- `vm_cache_entries` - the number of entries in the cache +* `vm_cache_size_bytes` - the actual cache size +* `vm_cache_size_max_bytes` - cache size limit +* `vm_cache_requests_total` - the number of requests to the cache +* `vm_cache_misses_total` - the number of cache misses +* `vm_cache_entries` - the number of entries in the cache Both Grafana dashboards for [single-node VictoriaMetrics](https://grafana.com/dashboards/10229) and [clustered VictoriaMetrics](https://grafana.com/grafana/dashboards/11176) @@ -1456,28 +1409,28 @@ practical scenarios and workloads. Change the defaults only if you understand th To override the default values see command-line flags with `-storage.cacheSize` prefix. See the full description of flags [here](#list-of-command-line-flags). - ## Data migration ### From VictoriaMetrics -The simplest way to migrate data from one single-node (source) to another (destination), or from one vmstorage node +The simplest way to migrate data from one single-node (source) to another (destination), or from one vmstorage node to another do the following: + 1. Stop the VictoriaMetrics (source) with `kill -INT`; -2. Copy (via [rsync](https://en.wikipedia.org/wiki/Rsync) or any other tool) the entire folder specified +2. Copy (via [rsync](https://en.wikipedia.org/wiki/Rsync) or any other tool) the entire folder specified via `-storageDataPath` from the source node to the empty folder at the destination node. -3. Once copy is done, stop the VictoriaMetrics (destination) with `kill -INT` and verify that +3. Once copy is done, stop the VictoriaMetrics (destination) with `kill -INT` and verify that its `-storageDataPath` points to the copied folder from p.2; 4. Start the VictoriaMetrics (destination). The copied data should be now available. Things to consider when copying data: + 1. Data formats between single-node and vmstorage node aren't compatible and can't be copied. 2. Copying data folder means complete replacement of the previous data on destination VictoriaMetrics. For more complex scenarios like single-to-cluster, cluster-to-single, re-sharding or migrating only a fraction of data - see [vmctl. Migrating data from VictoriaMetrics](https://docs.victoriametrics.com/vmctl.html#migrating-data-from-victoriametrics). - ### From other systems Use [vmctl](https://docs.victoriametrics.com/vmctl.html) for data migration. It supports the following data migration types: @@ -1489,7 +1442,6 @@ Use [vmctl](https://docs.victoriametrics.com/vmctl.html) for data migration. It See [vmctl docs](https://docs.victoriametrics.com/vmctl.html) for more details. - ## Backfilling VictoriaMetrics accepts historical data in arbitrary order of time via [any supported ingestion method](#how-to-import-time-series-data). @@ -1507,7 +1459,6 @@ Yet another solution is to increase `-search.cacheTimestampOffset` flag value in for data with timestamps close to the current time. Single-node VictoriaMetrics automatically resets response cache when samples with timestamps older than `now - search.cacheTimestampOffset` are ingested to it. - ## Data updates VictoriaMetrics doesn't support updating already existing sample values to new ones. It stores all the ingested data points @@ -1515,7 +1466,6 @@ for the same time series with identical timestamps. While it is possible substit [removal of old time series](#how-to-delete-time-series) and then [writing new time series](#backfilling), this approach should be used only for one-off updates. It shouldn't be used for frequent updates because of non-zero overhead related to data removal. - ## Replication Single-node VictoriaMetrics doesn't support application-level replication. Use cluster version instead. @@ -1525,7 +1475,6 @@ Storage-level replication may be offloaded to durable persistent storage such as See also [high availability docs](#high-availability) and [backup docs](#backups). - ## Backups VictoriaMetrics supports backups via [vmbackup](https://docs.victoriametrics.com/vmbackup.html) @@ -1533,19 +1482,17 @@ and [vmrestore](https://docs.victoriametrics.com/vmrestore.html) tools. We also provide [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html) tool for enterprise subscribers. Enterprise binaries can be downloaded and evaluated for free from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases). - ## Benchmarks -Note, that vendors (including VictoriaMetrics) are often biased when doing such tests. E.g. they try highlighting -the best parts of their product, while highlighting the worst parts of competing products. -So we encourage users and all independent third parties to conduct their becnhmarks for various products +Note, that vendors (including VictoriaMetrics) are often biased when doing such tests. E.g. they try highlighting +the best parts of their product, while highlighting the worst parts of competing products. +So we encourage users and all independent third parties to conduct their becnhmarks for various products they are evaluating in production and publish the results. As a reference, please see [benchmarks](https://docs.victoriametrics.com/Articles.html#benchmarks) conducted by -VictoriaMetrics team. Please also see the [helm chart](https://github.com/VictoriaMetrics/benchmark) +VictoriaMetrics team. Please also see the [helm chart](https://github.com/VictoriaMetrics/benchmark) for running ingestion benchmarks based on node_exporter metrics. - ## Profiling VictoriaMetrics provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs): @@ -1574,7 +1521,6 @@ The command for collecting CPU profile waits for 30 seconds before returning. The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). - ## Integrations * [Helm charts for single-node and cluster versions of VictoriaMetrics](https://github.com/VictoriaMetrics/helm-charts). @@ -1588,7 +1534,6 @@ The collected profiles may be analyzed with [go tool pprof](https://github.com/g * [Snap package for VictoriaMetrics](https://snapcraft.io/victoriametrics). * [vmalert-cli](https://github.com/aorfanos/vmalert-cli) - a CLI application for managing [vmalert](https://docs.victoriametrics.com/vmalert.html). - ## Third-party contributions * [Unofficial yum repository](https://copr.fedorainfracloud.org/coprs/antonpatsev/VictoriaMetrics/) ([source code](https://github.com/patsevanton/victoriametrics-rpm)) @@ -1596,12 +1541,10 @@ The collected profiles may be analyzed with [go tool pprof](https://github.com/g * [Prometheus -> VictoriaMetrics exporter #2](https://github.com/AnchorFree/tsdb-remote-write) * [Prometheus Oauth proxy](https://gitlab.com/optima_public/prometheus_oauth_proxy) - see [this article](https://medium.com/@richard.holly/powerful-saas-solution-for-detection-metrics-c67b9208d362) for details. - ## Contacts Contact us with any questions regarding VictoriaMetrics at [info@victoriametrics.com](mailto:info@victoriametrics.com). - ## Community and contributions Feel free asking any questions regarding VictoriaMetrics: @@ -1635,7 +1578,6 @@ Adhering `KISS` principle simplifies the resulting code and architecture, so it Report bugs and propose new features [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues). - ## VictoriaMetrics Logo [Zip](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/VM_logo.zip) contains three folders with different image orientations (main color and inverted version). @@ -1665,315 +1607,314 @@ Files included in each folder: * Do not change spacing, alignment, or relative locations of the design elements. * Do not change the proportions of any of the design elements or the design itself. You may resize as needed but must retain all proportions. - ## List of command-line flags Pass `-help` to VictoriaMetrics in order to see the list of supported command-line flags with their description: ``` -bigMergeConcurrency int - The maximum number of CPU cores to use for big merges. Default value is used if set to 0 + The maximum number of CPU cores to use for big merges. Default value is used if set to 0 -configAuthKey string - Authorization key for accessing /config page. It must be passed via authKey query arg + Authorization key for accessing /config page. It must be passed via authKey query arg -csvTrimTimestamp duration - Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -datadog.maxInsertRequestSize size - The maximum size in bytes of a single DataDog POST request to /api/v1/series - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864) + The maximum size in bytes of a single DataDog POST request to /api/v1/series + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864) -dedup.minScrapeInterval duration - Leave only the first sample in every time series per each discrete interval equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication and https://docs.victoriametrics.com/#downsampling + Leave only the first sample in every time series per each discrete interval equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication and https://docs.victoriametrics.com/#downsampling -deleteAuthKey string - authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series and /tags/delSeries + authKey for metrics' deletion via /api/v1/admin/tsdb/delete_series and /tags/delSeries -denyQueriesOutsideRetention - Whether to deny queries outside of the configured -retentionPeriod. When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. This may be useful when multiple data sources with distinct retentions are hidden behind query-tee + Whether to deny queries outside of the configured -retentionPeriod. When set, then /api/v1/query_range would return '503 Service Unavailable' error for queries with 'from' value outside -retentionPeriod. This may be useful when multiple data sources with distinct retentions are hidden behind query-tee -downsampling.period array - Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details - Supports an array of values separated by comma or specified via multiple flags. + Comma-separated downsampling periods in the format 'offset:period'. For example, '30d:10m' instructs to leave a single sample per 10 minutes for samples older than 30 days. See https://docs.victoriametrics.com/#downsampling for details + Supports an array of values separated by comma or specified via multiple flags. -dryRun - Whether to check only -promscrape.config and then exit. Unknown config entries aren't allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse=false command-line flag + Whether to check only -promscrape.config and then exit. Unknown config entries aren't allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse=false command-line flag -enableTCP6 - Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used + Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used -envflag.enable - Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details + Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details -envflag.prefix string - Prefix for environment variables if -envflag.enable is set + Prefix for environment variables if -envflag.enable is set -eula - By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf + By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf -finalMergeDelay duration - The delay before starting final merge for per-month partition after no new data is ingested into it. Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. Zero value disables final merge + The delay before starting final merge for per-month partition after no new data is ingested into it. Final merge may require additional disk IO and CPU resources. Final merge may increase query speed and reduce disk space usage in some cases. Zero value disables final merge -forceFlushAuthKey string - authKey, which must be passed in query string to /internal/force_flush pages + authKey, which must be passed in query string to /internal/force_flush pages -forceMergeAuthKey string - authKey, which must be passed in query string to /internal/force_merge pages + authKey, which must be passed in query string to /internal/force_merge pages -fs.disableMmap - Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() + Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() -graphiteListenAddr string - TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty + TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty -graphiteTrimTimestamp duration - Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) + Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) -http.connTimeout duration - Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) + Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) -http.disableResponseCompression - Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth + Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth -http.idleConnTimeout duration - Timeout for incoming idle http connections (default 1m0s) + Timeout for incoming idle http connections (default 1m0s) -http.maxGracefulShutdownDuration duration - The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) + The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) -http.pathPrefix string - An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus + An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus -http.shutdownDelay duration - Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers + Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers -httpAuth.password string - Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty + Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty -httpAuth.username string - Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password + Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password -httpListenAddr string - TCP address to listen for http connections (default ":8428") + TCP address to listen for http connections (default ":8428") -import.maxLineLen size - The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600) + The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600) -influx.databaseNames array - Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb - Supports an array of values separated by comma or specified via multiple flags. + Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb + Supports an array of values separated by comma or specified via multiple flags. -influx.maxLineSize size - The maximum size in bytes for a single InfluxDB line during parsing - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144) + The maximum size in bytes for a single InfluxDB line during parsing + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144) -influxDBLabel string - Default label for the DB name sent over '?db={db_name}' query parameter (default "db") + Default label for the DB name sent over '?db={db_name}' query parameter (default "db") -influxListenAddr string - TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://:8428/write + TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://:8428/write -influxMeasurementFieldSeparator string - Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") + Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") -influxSkipMeasurement - Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator' + Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator' -influxSkipSingleField - Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field + Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field -influxTrimTimestamp duration - Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -insert.maxQueueDuration duration - The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s) + The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s) -logNewSeries - Whether to log new series. This option is for debug purposes only. It can lead to performance issues when big number of new series are ingested into VictoriaMetrics + Whether to log new series. This option is for debug purposes only. It can lead to performance issues when big number of new series are ingested into VictoriaMetrics -loggerDisableTimestamps - Whether to disable writing timestamps in logs + Whether to disable writing timestamps in logs -loggerErrorsPerSecondLimit int - Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit + Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit -loggerFormat string - Format for logs. Possible values: default, json (default "default") + Format for logs. Possible values: default, json (default "default") -loggerLevel string - Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") + Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") -loggerOutput string - Output for the logs. Supported values: stderr, stdout (default "stderr") + Output for the logs. Supported values: stderr, stdout (default "stderr") -loggerTimezone string - Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") + Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") -loggerWarnsPerSecondLimit int - Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit + Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit -maxConcurrentInserts int - The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16) + The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16) -maxInsertRequestSize size - The maximum size in bytes of a single Prometheus remote_write API request - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) + The maximum size in bytes of a single Prometheus remote_write API request + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) -maxLabelValueLen int - The maximum length of label values in the accepted time series. Longer label values are truncated. In this case the vm_too_long_label_values_total metric at /metrics page is incremented (default 16384) + The maximum length of label values in the accepted time series. Longer label values are truncated. In this case the vm_too_long_label_values_total metric at /metrics page is incremented (default 16384) -maxLabelsPerTimeseries int - The maximum number of labels accepted per time series. Superfluous labels are dropped. In this case the vm_metrics_with_dropped_labels_total metric at /metrics page is incremented (default 30) + The maximum number of labels accepted per time series. Superfluous labels are dropped. In this case the vm_metrics_with_dropped_labels_total metric at /metrics page is incremented (default 30) -memory.allowedBytes size - Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -memory.allowedPercent float - Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) + Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) -metricsAuthKey string - Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings -opentsdbHTTPListenAddr string - TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty + TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty -opentsdbListenAddr string - TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty + TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty -opentsdbTrimTimestamp duration - Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) + Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) -opentsdbhttp.maxInsertRequestSize size - The maximum size of OpenTSDB HTTP put request - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) + The maximum size of OpenTSDB HTTP put request + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) -opentsdbhttpTrimTimestamp duration - Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -pprofAuthKey string - Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings -precisionBits int - The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss (default 64) + The number of precision bits to store per each value. Lower precision bits improves data compression at the cost of precision loss (default 64) -promscrape.cluster.memberNum int - The number of number in the cluster of scrapers. It must be an unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster + The number of number in the cluster of scrapers. It must be an unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster -promscrape.cluster.membersCount int - The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets + The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets -promscrape.cluster.replicationFactor int - The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1) + The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1) -promscrape.config string - Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details + Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details -promscrape.config.dryRun - Checks -promscrape.config file for errors and unsupported fields and then exits. Returns non-zero exit code on parsing errors and emits these errors to stderr. See also -promscrape.config.strictParse command-line flag. Pass -loggerLevel=ERROR if you don't need to see info messages in the output. + Checks -promscrape.config file for errors and unsupported fields and then exits. Returns non-zero exit code on parsing errors and emits these errors to stderr. See also -promscrape.config.strictParse command-line flag. Pass -loggerLevel=ERROR if you don't need to see info messages in the output. -promscrape.config.strictParse - Whether to deny unsupported fields in -promscrape.config . Set to false in order to silently skip unsupported fields (default true) + Whether to deny unsupported fields in -promscrape.config . Set to false in order to silently skip unsupported fields (default true) -promscrape.configCheckInterval duration - Interval for checking for changes in '-promscrape.config' file. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes + Interval for checking for changes in '-promscrape.config' file. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes -promscrape.consul.waitTime duration - Wait time used by Consul service discovery. Default value is used if not set + Wait time used by Consul service discovery. Default value is used if not set -promscrape.consulSDCheckInterval duration - Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) + Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) -promscrape.digitaloceanSDCheckInterval duration - Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s) + Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s) -promscrape.disableCompression - Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control + Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control -promscrape.disableKeepAlive - Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets + Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets -promscrape.discovery.concurrency int - The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100) + The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100) -promscrape.discovery.concurrentWaitTime duration - The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s) + The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s) -promscrape.dnsSDCheckInterval duration - Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config for details (default 30s) + Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config for details (default 30s) -promscrape.dockerSDCheckInterval duration - Interval for checking for changes in docker. This works only if docker_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#docker_sd_config for details (default 30s) + Interval for checking for changes in docker. This works only if docker_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#docker_sd_config for details (default 30s) -promscrape.dockerswarmSDCheckInterval duration - Interval for checking for changes in dockerswarm. This works only if dockerswarm_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config for details (default 30s) + Interval for checking for changes in dockerswarm. This works only if dockerswarm_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config for details (default 30s) -promscrape.dropOriginalLabels - Whether to drop original labels for scrape targets at /targets and /api/v1/targets pages. This may be needed for reducing memory usage when original labels for big number of scrape targets occupy big amounts of memory. Note that this reduces debuggability for improper per-target relabeling configs + Whether to drop original labels for scrape targets at /targets and /api/v1/targets pages. This may be needed for reducing memory usage when original labels for big number of scrape targets occupy big amounts of memory. Note that this reduces debuggability for improper per-target relabeling configs -promscrape.ec2SDCheckInterval duration - Interval for checking for changes in ec2. This works only if ec2_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config for details (default 1m0s) + Interval for checking for changes in ec2. This works only if ec2_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config for details (default 1m0s) -promscrape.eurekaSDCheckInterval duration - Interval for checking for changes in eureka. This works only if eureka_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config for details (default 30s) + Interval for checking for changes in eureka. This works only if eureka_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config for details (default 30s) -promscrape.fileSDCheckInterval duration - Interval for checking for changes in 'file_sd_config'. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config for details (default 5m0s) + Interval for checking for changes in 'file_sd_config'. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config for details (default 5m0s) -promscrape.gceSDCheckInterval duration - Interval for checking for changes in gce. This works only if gce_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config for details (default 1m0s) + Interval for checking for changes in gce. This works only if gce_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config for details (default 1m0s) -promscrape.httpSDCheckInterval duration - Interval for checking for changes in http endpoint service discovery. This works only if http_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config for details (default 1m0s) + Interval for checking for changes in http endpoint service discovery. This works only if http_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config for details (default 1m0s) -promscrape.kubernetes.apiServerTimeout duration - How frequently to reload the full state from Kuberntes API server (default 30m0s) + How frequently to reload the full state from Kuberntes API server (default 30m0s) -promscrape.kubernetesSDCheckInterval duration - Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config for details (default 30s) + Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config for details (default 30s) -promscrape.maxDroppedTargets int - The maximum number of droppedTargets to show at /api/v1/targets page. Increase this value if your setup drops more scrape targets during relabeling and you need investigating labels for all the dropped targets. Note that the increased number of tracked dropped targets may result in increased memory usage (default 1000) + The maximum number of droppedTargets to show at /api/v1/targets page. Increase this value if your setup drops more scrape targets during relabeling and you need investigating labels for all the dropped targets. Note that the increased number of tracked dropped targets may result in increased memory usage (default 1000) -promscrape.maxResponseHeadersSize size - The maximum size of http response headers from Prometheus scrape targets - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 4096) + The maximum size of http response headers from Prometheus scrape targets + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 4096) -promscrape.maxScrapeSize size - The maximum size of scrape response in bytes to process from Prometheus targets. Bigger responses are rejected - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16777216) + The maximum size of scrape response in bytes to process from Prometheus targets. Bigger responses are rejected + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16777216) -promscrape.minResponseSizeForStreamParse size - The minimum target response size for automatic switching to stream parsing mode, which can reduce memory usage. See https://docs.victoriametrics.com/vmagent.html#stream-parsing-mode - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 1000000) + The minimum target response size for automatic switching to stream parsing mode, which can reduce memory usage. See https://docs.victoriametrics.com/vmagent.html#stream-parsing-mode + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 1000000) -promscrape.noStaleMarkers - Whether to disable sending Prometheus stale markers for metrics when scrape target disappears. This option may reduce memory usage if stale markers aren't needed for your setup. This option also disables populating the scrape_series_added metric. See https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series + Whether to disable sending Prometheus stale markers for metrics when scrape target disappears. This option may reduce memory usage if stale markers aren't needed for your setup. This option also disables populating the scrape_series_added metric. See https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series -promscrape.openstackSDCheckInterval duration - Interval for checking for changes in openstack API server. This works only if openstack_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config for details (default 30s) + Interval for checking for changes in openstack API server. This works only if openstack_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config for details (default 30s) -promscrape.seriesLimitPerTarget int - Optional limit on the number of unique time series a single scrape target can expose. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more info + Optional limit on the number of unique time series a single scrape target can expose. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more info -promscrape.streamParse - Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is posible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control + Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is posible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control -promscrape.suppressDuplicateScrapeTargetErrors - Whether to suppress 'duplicate scrape target' errors; see https://docs.victoriametrics.com/vmagent.html#troubleshooting for details + Whether to suppress 'duplicate scrape target' errors; see https://docs.victoriametrics.com/vmagent.html#troubleshooting for details -promscrape.suppressScrapeErrors - Whether to suppress scrape errors logging. The last error for each target is always available at '/targets' page even if scrape errors logging is suppressed + Whether to suppress scrape errors logging. The last error for each target is always available at '/targets' page even if scrape errors logging is suppressed -relabelConfig string - Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal + Optional path to a file with relabeling rules, which are applied to all the ingested metrics. The path can point either to local file or to http url. See https://docs.victoriametrics.com/#relabeling for details. The config is reloaded on SIGHUP signal -relabelDebug - Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, then the metrics aren't sent to storage. This is useful for debugging the relabeling configs + Whether to log metrics before and after relabeling with -relabelConfig. If the -relabelDebug is enabled, then the metrics aren't sent to storage. This is useful for debugging the relabeling configs -retentionPeriod value - Data with timestamps outside the retentionPeriod is automatically deleted - The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1) + Data with timestamps outside the retentionPeriod is automatically deleted + The following optional suffixes are supported: h (hour), d (day), w (week), y (year). If suffix isn't set, then the duration is counted in months (default 1) -search.cacheTimestampOffset duration - The maximum duration since the current time for response data, which is always queried from the original raw data, without using the response cache. Increase this value if you see gaps in responses due to time synchronization issues between VictoriaMetrics and data sources. See also -search.disableAutoCacheReset (default 5m0s) + The maximum duration since the current time for response data, which is always queried from the original raw data, without using the response cache. Increase this value if you see gaps in responses due to time synchronization issues between VictoriaMetrics and data sources. See also -search.disableAutoCacheReset (default 5m0s) -search.disableAutoCacheReset - Whether to disable automatic response cache reset if a sample with timestamp outside -search.cacheTimestampOffset is inserted into VictoriaMetrics + Whether to disable automatic response cache reset if a sample with timestamp outside -search.cacheTimestampOffset is inserted into VictoriaMetrics -search.disableCache - Whether to disable response caching. This may be useful during data backfilling + Whether to disable response caching. This may be useful during data backfilling -search.graphiteMaxPointsPerSeries int - The maximum number of points per series Graphite render API can return (default 1000000) + The maximum number of points per series Graphite render API can return (default 1000000) -search.graphiteStorageStep duration - The interval between datapoints stored in the database. It is used at Graphite Render API handler for normalizing the interval between datapoints in case it isn't normalized. It can be overriden by sending 'storage_step' query arg to /render API or by sending the desired interval via 'Storage-Step' http header during querying /render API (default 10s) + The interval between datapoints stored in the database. It is used at Graphite Render API handler for normalizing the interval between datapoints in case it isn't normalized. It can be overriden by sending 'storage_step' query arg to /render API or by sending the desired interval via 'Storage-Step' http header during querying /render API (default 10s) -search.latencyOffset duration - The time when data points become visible in query results after the collection. Too small value can result in incomplete last points for query results (default 30s) + The time when data points become visible in query results after the collection. Too small value can result in incomplete last points for query results (default 30s) -search.logSlowQueryDuration duration - Log queries with execution time exceeding this value. Zero disables slow query logging (default 5s) + Log queries with execution time exceeding this value. Zero disables slow query logging (default 5s) -search.maxConcurrentRequests int - The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration (default 8) + The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores. See also -search.maxQueueDuration (default 8) -search.maxExportDuration duration - The maximum duration for /api/v1/export call (default 720h0m0s) + The maximum duration for /api/v1/export call (default 720h0m0s) -search.maxLookback duration - Synonym to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. See also '-search.maxStalenessInterval' flag, which has the same meaining due to historical reasons + Synonym to -search.lookback-delta from Prometheus. The value is dynamically detected from interval between time series datapoints if not set. It can be overridden on per-query basis via max_lookback arg. See also '-search.maxStalenessInterval' flag, which has the same meaining due to historical reasons -search.maxPointsPerTimeseries int - The maximum points per a single timeseries returned from /api/v1/query_range. This option doesn't limit the number of scanned raw samples in the database. The main purpose of this option is to limit the number of per-series points returned to graphing UI such as Grafana. There is no sense in setting this limit to values bigger than the horizontal resolution of the graph (default 30000) + The maximum points per a single timeseries returned from /api/v1/query_range. This option doesn't limit the number of scanned raw samples in the database. The main purpose of this option is to limit the number of per-series points returned to graphing UI such as Grafana. There is no sense in setting this limit to values bigger than the horizontal resolution of the graph (default 30000) -search.maxQueryDuration duration - The maximum duration for query execution (default 30s) + The maximum duration for query execution (default 30s) -search.maxQueryLen size - The maximum search query length in bytes - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16384) + The maximum search query length in bytes + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16384) -search.maxQueueDuration duration - The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached; see also -search.maxQueryDuration (default 10s) + The maximum time the request waits for execution when -search.maxConcurrentRequests limit is reached; see also -search.maxQueryDuration (default 10s) -search.maxSamplesPerQuery int - The maximum number of raw samples a single query can process across all time series. This protects from heavy queries, which select unexpectedly high number of raw samples. See also -search.maxSamplesPerSeries (default 1000000000) + The maximum number of raw samples a single query can process across all time series. This protects from heavy queries, which select unexpectedly high number of raw samples. See also -search.maxSamplesPerSeries (default 1000000000) -search.maxSamplesPerSeries int - The maximum number of raw samples a single query can scan per each time series. This option allows limiting memory usage (default 30000000) + The maximum number of raw samples a single query can scan per each time series. This option allows limiting memory usage (default 30000000) -search.maxStalenessInterval duration - The maximum interval for staleness calculations. By default it is automatically calculated from the median interval between samples. This flag could be useful for tuning Prometheus data model closer to Influx-style data model. See https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness for details. See also '-search.maxLookback' flag, which has the same meaning due to historical reasons + The maximum interval for staleness calculations. By default it is automatically calculated from the median interval between samples. This flag could be useful for tuning Prometheus data model closer to Influx-style data model. See https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness for details. See also '-search.maxLookback' flag, which has the same meaning due to historical reasons -search.maxStatusRequestDuration duration - The maximum duration for /api/v1/status/* requests (default 5m0s) + The maximum duration for /api/v1/status/* requests (default 5m0s) -search.maxStepForPointsAdjustment duration - The maximum step when /api/v1/query_range handler adjusts points with timestamps closer than -search.latencyOffset to the current time. The adjustment is needed because such points may contain incomplete data (default 1m0s) + The maximum step when /api/v1/query_range handler adjusts points with timestamps closer than -search.latencyOffset to the current time. The adjustment is needed because such points may contain incomplete data (default 1m0s) -search.maxTagKeys int - The maximum number of tag keys returned from /api/v1/labels (default 100000) + The maximum number of tag keys returned from /api/v1/labels (default 100000) -search.maxTagValueSuffixesPerSearch int - The maximum number of tag value suffixes returned from /metrics/find (default 100000) + The maximum number of tag value suffixes returned from /metrics/find (default 100000) -search.maxTagValues int - The maximum number of tag values returned from /api/v1/label//values (default 100000) + The maximum number of tag values returned from /api/v1/label//values (default 100000) -search.maxUniqueTimeseries int - The maximum number of unique time series each search can scan. This option allows limiting memory usage (default 300000) + The maximum number of unique time series each search can scan. This option allows limiting memory usage (default 300000) -search.minStalenessInterval duration - The minimum interval for staleness calculations. This flag could be useful for removing gaps on graphs generated from time series with irregular intervals between samples. See also '-search.maxStalenessInterval' + The minimum interval for staleness calculations. This flag could be useful for removing gaps on graphs generated from time series with irregular intervals between samples. See also '-search.maxStalenessInterval' -search.noStaleMarkers - Set this flag to true if the database doesn't contain Prometheus stale markers, so there is no need in spending additional CPU time on its handling. Staleness markers may exist only in data obtained from Prometheus scrape targets + Set this flag to true if the database doesn't contain Prometheus stale markers, so there is no need in spending additional CPU time on its handling. Staleness markers may exist only in data obtained from Prometheus scrape targets -search.queryStats.lastQueriesCount int - Query stats for /api/v1/status/top_queries is tracked on this number of last queries. Zero value disables query stats tracking (default 20000) + Query stats for /api/v1/status/top_queries is tracked on this number of last queries. Zero value disables query stats tracking (default 20000) -search.queryStats.minQueryDuration duration - The minimum duration for queries to track in query stats at /api/v1/status/top_queries. Queries with lower duration are ignored in query stats (default 1ms) + The minimum duration for queries to track in query stats at /api/v1/status/top_queries. Queries with lower duration are ignored in query stats (default 1ms) -search.resetCacheAuthKey string - Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call + Optional authKey for resetting rollup cache via /internal/resetRollupResultCache call -search.treatDotsAsIsInRegexps - Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter + Whether to treat dots as is in regexp label filters used in queries. For example, foo{bar=~"a.b.c"} will be automatically converted to foo{bar=~"a\\.b\\.c"}, i.e. all the dots in regexp filters will be automatically escaped in order to match only dot char instead of matching any char. Dots in ".+", ".*" and ".{n}" regexps aren't escaped. This option is DEPRECATED in favor of {__graphite__="a.*.c"} syntax for selecting metrics matching the given Graphite metrics filter -selfScrapeInstance string - Value for 'instance' label, which is added to self-scraped metrics (default "self") + Value for 'instance' label, which is added to self-scraped metrics (default "self") -selfScrapeInterval duration - Interval for self-scraping own metrics at /metrics page + Interval for self-scraping own metrics at /metrics page -selfScrapeJob string - Value for 'job' label, which is added to self-scraped metrics (default "victoria-metrics") + Value for 'job' label, which is added to self-scraped metrics (default "victoria-metrics") -smallMergeConcurrency int - The maximum number of CPU cores to use for small merges. Default value is used if set to 0 + The maximum number of CPU cores to use for small merges. Default value is used if set to 0 -snapshotAuthKey string - authKey, which must be passed in query string to /snapshot* pages + authKey, which must be passed in query string to /snapshot* pages -sortLabels - Whether to sort labels for incoming samples before writing them to storage. This may be needed for reducing memory usage at storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}. Enabled sorting for labels can slow down ingestion performance a bit + Whether to sort labels for incoming samples before writing them to storage. This may be needed for reducing memory usage at storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}. Enabled sorting for labels can slow down ingestion performance a bit -storage.cacheSizeIndexDBDataBlocks size - Overrides max size for indexdb/dataBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Overrides max size for indexdb/dataBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -storage.cacheSizeIndexDBIndexBlocks size - Overrides max size for indexdb/indexBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Overrides max size for indexdb/indexBlocks cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -storage.cacheSizeStorageTSID size - Overrides max size for storage/tsid cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Overrides max size for storage/tsid cache. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#cache-tuning + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -storage.maxDailySeries int - The maximum number of unique series can be added to the storage during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See also -storage.maxHourlySeries + The maximum number of unique series can be added to the storage during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See also -storage.maxHourlySeries -storage.maxHourlySeries int - The maximum number of unique series can be added to the storage during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See also -storage.maxDailySeries + The maximum number of unique series can be added to the storage during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See also -storage.maxDailySeries -storage.minFreeDiskSpaceBytes size - The minimum free disk space at -storageDataPath after which the storage stops accepting new data - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 10000000) + The minimum free disk space at -storageDataPath after which the storage stops accepting new data + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 10000000) -storageDataPath string - Path to storage data (default "victoria-metrics-data") + Path to storage data (default "victoria-metrics-data") -tls - Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set + Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set -tlsCertFile string - Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated -tlsKeyFile string - Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated -version - Show VictoriaMetrics version + Show VictoriaMetrics version ``` diff --git a/docs/url-examples.md b/docs/url-examples.md index ecdb95b27..6ab0adcdf 100644 --- a/docs/url-examples.md +++ b/docs/url-examples.md @@ -4,11 +4,10 @@ sort: 20 # VictoriaMetrics API examples - ## /api/v1/admin/tsdb/delete_series **Deletes time series from VictoriaMetrics** - + Single:
@@ -28,13 +27,13 @@ curl 'http://:8481/delete/0/prometheus/api/v1/admin/tsdb/delete_series
Additional information: -* [How to delete time series](https://docs.victoriametrics.com/#how-to-delete-time-series) +* [How to delete time series](https://docs.victoriametrics.com/#how-to-delete-time-series) ## /api/v1/export/csv **Exports CSV data from VictoriaMetrics** - + Single:
@@ -43,7 +42,7 @@ curl 'http://:8428/api/v1/export/csv?format=__name__,__val ```
- + Cluster:
@@ -53,11 +52,11 @@ curl -G 'http://:8481/select/0/prometheus/api/v1/export/csv?format=__n
-Additional information: +Additional information: + * [How to export time series](https://docs.victoriametrics.com/#how-to-export-csv-data) * [URL Format](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format) - ## /api/v1/export/native **Exports data from VictoriaMetrics in native format** @@ -81,8 +80,8 @@ curl -G 'http://:8481/select/0/prometheus/api/v1/export/native?match=v More information: -* [How to export data in native format](https://docs.victoriametrics.com/#how-to-export-data-in-native-format) +* [How to export data in native format](https://docs.victoriametrics.com/#how-to-export-data-in-native-format) ## /api/v1/import @@ -107,13 +106,13 @@ curl --data-binary "@import.txt" -X POST 'http://:8480/insert/promethe Additional information: + * [How to import time series data](https://docs.victoriametrics.com/#how-to-import-time-series-data) - -## /api/v1/import/csv +## /api/v1/import/csv **Imports CSV data to VictoriaMetrics** - + Single:
@@ -134,15 +133,15 @@ curl -d "GOOG,1.23,4.56,NYSE" 'http://:8480/insert/0/prometheus/api/v1
-Additional information: +Additional information: + * [URL format](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format) * [How to import CSV data](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-import-csv-data) - ## /datadog/api/v1/series **Sends data from DataDog agent to VM** - + Single:
@@ -198,8 +197,8 @@ echo '
Additional information: -* [How to send data from datadog agent](https://docs.victoriametrics.com/#how-to-send-data-from-datadog-agent) +* [How to send data from datadog agent](https://docs.victoriametrics.com/#how-to-send-data-from-datadog-agent) ## /graphite/metrics/find @@ -213,7 +212,7 @@ curl -G 'http://localhost:8428/graphite/metrics/find?query=vm_http_request_error ``` - + Cluster:
@@ -222,13 +221,13 @@ curl -G 'http://:8481/select/0/graphite/metrics/find?query=vm_http_req ```
- + Additional information: + * [Metrics find](https://graphite-api.readthedocs.io/en/latest/api.html#metrics-find) * [How to send data from graphite compatible agents such as statsd](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-graphite-compatible-agents-such-as-statsd) * [URL Format](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format) - ## /influx/write **Writes data with InfluxDB line protocol to VictoriaMetrics** @@ -241,7 +240,7 @@ curl -d 'measurement,tag1=value1,tag2=value2 field1=123,field2=1.23' -X POST 'ht ``` - + Cluster:
@@ -250,10 +249,10 @@ curl -d 'measurement,tag1=value1,tag2=value2 field1=123,field2=1.23' -X POST 'ht ```
- -Additional information: -* [How to send data from influxdb compatible agents such as telegraf](https://docs.victoriametrics.com/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) +Additional information: + +* [How to send data from influxdb compatible agents such as telegraf](https://docs.victoriametrics.com/#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf) ## TCP and UDP @@ -270,7 +269,7 @@ echo "put foo.bar.baz `date +%s` 123 tag1=value1 tag2=value2" | nc -N localhost ``` - + Cluster:
@@ -279,9 +278,9 @@ echo "put foo.bar.baz `date +%s` 123 tag1=value1 tag2=value2 VictoriaMetrics_Ac ```
- + Enable HTTP server for OpenTSDB /api/put requests by setting `-opentsdbHTTPListenAddr` command-line flag. - + Single:
@@ -290,7 +289,7 @@ curl -H 'Content-Type: application/json' -d '[{"metric":"foo","value":45.34},{"m ```
- + Cluster:
@@ -300,16 +299,16 @@ curl -H 'Content-Type: application/json' -d '[{"metric":"foo","value":45.34},{"m ```
- + Additional information: + * [Api http put](http://opentsdb.net/docs/build/html/api_http/put.html) * [How to send data from opentsdb compatible agents](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-opentsdb-compatible-agents) - **How to write data with Graphite plaintext protocol to VictoriaMetrics** Enable Graphite receiver in VictoriaMetrics by setting `-graphiteListenAddr` command-line flag. - + Single:
@@ -319,7 +318,7 @@ echo "foo.bar.baz;tag1=value1;tag2=value2 123 `date +%s`" | ```
- + Cluster:
@@ -332,5 +331,6 @@ echo "foo.bar.baz;tag1=value1;tag2=value2;VictoriaMetrics_AccountID=42 123 `date Additional information: `VictoriaMetrics_AccountID=42` - tag that indicated tenant ID. + * [Request handler](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/a3eafd2e7fc75776dfc19d3c68c85589454d9dce/app/vminsert/opentsdb/request_handler.go#L47) * [How to send data from graphite compatible agents such as statsd](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-send-data-from-graphite-compatible-agents-such-as-statsd) diff --git a/docs/vmagent.md b/docs/vmagent.md index 039426a55..163c8e02e 100644 --- a/docs/vmagent.md +++ b/docs/vmagent.md @@ -10,7 +10,6 @@ or any other Prometheus-compatible storage systems that support the `remote_writ vmagent - ## Motivation While VictoriaMetrics provides an efficient solution to store and observe metrics, our users needed something fast @@ -18,7 +17,6 @@ and RAM friendly to scrape metrics from Prometheus-compatible exporters into Vic Also, we found that our user's infrastructure are like snowflakes in that no two are alike. Therefore we decided to add more flexibility to `vmagent` such as the ability to push metrics additionally to pulling them. We did our best and will continue to improve `vmagent`. - ## Features * Can be used as a drop-in replacement for Prometheus for scraping targets such as [node_exporter](https://github.com/prometheus/node_exporter). See [Quick Start](#quick-start) for details. @@ -71,7 +69,6 @@ Then send InfluxDB data to `http://vmagent-host:8429`. See [these docs](https:// Pass `-help` to `vmagent` in order to see [the full list of supported command-line flags with their descriptions](#advanced-usage). - ## Configuration update `vmagent` should be restarted in order to update config options set via command-line args. @@ -79,6 +76,7 @@ Pass `-help` to `vmagent` in order to see [the full list of supported command-li `vmagent` supports multiple approaches for reloading configs from updated config files such as `-promscrape.config`, `-remoteWrite.relabelConfig` and `-remoteWrite.urlRelabelConfig`: * Sending `SUGHUP` signal to `vmagent` process: + ```bash kill -SIGHUP `pidof vmagent` ``` @@ -87,10 +85,8 @@ Pass `-help` to `vmagent` in order to see [the full list of supported command-li There is also `-promscrape.configCheckInterval` command-line option, which can be used for automatic reloading configs from updated `-promscrape.config` file. - ## Use cases - ### IoT and Edge monitoring `vmagent` can run and collect metrics in IoT and industrial networks with unreliable or scheduled connections to their remote storage. @@ -101,28 +97,24 @@ The maximum buffer size can be limited with `-remoteWrite.maxDiskUsagePerURL`. `vmagent` works on various architectures from the IoT world - 32-bit arm, 64-bit arm, ppc64, 386, amd64. See [the corresponding Makefile rules](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmagent/Makefile) for details. - ### Drop-in replacement for Prometheus If you use Prometheus only for scraping metrics from various targets and forwarding those metrics to remote storage then `vmagent` can replace Prometheus. Typically, `vmagent` requires lower amounts of RAM, CPU and network bandwidth compared with Prometheus. See [these docs](#how-to-collect-metrics-in-prometheus-format) for details. - ### Replication and high availability `vmagent` replicates the collected metrics among multiple remote storage instances configured via `-remoteWrite.url` args. If a single remote storage instance temporarily is out of service, then the collected data remains available in another remote storage instance. `vmagent` buffers the collected data in files at `-remoteWrite.tmpDataPath` until the remote storage becomes available again and then it sends the buffered data to the remote storage in order to prevent data gaps. - ### Relabeling and filtering `vmagent` can add, remove or update labels on the collected data before sending it to the remote storage. Additionally, it can remove unwanted samples via Prometheus-like relabeling before sending the collected data to remote storage. Please see [these docs](#relabeling) for details. - ### Splitting data streams among multiple systems `vmagent` supports splitting the collected data between muliple destinations with the help of `-remoteWrite.urlRelabelConfig`, @@ -130,7 +122,6 @@ which is applied independently for each configured `-remoteWrite.url` destinatio data among long-term remote storage, short-term remote storage and a real-time analytical system [built on top of Kafka](https://github.com/Telefonica/prometheus-kafka-adapter). Note that each destination can receive it's own subset of the collected data due to per-destination relabeling via `-remoteWrite.urlRelabelConfig`. - ### Prometheus remote_write proxy `vmagent` can be used as a proxy for Prometheus data sent via Prometheus `remote_write` protocol. It can accept data via the `remote_write` API @@ -138,7 +129,6 @@ at the`/api/v1/write` endpoint. Then apply relabeling and filtering and proxy it The `vmagent` can be configured to encrypt the incoming `remote_write` requests with `-tls*` command-line flags. Also, Basic Auth can be enabled for the incoming `remote_write` requests with `-httpAuth.*` command-line flags. - ### remote_write for clustered version While `vmagent` can accept data in several supported protocols (OpenTSDB, Influx, Prometheus, Graphite) and scrape data from various targets, writes are always peformed in Promethes remote_write protocol. Therefore for the [clustered version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html), `-remoteWrite.url` the command-line flag should be configured as `://:8480/insert//prometheus/api/v1/write` according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format). There is also support for multitenant writes. See [these docs](#multitenancy). @@ -147,7 +137,6 @@ While `vmagent` can accept data in several supported protocols (OpenTSDB, Influx By default `vmagent` collects the data without tenant identifiers and routes it to the configured `-remoteWrite.url`. But it can accept multitenant data if `-remoteWrite.multitenantURL` is set. In this case it accepts multitenant data at `http://vmagent:8429/insert//...` in the same way as cluster version of VictoriaMetrics does according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format) and routes it to `<-remoteWrite.multitenantURL>/insert//prometheus/api/v1/write`. If multiple `-remoteWrite.multitenantURL` command-line options are set, then `vmagent` replicates the collected data across all the configured urls. This allows using a single `vmagent` instance in front of VictoriaMetrics clusters for processing the data from all the tenants. - ## How to collect metrics in Prometheus format Specify the path to `prometheus.yml` file via `-promscrape.config` command-line flag. `vmagent` takes into account the following @@ -215,7 +204,6 @@ entries to 60s. Run `vmagent -help` in order to see default values for the `-pro The file pointed by `-promscrape.config` may contain `%{ENV_VAR}` placeholders which are substituted by the corresponding `ENV_VAR` environment variable values. - ## Loading scrape configs from multiple files `vmagent` supports loading scrape configs from multiple files specified in the `scrape_config_files` section of `-promscrape.config` file. For example, the following `-promscrape.config` instructs `vmagent` loading scrape configs from all the `*.yml` files under `configs` directory, from `single_scrape_config.yml` local file and from `https://config-server/scrape_config.yml` url: @@ -240,7 +228,6 @@ Every referred file can contain arbitrary number of [supported scrape configs](# `vmagent` dynamically reloads these files on `SIGHUP` signal or on the request to `http://vmagent:8429/-/reload`. - ## Unsupported Prometheus config sections `vmagent` doesn't support the following sections in Prometheus config file passed to `-promscrape.config` command-line flag: @@ -253,7 +240,6 @@ The list of supported service discovery types is available [here](#how-to-collec Additionally `vmagent` doesn't support `refresh_interval` option at service discovery sections. This option is substituted with `-promscrape.*CheckInterval` command-line options, which are specific per each service discovery type. See [the full list of command-line flags for vmagent](#advanced-usage). - ## Adding labels to metrics Labels can be added to metrics by the following mechanisms: @@ -265,7 +251,6 @@ Labels can be added to metrics by the following mechanisms: /path/to/vmagent -remoteWrite.label=datacenter=foobar ... ``` - ## Relabeling VictoriaMetrics components (including `vmagent`) support Prometheus-compatible relabeling. @@ -324,7 +309,6 @@ You can read more about relabeling in the following articles: * [Extracting labels from legacy metric names](https://www.robustperception.io/extracting-labels-from-legacy-metric-names) * [relabel_configs vs metric_relabel_configs](https://www.robustperception.io/relabel_configs-vs-metric_relabel_configs) - ## Prometheus staleness markers `vmagent` sends [Prometheus staleness markers](https://www.robustperception.io/staleness-and-promql) to `-remoteWrite.url` in the following cases: @@ -336,16 +320,15 @@ You can read more about relabeling in the following articles: Prometheus staleness markers' tracking needs additional memory, since it must store the previous response body per each scrape target in order to compare it to the current response body. The memory usage may be reduced by passing `-promscrape.noStaleMarkers` command-line flag to `vmagent`. This disables staleness tracking. This also disables tracking the number of new time series per each scrape with the auto-generated `scrape_series_added` metric. See [these docs](https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series) for details. - ## Stream parsing mode By default `vmagent` reads the full response body from scrape target into memory, then parses it, applies [relabeling](#relabeling) and then pushes the resulting metrics to the configured `-remoteWrite.url`. This mode works good for the majority of cases when the scrape target exposes small number of metrics (e.g. less than 10 thousand). But this mode may take big amounts of memory when the scrape target exposes big number of metrics. In this case it is recommended enabling stream parsing mode. When this mode is enabled, then `vmagent` reads response from scrape target in chunks, then immediately processes every chunk and pushes the processed metrics to remote storage. This allows saving memory when scraping targets that expose millions of metrics. Stream parsing mode is automatically enabled for scrape targets returning response bodies with sizes bigger than the `-promscrape.minResponseSizeForStreamParse` command-line flag value. Additionally, the stream parsing mode can be explicitly enabled in the following places: -- Via `-promscrape.streamParse` command-line flag. In this case all the scrape targets defined in the file pointed by `-promscrape.config` are scraped in stream parsing mode. -- Via `stream_parse: true` option at `scrape_configs` section. In this case all the scrape targets defined in this section are scraped in stream parsing mode. -- Via `__stream_parse__=true` label, which can be set via [relabeling](#relabeling) at `relabel_configs` section. In this case stream parsing mode is enabled for the corresponding scrape targets. Typical use case: to set the label via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) for targets exposing big number of metrics. +* Via `-promscrape.streamParse` command-line flag. In this case all the scrape targets defined in the file pointed by `-promscrape.config` are scraped in stream parsing mode. +* Via `stream_parse: true` option at `scrape_configs` section. In this case all the scrape targets defined in this section are scraped in stream parsing mode. +* Via `__stream_parse__=true` label, which can be set via [relabeling](#relabeling) at `relabel_configs` section. In this case stream parsing mode is enabled for the corresponding scrape targets. Typical use case: to set the label via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) for targets exposing big number of metrics. Examples: @@ -365,7 +348,6 @@ scrape_configs: Note that `sample_limit` and `series_limit` options cannot be used in stream parsing mode because the parsed data is pushed to remote storage as soon as it is parsed. - ## Scraping big number of targets A single `vmagent` instance can scrape tens of thousands of scrape targets. Sometimes this isn't enough due to limitations on CPU, network, RAM, etc. @@ -393,7 +375,6 @@ start a cluster of three `vmagent` instances, where each target is scraped by tw If each target is scraped by multiple `vmagent` instances, then data deduplication must be enabled at remote storage pointed by `-remoteWrite.url`. See [these docs](https://docs.victoriametrics.com/#deduplication) for details. - ## Scraping targets via a proxy `vmagent` supports scraping targets via http, https and socks5 proxies. Proxy address must be specified in `proxy_url` option. For example, the following scrape config instructs @@ -433,9 +414,9 @@ scrape_configs: By default `vmagent` doesn't limit the number of time series each scrape target can expose. The limit can be enforced in the following places: -- Via `-promscrape.seriesLimitPerTarget` command-line option. This limit is applied individually to all the scrape targets defined in the file pointed by `-promscrape.config`. -- Via `series_limit` config option at `scrape_config` section. This limit is applied individually to all the scrape targets defined in the given `scrape_config`. -- Via `__series_limit__` label, which can be set with [relabeling](#relabeling) at `relabel_configs` section. This limit is applied to the corresponding scrape targets. Typical use case: to set the limit via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) for targets, which may expose too high number of time series. +* Via `-promscrape.seriesLimitPerTarget` command-line option. This limit is applied individually to all the scrape targets defined in the file pointed by `-promscrape.config`. +* Via `series_limit` config option at `scrape_config` section. This limit is applied individually to all the scrape targets defined in the given `scrape_config`. +* Via `__series_limit__` label, which can be set with [relabeling](#relabeling) at `relabel_configs` section. This limit is applied to the corresponding scrape targets. Typical use case: to set the limit via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) for targets, which may expose too high number of time series. All the scraped metrics are dropped for time series exceeding the given limit. The exceeded limit can be [monitored](#monitoring) via `promscrape_series_limit_rows_dropped_total` metric. @@ -455,7 +436,6 @@ The exceeded limits can be [monitored](#monitoring) with the following metrics: These limits are approximate, so `vmagent` can underflow/overflow the limit by a small percentage (usually less than 1%). - ## Monitoring `vmagent` exports various metrics in Prometheus exposition format at `http://vmagent-host:8429/metrics` page. We recommend setting up regular scraping of this page @@ -474,7 +454,6 @@ This information may be useful for debugging target relabeling. * `http://vmagent-host:8429/ready`. This handler returns http 200 status code when `vmagent` finishes it's initialization for all service_discovery configs. It may be useful to perform `vmagent` rolling update without any scrape loss. - ## Troubleshooting * We recommend you [set up the official Grafana dashboard](#monitoring) in order to monitor the state of `vmagent'. @@ -538,12 +517,14 @@ It may be useful to perform `vmagent` rolling update without any scrape loss. See the available options below if you prefer fixing the root cause of the error: The following relabeling rule may be added to `relabel_configs` section in order to filter out pods with unneeded ports: + ```yml - action: keep_if_equal source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number] ``` The following relabeling rule may be added to `relabel_configs` section in order to filter out init container pods: + ```yml - action: drop source_labels: [__meta_kubernetes_pod_container_init] @@ -559,7 +540,6 @@ It may be useful to perform `vmagent` rolling update without any scrape loss. The enterprise version of vmagent is available for evaluation at [releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) page in `vmutils-*-enteprise.tar.gz` archives and in [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags) with tags containing `enterprise` suffix. - ### Reading metrics from Kafka [Enterprise version](https://victoriametrics.com/products/enterprise/) of `vmagent` can read metrics in various formats from Kafka messages. These formats can be configured with `-kafka.consumer.topic.defaultFormat` or `-kafka.consumer.topic.format` command-line options. The following formats are supported: @@ -595,7 +575,6 @@ topic = "influx" data_format = "influx" ``` - #### Command-line flags for Kafka consumer These command-line flags are available only in [enterprise](https://victoriametrics.com/products/enterprise/) version of `vmagent`, which can be downloaded for evaluation from [releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) page (see `vmutils-*-enteprise.tar.gz` archives) and from [docker images](https://hub.docker.com/r/victoriametrics/vmagent/tags) with tags containing `enterprise` suffix. @@ -635,7 +614,6 @@ These command-line flags are available only in [enterprise](https://victoriametr Additional Kafka options can be passed as query params to `-remoteWrite.url`. For instance, `kafka://localhost:9092/?topic=prom-rw&client.id=my-favorite-id` sets `client.id` Kafka option to `my-favorite-id`. The full list of Kafka options is available [here](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md). - #### Kafka broker authorization and authentication Two types of auth are supported: @@ -652,12 +630,10 @@ Two types of auth are supported: ./bin/vmagent -remoteWrite.url=kafka://localhost:9092/?topic=prom-rw&security.protocol=SSL -remoteWrite.tlsCAFile=/opt/ca.pem -remoteWrite.tlsCertFile=/opt/cert.pem -remoteWrite.tlsKeyFile=/opt/key.pem ``` - ## How to build from sources We recommend using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - `vmagent` is located in the `vmutils-*` archives . - ### Development build 1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.17. @@ -699,7 +675,6 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b 2. Run `make vmagent-arm-prod` or `make vmagent-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics). It builds `vmagent-arm-prod` or `vmagent-arm64-prod` binary respectively and puts it into the `bin` folder. - ## Profiling `vmagent` provides handlers for collecting the following [Go profiles](https://blog.golang.org/profiling-go-programs): @@ -728,7 +703,6 @@ The command for collecting CPU profile waits for 30 seconds before returning. The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). - ## Advanced usage `vmagent` can be fine-tuned with various command-line flags. Run `./vmagent -help` in order to see the full list of these flags with their desciptions and default values: @@ -741,314 +715,314 @@ vmagent collects metrics data via popular data ingestion protocols and routes th See the docs at https://docs.victoriametrics.com/vmagent.html . -configAuthKey string - Authorization key for accessing /config page. It must be passed via authKey query arg + Authorization key for accessing /config page. It must be passed via authKey query arg -csvTrimTimestamp duration - Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps when importing csv data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -datadog.maxInsertRequestSize size - The maximum size in bytes of a single DataDog POST request to /api/v1/series - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864) + The maximum size in bytes of a single DataDog POST request to /api/v1/series + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 67108864) -dryRun - Whether to check only config files without running vmagent. The following files are checked: -promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig . Unknown config entries aren't allowed in -promscrape.config by default. This can be changed by passing -promscrape.config.strictParse=false command-line flag + Whether to check only config files without running vmagent. The following files are checked: -promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig . Unknown config entries aren't allowed in -promscrape.config by default. This can be changed by passing -promscrape.config.strictParse=false command-line flag -enableTCP6 - Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used + Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used -envflag.enable - Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details + Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details -envflag.prefix string - Prefix for environment variables if -envflag.enable is set + Prefix for environment variables if -envflag.enable is set -eula - By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf + By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf -fs.disableMmap - Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() + Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() -graphiteListenAddr string - TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty + TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty -graphiteTrimTimestamp duration - Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) + Trim timestamps for Graphite data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) -http.connTimeout duration - Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) + Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) -http.disableResponseCompression - Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth + Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth -http.idleConnTimeout duration - Timeout for incoming idle http connections (default 1m0s) + Timeout for incoming idle http connections (default 1m0s) -http.maxGracefulShutdownDuration duration - The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) + The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) -http.pathPrefix string - An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus + An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus -http.shutdownDelay duration - Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers + Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers -httpAuth.password string - Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty + Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty -httpAuth.username string - Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password + Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password -httpListenAddr string - TCP address to listen for http connections. Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. Note that /targets and /metrics pages aren't available if -httpListenAddr='' (default ":8429") + TCP address to listen for http connections. Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. Note that /targets and /metrics pages aren't available if -httpListenAddr='' (default ":8429") -import.maxLineLen size - The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600) + The maximum length in bytes of a single line accepted by /api/v1/import; the line length can be limited with 'max_rows_per_line' query arg passed to /api/v1/export + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 104857600) -influx.databaseNames array - Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb - Supports an array of values separated by comma or specified via multiple flags. + Comma-separated list of database names to return from /query and /influx/query API. This can be needed for accepting data from Telegraf plugins such as https://github.com/fangli/fluent-plugin-influxdb + Supports an array of values separated by comma or specified via multiple flags. -influx.maxLineSize size - The maximum size in bytes for a single InfluxDB line during parsing - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144) + The maximum size in bytes for a single InfluxDB line during parsing + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 262144) -influxDBLabel string - Default label for the DB name sent over '?db={db_name}' query parameter (default "db") + Default label for the DB name sent over '?db={db_name}' query parameter (default "db") -influxListenAddr string - TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://:8429/write + TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. This flag isn't needed when ingesting data over HTTP - just send it to http://:8429/write -influxMeasurementFieldSeparator string - Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") + Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol (default "_") -influxSkipMeasurement - Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator' + Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator' -influxSkipSingleField - Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field + Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field -influxTrimTimestamp duration - Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps for InfluxDB line protocol data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -insert.maxQueueDuration duration - The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s) + The maximum duration for waiting in the queue for insert requests due to -maxConcurrentInserts (default 1m0s) -kafka.consumer.topic array - Kafka topic names for data consumption. - Supports an array of values separated by comma or specified via multiple flags. + Kafka topic names for data consumption. + Supports an array of values separated by comma or specified via multiple flags. -kafka.consumer.topic.basicAuth.password array - Optional basic auth password for -kafka.consumer.topic. Must be used in conjunction with any supported auth methods for kafka client, specified by flag -kafka.consumer.topic.options='security.protocol=SASL_SSL;sasl.mechanisms=PLAIN' - Supports an array of values separated by comma or specified via multiple flags. + Optional basic auth password for -kafka.consumer.topic. Must be used in conjunction with any supported auth methods for kafka client, specified by flag -kafka.consumer.topic.options='security.protocol=SASL_SSL;sasl.mechanisms=PLAIN' + Supports an array of values separated by comma or specified via multiple flags. -kafka.consumer.topic.basicAuth.username array - Optional basic auth username for -kafka.consumer.topic. Must be used in conjunction with any supported auth methods for kafka client, specified by flag -kafka.consumer.topic.options='security.protocol=SASL_SSL;sasl.mechanisms=PLAIN' - Supports an array of values separated by comma or specified via multiple flags. + Optional basic auth username for -kafka.consumer.topic. Must be used in conjunction with any supported auth methods for kafka client, specified by flag -kafka.consumer.topic.options='security.protocol=SASL_SSL;sasl.mechanisms=PLAIN' + Supports an array of values separated by comma or specified via multiple flags. -kafka.consumer.topic.brokers array - List of brokers to connect for given topic, e.g. -kafka.consumer.topic.broker=host-1:9092;host-2:9092 - Supports an array of values separated by comma or specified via multiple flags. + List of brokers to connect for given topic, e.g. -kafka.consumer.topic.broker=host-1:9092;host-2:9092 + Supports an array of values separated by comma or specified via multiple flags. -kafka.consumer.topic.defaultFormat string - Expected data format in the topic if -kafka.consumer.topic.format is skipped. (default "promremotewrite") + Expected data format in the topic if -kafka.consumer.topic.format is skipped. (default "promremotewrite") -kafka.consumer.topic.format array - data format for corresponding kafka topic. Valid formats: influx, prometheus, promremotewrite, graphite, jsonline - Supports an array of values separated by comma or specified via multiple flags. + data format for corresponding kafka topic. Valid formats: influx, prometheus, promremotewrite, graphite, jsonline + Supports an array of values separated by comma or specified via multiple flags. -kafka.consumer.topic.groupID array - Defines group.id for topic - Supports an array of values separated by comma or specified via multiple flags. + Defines group.id for topic + Supports an array of values separated by comma or specified via multiple flags. -kafka.consumer.topic.isGzipped array - Enables gzip setting for topic messages payload. Only prometheus, jsonline and influx formats accept gzipped messages. - Supports array of values separated by comma or specified via multiple flags. + Enables gzip setting for topic messages payload. Only prometheus, jsonline and influx formats accept gzipped messages. + Supports array of values separated by comma or specified via multiple flags. -kafka.consumer.topic.options array - Optional key=value;key1=value2 settings for topic consumer. See full configuration options at https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md. - Supports an array of values separated by comma or specified via multiple flags. + Optional key=value;key1=value2 settings for topic consumer. See full configuration options at https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md. + Supports an array of values separated by comma or specified via multiple flags. -loggerDisableTimestamps - Whether to disable writing timestamps in logs + Whether to disable writing timestamps in logs -loggerErrorsPerSecondLimit int - Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit + Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit -loggerFormat string - Format for logs. Possible values: default, json (default "default") + Format for logs. Possible values: default, json (default "default") -loggerLevel string - Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") + Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") -loggerOutput string - Output for the logs. Supported values: stderr, stdout (default "stderr") + Output for the logs. Supported values: stderr, stdout (default "stderr") -loggerTimezone string - Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") + Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") -loggerWarnsPerSecondLimit int - Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit + Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit -maxConcurrentInserts int - The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16) + The maximum number of concurrent inserts. Default value should work for most cases, since it minimizes the overhead for concurrent inserts. This option is tigthly coupled with -insert.maxQueueDuration (default 16) -maxInsertRequestSize size - The maximum size in bytes of a single Prometheus remote_write API request - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) + The maximum size in bytes of a single Prometheus remote_write API request + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) -memory.allowedBytes size - Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -memory.allowedPercent float - Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) + Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) -metricsAuthKey string - Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings -opentsdbHTTPListenAddr string - TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty + TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty -opentsdbListenAddr string - TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty + TCP and UDP address to listen for OpentTSDB metrics. Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. Usually :4242 must be set. Doesn't work if empty -opentsdbTrimTimestamp duration - Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) + Trim timestamps for OpenTSDB 'telnet put' data to this duration. Minimum practical duration is 1s. Higher duration (i.e. 1m) may be used for reducing disk space usage for timestamp data (default 1s) -opentsdbhttp.maxInsertRequestSize size - The maximum size of OpenTSDB HTTP put request - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) + The maximum size of OpenTSDB HTTP put request + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 33554432) -opentsdbhttpTrimTimestamp duration - Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) + Trim timestamps for OpenTSDB HTTP data to this duration. Minimum practical duration is 1ms. Higher duration (i.e. 1s) may be used for reducing disk space usage for timestamp data (default 1ms) -pprofAuthKey string - Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings -promscrape.cluster.memberNum int - The number of number in the cluster of scrapers. It must be an unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster + The number of number in the cluster of scrapers. It must be an unique value in the range 0 ... promscrape.cluster.membersCount-1 across scrapers in the cluster -promscrape.cluster.membersCount int - The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets + The number of members in a cluster of scrapers. Each member must have an unique -promscrape.cluster.memberNum in the range 0 ... promscrape.cluster.membersCount-1 . Each member then scrapes roughly 1/N of all the targets. By default cluster scraping is disabled, i.e. a single scraper scrapes all the targets -promscrape.cluster.replicationFactor int - The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1) + The number of members in the cluster, which scrape the same targets. If the replication factor is greater than 2, then the deduplication must be enabled at remote storage side. See https://docs.victoriametrics.com/#deduplication (default 1) -promscrape.config string - Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details + Optional path to Prometheus config file with 'scrape_configs' section containing targets to scrape. The path can point to local file and to http url. See https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter for details -promscrape.config.dryRun - Checks -promscrape.config file for errors and unsupported fields and then exits. Returns non-zero exit code on parsing errors and emits these errors to stderr. See also -promscrape.config.strictParse command-line flag. Pass -loggerLevel=ERROR if you don't need to see info messages in the output. + Checks -promscrape.config file for errors and unsupported fields and then exits. Returns non-zero exit code on parsing errors and emits these errors to stderr. See also -promscrape.config.strictParse command-line flag. Pass -loggerLevel=ERROR if you don't need to see info messages in the output. -promscrape.config.strictParse - Whether to deny unsupported fields in -promscrape.config . Set to false in order to silently skip unsupported fields (default true) + Whether to deny unsupported fields in -promscrape.config . Set to false in order to silently skip unsupported fields (default true) -promscrape.configCheckInterval duration - Interval for checking for changes in '-promscrape.config' file. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes + Interval for checking for changes in '-promscrape.config' file. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes -promscrape.consul.waitTime duration - Wait time used by Consul service discovery. Default value is used if not set + Wait time used by Consul service discovery. Default value is used if not set -promscrape.consulSDCheckInterval duration - Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) + Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) -promscrape.digitaloceanSDCheckInterval duration - Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s) + Interval for checking for changes in digital ocean. This works only if digitalocean_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#digitalocean_sd_config for details (default 1m0s) -promscrape.disableCompression - Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control + Whether to disable sending 'Accept-Encoding: gzip' request headers to all the scrape targets. This may reduce CPU usage on scrape targets at the cost of higher network bandwidth utilization. It is possible to set 'disable_compression: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control -promscrape.disableKeepAlive - Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets + Whether to disable HTTP keep-alive connections when scraping all the targets. This may be useful when targets has no support for HTTP keep-alive connection. It is possible to set 'disable_keepalive: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control. Note that disabling HTTP keep-alive may increase load on both vmagent and scrape targets -promscrape.discovery.concurrency int - The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100) + The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100) -promscrape.discovery.concurrentWaitTime duration - The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s) + The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s) -promscrape.dnsSDCheckInterval duration - Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config for details (default 30s) + Interval for checking for changes in dns. This works only if dns_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dns_sd_config for details (default 30s) -promscrape.dockerSDCheckInterval duration - Interval for checking for changes in docker. This works only if docker_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#docker_sd_config for details (default 30s) + Interval for checking for changes in docker. This works only if docker_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#docker_sd_config for details (default 30s) -promscrape.dockerswarmSDCheckInterval duration - Interval for checking for changes in dockerswarm. This works only if dockerswarm_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config for details (default 30s) + Interval for checking for changes in dockerswarm. This works only if dockerswarm_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#dockerswarm_sd_config for details (default 30s) -promscrape.dropOriginalLabels - Whether to drop original labels for scrape targets at /targets and /api/v1/targets pages. This may be needed for reducing memory usage when original labels for big number of scrape targets occupy big amounts of memory. Note that this reduces debuggability for improper per-target relabeling configs + Whether to drop original labels for scrape targets at /targets and /api/v1/targets pages. This may be needed for reducing memory usage when original labels for big number of scrape targets occupy big amounts of memory. Note that this reduces debuggability for improper per-target relabeling configs -promscrape.ec2SDCheckInterval duration - Interval for checking for changes in ec2. This works only if ec2_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config for details (default 1m0s) + Interval for checking for changes in ec2. This works only if ec2_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ec2_sd_config for details (default 1m0s) -promscrape.eurekaSDCheckInterval duration - Interval for checking for changes in eureka. This works only if eureka_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config for details (default 30s) + Interval for checking for changes in eureka. This works only if eureka_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#eureka_sd_config for details (default 30s) -promscrape.fileSDCheckInterval duration - Interval for checking for changes in 'file_sd_config'. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config for details (default 5m0s) + Interval for checking for changes in 'file_sd_config'. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#file_sd_config for details (default 5m0s) -promscrape.gceSDCheckInterval duration - Interval for checking for changes in gce. This works only if gce_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config for details (default 1m0s) + Interval for checking for changes in gce. This works only if gce_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#gce_sd_config for details (default 1m0s) -promscrape.httpSDCheckInterval duration - Interval for checking for changes in http endpoint service discovery. This works only if http_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config for details (default 1m0s) + Interval for checking for changes in http endpoint service discovery. This works only if http_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#http_sd_config for details (default 1m0s) -promscrape.kubernetes.apiServerTimeout duration - How frequently to reload the full state from Kuberntes API server (default 30m0s) + How frequently to reload the full state from Kuberntes API server (default 30m0s) -promscrape.kubernetesSDCheckInterval duration - Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config for details (default 30s) + Interval for checking for changes in Kubernetes API server. This works only if kubernetes_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config for details (default 30s) -promscrape.maxDroppedTargets int - The maximum number of droppedTargets to show at /api/v1/targets page. Increase this value if your setup drops more scrape targets during relabeling and you need investigating labels for all the dropped targets. Note that the increased number of tracked dropped targets may result in increased memory usage (default 1000) + The maximum number of droppedTargets to show at /api/v1/targets page. Increase this value if your setup drops more scrape targets during relabeling and you need investigating labels for all the dropped targets. Note that the increased number of tracked dropped targets may result in increased memory usage (default 1000) -promscrape.maxResponseHeadersSize size - The maximum size of http response headers from Prometheus scrape targets - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 4096) + The maximum size of http response headers from Prometheus scrape targets + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 4096) -promscrape.maxScrapeSize size - The maximum size of scrape response in bytes to process from Prometheus targets. Bigger responses are rejected - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16777216) + The maximum size of scrape response in bytes to process from Prometheus targets. Bigger responses are rejected + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 16777216) -promscrape.minResponseSizeForStreamParse size - The minimum target response size for automatic switching to stream parsing mode, which can reduce memory usage. See https://docs.victoriametrics.com/vmagent.html#stream-parsing-mode - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 1000000) + The minimum target response size for automatic switching to stream parsing mode, which can reduce memory usage. See https://docs.victoriametrics.com/vmagent.html#stream-parsing-mode + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 1000000) -promscrape.noStaleMarkers - Whether to disable sending Prometheus stale markers for metrics when scrape target disappears. This option may reduce memory usage if stale markers aren't needed for your setup. This option also disables populating the scrape_series_added metric. See https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series + Whether to disable sending Prometheus stale markers for metrics when scrape target disappears. This option may reduce memory usage if stale markers aren't needed for your setup. This option also disables populating the scrape_series_added metric. See https://prometheus.io/docs/concepts/jobs_instances/#automatically-generated-labels-and-time-series -promscrape.openstackSDCheckInterval duration - Interval for checking for changes in openstack API server. This works only if openstack_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config for details (default 30s) + Interval for checking for changes in openstack API server. This works only if openstack_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#openstack_sd_config for details (default 30s) -promscrape.seriesLimitPerTarget int - Optional limit on the number of unique time series a single scrape target can expose. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more info + Optional limit on the number of unique time series a single scrape target can expose. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more info -promscrape.streamParse - Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is posible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control + Whether to enable stream parsing for metrics obtained from scrape targets. This may be useful for reducing memory usage when millions of metrics are exposed per each scrape target. It is posible to set 'stream_parse: true' individually per each 'scrape_config' section in '-promscrape.config' for fine grained control -promscrape.suppressDuplicateScrapeTargetErrors - Whether to suppress 'duplicate scrape target' errors; see https://docs.victoriametrics.com/vmagent.html#troubleshooting for details + Whether to suppress 'duplicate scrape target' errors; see https://docs.victoriametrics.com/vmagent.html#troubleshooting for details -promscrape.suppressScrapeErrors - Whether to suppress scrape errors logging. The last error for each target is always available at '/targets' page even if scrape errors logging is suppressed + Whether to suppress scrape errors logging. The last error for each target is always available at '/targets' page even if scrape errors logging is suppressed -remoteWrite.basicAuth.password array - Optional basic auth password to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional basic auth password to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.basicAuth.passwordFile array - Optional path to basic auth password to use for -remoteWrite.url. The file is re-read every second. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional path to basic auth password to use for -remoteWrite.url. The file is re-read every second. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.basicAuth.username array - Optional basic auth username to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional basic auth username to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.bearerToken array - Optional bearer auth token to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional bearer auth token to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.bearerTokenFile array - Optional path to bearer token file to use for -remoteWrite.url. The token is re-read from the file every second. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional path to bearer token file to use for -remoteWrite.url. The token is re-read from the file every second. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.flushInterval duration - Interval for flushing the data to remote storage. This option takes effect only when less than 10K data points per second are pushed to -remoteWrite.url (default 1s) + Interval for flushing the data to remote storage. This option takes effect only when less than 10K data points per second are pushed to -remoteWrite.url (default 1s) -remoteWrite.label array - Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. Pass multiple -remoteWrite.label flags in order to add multiple labels to metrics before sending them to remote storage - Supports an array of values separated by comma or specified via multiple flags. + Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. Pass multiple -remoteWrite.label flags in order to add multiple labels to metrics before sending them to remote storage + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.maxBlockSize size - The maximum block size to send to remote storage. Bigger blocks may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxRowsPerBlock - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 8388608) + The maximum block size to send to remote storage. Bigger blocks may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxRowsPerBlock + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 8388608) -remoteWrite.maxDailySeries int - The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter + The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. Excess series are logged and dropped. This can be useful for limiting series churn rate. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter -remoteWrite.maxDiskUsagePerURL size - The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath for each -remoteWrite.url. When buffer size reaches the configured maximum, then old data is dropped when adding new data to the buffer. Buffered data is stored in ~500MB chunks, so the minimum practical value for this flag is 500000000. Disk usage is unlimited if the value is set to 0 - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath for each -remoteWrite.url. When buffer size reaches the configured maximum, then old data is dropped when adding new data to the buffer. Buffered data is stored in ~500MB chunks, so the minimum practical value for this flag is 500000000. Disk usage is unlimited if the value is set to 0 + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -remoteWrite.maxHourlySeries int - The maximum number of unique series vmagent can send to remote storage systems during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter + The maximum number of unique series vmagent can send to remote storage systems during the last hour. Excess series are logged and dropped. This can be useful for limiting series cardinality. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter -remoteWrite.maxRowsPerBlock int - The maximum number of samples to send in each block to remote storage. Higher number may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxBlockSize (default 10000) + The maximum number of samples to send in each block to remote storage. Higher number may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxBlockSize (default 10000) -remoteWrite.multitenantURL array - Base path for multitenant remote storage URL to write data to. See https://docs.victoriametrics.com/vmagent.html#multitenancy for details. Example url: http://:8480 . Pass multiple -remoteWrite.multitenantURL flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Base path for multitenant remote storage URL to write data to. See https://docs.victoriametrics.com/vmagent.html#multitenancy for details. Example url: http://:8480 . Pass multiple -remoteWrite.multitenantURL flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.oauth2.clientID array - Optional OAuth2 clientID to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional OAuth2 clientID to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.oauth2.clientSecret array - Optional OAuth2 clientSecret to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional OAuth2 clientSecret to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.oauth2.clientSecretFile array - Optional OAuth2 clientSecretFile to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional OAuth2 clientSecretFile to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.oauth2.scopes array - Optional OAuth2 scopes to use for -remoteWrite.url. Scopes must be delimited by ';'. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional OAuth2 scopes to use for -remoteWrite.url. Scopes must be delimited by ';'. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.oauth2.tokenUrl array - Optional OAuth2 tokenURL to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional OAuth2 tokenURL to use for -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.proxyURL array - Optional proxy URL for writing data to -remoteWrite.url. Supported proxies: http, https, socks5. Example: -remoteWrite.proxyURL=socks5://proxy:1234 - Supports an array of values separated by comma or specified via multiple flags. + Optional proxy URL for writing data to -remoteWrite.url. Supported proxies: http, https, socks5. Example: -remoteWrite.proxyURL=socks5://proxy:1234 + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.queues int - The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues isn't enough for sending high volume of collected data to remote storage. Default value is 2 * numberOfAvailableCPUs (default 8) + The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues isn't enough for sending high volume of collected data to remote storage. Default value is 2 * numberOfAvailableCPUs (default 8) -remoteWrite.rateLimit array - Optional rate limit in bytes per second for data sent to -remoteWrite.url. By default the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data is sent after temporary unavailability of the remote storage - Supports array of values separated by comma or specified via multiple flags. + Optional rate limit in bytes per second for data sent to -remoteWrite.url. By default the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data is sent after temporary unavailability of the remote storage + Supports array of values separated by comma or specified via multiple flags. -remoteWrite.relabelConfig string - Optional path to file with relabel_config entries. The path can point either to local file or to http url. These entries are applied to all the metrics before sending them to -remoteWrite.url. See https://docs.victoriametrics.com/vmagent.html#relabeling for details + Optional path to file with relabel_config entries. The path can point either to local file or to http url. These entries are applied to all the metrics before sending them to -remoteWrite.url. See https://docs.victoriametrics.com/vmagent.html#relabeling for details -remoteWrite.relabelDebug - Whether to log metrics before and after relabeling with -remoteWrite.relabelConfig. If the -remoteWrite.relabelDebug is enabled, then the metrics aren't sent to remote storage. This is useful for debugging the relabeling configs + Whether to log metrics before and after relabeling with -remoteWrite.relabelConfig. If the -remoteWrite.relabelDebug is enabled, then the metrics aren't sent to remote storage. This is useful for debugging the relabeling configs -remoteWrite.roundDigits array - Round metric values to this number of decimal digits after the point before writing them to remote storage. Examples: -remoteWrite.roundDigits=2 would round 1.236 to 1.24, while -remoteWrite.roundDigits=-1 would round 126.78 to 130. By default digits rounding is disabled. Set it to 100 for disabling it for a particular remote storage. This option may be used for improving data compression for the stored metrics - Supports array of values separated by comma or specified via multiple flags. + Round metric values to this number of decimal digits after the point before writing them to remote storage. Examples: -remoteWrite.roundDigits=2 would round 1.236 to 1.24, while -remoteWrite.roundDigits=-1 would round 126.78 to 130. By default digits rounding is disabled. Set it to 100 for disabling it for a particular remote storage. This option may be used for improving data compression for the stored metrics + Supports array of values separated by comma or specified via multiple flags. -remoteWrite.sendTimeout array - Timeout for sending a single block of data to -remoteWrite.url - Supports array of values separated by comma or specified via multiple flags. + Timeout for sending a single block of data to -remoteWrite.url + Supports array of values separated by comma or specified via multiple flags. -remoteWrite.showURL - Whether to show -remoteWrite.url in the exported metrics. It is hidden by default, since it can contain sensitive info such as auth key + Whether to show -remoteWrite.url in the exported metrics. It is hidden by default, since it can contain sensitive info such as auth key -remoteWrite.significantFigures array - The number of significant figures to leave in metric values before writing them to remote storage. See https://en.wikipedia.org/wiki/Significant_figures . Zero value saves all the significant figures. This option may be used for improving data compression for the stored metrics. See also -remoteWrite.roundDigits - Supports array of values separated by comma or specified via multiple flags. + The number of significant figures to leave in metric values before writing them to remote storage. See https://en.wikipedia.org/wiki/Significant_figures . Zero value saves all the significant figures. This option may be used for improving data compression for the stored metrics. See also -remoteWrite.roundDigits + Supports array of values separated by comma or specified via multiple flags. -remoteWrite.tlsCAFile array - Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. By default system CA is used. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. By default system CA is used. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.tlsCertFile array - Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.tlsInsecureSkipVerify array - Whether to skip tls verification when connecting to -remoteWrite.url - Supports array of values separated by comma or specified via multiple flags. + Whether to skip tls verification when connecting to -remoteWrite.url + Supports array of values separated by comma or specified via multiple flags. -remoteWrite.tlsKeyFile array - Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.tlsServerName array - Optional TLS server name to use for connections to -remoteWrite.url. By default the server name from -remoteWrite.url is used. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url - Supports an array of values separated by comma or specified via multiple flags. + Optional TLS server name to use for connections to -remoteWrite.url. By default the server name from -remoteWrite.url is used. If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.tmpDataPath string - Path to directory where temporary data for remote write component is stored. See also -remoteWrite.maxDiskUsagePerURL (default "vmagent-remotewrite-data") + Path to directory where temporary data for remote write component is stored. See also -remoteWrite.maxDiskUsagePerURL (default "vmagent-remotewrite-data") -remoteWrite.url array - Remote storage URL to write data to. It must support Prometheus remote_write API. It is recommended using VictoriaMetrics as remote storage. Example url: http://:8428/api/v1/write . Pass multiple -remoteWrite.url flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.multitenantURL - Supports an array of values separated by comma or specified via multiple flags. + Remote storage URL to write data to. It must support Prometheus remote_write API. It is recommended using VictoriaMetrics as remote storage. Example url: http://:8428/api/v1/write . Pass multiple -remoteWrite.url flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.multitenantURL + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.urlRelabelConfig array - Optional path to relabel config for the corresponding -remoteWrite.url. The path can point either to local file or to http url - Supports an array of values separated by comma or specified via multiple flags. + Optional path to relabel config for the corresponding -remoteWrite.url. The path can point either to local file or to http url + Supports an array of values separated by comma or specified via multiple flags. -remoteWrite.urlRelabelDebug array - Whether to log metrics before and after relabeling with -remoteWrite.urlRelabelConfig. If the -remoteWrite.urlRelabelDebug is enabled, then the metrics aren't sent to the corresponding -remoteWrite.url. This is useful for debugging the relabeling configs - Supports array of values separated by comma or specified via multiple flags. + Whether to log metrics before and after relabeling with -remoteWrite.urlRelabelConfig. If the -remoteWrite.urlRelabelDebug is enabled, then the metrics aren't sent to the corresponding -remoteWrite.url. This is useful for debugging the relabeling configs + Supports array of values separated by comma or specified via multiple flags. -sortLabels - Whether to sort labels for incoming samples before writing them to all the configured remote storage systems. This may be needed for reducing memory usage at remote storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}Enabled sorting for labels can slow down ingestion performance a bit + Whether to sort labels for incoming samples before writing them to all the configured remote storage systems. This may be needed for reducing memory usage at remote storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}Enabled sorting for labels can slow down ingestion performance a bit -tls - Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set + Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set -tlsCertFile string - Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated -tlsKeyFile string - Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated -version - Show VictoriaMetrics version + Show VictoriaMetrics version ``` diff --git a/docs/vmalert.md b/docs/vmalert.md index cb04a7f84..0bcd76343 100644 --- a/docs/vmalert.md +++ b/docs/vmalert.md @@ -14,6 +14,7 @@ Vmalert is heavily inspired by [Prometheus](https://prometheus.io/docs/alerting/ implementation and aims to be compatible with its syntax. ## Features + * Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB; * VictoriaMetrics [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) support and expressions validation; @@ -26,8 +27,9 @@ implementation and aims to be compatible with its syntax. * Lightweight without extra dependencies. ## Limitations + * `vmalert` execute queries against remote datasource which has reliability risks because of the network. -It is recommended to configure alerts thresholds and rules expressions with the understanding that network +It is recommended to configure alerts thresholds and rules expressions with the understanding that network requests may fail; * by default, rules execution is sequential within one group, but persistence of execution results to remote storage is asynchronous. Hence, user shouldn't rely on chaining of recording rules when result of previous @@ -36,25 +38,29 @@ recording rule is reused in the next one; ## QuickStart To build `vmalert` from sources: -``` + +```bash git clone https://github.com/VictoriaMetrics/VictoriaMetrics cd VictoriaMetrics make vmalert ``` + The build binary will be placed in `VictoriaMetrics/bin` folder. To start using `vmalert` you will need the following things: + * list of rules - PromQL/MetricsQL expressions to execute; * datasource address - reachable MetricsQL endpoint to run queries against; * notifier address [optional] - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing, -aggregating alerts, and sending notifications. Please note, notifier address also supports Consul Service Discovery via +aggregating alerts, and sending notifications. Please note, notifier address also supports Consul Service Discovery via [config file](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go). * remote write address [optional] - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations) compatible storage to persist rules and alerts state info; * remote read address [optional] - MetricsQL compatible datasource to restore alerts state from. Then configure `vmalert` accordingly: -``` + +```bash ./bin/vmalert -rule=alert.rules \ # Path to the file with rules configuration. Supports wildcard -datasource.url=http://localhost:8428 \ # PromQL compatible datasource -notifier.url=http://localhost:9093 \ # AlertManager URL (required if alerting rules are used) @@ -81,6 +87,7 @@ and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerti similar to Prometheus rules and configured using YAML. Configuration examples may be found in [testdata](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/config/testdata) folder. Every `rule` belongs to a `group` and every configuration file may contain arbitrary number of groups: + ```yaml groups: [ - ] @@ -89,6 +96,7 @@ groups: ### Groups Each group has the following attributes: + ```yaml # The name of the group. Must be unique within a file. name: @@ -140,6 +148,7 @@ or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expression. Vmal expression and then act according to the Rule type. There are two types of Rules: + * [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) - Alerting rules allow defining alert conditions via `expr` field and to send notifications to [Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty. @@ -154,6 +163,7 @@ within one group. #### Alerting rules The syntax for alerting rule is the following: + ```yaml # The name of the alert. Must be a valid metric name. alert: @@ -186,6 +196,7 @@ listed [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app #### Recording rules The syntax for recording rules is following: + ```yaml # The name of the time series to output to. Must be a valid metric name. record: @@ -202,11 +213,11 @@ labels: For recording rules to work `-remoteWrite.url` must be specified. - ### Alerts state on restarts `vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after restart of `vmalert` the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags: + * `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol. These are regular time series and maybe queried from VM just as any other time series. @@ -218,7 +229,6 @@ Both flags are required for proper state restoration. Restore process may fail i in configured `-remoteRead.url`, weren't updated in the last `1h` (controlled by `-remoteRead.lookback`) or received state doesn't match current `vmalert` rules configuration. - ### Multitenancy There are the following approaches exist for alerting and recording rules across @@ -263,10 +273,11 @@ tags at [Docker Hub](https://hub.docker.com/r/victoriametrics/vmalert/tags). ### Topology examples -The following sections are showing how `vmalert` may be used and configured -for different scenarios. +The following sections are showing how `vmalert` may be used and configured +for different scenarios. + +Please note, not all flags in examples are required: -Please note, not all flags in examples are required: * `-remoteWrite.url` and `-remoteRead.url` are optional and are needed only if you have recording rules or want to store [alerts state](#alerts-state-on-restarts) on `vmalert` restarts; * `-notifier.url` is optional and is needed only if you have alerting rules. @@ -277,6 +288,7 @@ The simplest configuration where one single-node VM server is used for rules execution, storing recording rules results and alerts state. `vmalert` configuration flags: + ``` ./bin/vmalert -rule=rules.yml \ # Path to the file with rules configuration. Supports wildcard -datasource.url=http://victoriametrics:8428 \ # VM-single addr for executing rules expressions @@ -287,16 +299,16 @@ rules execution, storing recording rules results and alerts state. vmalert single - #### Cluster VictoriaMetrics In [cluster mode](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) VictoriaMetrics has separate components for writing and reading path: `vminsert` and `vmselect` components respectively. `vmselect` is used for executing rules expressions and `vminsert` is used to persist recording rules results and alerts state. -Cluster mode could have multiple `vminsert` and `vmselect` components. +Cluster mode could have multiple `vminsert` and `vmselect` components. `vmalert` configuration flags: + ``` ./bin/vmalert -rule=rules.yml \ # Path to the file with rules configuration. Supports wildcard -datasource.url=http://vmselect:8481/select/0/prometheus # vmselect addr for executing rules expressions @@ -319,6 +331,7 @@ the same destinations, and send alert notifications to multiple configured Alertmanagers. `vmalert` configuration flags: + ``` ./bin/vmalert -rule=rules.yml \ # Path to the file with rules configuration. Supports wildcard -datasource.url=http://victoriametrics:8428 \ # VM-single addr for executing rules expressions @@ -339,9 +352,8 @@ all `vmalert`s are having the same config. Don't forget to configure [cluster mode](https://prometheus.io/docs/alerting/latest/alertmanager/) for Alertmanagers for better reliability. -This example uses single-node VM server for the sake of simplicity. -Check how to replace it with [cluster VictoriaMetrics](#cluster-victoriametrics) if needed. - +This example uses single-node VM server for the sake of simplicity. +Check how to replace it with [cluster VictoriaMetrics](#cluster-victoriametrics) if needed. #### Downsampling and aggregation via vmalert @@ -353,6 +365,7 @@ recording rules to process raw data from "hot" cluster (by applying additional t or reducing resolution) and push results to "cold" cluster. `vmalert` configuration flags: + ``` ./bin/vmalert -rule=downsampling-rules.yml \ # Path to the file with rules configuration. Supports wildcard -datasource.url=http://raw-cluster-vmselect:8481/select/0/prometheus # vmselect addr for executing recordi ng rules expressions @@ -367,19 +380,18 @@ Flags `-remoteRead.url` and `-notifier.url` are omitted since we assume only rec See also [downsampling docs](https://docs.victoriametrics.com/#downsampling). - ### Web `vmalert` runs a web-server (`-httpListenAddr`) for serving metrics and alerts endpoints: + * `http://` - UI; * `http:///api/v1/groups` - list of all loaded groups and rules; * `http:///api/v1/alerts` - list of all active alerts; -* `http:///api/v1///status" ` - get alert status by ID. +* `http:///api/v1///status"` - get alert status by ID. Used as alert source in AlertManager. * `http:///metrics` - application metrics. * `http:///-/reload` - hot configuration reload. - ## Graphite vmalert sends requests to `<-datasource.url>/render?format=json` during evaluation of alerting and recording rules @@ -399,6 +411,7 @@ data source for backfilling. In `replay` mode vmalert works as a cli-tool and exits immediately after work is done. To run vmalert in `replay` mode: + ``` ./bin/vmalert -rule=path/to/your.rules \ # path to files with rules you usually use with vmalert -datasource.url=http://localhost:8428 \ # PromQL/MetricsQL compatible datasource @@ -408,6 +421,7 @@ To run vmalert in `replay` mode: ``` The output of the command will look like the following: + ``` Replay mode: from: 2021-05-11 07:21:43 +0000 UTC # set by -replay.timeFrom @@ -451,9 +465,11 @@ The result of recording rules `replay` should match with results of normal rules The result of alerting rules `replay` is time series reflecting [alert's state](#alerts-state-on-restarts). To see if `replayed` alert has fired in the past use the following PromQL/MetricsQL expression: + ``` ALERTS{alertname="your_alertname", alertstate="firing"} ``` + Execute the query against storage which was used for `-remoteWrite.url` during the `replay`. ### Additional configuration @@ -477,7 +493,6 @@ See full description for these flags in `./vmalert --help`. * Graphite engine isn't supported yet; * `query` template function is disabled for performance reasons (might be changed in future); - ## Monitoring `vmalert` exports various metrics in Prometheus exposition format at `http://vmalert-host:8880/metrics` page. @@ -488,7 +503,6 @@ Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/1495 If you have suggestions for improvements or have found a bug - please open an issue on github or add a review to the dashboard. - ## Configuration ### Flags @@ -497,305 +511,308 @@ Pass `-help` to `vmalert` in order to see the full list of supported command-line flags with their descriptions. The shortlist of configuration flags is the following: + ``` -clusterMode - If clusterMode is enabled, then vmalert automatically adds the tenant specified in config groups to -datasource.url, -remoteWrite.url and -remoteRead.url. See https://docs.victoriametrics.com/vmalert.html#multitenancy + If clusterMode is enabled, then vmalert automatically adds the tenant specified in config groups to -datasource.url, -remoteWrite.url and -remoteRead.url. See https://docs.victoriametrics.com/vmalert.html#multitenancy -configCheckInterval duration - Interval for checking for changes in '-rule' or '-notifier.config' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. + Interval for checking for changes in '-rule' or '-notifier.config' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. -datasource.appendTypePrefix - Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL. + Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL. -datasource.basicAuth.password string - Optional basic auth password for -datasource.url + Optional basic auth password for -datasource.url -datasource.basicAuth.passwordFile string - Optional path to basic auth password to use for -datasource.url + Optional path to basic auth password to use for -datasource.url -datasource.basicAuth.username string - Optional basic auth username for -datasource.url + Optional basic auth username for -datasource.url -datasource.bearerToken string - Optional bearer auth token to use for -datasource.url. + Optional bearer auth token to use for -datasource.url. -datasource.bearerTokenFile string - Optional path to bearer token file to use for -datasource.url. + Optional path to bearer token file to use for -datasource.url. -datasource.lookback duration - Lookback defines how far into the past to look when evaluating queries. For example, if the datasource.lookback=5m then param "time" with value now()-5m will be added to every query. + Lookback defines how far into the past to look when evaluating queries. For example, if the datasource.lookback=5m then param "time" with value now()-5m will be added to every query. -datasource.maxIdleConnections int - Defines the number of idle (keep-alive connections) to each configured datasource. Consider setting this value equal to the value: groups_total * group.concurrency. Too low a value may result in a high number of sockets in TIME_WAIT state. (default 100) + Defines the number of idle (keep-alive connections) to each configured datasource. Consider setting this value equal to the value: groups_total * group.concurrency. Too low a value may result in a high number of sockets in TIME_WAIT state. (default 100) -datasource.oauth2.clientID string - Optional OAuth2 clientID to use for -datasource.url. + Optional OAuth2 clientID to use for -datasource.url. -datasource.oauth2.clientSecret string - Optional OAuth2 clientSecret to use for -datasource.url. + Optional OAuth2 clientSecret to use for -datasource.url. -datasource.oauth2.clientSecretFile string - Optional OAuth2 clientSecretFile to use for -datasource.url. + Optional OAuth2 clientSecretFile to use for -datasource.url. -datasource.oauth2.scopes string - Optional OAuth2 scopes to use for -datasource.url. Scopes must be delimited by ';' + Optional OAuth2 scopes to use for -datasource.url. Scopes must be delimited by ';' -datasource.oauth2.tokenUrl string - Optional OAuth2 tokenURL to use for -datasource.url. + Optional OAuth2 tokenURL to use for -datasource.url. -datasource.queryStep duration - queryStep defines how far a value can fallback to when evaluating queries. For example, if datasource.queryStep=15s then param "step" with value "15s" will be added to every query.If queryStep isn't specified, rule's evaluationInterval will be used instead. + queryStep defines how far a value can fallback to when evaluating queries. For example, if datasource.queryStep=15s then param "step" with value "15s" will be added to every query.If queryStep isn't specified, rule's evaluationInterval will be used instead. -datasource.queryTimeAlignment - Whether to align "time" parameter with evaluation interval.Alignment supposed to produce deterministic results despite of number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257 (default true) + Whether to align "time" parameter with evaluation interval.Alignment supposed to produce deterministic results despite of number of vmalert replicas or time they were started. See more details here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1257 (default true) -datasource.roundDigits int - Adds "round_digits" GET param to datasource requests. In VM "round_digits" limits the number of digits after the decimal point in response values. + Adds "round_digits" GET param to datasource requests. In VM "round_digits" limits the number of digits after the decimal point in response values. -datasource.tlsCAFile string - Optional path to TLS CA file to use for verifying connections to -datasource.url. By default, system CA is used + Optional path to TLS CA file to use for verifying connections to -datasource.url. By default, system CA is used -datasource.tlsCertFile string - Optional path to client-side TLS certificate file to use when connecting to -datasource.url + Optional path to client-side TLS certificate file to use when connecting to -datasource.url -datasource.tlsInsecureSkipVerify - Whether to skip tls verification when connecting to -datasource.url + Whether to skip tls verification when connecting to -datasource.url -datasource.tlsKeyFile string - Optional path to client-side TLS certificate key to use when connecting to -datasource.url + Optional path to client-side TLS certificate key to use when connecting to -datasource.url -datasource.tlsServerName string - Optional TLS server name to use for connections to -datasource.url. By default, the server name from -datasource.url is used + Optional TLS server name to use for connections to -datasource.url. By default, the server name from -datasource.url is used -datasource.url string - VictoriaMetrics or vmselect url. Required parameter. E.g. http://127.0.0.1:8428 + VictoriaMetrics or vmselect url. Required parameter. E.g. http://127.0.0.1:8428 -defaultTenant.graphite string - Default tenant for Graphite alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy + Default tenant for Graphite alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy -defaultTenant.prometheus string - Default tenant for Prometheus alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy + Default tenant for Prometheus alerting groups. See https://docs.victoriametrics.com/vmalert.html#multitenancy -disableAlertgroupLabel - Whether to disable adding group's Name as label to generated alerts and time series. + Whether to disable adding group's Name as label to generated alerts and time series. -dryRun -rule - Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified. + Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified. -enableTCP6 - Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used + Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used -envflag.enable - Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details + Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details -envflag.prefix string - Prefix for environment variables if -envflag.enable is set + Prefix for environment variables if -envflag.enable is set -eula - By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf + By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf -evaluationInterval duration - How often to evaluate the rules (default 1m0s) + How often to evaluate the rules (default 1m0s) -external.alert.source string - External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service. - eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|queryEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used + External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service. + eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|queryEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used -external.label array - Optional label in the form 'Name=value' to add to all generated recording rules and alerts. Pass multiple -label flags in order to add multiple label sets. - Supports an array of values separated by comma or specified via multiple flags. + Optional label in the form 'Name=value' to add to all generated recording rules and alerts. Pass multiple -label flags in order to add multiple label sets. + Supports an array of values separated by comma or specified via multiple flags. -external.url string - External URL is used as alert's source for sent alerts to the notifier + External URL is used as alert's source for sent alerts to the notifier -fs.disableMmap - Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() + Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() -http.connTimeout duration - Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) + Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) -http.disableResponseCompression - Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth + Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth -http.idleConnTimeout duration - Timeout for incoming idle http connections (default 1m0s) + Timeout for incoming idle http connections (default 1m0s) -http.maxGracefulShutdownDuration duration - The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) + The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) -http.pathPrefix string - An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus + An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus -http.shutdownDelay duration - Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers + Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers -httpAuth.password string - Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty + Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty -httpAuth.username string - Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password + Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password -httpListenAddr string - Address to listen for http connections (default ":8880") + Address to listen for http connections (default ":8880") -loggerDisableTimestamps - Whether to disable writing timestamps in logs + Whether to disable writing timestamps in logs -loggerErrorsPerSecondLimit int - Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit + Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit -loggerFormat string - Format for logs. Possible values: default, json (default "default") + Format for logs. Possible values: default, json (default "default") -loggerLevel string - Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") + Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") -loggerOutput string - Output for the logs. Supported values: stderr, stdout (default "stderr") + Output for the logs. Supported values: stderr, stdout (default "stderr") -loggerTimezone string - Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") + Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") -loggerWarnsPerSecondLimit int - Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit + Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit -memory.allowedBytes size - Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -memory.allowedPercent float - Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) + Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) -metricsAuthKey string - Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings -notifier.basicAuth.password array - Optional basic auth password for -notifier.url - Supports an array of values separated by comma or specified via multiple flags. + Optional basic auth password for -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.basicAuth.passwordFile array - Optional path to basic auth password file for -notifier.url - Supports an array of values separated by comma or specified via multiple flags. + Optional path to basic auth password file for -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.basicAuth.username array - Optional basic auth username for -notifier.url - Supports an array of values separated by comma or specified via multiple flags. + Optional basic auth username for -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.bearerToken array - Optional bearer token for -notifier.url - Supports an array of values separated by comma or specified via multiple flags. + Optional bearer token for -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.bearerTokenFile array - Optional path to bearer token file for -notifier.url - Supports an array of values separated by comma or specified via multiple flags. + Optional path to bearer token file for -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.config string - Path to configuration file for notifiers + Path to configuration file for notifiers -notifier.oauth2.clientID array - Optional OAuth2 clientID to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url - Supports an array of values separated by comma or specified via multiple flags. + Optional OAuth2 clientID to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.oauth2.clientSecret array - Optional OAuth2 clientSecret to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url - Supports an array of values separated by comma or specified via multiple flags. + Optional OAuth2 clientSecret to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.oauth2.clientSecretFile array - Optional OAuth2 clientSecretFile to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url - Supports an array of values separated by comma or specified via multiple flags. + Optional OAuth2 clientSecretFile to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.oauth2.scopes array - Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'. If multiple args are set, then they are applied independently for the corresponding -notifier.url - Supports an array of values separated by comma or specified via multiple flags. + Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'. If multiple args are set, then they are applied independently for the corresponding -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.oauth2.tokenUrl array - Optional OAuth2 tokenURL to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url - Supports an array of values separated by comma or specified via multiple flags. + Optional OAuth2 tokenURL to use for -notifier.url. If multiple args are set, then they are applied independently for the corresponding -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.suppressDuplicateTargetErrors - Whether to suppress 'duplicate target' errors during discovery + Whether to suppress 'duplicate target' errors during discovery -notifier.tlsCAFile array - Optional path to TLS CA file to use for verifying connections to -notifier.url. By default system CA is used - Supports an array of values separated by comma or specified via multiple flags. + Optional path to TLS CA file to use for verifying connections to -notifier.url. By default system CA is used + Supports an array of values separated by comma or specified via multiple flags. -notifier.tlsCertFile array - Optional path to client-side TLS certificate file to use when connecting to -notifier.url - Supports an array of values separated by comma or specified via multiple flags. + Optional path to client-side TLS certificate file to use when connecting to -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.tlsInsecureSkipVerify array - Whether to skip tls verification when connecting to -notifier.url - Supports array of values separated by comma or specified via multiple flags. + Whether to skip tls verification when connecting to -notifier.url + Supports array of values separated by comma or specified via multiple flags. -notifier.tlsKeyFile array - Optional path to client-side TLS certificate key to use when connecting to -notifier.url - Supports an array of values separated by comma or specified via multiple flags. + Optional path to client-side TLS certificate key to use when connecting to -notifier.url + Supports an array of values separated by comma or specified via multiple flags. -notifier.tlsServerName array - Optional TLS server name to use for connections to -notifier.url. By default the server name from -notifier.url is used - Supports an array of values separated by comma or specified via multiple flags. + Optional TLS server name to use for connections to -notifier.url. By default the server name from -notifier.url is used + Supports an array of values separated by comma or specified via multiple flags. -notifier.url array - Prometheus alertmanager URL, e.g. http://127.0.0.1:9093 - Supports an array of values separated by comma or specified via multiple flags. + Prometheus alertmanager URL, e.g. http://127.0.0.1:9093 + Supports an array of values separated by comma or specified via multiple flags. -pprofAuthKey string - Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings -promscrape.consul.waitTime duration - Wait time used by Consul service discovery. Default value is used if not set + Wait time used by Consul service discovery. Default value is used if not set -promscrape.consulSDCheckInterval duration - Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) + Interval for checking for changes in Consul. This works only if consul_sd_configs is configured in '-promscrape.config' file. See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config for details (default 30s) -promscrape.discovery.concurrency int - The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100) + The maximum number of concurrent requests to Prometheus autodiscovery API (Consul, Kubernetes, etc.) (default 100) -promscrape.discovery.concurrentWaitTime duration - The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s) + The maximum duration for waiting to perform API requests if more than -promscrape.discovery.concurrency requests are simultaneously performed (default 1m0s) -remoteRead.basicAuth.password string - Optional basic auth password for -remoteRead.url + Optional basic auth password for -remoteRead.url -remoteRead.basicAuth.passwordFile string - Optional path to basic auth password to use for -remoteRead.url + Optional path to basic auth password to use for -remoteRead.url -remoteRead.basicAuth.username string - Optional basic auth username for -remoteRead.url + Optional basic auth username for -remoteRead.url -remoteRead.bearerToken string - Optional bearer auth token to use for -remoteRead.url. + Optional bearer auth token to use for -remoteRead.url. -remoteRead.bearerTokenFile string - Optional path to bearer token file to use for -remoteRead.url. + Optional path to bearer token file to use for -remoteRead.url. -remoteRead.disablePathAppend - Whether to disable automatic appending of '/api/v1/query' path to the configured -remoteRead.url. + Whether to disable automatic appending of '/api/v1/query' path to the configured -remoteRead.url. -remoteRead.ignoreRestoreErrors - Whether to ignore errors from remote storage when restoring alerts state on startup. (default true) + Whether to ignore errors from remote storage when restoring alerts state on startup. (default true) -remoteRead.lookback duration - Lookback defines how far to look into past for alerts timeseries. For example, if lookback=1h then range from now() to now()-1h will be scanned. (default 1h0m0s) + Lookback defines how far to look into past for alerts timeseries. For example, if lookback=1h then range from now() to now()-1h will be scanned. (default 1h0m0s) -remoteRead.oauth2.clientID string - Optional OAuth2 clientID to use for -remoteRead.url. + Optional OAuth2 clientID to use for -remoteRead.url. -remoteRead.oauth2.clientSecret string - Optional OAuth2 clientSecret to use for -remoteRead.url. + Optional OAuth2 clientSecret to use for -remoteRead.url. -remoteRead.oauth2.clientSecretFile string - Optional OAuth2 clientSecretFile to use for -remoteRead.url. + Optional OAuth2 clientSecretFile to use for -remoteRead.url. -remoteRead.oauth2.scopes string - Optional OAuth2 scopes to use for -remoteRead.url. Scopes must be delimited by ';'. + Optional OAuth2 scopes to use for -remoteRead.url. Scopes must be delimited by ';'. -remoteRead.oauth2.tokenUrl string - Optional OAuth2 tokenURL to use for -remoteRead.url. + Optional OAuth2 tokenURL to use for -remoteRead.url. -remoteRead.tlsCAFile string - Optional path to TLS CA file to use for verifying connections to -remoteRead.url. By default system CA is used + Optional path to TLS CA file to use for verifying connections to -remoteRead.url. By default system CA is used -remoteRead.tlsCertFile string - Optional path to client-side TLS certificate file to use when connecting to -remoteRead.url + Optional path to client-side TLS certificate file to use when connecting to -remoteRead.url -remoteRead.tlsInsecureSkipVerify - Whether to skip tls verification when connecting to -remoteRead.url + Whether to skip tls verification when connecting to -remoteRead.url -remoteRead.tlsKeyFile string - Optional path to client-side TLS certificate key to use when connecting to -remoteRead.url + Optional path to client-side TLS certificate key to use when connecting to -remoteRead.url -remoteRead.tlsServerName string - Optional TLS server name to use for connections to -remoteRead.url. By default the server name from -remoteRead.url is used + Optional TLS server name to use for connections to -remoteRead.url. By default the server name from -remoteRead.url is used -remoteRead.url vmalert - Optional URL to VictoriaMetrics or vmselect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428. See also -remoteRead.disablePathAppend + Optional URL to VictoriaMetrics or vmselect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428. See also -remoteRead.disablePathAppend -remoteWrite.basicAuth.password string - Optional basic auth password for -remoteWrite.url + Optional basic auth password for -remoteWrite.url -remoteWrite.basicAuth.passwordFile string - Optional path to basic auth password to use for -remoteWrite.url + Optional path to basic auth password to use for -remoteWrite.url -remoteWrite.basicAuth.username string - Optional basic auth username for -remoteWrite.url + Optional basic auth username for -remoteWrite.url -remoteWrite.bearerToken string - Optional bearer auth token to use for -remoteWrite.url. + Optional bearer auth token to use for -remoteWrite.url. -remoteWrite.bearerTokenFile string - Optional path to bearer token file to use for -remoteWrite.url. + Optional path to bearer token file to use for -remoteWrite.url. -remoteWrite.concurrency int - Defines number of writers for concurrent writing into remote querier (default 1) + Defines number of writers for concurrent writing into remote querier (default 1) -remoteWrite.disablePathAppend - Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url. + Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url. -remoteWrite.flushInterval duration - Defines interval of flushes to remote write endpoint (default 5s) + Defines interval of flushes to remote write endpoint (default 5s) -remoteWrite.maxBatchSize int - Defines defines max number of timeseries to be flushed at once (default 1000) + Defines defines max number of timeseries to be flushed at once (default 1000) -remoteWrite.maxQueueSize int - Defines the max number of pending datapoints to remote write endpoint (default 100000) + Defines the max number of pending datapoints to remote write endpoint (default 100000) -remoteWrite.oauth2.clientID string - Optional OAuth2 clientID to use for -remoteWrite.url. + Optional OAuth2 clientID to use for -remoteWrite.url. -remoteWrite.oauth2.clientSecret string - Optional OAuth2 clientSecret to use for -remoteWrite.url. + Optional OAuth2 clientSecret to use for -remoteWrite.url. -remoteWrite.oauth2.clientSecretFile string - Optional OAuth2 clientSecretFile to use for -remoteWrite.url. + Optional OAuth2 clientSecretFile to use for -remoteWrite.url. -remoteWrite.oauth2.scopes string - Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'. + Optional OAuth2 scopes to use for -notifier.url. Scopes must be delimited by ';'. -remoteWrite.oauth2.tokenUrl string - Optional OAuth2 tokenURL to use for -notifier.url. + Optional OAuth2 tokenURL to use for -notifier.url. -remoteWrite.tlsCAFile string - Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. By default system CA is used + Optional path to TLS CA file to use for verifying connections to -remoteWrite.url. By default system CA is used -remoteWrite.tlsCertFile string - Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url + Optional path to client-side TLS certificate file to use when connecting to -remoteWrite.url -remoteWrite.tlsInsecureSkipVerify - Whether to skip tls verification when connecting to -remoteWrite.url + Whether to skip tls verification when connecting to -remoteWrite.url -remoteWrite.tlsKeyFile string - Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url + Optional path to client-side TLS certificate key to use when connecting to -remoteWrite.url -remoteWrite.tlsServerName string - Optional TLS server name to use for connections to -remoteWrite.url. By default the server name from -remoteWrite.url is used + Optional TLS server name to use for connections to -remoteWrite.url. By default the server name from -remoteWrite.url is used -remoteWrite.url string - Optional URL to VictoriaMetrics or vminsert where to persist alerts state and recording rules results in form of timeseries. For example, if -remoteWrite.url=http://127.0.0.1:8428 is specified, then the alerts state will be written to http://127.0.0.1:8428/api/v1/write . See also -remoteWrite.disablePathAppend + Optional URL to VictoriaMetrics or vminsert where to persist alerts state and recording rules results in form of timeseries. For example, if -remoteWrite.url=http://127.0.0.1:8428 is specified, then the alerts state will be written to http://127.0.0.1:8428/api/v1/write . See also -remoteWrite.disablePathAppend -replay.maxDatapointsPerQuery int - Max number of data points expected in one request. The higher the value, the less requests will be made during replay. (default 1000) + Max number of data points expected in one request. The higher the value, the less requests will be made during replay. (default 1000) -replay.ruleRetryAttempts int - Defines how many retries to make before giving up on rule if request for it returns an error. (default 5) + Defines how many retries to make before giving up on rule if request for it returns an error. (default 5) -replay.rulesDelay duration - Delay between rules evaluation within the group. Could be important if there are chained rules inside of the groupand processing need to wait for previous rule results to be persisted by remote storage before evaluating the next rule.Keep it equal or bigger than -remoteWrite.flushInterval. (default 1s) + Delay between rules evaluation within the group. Could be important if there are chained rules inside of the groupand processing need to wait for previous rule results to be persisted by remote storage before evaluating the next rule.Keep it equal or bigger than -remoteWrite.flushInterval. (default 1s) -replay.timeFrom string - The time filter in RFC3339 format to select time series with timestamp equal or higher than provided value. E.g. '2020-01-01T20:07:00Z' + The time filter in RFC3339 format to select time series with timestamp equal or higher than provided value. E.g. '2020-01-01T20:07:00Z' -replay.timeTo string - The time filter in RFC3339 format to select timeseries with timestamp equal or lower than provided value. E.g. '2020-01-01T20:07:00Z' + The time filter in RFC3339 format to select timeseries with timestamp equal or lower than provided value. E.g. '2020-01-01T20:07:00Z' -rule array - Path to the file with alert rules. - Supports patterns. Flag can be specified multiple times. - Examples: - -rule="/path/to/file". Path to a single file with alerting rules - -rule="dir/*.yaml" -rule="/*.yaml". Relative path to all .yaml files in "dir" folder, - absolute path to all .yaml files in root. - Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars. - Supports an array of values separated by comma or specified via multiple flags. + Path to the file with alert rules. + Supports patterns. Flag can be specified multiple times. + Examples: + -rule="/path/to/file". Path to a single file with alerting rules + -rule="dir/*.yaml" -rule="/*.yaml". Relative path to all .yaml files in "dir" folder, + absolute path to all .yaml files in root. + Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars. + Supports an array of values separated by comma or specified via multiple flags. -rule.configCheckInterval duration - Interval for checking for changes in '-rule' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. DEPRECATED - see '-configCheckInterval' instead + Interval for checking for changes in '-rule' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes. DEPRECATED - see '-configCheckInterval' instead -rule.maxResolveDuration duration - Limits the maximum duration for automatic alert expiration, which is by default equal to 3 evaluation intervals of the parent group. + Limits the maximum duration for automatic alert expiration, which is by default equal to 3 evaluation intervals of the parent group. -rule.resendDelay duration - Minimum amount of time to wait before resending an alert to notifier + Minimum amount of time to wait before resending an alert to notifier -rule.validateExpressions - Whether to validate rules expressions via MetricsQL engine (default true) + Whether to validate rules expressions via MetricsQL engine (default true) -rule.validateTemplates - Whether to validate annotation and label templates (default true) + Whether to validate annotation and label templates (default true) -tls - Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set + Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set -tlsCertFile string - Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated -tlsKeyFile string - Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated -version - Show VictoriaMetrics version + Show VictoriaMetrics version ``` ### Hot config reload + `vmalert` supports "hot" config reload via the following methods: + * send SIGHUP signal to `vmalert` process; * send GET request to `/-/reload` endpoint; * configure `-configCheckInterval` flag for periodic reload @@ -808,6 +825,7 @@ just add them in address: `-datasource.url=http://localhost:8428?nocache=1`. To set additional URL params for specific [group of rules](#Groups) modify the `params` group: + ```yaml groups: - name: TestGroup @@ -815,6 +833,7 @@ groups: denyPartialResponse: ["true"] extra_label: ["env=dev"] ``` + Please note, `params` are used only for executing rules expressions (requests to `datasource.url`). If there would be a conflict between URL params set in `datasource.url` flag and params in group definition the latter will have higher priority. @@ -822,15 +841,17 @@ the latter will have higher priority. ### Notifier configuration file Notifier also supports configuration via file specified with flag `notifier.config`: + ``` ./bin/vmalert -rule=app/vmalert/config/testdata/rules.good.rules \ - -datasource.url=http://localhost:8428 \ - -notifier.config=app/vmalert/notifier/testdata/consul.good.yaml + -datasource.url=http://localhost:8428 \ + -notifier.config=app/vmalert/notifier/testdata/consul.good.yaml ``` -The configuration file allows to configure static notifiers or discover notifiers via +The configuration file allows to configure static notifiers or discover notifiers via [Consul](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config). For example: + ``` static_configs: - targets: @@ -847,6 +868,7 @@ The list of configured or discovered Notifiers can be explored via [UI](#Web). The configuration file [specification](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/config.go) is the following: + ``` # Per-target Notifier timeout when pushing alerts. [ timeout: | default = 10s ] @@ -901,7 +923,6 @@ relabel_configs: The configuration file can be [hot-reloaded](#hot-config-reload). - ## Contributing `vmalert` is mostly designed and built by VictoriaMetrics community. @@ -912,8 +933,8 @@ software. Please keep simplicity as the main priority. It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) -- `vmalert` is located in `vmutils-*` archives there. +* `vmalert` is located in `vmutils-*` archives there. ### Development build @@ -927,7 +948,6 @@ It is recommended using 2. Run `make vmalert-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics). It builds `vmalert-prod` binary and puts it into the `bin` folder. - ### ARM build ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://blog.cloudflare.com/arm-takes-wing/). diff --git a/docs/vmauth.md b/docs/vmauth.md index e2c8ba182..74ccb7011 100644 --- a/docs/vmauth.md +++ b/docs/vmauth.md @@ -14,7 +14,7 @@ The `-auth.config` can point to either local file or to http url. Just download `vmutils-*` archive from [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases), unpack it and pass the following flag to `vmauth` binary in order to start authorizing and routing requests: -``` +```bash /path/to/vmauth -auth.config=/path/to/auth/config.yml ``` @@ -133,13 +133,13 @@ It is expected that all the backend services protected by `vmauth` are located i Do not transfer Basic Auth headers in plaintext over untrusted networks. Enable https. This can be done by passing the following `-tls*` command-line flags to `vmauth`: -``` +```bash -tls - Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set + Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set -tlsCertFile string - Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs, since RSA certs are slow + Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs, since RSA certs are slow -tlsKeyFile string - Path to file with TLS key. Used only if -tls is set + Path to file with TLS key. Used only if -tls is set ``` Alternatively, [https termination proxy](https://en.wikipedia.org/wiki/TLS_termination_proxy) may be put in front of `vmauth`. @@ -221,7 +221,7 @@ The collected profiles may be analyzed with [go tool pprof](https://github.com/g Pass `-help` command-line arg to `vmauth` in order to see all the configuration options: -``` +```bash ./vmauth -help vmauth authenticates and authorizes incoming requests and proxies them to VictoriaMetrics. @@ -229,70 +229,70 @@ vmauth authenticates and authorizes incoming requests and proxies them to Victor See the docs at https://docs.victoriametrics.com/vmauth.html . -auth.config string - Path to auth config. It can point either to local file or to http url. See https://docs.victoriametrics.com/vmauth.html for details on the format of this auth config + Path to auth config. It can point either to local file or to http url. See https://docs.victoriametrics.com/vmauth.html for details on the format of this auth config -enableTCP6 - Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used + Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used -envflag.enable - Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details + Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details -envflag.prefix string - Prefix for environment variables if -envflag.enable is set + Prefix for environment variables if -envflag.enable is set -eula - By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf + By specifying this flag, you confirm that you have an enterprise license and accept the EULA https://victoriametrics.com/assets/VM_EULA.pdf -fs.disableMmap - Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() + Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() -http.connTimeout duration - Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) + Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) -http.disableResponseCompression - Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth + Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth -http.idleConnTimeout duration - Timeout for incoming idle http connections (default 1m0s) + Timeout for incoming idle http connections (default 1m0s) -http.maxGracefulShutdownDuration duration - The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) + The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) -http.pathPrefix string - An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus + An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus -http.shutdownDelay duration - Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers + Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers -httpAuth.password string - Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty + Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty -httpAuth.username string - Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password + Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password -httpListenAddr string - TCP address to listen for http connections (default ":8427") + TCP address to listen for http connections (default ":8427") -logInvalidAuthTokens - Whether to log requests with invalid auth tokens. Such requests are always counted at vmauth_http_request_errors_total{reason="invalid_auth_token"} metric, which is exposed at /metrics page + Whether to log requests with invalid auth tokens. Such requests are always counted at vmauth_http_request_errors_total{reason="invalid_auth_token"} metric, which is exposed at /metrics page -loggerDisableTimestamps - Whether to disable writing timestamps in logs + Whether to disable writing timestamps in logs -loggerErrorsPerSecondLimit int - Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit + Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit -loggerFormat string - Format for logs. Possible values: default, json (default "default") + Format for logs. Possible values: default, json (default "default") -loggerLevel string - Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") + Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") -loggerOutput string - Output for the logs. Supported values: stderr, stdout (default "stderr") + Output for the logs. Supported values: stderr, stdout (default "stderr") -loggerTimezone string - Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") + Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") -loggerWarnsPerSecondLimit int - Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit + Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit -maxIdleConnsPerBackend int - The maximum number of idle connections vmauth can open per each backend host (default 100) + The maximum number of idle connections vmauth can open per each backend host (default 100) -memory.allowedBytes size - Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -memory.allowedPercent float - Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) + Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) -metricsAuthKey string - Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings -pprofAuthKey string - Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings -reloadAuthKey string - Auth key for /-/reload http endpoint. It must be passed as authKey=... + Auth key for /-/reload http endpoint. It must be passed as authKey=... -tls - Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set + Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set -tlsCertFile string - Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower. The provided certificate file is automatically re-read every second, so it can be dynamically updated -tlsKeyFile string - Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated + Path to file with TLS key. Used only if -tls is set. The provided key file is automatically re-read every second, so it can be dynamically updated -version - Show VictoriaMetrics version + Show VictoriaMetrics version ``` diff --git a/docs/vmbackup.md b/docs/vmbackup.md index de919f2d8..f5e1b1d70 100644 --- a/docs/vmbackup.md +++ b/docs/vmbackup.md @@ -26,14 +26,13 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time- See also [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager.html) tool built on top of `vmbackup`. This tool simplifies creation of hourly, daily, weekly and monthly backups. - ## Use cases ### Regular backups Regular backup can be performed with the following command: -``` +```bash vmbackup -storageDataPath= -snapshotName= -dst=gs:/// ``` @@ -43,36 +42,33 @@ vmbackup -storageDataPath= -snapshotName=` is an already existing name for [GCS bucket](https://cloud.google.com/storage/docs/creating-buckets). * `` is the destination path where new backup will be placed. - ### Regular backups with server-side copy from existing backup If the destination GCS bucket already contains the previous backup at `-origin` path, then new backup can be sped up with the following command: -``` +```bash vmbackup -storageDataPath= -snapshotName= -dst=gs:/// -origin=gs:/// ``` It saves time and network bandwidth costs by performing server-side copy for the shared data from the `-origin` to `-dst`. - ### Incremental backups Incremental backups are performed if `-dst` points to an already existing backup. In this case only new data is uploaded to remote storage. It saves time and network bandwidth costs when working with big backups: -``` +```bash vmbackup -storageDataPath= -snapshotName= -dst=gs:/// ``` - ### Smart backups Smart backups mean storing full daily backups into `YYYYMMDD` folders and creating incremental hourly backup into `latest` folder: * Run the following command every hour: -``` +```bash vmbackup -snapshotName= -dst=gs:///latest ``` @@ -81,13 +77,12 @@ The command will upload only changed data to `gs:///latest`. * Run the following command once a day: -``` +```bash vmbackup -snapshotName= -dst=gs:/// -origin=gs:///latest ``` Where `` is the snapshot for the last day ``. - This apporach saves network bandwidth costs on hourly backups (since they are incremental) and allows recovering data from either the last hour (`latest` backup) or from any day (`YYYYMMDD` backups). Note that hourly backup shouldn't run when creating daily backup. @@ -95,7 +90,6 @@ Do not forget to remove old snapshots and backups when they are no longer needed See also [vmbackupmanager tool](https://docs.victoriametrics.com/vmbackupmanager.html) for automating smart backups. - ## How does it work? The backup algorithm is the following: @@ -112,16 +106,15 @@ Such splitting minimizes the amounts of data to re-transfer after temporary erro `vmbackup` relies on [instant snapshot](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282) properties: -- All the files in the snapshot are immutable. -- Old files are periodically merged into new files. -- Smaller files have higher probability to be merged. -- Consecutive snapshots share many identical files. +* All the files in the snapshot are immutable. +* Old files are periodically merged into new files. +* Smaller files have higher probability to be merged. +* Consecutive snapshots share many identical files. These properties allow performing fast and cheap incremental backups and server-side copying from `-origin` paths. See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883) for more details. `vmbackup` can work improperly or slowly when these properties are violated. - ## Troubleshooting * If the backup is slow, then try setting higher value for `-concurrency` flag. This will increase the number of concurrent workers that upload data to backup storage. @@ -130,15 +123,14 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time- * Backups created from [single-node VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html) cannot be restored at [cluster VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) and vice versa. - ## Advanced usage - * Obtaining credentials from a file. Add flag `-credsFilePath=/etc/credentials` with the following content: for s3 (aws, minio or other s3 compatible storages): + ```bash [default] aws_access_key_id=theaccesskey @@ -146,6 +138,7 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time- ``` for gce cloud storage: + ```json { "type": "service_account", @@ -163,7 +156,8 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time- * Usage with s3 custom url endpoint. It is possible to use `vmbackup` with s3 compatible storages like minio, cloudian, etc. You have to add a custom url endpoint via flag: -``` + +```bash # for minio -customS3Endpoint=http://localhost:9000 @@ -173,102 +167,100 @@ See [this article](https://medium.com/@valyala/speeding-up-backups-for-big-time- * Run `vmbackup -help` in order to see all the available options: -``` +```bash -concurrency int - The number of concurrent workers. Higher concurrency may reduce backup duration (default 10) + The number of concurrent workers. Higher concurrency may reduce backup duration (default 10) -configFilePath string - Path to file with S3 configs. Configs are loaded from default location if not set. - See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html + Path to file with S3 configs. Configs are loaded from default location if not set. + See https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html -configProfile string - Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used + Profile name for S3 configs. If no set, the value of the environment variable will be loaded (AWS_PROFILE or AWS_DEFAULT_PROFILE), or if both not set, DefaultSharedConfigProfile is used -credsFilePath string - Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set. - See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html + Path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set. + See https://cloud.google.com/iam/docs/creating-managing-service-account-keys and https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html -customS3Endpoint string - Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set + Custom S3 endpoint for use with S3-compatible storages (e.g. MinIO). S3 is used if not set -dst string - Where to put the backup on the remote storage. Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir - -dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded + Where to put the backup on the remote storage. Example: gs://bucket/path/to/backup/dir, s3://bucket/path/to/backup/dir or fs:///path/to/local/backup/dir + -dst can point to the previous backup. In this case incremental backup is performed, i.e. only changed data is uploaded -enableTCP6 - Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used + Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used -envflag.enable - Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details + Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details -envflag.prefix string - Prefix for environment variables if -envflag.enable is set + Prefix for environment variables if -envflag.enable is set -fs.disableMmap - Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() + Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread() -http.connTimeout duration - Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) + Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s) -http.disableResponseCompression - Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth + Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth -http.idleConnTimeout duration - Timeout for incoming idle http connections (default 1m0s) + Timeout for incoming idle http connections (default 1m0s) -http.maxGracefulShutdownDuration duration - The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) + The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s) -http.pathPrefix string - An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus + An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus -http.shutdownDelay duration - Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers + Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers -httpAuth.password string - Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty + Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty -httpAuth.username string - Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password + Username for HTTP Basic Auth. The authentication is disabled if empty. See also -httpAuth.password -httpListenAddr string - TCP address for exporting metrics at /metrics page (default ":8420") + TCP address for exporting metrics at /metrics page (default ":8420") -loggerDisableTimestamps - Whether to disable writing timestamps in logs + Whether to disable writing timestamps in logs -loggerErrorsPerSecondLimit int - Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit + Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit -loggerFormat string - Format for logs. Possible values: default, json (default "default") + Format for logs. Possible values: default, json (default "default") -loggerLevel string - Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") + Minimum level of errors to log. Possible values: INFO, WARN, ERROR, FATAL, PANIC (default "INFO") -loggerOutput string - Output for the logs. Supported values: stderr, stdout (default "stderr") + Output for the logs. Supported values: stderr, stdout (default "stderr") -loggerTimezone string - Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") + Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC") -loggerWarnsPerSecondLimit int - Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit + Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit -maxBytesPerSecond size - The maximum upload speed. There is no limit if it is set to 0 - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + The maximum upload speed. There is no limit if it is set to 0 + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -memory.allowedBytes size - Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage - Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) + Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage + Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0) -memory.allowedPercent float - Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) + Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60) -metricsAuthKey string - Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings -origin string - Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups + Optional origin directory on the remote storage with old backup for server-side copying when performing full backup. This speeds up full backups -pprofAuthKey string - Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings + Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings -s3ForcePathStyle - Prefixing endpoint with bucket name when set false, true by default. (default true) + Prefixing endpoint with bucket name when set false, true by default. (default true) -snapshot.createURL string - VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup. Example: http://victoriametrics:8428/snapshot/create . There is no need in setting -snapshotName if -snapshot.createURL is set + VictoriaMetrics create snapshot url. When this is given a snapshot will automatically be created during backup. Example: http://victoriametrics:8428/snapshot/create . There is no need in setting -snapshotName if -snapshot.createURL is set -snapshot.deleteURL string - VictoriaMetrics delete snapshot url. Optional. Will be generated from -snapshot.createURL if not provided. All created snapshots will be automatically deleted. Example: http://victoriametrics:8428/snapshot/delete + VictoriaMetrics delete snapshot url. Optional. Will be generated from -snapshot.createURL if not provided. All created snapshots will be automatically deleted. Example: http://victoriametrics:8428/snapshot/delete -snapshotName string - Name for the snapshot to backup. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots. There is no need in setting -snapshotName if -snapshot.createURL is set + Name for the snapshot to backup. See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots. There is no need in setting -snapshotName if -snapshot.createURL is set -storageDataPath string - Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage (default "victoria-metrics-data") + Path to VictoriaMetrics data. Must match -storageDataPath from VictoriaMetrics or vmstorage (default "victoria-metrics-data") -tls - Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set + Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set -tlsCertFile string - Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower + Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower -tlsKeyFile string - Path to file with TLS key. Used only if -tls is set + Path to file with TLS key. Used only if -tls is set -version - Show VictoriaMetrics version + Show VictoriaMetrics version ``` - ## How to build from sources It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - see `vmutils-*` archives there. - ### Development build 1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.17. diff --git a/docs/vmbackupmanager.md b/docs/vmbackupmanager.md index da1c159ef..68692a85b 100644 --- a/docs/vmbackupmanager.md +++ b/docs/vmbackupmanager.md @@ -13,11 +13,10 @@ The required flags for running the service are as follows: * -eula - should be true and means that you have the legal right to run a backup manager. That can either be a signed contract or an email with confirmation to run the service in a trial period * -storageDataPath - path to VictoriaMetrics or vmstorage data path to make backup from -* -snapshot.createURL - VictoriaMetrics creates snapshot URL which will automatically be created during backup. Example: http://victoriametrics:8428/snapshot/create +* -snapshot.createURL - VictoriaMetrics creates snapshot URL which will automatically be created during backup. Example: * -dst - backup destination at s3, gcs or local filesystem * -credsFilePath - path to file with GCS or S3 credentials. Credentials are loaded from default locations if not set. See [https://cloud.google.com/iam/docs/creating-managing-service-account-keys](https://cloud.google.com/iam/docs/creating-managing-service-account-keys) and [https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html](https://docs.aws.amazon.com/general/latest/gr/aws-security-credentials.html) - Backup schedule is controlled by the following flags: * -disableHourly - disable hourly run. Default false @@ -27,7 +26,6 @@ Backup schedule is controlled by the following flags: By default, all flags are turned on and Backup Manager backups data every hour for every interval (hourly, daily, weekly and monthly). - The backup manager creates the following directory hierarchy at **-dst**: * /latest/ - contains the latest backup @@ -36,7 +34,6 @@ The backup manager creates the following directory hierarchy at **-dst**: * /weekly/ - contains weekly backups. Each backup is named as *YYYY-WW* * /monthly/ - contains monthly backups. Each backup is named as *YYYY-MM* - To get the full list of supported flags please run the following command: ```console @@ -52,7 +49,6 @@ There are two flags which could help with performance tuning: * -maxBytesPerSecond - the maximum upload speed. There is no limit if it is set to 0 * -concurrency - The number of concurrent workers. Higher concurrency may improve upload speed (default 10) - ## Example of Usage GCS and cluster version. You need to have a credentials file in json format with following structure @@ -100,11 +96,11 @@ info VictoriaMetrics/lib/storage/storage.go:319 deleted snapshot "/vmstora The result on the GCS bucket -- The root folder +* The root folder ![root](vmbackupmanager_root_folder.png) -- The latest folder +* The latest folder ![latest](vmbackupmanager_latest_folder.png) @@ -123,7 +119,6 @@ Let’s assume we have a backup manager collecting daily backups for the past 10 ![daily](vmbackupmanager_rp_daily_1.png) - We enable backup retention policy for backup manager by using following configuration: ```console diff --git a/docs/vmctl.md b/docs/vmctl.md index b1df62e8a..a0e581dbe 100644 --- a/docs/vmctl.md +++ b/docs/vmctl.md @@ -7,18 +7,20 @@ sort: 8 VictoriaMetrics command-line tool Features: + - [x] Prometheus: migrate data from Prometheus to VictoriaMetrics using snapshot API - [x] Thanos: migrate data from Thanos to VictoriaMetrics - [ ] ~~Prometheus: migrate data from Prometheus to VictoriaMetrics by query~~(discarded) - [x] InfluxDB: migrate data from InfluxDB to VictoriaMetrics - [x] OpenTSDB: migrate data from OpenTSDB to VictoriaMetrics -- [ ] Storage Management: data re-balancing between nodes +- [ ] Storage Management: data re-balancing between nodes -vmctl acts as a proxy between data source ([Prometheus](#migrating-data-from-prometheus), +vmctl acts as a proxy between data source ([Prometheus](#migrating-data-from-prometheus), [InfluxDB](#migrating-data-from-influxdb-1x), [VictoriaMetrics](##migrating-data-from-victoriametrics), etc.) -and destination - VictoriaMetrics single or cluster version. To see the full list of supported modes +and destination - VictoriaMetrics single or cluster version. To see the full list of supported modes run the following command: -``` + +```bash ./vmctl --help NAME: vmctl - VictoriaMetrics command-line tool @@ -35,6 +37,7 @@ COMMANDS: Each mode has its own unique set of flags specific (e.g. prefixed with `influx` for influx mode) to the data source and common list of flags for destination (prefixed with `vm` for VictoriaMetrics): + ``` ./vmctl influx --help OPTIONS: @@ -50,10 +53,11 @@ Please note, that vmctl performs initial readiness check for the given address b ``` When doing a migration user needs to specify flags for source (where and how to fetch data) and for -destination (where to migrate data). Every mode has additional details and nuances, please see +destination (where to migrate data). Every mode has additional details and nuances, please see them below in corresponding sections. For the destination flags see the full description by running the following command: + ``` ./vmctl influx --help | grep vm- ``` @@ -63,14 +67,13 @@ has additional sections with description below. Details about tweaking and adjus are explained in [Tuning](#tuning) section. Please note, that if you're going to import data into VictoriaMetrics cluster do not -forget to specify the `--vm-account-id` flag. See more details for cluster version +forget to specify the `--vm-account-id` flag. See more details for cluster version [here](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster). ## Articles -* [How to migrate data from Prometheus](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-d44a6728f043) -* [How to migrate data from Prometheus. Filtering and modifying time series](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-filtering-and-modifying-time-series-6d40cea4bf21) - +- [How to migrate data from Prometheus](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-d44a6728f043) +- [How to migrate data from Prometheus. Filtering and modifying time series](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-filtering-and-modifying-time-series-6d40cea4bf21) ## Migrating data from OpenTSDB @@ -83,16 +86,21 @@ See `./vmctl opentsdb --help` for details and full list of flags. OpenTSDB migration works like so: 1. Find metrics based on selected filters (or the default filter set ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']) - * e.g. `curl -Ss "http://opentsdb:4242/api/suggest?type=metrics&q=sys"` + +- e.g. `curl -Ss "http://opentsdb:4242/api/suggest?type=metrics&q=sys"` + 2. Find series associated with each returned metric - * e.g. `curl -Ss "http://opentsdb:4242/api/search/lookup?m=system.load5&limit=1000000"` + +- e.g. `curl -Ss "http://opentsdb:4242/api/search/lookup?m=system.load5&limit=1000000"` + 3. Download data for each series in chunks defined in the CLI switches - * e.g. `-retention=sum-1m-avg:1h:90d` == - * `curl -Ss "http://opentsdb:4242/api/query?start=1h-ago&end=now&m=sum:1m-avg-none:system.load5\{host=host1\}"` - * `curl -Ss "http://opentsdb:4242/api/query?start=2h-ago&end=1h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"` - * `curl -Ss "http://opentsdb:4242/api/query?start=3h-ago&end=2h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"` - * ... - * `curl -Ss "http://opentsdb:4242/api/query?start=2160h-ago&end=2159h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"` + +- e.g. `-retention=sum-1m-avg:1h:90d` == + - `curl -Ss "http://opentsdb:4242/api/query?start=1h-ago&end=now&m=sum:1m-avg-none:system.load5\{host=host1\}"` + - `curl -Ss "http://opentsdb:4242/api/query?start=2h-ago&end=1h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"` + - `curl -Ss "http://opentsdb:4242/api/query?start=3h-ago&end=2h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"` + - ... + - `curl -Ss "http://opentsdb:4242/api/query?start=2160h-ago&end=2159h-ago&m=sum:1m-avg-none:system.load5\{host=host1\}"` This means that we must stream data from OpenTSDB to VictoriaMetrics in chunks. This is where concurrency for OpenTSDB comes in. We can query multiple chunks at once, but we shouldn't perform too many chunks at a time to avoid overloading the OpenTSDB cluster. @@ -111,6 +119,7 @@ Found 9 metrics to import. Continue? [Y/n] Starting with a relatively simple retention string (`sum-1m-avg:1h:30d`), let's describe how this is converted into actual queries. There are two essential parts of a retention string: + 1. [aggregation](#aggregation) 2. [windows/time ranges](#windows) @@ -119,8 +128,9 @@ There are two essential parts of a retention string: Retention strings essentially define the two levels of aggregation for our collected series. `sum-1m-avg` would become: -* First order: `sum` -* Second order: `1m-avg-none` + +- First order: `sum` +- Second order: `1m-avg-none` ##### First Order Aggregations @@ -141,6 +151,7 @@ We do not allow for defining the "null value" portion of the rollup window (e.g. #### Windows There are two important windows we define in a retention string: + 1. the "chunk" range of each query 2. The time range we will be querying on with that "chunk" @@ -186,8 +197,8 @@ See `./vmctl influx --help` for details and full list of flags. To use migration tool please specify the InfluxDB address `--influx-addr`, the database `--influx-database` and VictoriaMetrics address `--vm-addr`. Flag `--vm-addr` for single-node VM is usually equal to `--httpListenAddr`, and for cluster version -is equal to `--httpListenAddr` flag of vminsert component. Please note, that vmctl performs initial readiness check for the given address -by checking `/health` endpoint. For cluster version it is additionally required to specify the `--vm-account-id` flag. +is equal to `--httpListenAddr` flag of vminsert component. Please note, that vmctl performs initial readiness check for the given address +by checking `/health` endpoint. For cluster version it is additionally required to specify the `--vm-account-id` flag. See more details for cluster version [here](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster). As soon as required flags are provided and all endpoints are accessible, `vmctl` will start the InfluxDB scheme exploration. @@ -195,8 +206,9 @@ Basically, it just fetches all fields and timeseries from the provided database Then `vmctl` sends fetch requests for each timeseries to InfluxDB one by one and pass results to VM importer. VM importer then accumulates received samples in batches and sends import requests to VM. -The importing process example for local installation of InfluxDB(`http://localhost:8086`) +The importing process example for local installation of InfluxDB(`http://localhost:8086`) and single-node VictoriaMetrics(`http://localhost:8428`): + ``` ./vmctl influx --influx-database benchmark InfluxDB import mode @@ -216,25 +228,27 @@ Found 40000 timeseries to import. Continue? [Y/n] y bytes/s: 5.4 MB; import requests: 40001; 2020/01/18 21:19:00 Total time: 31m48.467044016s -``` +``` ### Data mapping Vmctl maps InfluxDB data the same way as VictoriaMetrics does by using the following rules: -* `influx-database` arg is mapped into `db` label value unless `db` tag exists in the InfluxDB line. -* Field names are mapped to time series names prefixed with {measurement}{separator} value, -where {separator} equals to _ by default. +- `influx-database` arg is mapped into `db` label value unless `db` tag exists in the InfluxDB line. +- Field names are mapped to time series names prefixed with {measurement}{separator} value, +where {separator} equals to _ by default. It can be changed with `--influx-measurement-field-separator` command-line flag. -* Field values are mapped to time series values. -* Tags are mapped to Prometheus labels format as-is. +- Field values are mapped to time series values. +- Tags are mapped to Prometheus labels format as-is. For example, the following InfluxDB line: + ``` foo,tag1=value1,tag2=value2 field1=12,field2=40 ``` is converted into the following Prometheus format data points: + ``` foo_field1{tag1="value1", tag2="value2"} 12 foo_field2{tag1="value1", tag2="value2"} 40 @@ -242,7 +256,7 @@ foo_field2{tag1="value1", tag2="value2"} 40 ### Configuration -The configuration flags should contain self-explanatory descriptions. +The configuration flags should contain self-explanatory descriptions. ### Filtering @@ -250,6 +264,7 @@ The filtering consists of two parts: timeseries and time. The first step of application is to select all available timeseries for given database and retention. User may specify additional filtering condition via `--influx-filter-series` flag. For example: + ``` ./vmctl influx --influx-database benchmark \ --influx-filter-series "on benchmark from cpu where hostname='host_1703'" @@ -260,13 +275,15 @@ InfluxDB import mode 2020/01/26 14:23:29 fetching series: command: "show series on benchmark from cpu where hostname='host_1703'"; database: "benchmark"; retention: "autogen" Found 10 timeseries to import. Continue? [Y/n] ``` + The timeseries select query would be following: `fetching series: command: "show series on benchmark from cpu where hostname='host_1703'"; database: "benchmark"; retention: "autogen"` - + The second step of filtering is a time filter and it applies when fetching the datapoints from Influx. Time filtering may be configured with two flags: -* --influx-filter-time-start -* --influx-filter-time-end + +- --influx-filter-time-start +- --influx-filter-time-end Here's an example of importing timeseries for one day only: `./vmctl influx --influx-database benchmark --influx-filter-series "where hostname='host_1703'" --influx-filter-time-start "2020-01-01T10:07:00Z" --influx-filter-time-end "2020-01-01T15:07:00Z"` @@ -275,36 +292,36 @@ Please see more about time filtering [here](https://docs.influxdata.com/influxdb ## Migrating data from InfluxDB (2.x) Migrating data from InfluxDB v2.x is not supported yet ([#32](https://github.com/VictoriaMetrics/vmctl/issues/32)). -You may find useful a 3rd party solution for this - https://github.com/jonppe/influx_to_victoriametrics. - +You may find useful a 3rd party solution for this - . ## Migrating data from Prometheus `vmctl` supports the `prometheus` mode for migrating data from Prometheus to VictoriaMetrics time-series database. -Migration is based on reading Prometheus snapshot, which is basically a hard-link to Prometheus data files. +Migration is based on reading Prometheus snapshot, which is basically a hard-link to Prometheus data files. See `./vmctl prometheus --help` for details and full list of flags. Also see Prometheus related articles [here](#articles). To use migration tool please specify the file path to Prometheus snapshot `--prom-snapshot` (see how to make a snapshot [here](https://www.robustperception.io/taking-snapshots-of-prometheus-data)) and VictoriaMetrics address `--vm-addr`. Please note, that `vmctl` *do not make a snapshot from Prometheus*, it uses an already prepared snapshot. More about Prometheus snapshots may be found [here](https://www.robustperception.io/taking-snapshots-of-prometheus-data) and [here](https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-d44a6728f043). Flag `--vm-addr` for single-node VM is usually equal to `--httpListenAddr`, and for cluster version -is equal to `--httpListenAddr` flag of vminsert component. Please note, that vmctl performs initial readiness check for the given address -by checking `/health` endpoint. For cluster version it is additionally required to specify the `--vm-account-id` flag. +is equal to `--httpListenAddr` flag of vminsert component. Please note, that vmctl performs initial readiness check for the given address +by checking `/health` endpoint. For cluster version it is additionally required to specify the `--vm-account-id` flag. See more details for cluster version [here](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster). As soon as required flags are provided and all endpoints are accessible, `vmctl` will start the Prometheus snapshot exploration. Basically, it just fetches all available blocks in provided snapshot and read the metadata. It also does initial filtering by time if flags `--prom-filter-time-start` or `--prom-filter-time-end` were set. The exploration procedure prints some stats from read blocks. Please note that stats are not taking into account timeseries or samples filtering. This will be done during importing process. - + The importing process takes the snapshot blocks revealed from Explore procedure and processes them one by one accumulating timeseries and samples. Please note, that `vmctl` relies on responses from InfluxDB on this stage, -so ensure that Explore queries are executed without errors or limits. Please see this +so ensure that Explore queries are executed without errors or limits. Please see this [issue](https://github.com/VictoriaMetrics/vmctl/issues/30) for details. The data processed in chunks and then sent to VM. -The importing process example for local installation of Prometheus +The importing process example for local installation of Prometheus and single-node VictoriaMetrics(`http://localhost:8428`): + ``` ./vmctl prometheus --prom-snapshot=/path/to/snapshot \ --vm-concurrency=1 \ @@ -331,7 +348,7 @@ Found 14 blocks to import. Continue? [Y/n] y import requests: 323; import requests retries: 0; 2020/02/23 15:50:03 Total time: 51.077451066s -``` +``` ### Data mapping @@ -340,7 +357,7 @@ So no data changes will be applied. ### Configuration -The configuration flags should contain self-explanatory descriptions. +The configuration flags should contain self-explanatory descriptions. ### Filtering @@ -351,6 +368,7 @@ in in RFC3339 format. This filter applied twice: to drop blocks out of range and overlapping time range. Example of applying time filter: + ``` ./vmctl prometheus --prom-snapshot=/path/to/snapshot \ --prom-filter-time-start=2020-02-07T00:07:01Z \ @@ -370,12 +388,13 @@ Please notice, that total amount of blocks in provided snapshot is 14, but only time range. So other 12 blocks were marked as `skipped`. The amount of samples and series is not taken into account, since this is heavy operation and will be done during import process. +Filtering by timeseries is configured with following flags: -Filtering by timeseries is configured with following flags: -* `--prom-filter-label` - the label name, e.g. `__name__` or `instance`; -* `--prom-filter-label-value` - the regular expression to filter the label value. By default matches all `.*` +- `--prom-filter-label` - the label name, e.g. `__name__` or `instance`; +- `--prom-filter-label-value` - the regular expression to filter the label value. By default matches all `.*` For example: + ``` ./vmctl prometheus --prom-snapshot=/path/to/snapshot \ --prom-filter-label="__name__" \ @@ -409,38 +428,44 @@ Found 2 blocks to import. Continue? [Y/n] y Thanos uses the same storage engine as Prometheus and the data layout on-disk should be the same. That means `vmctl` in mode `prometheus` may be used for Thanos historical data migration as well. -These instructions may vary based on the details of your Thanos configuration. -Please read carefully and verify as you go. We assume you're using Thanos Sidecar on your Prometheus pods, +These instructions may vary based on the details of your Thanos configuration. +Please read carefully and verify as you go. We assume you're using Thanos Sidecar on your Prometheus pods, and that you have a separate Thanos Store installation. ### Current data -1. For now, keep your Thanos Sidecar and Thanos-related Prometheus configuration, but add this to also stream +1. For now, keep your Thanos Sidecar and Thanos-related Prometheus configuration, but add this to also stream metrics to VictoriaMetrics: + ``` remote_write: - url: http://victoria-metrics:8428/api/v1/write ``` -2. Make sure VM is running, of course. Now check the logs to make sure that Prometheus is sending and VM is receiving. + +2. Make sure VM is running, of course. Now check the logs to make sure that Prometheus is sending and VM is receiving. In Prometheus, make sure there are no errors. On the VM side, you should see messages like this: + ``` - 2020-04-27T18:38:46.474Z info VictoriaMetrics/lib/storage/partition.go:207 creating a partition "2020_04" with smallPartsPath="/victoria-metrics-data/data/small/2020_04", bigPartsPath="/victoria-metrics-data/data/big/2020_04" - 2020-04-27T18:38:46.506Z info VictoriaMetrics/lib/storage/partition.go:222 partition "2020_04" has been created + 2020-04-27T18:38:46.474Z info VictoriaMetrics/lib/storage/partition.go:207 creating a partition "2020_04" with smallPartsPath="/victoria-metrics-data/data/small/2020_04", bigPartsPath="/victoria-metrics-data/data/big/2020_04" + 2020-04-27T18:38:46.506Z info VictoriaMetrics/lib/storage/partition.go:222 partition "2020_04" has been created ``` + 3. Now just wait. Within two hours, Prometheus should finish its current data file and hand it off to Thanos Store for long term storage. ### Historical data -Let's assume your data is stored on S3 served by minio. You first need to copy that out to a local filesystem, +Let's assume your data is stored on S3 served by minio. You first need to copy that out to a local filesystem, then import it into VM using `vmctl` in `prometheus` mode. + 1. Copy data from minio. 1. Run the `minio/mc` Docker container. 1. `mc config host add minio http://minio:9000 accessKey secretKey`, substituting appropriate values for the last 3 items. 1. `mc cp -r minio/prometheus thanos-data` 1. Import using `vmctl`. 1. Follow the [instructions](#how-to-build) to compile `vmctl` on your machine. - 1. Use [prometheus](#migrating-data-from-prometheus) mode to import data: + 1. Use [prometheus](#migrating-data-from-prometheus) mode to import data: + ``` vmctl prometheus --prom-snapshot thanos-data --vm-addr http://victoria-metrics:8428 ``` @@ -457,8 +482,8 @@ or higher. See `./vmctl vm-native --help` for details and full list of flags. -In this mode `vmctl` acts as a proxy between two VM instances, where time series filtering is done by "source" (`src`) -and processing is done by "destination" (`dst`). Because of that, `vmctl` doesn't actually know how much data will be +In this mode `vmctl` acts as a proxy between two VM instances, where time series filtering is done by "source" (`src`) +and processing is done by "destination" (`dst`). Because of that, `vmctl` doesn't actually know how much data will be processed and can't show the progress bar. It will show the current processing speed and total number of processed bytes: ``` @@ -472,14 +497,15 @@ Initing export pipe from "http://localhost:8528" with filters: Initing import process to "http://localhost:8428": Total: 336.75 KiB ↖ Speed: 454.46 KiB p/s 2020/10/13 17:04:59 Total time: 952.143376ms -``` +``` Importing tips: -1. Migrating all the metrics from one VM to another may collide with existing application metrics -(prefixed with `vm_`) at destination and lead to confusion when using -[official Grafana dashboards](https://grafana.com/orgs/victoriametrics/dashboards). + +1. Migrating all the metrics from one VM to another may collide with existing application metrics +(prefixed with `vm_`) at destination and lead to confusion when using +[official Grafana dashboards](https://grafana.com/orgs/victoriametrics/dashboards). To avoid such situation try to filter out VM process metrics via `--vm-native-filter-match` flag. -2. Migration is a backfilling process, so it is recommended to read +2. Migration is a backfilling process, so it is recommended to read [Backfilling tips](https://github.com/VictoriaMetrics/VictoriaMetrics#backfilling) section. 3. `vmctl` doesn't provide relabeling or other types of labels management in this mode. Instead, use [relabeling in VictoriaMetrics](https://github.com/VictoriaMetrics/vmctl/issues/4#issuecomment-683424375). @@ -495,7 +521,7 @@ timeseries. Please set it wisely to avoid InfluxDB overwhelming. The flag `--influx-chunk-size` controls the max amount of datapoints to return in single chunk from fetch requests. Please see more details [here](https://docs.influxdata.com/influxdb/v1.7/guides/querying_data/#chunking). -The chunk size is used to control InfluxDB memory usage, so it won't OOM on processing large timeseries with +The chunk size is used to control InfluxDB memory usage, so it won't OOM on processing large timeseries with billions of datapoints. ### Prometheus mode @@ -511,17 +537,18 @@ Please note that each import request can load up to a single vCPU core on Victor to allocated CPU resources of your VictoriMetrics installation. The flag `--vm-batch-size` controls max amount of samples collected before sending the import request. -For example, if `--influx-chunk-size=500` and `--vm-batch-size=2000` then importer will process not more -than 4 chunks before sending the request. +For example, if `--influx-chunk-size=500` and `--vm-batch-size=2000` then importer will process not more +than 4 chunks before sending the request. ### Importer stats -After successful import `vmctl` prints some statistics for details. +After successful import `vmctl` prints some statistics for details. The important numbers to watch are following: - - `idle duration` - shows time that importer spent while waiting for data from InfluxDB/Prometheus + +- `idle duration` - shows time that importer spent while waiting for data from InfluxDB/Prometheus to fill up `--vm-batch-size` batch size. Value shows total duration across all workers configured via `--vm-concurrency`. High value may be a sign of too slow InfluxDB/Prometheus fetches or too -high `--vm-concurrency` value. Try to improve it by increasing `---concurrency` value or +high `--vm-concurrency` value. Try to improve it by increasing `---concurrency` value or decreasing `--vm-concurrency` value. - `import requests` - shows how many import requests were issued to VM server. The import request is issued once the batch size(`--vm-batch-size`) is full and ready to be sent. @@ -533,6 +560,7 @@ a sign of network issues or VM being overloaded. See the logs during import for By default `vmctl` waits confirmation from user before starting the import. If this is unwanted behavior and no user interaction required - pass `-s` flag to enable "silence" mode: + ``` -s Whether to run in silent mode. If set to true no confirmation prompts will appear. (default: false) ``` @@ -541,18 +569,18 @@ behavior and no user interaction required - pass `-s` flag to enable "silence" m `vmctl` allows to limit the number of [significant figures](https://en.wikipedia.org/wiki/Significant_figures) before importing. For example, the average value for response size is `102.342305` bytes and it has 9 significant figures. -If you ask a human to pronounce this value then with high probability value will be rounded to first 4 or 5 figures -because the rest aren't really that important to mention. In most cases, such a high precision is too much. -Moreover, such values may be just a result of [floating point arithmetic](https://en.wikipedia.org/wiki/Floating-point_arithmetic), -create a [false precision](https://en.wikipedia.org/wiki/False_precision) and result into bad compression ratio -according to [information theory](https://en.wikipedia.org/wiki/Information_theory). +If you ask a human to pronounce this value then with high probability value will be rounded to first 4 or 5 figures +because the rest aren't really that important to mention. In most cases, such a high precision is too much. +Moreover, such values may be just a result of [floating point arithmetic](https://en.wikipedia.org/wiki/Floating-point_arithmetic), +create a [false precision](https://en.wikipedia.org/wiki/False_precision) and result into bad compression ratio +according to [information theory](https://en.wikipedia.org/wiki/Information_theory). `vmctl` provides the following flags for improving data compression: -* `--vm-round-digits` flag for rounding processed values to the given number of decimal digits after the point. +- `--vm-round-digits` flag for rounding processed values to the given number of decimal digits after the point. For example, `--vm-round-digits=2` would round `1.2345` to `1.23`. By default the rounding is disabled. -* `--vm-significant-figures` flag for limiting the number of significant figures in processed values. It takes no effect if set +- `--vm-significant-figures` flag for limiting the number of significant figures in processed values. It takes no effect if set to 0 (by default), but set `--vm-significant-figures=5` and `102.342305` will be rounded to `102.34`. The most common case for using these flags is to improve data compression for time series storing aggregation @@ -560,7 +588,7 @@ results such as `average`, `rate`, etc. ### Adding extra labels - `vmctl` allows to add extra labels to all imported series. It can be achived with flag `--vm-extra-label label=value`. + `vmctl` allows to add extra labels to all imported series. It can be achived with flag `--vm-extra-label label=value`. If multiple labels needs to be added, set flag for each label, for example, `--vm-extra-label label1=value1 --vm-extra-label label2=value2`. If timeseries already have label, that must be added with `--vm-extra-label` flag, flag has priority and will override label value from timeseries. @@ -569,15 +597,13 @@ results such as `average`, `rate`, etc. Limiting the rate of data transfer could help to reduce pressure on disk or on destination database. The rate limit may be set in bytes-per-second via `--vm-rate-limit` flag. -Please note, you can also use [vmagent](https://docs.victoriametrics.com/vmagent.html) +Please note, you can also use [vmagent](https://docs.victoriametrics.com/vmagent.html) as a proxy between `vmctl` and destination with `-remoteWrite.rateLimit` flag enabled. - ## How to build It is recommended using [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) - `vmctl` is located in `vmutils-*` archives there. - ### Development build 1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.17. diff --git a/docs/vmgateway.md b/docs/vmgateway.md index 6ab4ec2ef..a80e96a2c 100644 --- a/docs/vmgateway.md +++ b/docs/vmgateway.md @@ -6,7 +6,6 @@ sort: 9 ***vmgateway is a part of [enterprise package](https://victoriametrics.com/products/enterprise/). It is available for download and evaluation at [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)*** - vmgateway `vmgateway` is a proxy for the VictoriaMetrics Time Series Database (TSDB). It provides the following features: @@ -20,7 +19,6 @@ sort: 9 `vmgateway` is included in our [enterprise packages](https://victoriametrics.com/products/enterprise/). - ## Access Control vmgateway-ac @@ -28,6 +26,7 @@ sort: 9 `vmgateway` supports jwt based authentication. With jwt payload can be configured to give access to specific tenants and labels as well as to read/write. jwt token must be in following format: + ```json { "exp": 1617304574, @@ -45,13 +44,15 @@ jwt token must be in following format: } } ``` + Where: -- `exp` - required, expire time in unix_timestamp. If the token expires then `vmgateway` rejects the request. -- `vm_access` - required, dict with claim info, minimum form: `{"vm_access": {"tenand_id": {}}` -- `tenant_id` - optional, for cluster mode, routes requests to the corresponding tenant. -- `extra_labels` - optional, key-value pairs for label filters added to the ingested or selected metrics. Multiple filters are added with `and` operation. If defined, `extra_label` from original request removed. -- `extra_filters` - optional, [series selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) added to the select query requests. Multiple selectors are added with `or` operation. If defined, `extra_filter` from original request removed. -- `mode` - optional, access mode for api - read, write, or full. Supported values: 0 - full (default value), 1 - read, 2 - write. + +* `exp` - required, expire time in unix_timestamp. If the token expires then `vmgateway` rejects the request. +* `vm_access` - required, dict with claim info, minimum form: `{"vm_access": {"tenand_id": {}}` +* `tenant_id` - optional, for cluster mode, routes requests to the corresponding tenant. +* `extra_labels` - optional, key-value pairs for label filters added to the ingested or selected metrics. Multiple filters are added with `and` operation. If defined, `extra_label` from original request removed. +* `extra_filters` - optional, [series selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) added to the select query requests. Multiple selectors are added with `or` operation. If defined, `extra_filter` from original request removed. +* `mode` - optional, access mode for api - read, write, or full. Supported values: 0 - full (default value), 1 - read, 2 - write. ## QuickStart @@ -70,18 +71,19 @@ Start vmgateway ``` Retrieve data from the database + ```bash curl 'http://localhost:8431/api/v1/series/count' -H 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ2bV9hY2Nlc3MiOnsidGVuYW50X2lkIjp7fSwicm9sZSI6MX0sImV4cCI6MTkzOTM0NjIxMH0.5WUxEfdcV9hKo4CtQdtuZYOGpGXWwaqM9VuVivMMrVg' ``` A request with an incorrect token or without any token will be rejected: + ```bash curl 'http://localhost:8431/api/v1/series/count' curl 'http://localhost:8431/api/v1/series/count' -H 'Authorization: Bearer incorrect-token' ``` - ## Rate Limiter vmgateway-rl @@ -92,14 +94,16 @@ Limits incoming requests by given, pre-configured limits. It supports read and w The metrics that you want to rate limit must be scraped from the cluster. List of supported limit types: -- `queries` - count of api requests made at tenant to read the api, such as `/api/v1/query`, `/api/v1/series` and others. -- `active_series` - count of current active series at any given tenant. -- `new_series` - count of created series; aka churn rate -- `rows_inserted` - count of inserted rows per tenant. + +* `queries` - count of api requests made at tenant to read the api, such as `/api/v1/query`, `/api/v1/series` and others. +* `active_series` - count of current active series at any given tenant. +* `new_series` - count of created series; aka churn rate +* `rows_inserted` - count of inserted rows per tenant. List of supported time windows: -- `minute` -- `hour` + +* `minute` +* `hour` Limits can be specified per tenant or at a global level if you omit `project_id` and `account_id`. @@ -123,6 +127,7 @@ limits: ## QuickStart cluster version of VictoriaMetrics is required for rate limiting. + ```bash # start datasource for cluster metrics @@ -173,6 +178,7 @@ curl 'http://localhost:8431/api/v1/labels' -H 'Authorization: Bearer eyJhbGciOiJ ## Configuration The shortlist of configuration flags include the following: + ```console -clusterMode enable this for the cluster version @@ -280,12 +286,11 @@ The shortlist of configuration flags include the following: ## TroubleShooting * Access control: - * incorrect `jwt` format, try https://jwt.io/#debugger-io with our tokens + * incorrect `jwt` format, try with our tokens * expired token, check `exp` field. * Rate Limiting: * `scrape_interval` at datasource, reduce it to apply limits faster. - ## Limitations * Access Control: diff --git a/docs/vmrestore.md b/docs/vmrestore.md index d52e53c72..b70a290b7 100644 --- a/docs/vmrestore.md +++ b/docs/vmrestore.md @@ -10,12 +10,11 @@ VictoriaMetrics `v1.29.0` and newer versions must be used for working with the r Restore process can be interrupted at any time. It is automatically resumed from the interruption point when restarting `vmrestore` with the same args. - ## Usage VictoriaMetrics must be stopped during the restore process. -``` +```bash vmrestore -src=gs:/// -storageDataPath= ``` @@ -28,13 +27,11 @@ vmrestore -src=gs:/// -storageDataPath= +snap link: #### develop @@ -11,19 +10,18 @@ Install snapcraft or docker build snap package with command - ```text + ```bash make build-snap ``` It produces snap package with current git version - `victoriametrics_v1.46.0+git1.1bebd021a-dirty_all.snap`. You can install it with command: `snap install victoriametrics_v1.46.0+git1.1bebd021a-dirty_all.snap --dangerous` - -#### usage +#### usage installation and configuration: -```text +```bash # install snap install victoriametrics # logs @@ -35,13 +33,16 @@ snap logs victoriametrics Configuration management: Prometheus scrape config can be edited with your favorite editor, its located at -```text + +```bash vi /var/snap/victoriametrics/current/etc/victoriametrics-scrape-config.yaml ``` + after changes, you can trigger config reread with `curl localhost:8248/-/reload`. Configuration tuning is possible with editing extra_flags: -```text + +```bash echo 'FLAGS="-selfScrapeInterval=10s -search.logSlowQueryDuration=20s"' > /var/snap/victoriametrics/current/extra_flags snap restart victoriametrics ```