mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2024-11-21 14:44:00 +00:00
lib/storage: add ability to use downsampling for the given series filter (#733)
* lib/storage: add ability to use downsampling for the given series filter Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: add information about downsampling filters Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs: fix MetricsQL filter Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: treat missing downsampling filter as a bug Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/part_header: verify correctness of downsampling filters when opening partition Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: save only appliable rules in part metadata Filter and save only rules which are appliable to partition based on MinTimestamp of stored data. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/downsampling: update log messages for final dedup Properly specify a reason of re-running deduplication for partition. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage: consistently use MaxTimestamp to determine deduplication/downsampling rules Using MinTimestamp leads to applying downsampling to parts which are only partially covered by downsampling rule. For example, partition covers range [1000-2000]. At t=2100 and rule offset 500 data with t=2100-500 => 1600 must be downsampled. The range check against MinTimestamp evaluates to true even though partition contains range which must not be downsampled - [1600:2000]. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * Follow-up - Apply the first matching downsampling period if multiple filters match the given time series. This allows fine-tuning the downsampling config for the specific needs. - Take into account downsampling filters during search queries. - Reduce the difference between community and enterprise branches. This should simplify further maintenance of these branches. - Properly parse series filters with colons inside them. - Document the feature at docs/CHANGELOG.md. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4960 --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
This commit is contained in:
parent
131f357098
commit
af3922b1df
7 changed files with 164 additions and 64 deletions
67
README.md
67
README.md
|
@ -1975,45 +1975,72 @@ to historical data.
|
|||
|
||||
See [how to configure multiple retentions in VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#retention-filters).
|
||||
|
||||
See also [downsampling](#downsampling).
|
||||
|
||||
Retention filters can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
|
||||
See how to request a free trial license [here](https://victoriametrics.com/products/enterprise/trial/).
|
||||
|
||||
## Downsampling
|
||||
|
||||
[VictoriaMetrics Enterprise](https://docs.victoriametrics.com/enterprise.html) supports multi-level downsampling with `-downsampling.period` command-line flag. For example:
|
||||
[VictoriaMetrics Enterprise](https://docs.victoriametrics.com/enterprise.html) supports multi-level downsampling via `-downsampling.period=offset:interval` command-line flag.
|
||||
This command-line flag instructs leaving the last sample per each `interval` for [time series](https://docs.victoriametrics.com/keyconcepts/#time-series)
|
||||
[samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples) older than the `offset`. For example, `-downsampling.period=30d:5m` instructs leaving the last sample
|
||||
per each 5-minute interval for samples older than 30 days, while the rest of samples aren't downsampled.
|
||||
|
||||
* `-downsampling.period=30d:5m` instructs VictoriaMetrics to [deduplicate](#deduplication) samples older than 30 days with 5 minutes interval.
|
||||
The `-downsampling.period` command-line flag can be specified multiple times in order to apply different downsampling levels for different time ranges (aka multi-level downsampling).
|
||||
For example, `-downsampling.period=30d:5m,180d:1h` instructs leaving the last sample per each 5-minute interval for samples older than 30 days,
|
||||
while leaving the last sample per each 1-hour interval for samples older than 180 days.
|
||||
|
||||
* `-downsampling.period=30d:5m,180d:1h` instructs VictoriaMetrics to deduplicate samples older than 30 days with 5 minutes interval and to deduplicate samples older than 180 days with 1 hour interval.
|
||||
VictoriaMetrics supports configuring independent downsampling per different sets of [time series](https://docs.victoriametrics.com/keyconcepts/#time-series)
|
||||
via `-downsampling.period=filter:offset:interval` syntax. In this case the given `offset:interval` downsampling is applied only to time series matching the given `filter`.
|
||||
The `filter` can contain arbitrary [series filter](https://docs.victoriametrics.com/keyConcepts.html#filtering).
|
||||
For example, `-downsampling.period='{__name__=~"(node|process)_.*"}:1d:1m` instructs VictoriaMetrics to deduplicate samples older than one day with one minute interval
|
||||
only for [time series](https://docs.victoriametrics.com/keyconcepts/#time-series) with names starting with `node_` or `process_` prefixes.
|
||||
The de-duplication for other time series can be configured independently via additional `-downsampling.period` command-line flags.
|
||||
|
||||
If the time series doesn't match any `filter`, then it isn't downsampled. If the time series matches multiple filters, then the downsampling
|
||||
for the first matching `filter` is applied. For example, `-downsampling.period='{env="prod"}:1d:30s,{__name__=~"node_.*"}:1d:5m'` de-duplicates
|
||||
samples older than one day with 30 seconds interval across all the time series with `env="prod"` [label](https://docs.victoriametrics.com/keyconcepts/#labels),
|
||||
even if their names start with `node_` prefix. All the other time series with names starting with `node_` prefix are de-duplicated with 5 minutes interval.
|
||||
|
||||
If downsampling shouldn't be applied to some time series matching the given `filter`, then pass `-downsampling.period=filter:0s:0s` command-line flag to VictoriaMetrics.
|
||||
For example, if series with `env="prod"` label shouldn't be downsampled, then pass `-downsampling.period='{env="prod"}:0s:0s'` command-line flag in front of other `-downsampling.period` flags.
|
||||
|
||||
Downsampling is applied independently per each time series and leaves a single [raw sample](https://docs.victoriametrics.com/keyConcepts.html#raw-samples)
|
||||
with the biggest [timestamp](https://en.wikipedia.org/wiki/Unix_time) on the configured interval, in the same way as [deduplication](#deduplication) does.
|
||||
It works the best for [counters](https://docs.victoriametrics.com/keyConcepts.html#counter) and [histograms](https://docs.victoriametrics.com/keyConcepts.html#histogram),
|
||||
as their values are always increasing. But downsampling [gauges](https://docs.victoriametrics.com/keyConcepts.html#gauge)
|
||||
and [summaries](https://docs.victoriametrics.com/keyConcepts.html#summary)
|
||||
would mean losing the changes within the downsampling interval. Please note, you can use [recording rules](https://docs.victoriametrics.com/vmalert.html#rules)
|
||||
or [steaming aggregation](https://docs.victoriametrics.com/stream-aggregation.html)
|
||||
as their values are always increasing. Downsampling [gauges](https://docs.victoriametrics.com/keyConcepts.html#gauge)
|
||||
and [summaries](https://docs.victoriametrics.com/keyConcepts.html#summary) lose some changes within the downsampling interval,
|
||||
since only the last sample on the given interval is left and the rest of samples are dropped.
|
||||
|
||||
You can use [recording rules](https://docs.victoriametrics.com/vmalert.html#rules) or [steaming aggregation](https://docs.victoriametrics.com/stream-aggregation.html)
|
||||
to apply custom aggregation functions, like min/max/avg etc., in order to make gauges more resilient to downsampling.
|
||||
|
||||
Downsampling can reduce disk space usage and improve query performance if it is applied to time series with big number
|
||||
of samples per each series. The downsampling doesn't improve query performance if the database contains big number
|
||||
of time series with small number of samples per each series (aka [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)),
|
||||
since downsampling doesn't reduce the number of time series. In this case the majority of query time is spent on searching for the matching time series
|
||||
instead of processing the found samples.
|
||||
of samples per each series. The downsampling doesn't improve query performance and doesn't reduce disk space if the database contains big number
|
||||
of time series with small number of samples per each series, since downsampling doesn't reduce the number of time series.
|
||||
So there is little sense in applying downsampling to time series with [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate).
|
||||
In this case the majority of query time is spent on searching for the matching time series instead of processing the found samples.
|
||||
It is possible to use [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html) in [vmagent](https://docs.victoriametrics.com/vmagent.html)
|
||||
or recording rules in [vmalert](https://docs.victoriametrics.com/vmalert.html) in order to
|
||||
or [recording rules in vmalert](https://docs.victoriametrics.com/vmalert.html#rules) in order to
|
||||
[reduce the number of time series](https://docs.victoriametrics.com/vmalert.html#downsampling-and-aggregation-via-vmalert).
|
||||
|
||||
Downsampling happens during [background merges](https://docs.victoriametrics.com/#storage)
|
||||
and can't be performed if there is not enough of free disk space or if vmstorage
|
||||
is in [read-only mode](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#readonly-mode).
|
||||
Downsampling is performed during [background merges](https://docs.victoriametrics.com/#storage).
|
||||
It cannot be performed if there is not enough of free disk space or if vmstorage is in [read-only mode](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#readonly-mode).
|
||||
|
||||
Please, note that intervals of `-downsampling.period` must be multiples of each other.
|
||||
In case [deduplication](https://docs.victoriametrics.com/#deduplication) is enabled value of `-dedup.minScrapeInterval` must also be multiple of `-downsampling.period` intervals.
|
||||
This is required to ensure consistency of deduplication and downsampling results.
|
||||
Please, note that intervals of `-downsampling.period` must be multiples of each other.
|
||||
In case [deduplication](https://docs.victoriametrics.com/#deduplication) is enabled, value of `-dedup.minScrapeInterval` command-line flag must also
|
||||
be multiple of `-downsampling.period` intervals. This is required to ensure consistency of deduplication and downsampling results.
|
||||
|
||||
The downsampling can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
|
||||
See how to request a free trial license [here](https://victoriametrics.com/products/enterprise/trial/).
|
||||
It is safe updating `-downsampling.period` during VictoriaMetrics restarts - the updated downsampling configuration will be
|
||||
applied eventually to historical data during [background merges](https://docs.victoriametrics.com/#storage).
|
||||
|
||||
See [how to configure downsampling in VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#downsampling).
|
||||
|
||||
See also [retention filters](#retention-filters).
|
||||
|
||||
The downsampling filters can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
|
||||
See [how to request a free trial license](https://victoriametrics.com/products/enterprise/trial/).
|
||||
|
||||
## Multi-tenancy
|
||||
|
||||
|
|
|
@ -11,6 +11,9 @@ import (
|
|||
"time"
|
||||
"unsafe"
|
||||
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
"github.com/VictoriaMetrics/metricsql"
|
||||
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/searchutils"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
|
||||
|
@ -18,8 +21,6 @@ import (
|
|||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/querytracer"
|
||||
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
|
||||
"github.com/VictoriaMetrics/metrics"
|
||||
"github.com/VictoriaMetrics/metricsql"
|
||||
)
|
||||
|
||||
var (
|
||||
|
|
|
@ -36,6 +36,7 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/).
|
|||
|
||||
* SECURITY: upgrade Go builder from Go1.21.7 to Go1.22.1. See [the list of issues addressed in Go1.22.1](https://github.com/golang/go/issues?q=milestone%3AGo1.22.1+label%3ACherryPickApproved).
|
||||
|
||||
* FEATURE: [downsampling](https://docs.victoriametrics.com/#downsampling): add ability to configure distinct downsampling per distinct sets of time series and/or [tenants](https://docs.victoriametrics.com/cluster-victoriametrics/#multitenancy). See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4960) and [these docs](https://docs.victoriametrics.com/#downsampling).
|
||||
* FEATURE: [vmauth](https://docs.victoriametrics.com/vmauth/): allow discovering ip addresses for backend instances hidden behind a shared hostname, via `discover_backend_ips: true` option. This allows evenly spreading load among backend instances. See [these docs](https://docs.victoriametrics.com/vmauth/#discovering-backend-ips) and [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5707).
|
||||
* FEATURE: [vmauth](https://docs.victoriametrics.com/vmauth/): allow routing incoming requests based on HTTP [query args](https://en.wikipedia.org/wiki/Query_string) via `src_query_args` option at `url_map`. See [these docs](https://docs.victoriametrics.com/vmauth/#generic-http-proxy-for-different-backends) and [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5878).
|
||||
* FEATURE: [vmauth](https://docs.victoriametrics.com/vmauth/): allow routing incoming requests based on HTTP request headers via `src_headers` option at `url_map`. See [these docs](https://docs.victoriametrics.com/vmauth/#generic-http-proxy-for-different-backends).
|
||||
|
@ -62,7 +63,6 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/).
|
|||
* FEATURE: [graphite](https://docs.victoriametrics.com/#graphite-render-api-usage): add support for `aggregateSeriesLists`, `diffSeriesLists`, `multiplySeriesLists` and `sumSeriesLists` functions. Thanks to @rbizos for [the pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5809).
|
||||
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): added command line argument that enables OpenTelementry metric names and labels sanitization.
|
||||
|
||||
|
||||
* BUGFIX: prevent from automatic deletion of newly registered time series when it is queried immediately after the addition. The probability of this bug has been increased significantly after [v1.99.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.99.0) because of optimizations related to registering new time series. See [this](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5948) and [this](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5959) issue.
|
||||
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly set `Host` header in requests to scrape targets if it is specified via [`headers` option](https://docs.victoriametrics.com/sd_configs/#http-api-client-options). Thanks to @fholzer for [the bugreport](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5969) and [the fix](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5970).
|
||||
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly set `Host` header in requests to scrape targets when [`server_name` option at `tls_config`](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config) is set. Previously the `Host` header was set incorrectly to the target hostname in this case.
|
||||
|
|
|
@ -898,6 +898,7 @@ For example, the following config sets retention to 5 days for time series with
|
|||
```
|
||||
|
||||
See also [these docs](https://docs.victoriametrics.com/#retention-filters) for additional details on retention filters.
|
||||
See also [downsampling](#downsampling).
|
||||
|
||||
Enterprise binaries can be downloaded and evaluated for free from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
|
||||
See how to request a free trial license [here](https://victoriametrics.com/products/enterprise/trial/).
|
||||
|
@ -907,6 +908,21 @@ See how to request a free trial license [here](https://victoriametrics.com/produ
|
|||
Downsampling is available in [enterprise version of VictoriaMetrics](https://docs.victoriametrics.com/enterprise.html).
|
||||
It is configured with `-downsampling.period` command-line flag according to [these docs](https://docs.victoriametrics.com/#downsampling).
|
||||
|
||||
It is possible to downsample series, which belong to a particular [tenant](#multitenancy) by using [filters](https://docs.victoriametrics.com/keyConcepts.html#filtering)
|
||||
on `vm_account_id` or `vm_project_id` pseudo-labels in `-downsampling.period` command-line flag. For example, the following config leaves the last sample per each minute for samples
|
||||
older than one hour only for [tenants](#multitenancy) with accountID equal to 12 and 42, while series for other tenants aren't downsampled:
|
||||
|
||||
```
|
||||
-downsampling.period='{vm_account_id=~"12|42"}:1h:1m'
|
||||
```
|
||||
|
||||
It is OK to mix filters on real labels with filters on `vm_account_id` and `vm_project_id` pseudo-labels.
|
||||
For example, the following config instructs leaving the last sample per hour after 30 days for time series with `env="dev"` label from [tenant](#multitenancy) `accountID=5`:
|
||||
|
||||
```
|
||||
-downsampling.period='{vm_account_id="5",env="dev"}:30d:1h'
|
||||
```
|
||||
|
||||
The same flag value must be passed to both `vmstorage` and `vmselect` nodes. Configuring `vmselect` node with `-downsampling.period`
|
||||
command-line flag makes query results more consistent, because `vmselect` uses the maximum configured downsampling interval
|
||||
on the requested time range if this time range covers multiple downsampling levels.
|
||||
|
@ -915,6 +931,8 @@ downsamples all the [raw samples](https://docs.victoriametrics.com/keyConcepts.h
|
|||
using 5 minute interval. If `-downsampling.period` command-line flag isn't set at `vmselect`,
|
||||
then query results can be less consistent because of mixing raw and downsampled data.
|
||||
|
||||
See also [retention filters](#retention-filters).
|
||||
|
||||
Enterprise binaries can be downloaded and evaluated for free from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
|
||||
See how to request a free trial license [here](https://victoriametrics.com/products/enterprise/trial/).
|
||||
|
||||
|
|
|
@ -1978,45 +1978,72 @@ to historical data.
|
|||
|
||||
See [how to configure multiple retentions in VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#retention-filters).
|
||||
|
||||
See also [downsampling](#downsampling).
|
||||
|
||||
Retention filters can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
|
||||
See how to request a free trial license [here](https://victoriametrics.com/products/enterprise/trial/).
|
||||
|
||||
## Downsampling
|
||||
|
||||
[VictoriaMetrics Enterprise](https://docs.victoriametrics.com/enterprise.html) supports multi-level downsampling with `-downsampling.period` command-line flag. For example:
|
||||
[VictoriaMetrics Enterprise](https://docs.victoriametrics.com/enterprise.html) supports multi-level downsampling via `-downsampling.period=offset:interval` command-line flag.
|
||||
This command-line flag instructs leaving the last sample per each `interval` for [time series](https://docs.victoriametrics.com/keyconcepts/#time-series)
|
||||
[samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples) older than the `offset`. For example, `-downsampling.period=30d:5m` instructs leaving the last sample
|
||||
per each 5-minute interval for samples older than 30 days, while the rest of samples aren't downsampled.
|
||||
|
||||
* `-downsampling.period=30d:5m` instructs VictoriaMetrics to [deduplicate](#deduplication) samples older than 30 days with 5 minutes interval.
|
||||
The `-downsampling.period` command-line flag can be specified multiple times in order to apply different downsampling levels for different time ranges (aka multi-level downsampling).
|
||||
For example, `-downsampling.period=30d:5m,180d:1h` instructs leaving the last sample per each 5-minute interval for samples older than 30 days,
|
||||
while leaving the last sample per each 1-hour interval for samples older than 180 days.
|
||||
|
||||
* `-downsampling.period=30d:5m,180d:1h` instructs VictoriaMetrics to deduplicate samples older than 30 days with 5 minutes interval and to deduplicate samples older than 180 days with 1 hour interval.
|
||||
VictoriaMetrics supports configuring independent downsampling per different sets of [time series](https://docs.victoriametrics.com/keyconcepts/#time-series)
|
||||
via `-downsampling.period=filter:offset:interval` syntax. In this case the given `offset:interval` downsampling is applied only to time series matching the given `filter`.
|
||||
The `filter` can contain arbitrary [series filter](https://docs.victoriametrics.com/keyConcepts.html#filtering).
|
||||
For example, `-downsampling.period='{__name__=~"(node|process)_.*"}:1d:1m` instructs VictoriaMetrics to deduplicate samples older than one day with one minute interval
|
||||
only for [time series](https://docs.victoriametrics.com/keyconcepts/#time-series) with names starting with `node_` or `process_` prefixes.
|
||||
The de-duplication for other time series can be configured independently via additional `-downsampling.period` command-line flags.
|
||||
|
||||
If the time series doesn't match any `filter`, then it isn't downsampled. If the time series matches multiple filters, then the downsampling
|
||||
for the first matching `filter` is applied. For example, `-downsampling.period='{env="prod"}:1d:30s,{__name__=~"node_.*"}:1d:5m'` de-duplicates
|
||||
samples older than one day with 30 seconds interval across all the time series with `env="prod"` [label](https://docs.victoriametrics.com/keyconcepts/#labels),
|
||||
even if their names start with `node_` prefix. All the other time series with names starting with `node_` prefix are de-duplicated with 5 minutes interval.
|
||||
|
||||
If downsampling shouldn't be applied to some time series matching the given `filter`, then pass `-downsampling.period=filter:0s:0s` command-line flag to VictoriaMetrics.
|
||||
For example, if series with `env="prod"` label shouldn't be downsampled, then pass `-downsampling.period='{env="prod"}:0s:0s'` command-line flag in front of other `-downsampling.period` flags.
|
||||
|
||||
Downsampling is applied independently per each time series and leaves a single [raw sample](https://docs.victoriametrics.com/keyConcepts.html#raw-samples)
|
||||
with the biggest [timestamp](https://en.wikipedia.org/wiki/Unix_time) on the configured interval, in the same way as [deduplication](#deduplication) does.
|
||||
It works the best for [counters](https://docs.victoriametrics.com/keyConcepts.html#counter) and [histograms](https://docs.victoriametrics.com/keyConcepts.html#histogram),
|
||||
as their values are always increasing. But downsampling [gauges](https://docs.victoriametrics.com/keyConcepts.html#gauge)
|
||||
and [summaries](https://docs.victoriametrics.com/keyConcepts.html#summary)
|
||||
would mean losing the changes within the downsampling interval. Please note, you can use [recording rules](https://docs.victoriametrics.com/vmalert.html#rules)
|
||||
or [steaming aggregation](https://docs.victoriametrics.com/stream-aggregation.html)
|
||||
as their values are always increasing. Downsampling [gauges](https://docs.victoriametrics.com/keyConcepts.html#gauge)
|
||||
and [summaries](https://docs.victoriametrics.com/keyConcepts.html#summary) lose some changes within the downsampling interval,
|
||||
since only the last sample on the given interval is left and the rest of samples are dropped.
|
||||
|
||||
You can use [recording rules](https://docs.victoriametrics.com/vmalert.html#rules) or [steaming aggregation](https://docs.victoriametrics.com/stream-aggregation.html)
|
||||
to apply custom aggregation functions, like min/max/avg etc., in order to make gauges more resilient to downsampling.
|
||||
|
||||
Downsampling can reduce disk space usage and improve query performance if it is applied to time series with big number
|
||||
of samples per each series. The downsampling doesn't improve query performance if the database contains big number
|
||||
of time series with small number of samples per each series (aka [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)),
|
||||
since downsampling doesn't reduce the number of time series. In this case the majority of query time is spent on searching for the matching time series
|
||||
instead of processing the found samples.
|
||||
of samples per each series. The downsampling doesn't improve query performance and doesn't reduce disk space if the database contains big number
|
||||
of time series with small number of samples per each series, since downsampling doesn't reduce the number of time series.
|
||||
So there is little sense in applying downsampling to time series with [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate).
|
||||
In this case the majority of query time is spent on searching for the matching time series instead of processing the found samples.
|
||||
It is possible to use [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html) in [vmagent](https://docs.victoriametrics.com/vmagent.html)
|
||||
or recording rules in [vmalert](https://docs.victoriametrics.com/vmalert.html) in order to
|
||||
or [recording rules in vmalert](https://docs.victoriametrics.com/vmalert.html#rules) in order to
|
||||
[reduce the number of time series](https://docs.victoriametrics.com/vmalert.html#downsampling-and-aggregation-via-vmalert).
|
||||
|
||||
Downsampling happens during [background merges](https://docs.victoriametrics.com/#storage)
|
||||
and can't be performed if there is not enough of free disk space or if vmstorage
|
||||
is in [read-only mode](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#readonly-mode).
|
||||
Downsampling is performed during [background merges](https://docs.victoriametrics.com/#storage).
|
||||
It cannot be performed if there is not enough of free disk space or if vmstorage is in [read-only mode](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#readonly-mode).
|
||||
|
||||
Please, note that intervals of `-downsampling.period` must be multiples of each other.
|
||||
In case [deduplication](https://docs.victoriametrics.com/#deduplication) is enabled value of `-dedup.minScrapeInterval` must also be multiple of `-downsampling.period` intervals.
|
||||
This is required to ensure consistency of deduplication and downsampling results.
|
||||
Please, note that intervals of `-downsampling.period` must be multiples of each other.
|
||||
In case [deduplication](https://docs.victoriametrics.com/#deduplication) is enabled, value of `-dedup.minScrapeInterval` command-line flag must also
|
||||
be multiple of `-downsampling.period` intervals. This is required to ensure consistency of deduplication and downsampling results.
|
||||
|
||||
The downsampling can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
|
||||
See how to request a free trial license [here](https://victoriametrics.com/products/enterprise/trial/).
|
||||
It is safe updating `-downsampling.period` during VictoriaMetrics restarts - the updated downsampling configuration will be
|
||||
applied eventually to historical data during [background merges](https://docs.victoriametrics.com/#storage).
|
||||
|
||||
See [how to configure downsampling in VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#downsampling).
|
||||
|
||||
See also [retention filters](#retention-filters).
|
||||
|
||||
The downsampling filters can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
|
||||
See [how to request a free trial license](https://victoriametrics.com/products/enterprise/trial/).
|
||||
|
||||
## Multi-tenancy
|
||||
|
||||
|
|
|
@ -1989,45 +1989,72 @@ to historical data.
|
|||
|
||||
See [how to configure multiple retentions in VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#retention-filters).
|
||||
|
||||
See also [downsampling](#downsampling).
|
||||
|
||||
Retention filters can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
|
||||
See how to request a free trial license [here](https://victoriametrics.com/products/enterprise/trial/).
|
||||
|
||||
## Downsampling
|
||||
|
||||
[VictoriaMetrics Enterprise](https://docs.victoriametrics.com/enterprise.html) supports multi-level downsampling with `-downsampling.period` command-line flag. For example:
|
||||
[VictoriaMetrics Enterprise](https://docs.victoriametrics.com/enterprise.html) supports multi-level downsampling via `-downsampling.period=offset:interval` command-line flag.
|
||||
This command-line flag instructs leaving the last sample per each `interval` for [time series](https://docs.victoriametrics.com/keyconcepts/#time-series)
|
||||
[samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples) older than the `offset`. For example, `-downsampling.period=30d:5m` instructs leaving the last sample
|
||||
per each 5-minute interval for samples older than 30 days, while the rest of samples aren't downsampled.
|
||||
|
||||
* `-downsampling.period=30d:5m` instructs VictoriaMetrics to [deduplicate](#deduplication) samples older than 30 days with 5 minutes interval.
|
||||
The `-downsampling.period` command-line flag can be specified multiple times in order to apply different downsampling levels for different time ranges (aka multi-level downsampling).
|
||||
For example, `-downsampling.period=30d:5m,180d:1h` instructs leaving the last sample per each 5-minute interval for samples older than 30 days,
|
||||
while leaving the last sample per each 1-hour interval for samples older than 180 days.
|
||||
|
||||
* `-downsampling.period=30d:5m,180d:1h` instructs VictoriaMetrics to deduplicate samples older than 30 days with 5 minutes interval and to deduplicate samples older than 180 days with 1 hour interval.
|
||||
VictoriaMetrics supports configuring independent downsampling per different sets of [time series](https://docs.victoriametrics.com/keyconcepts/#time-series)
|
||||
via `-downsampling.period=filter:offset:interval` syntax. In this case the given `offset:interval` downsampling is applied only to time series matching the given `filter`.
|
||||
The `filter` can contain arbitrary [series filter](https://docs.victoriametrics.com/keyConcepts.html#filtering).
|
||||
For example, `-downsampling.period='{__name__=~"(node|process)_.*"}:1d:1m` instructs VictoriaMetrics to deduplicate samples older than one day with one minute interval
|
||||
only for [time series](https://docs.victoriametrics.com/keyconcepts/#time-series) with names starting with `node_` or `process_` prefixes.
|
||||
The de-duplication for other time series can be configured independently via additional `-downsampling.period` command-line flags.
|
||||
|
||||
If the time series doesn't match any `filter`, then it isn't downsampled. If the time series matches multiple filters, then the downsampling
|
||||
for the first matching `filter` is applied. For example, `-downsampling.period='{env="prod"}:1d:30s,{__name__=~"node_.*"}:1d:5m'` de-duplicates
|
||||
samples older than one day with 30 seconds interval across all the time series with `env="prod"` [label](https://docs.victoriametrics.com/keyconcepts/#labels),
|
||||
even if their names start with `node_` prefix. All the other time series with names starting with `node_` prefix are de-duplicated with 5 minutes interval.
|
||||
|
||||
If downsampling shouldn't be applied to some time series matching the given `filter`, then pass `-downsampling.period=filter:0s:0s` command-line flag to VictoriaMetrics.
|
||||
For example, if series with `env="prod"` label shouldn't be downsampled, then pass `-downsampling.period='{env="prod"}:0s:0s'` command-line flag in front of other `-downsampling.period` flags.
|
||||
|
||||
Downsampling is applied independently per each time series and leaves a single [raw sample](https://docs.victoriametrics.com/keyConcepts.html#raw-samples)
|
||||
with the biggest [timestamp](https://en.wikipedia.org/wiki/Unix_time) on the configured interval, in the same way as [deduplication](#deduplication) does.
|
||||
It works the best for [counters](https://docs.victoriametrics.com/keyConcepts.html#counter) and [histograms](https://docs.victoriametrics.com/keyConcepts.html#histogram),
|
||||
as their values are always increasing. But downsampling [gauges](https://docs.victoriametrics.com/keyConcepts.html#gauge)
|
||||
and [summaries](https://docs.victoriametrics.com/keyConcepts.html#summary)
|
||||
would mean losing the changes within the downsampling interval. Please note, you can use [recording rules](https://docs.victoriametrics.com/vmalert.html#rules)
|
||||
or [steaming aggregation](https://docs.victoriametrics.com/stream-aggregation.html)
|
||||
as their values are always increasing. Downsampling [gauges](https://docs.victoriametrics.com/keyConcepts.html#gauge)
|
||||
and [summaries](https://docs.victoriametrics.com/keyConcepts.html#summary) lose some changes within the downsampling interval,
|
||||
since only the last sample on the given interval is left and the rest of samples are dropped.
|
||||
|
||||
You can use [recording rules](https://docs.victoriametrics.com/vmalert.html#rules) or [steaming aggregation](https://docs.victoriametrics.com/stream-aggregation.html)
|
||||
to apply custom aggregation functions, like min/max/avg etc., in order to make gauges more resilient to downsampling.
|
||||
|
||||
Downsampling can reduce disk space usage and improve query performance if it is applied to time series with big number
|
||||
of samples per each series. The downsampling doesn't improve query performance if the database contains big number
|
||||
of time series with small number of samples per each series (aka [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate)),
|
||||
since downsampling doesn't reduce the number of time series. In this case the majority of query time is spent on searching for the matching time series
|
||||
instead of processing the found samples.
|
||||
of samples per each series. The downsampling doesn't improve query performance and doesn't reduce disk space if the database contains big number
|
||||
of time series with small number of samples per each series, since downsampling doesn't reduce the number of time series.
|
||||
So there is little sense in applying downsampling to time series with [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate).
|
||||
In this case the majority of query time is spent on searching for the matching time series instead of processing the found samples.
|
||||
It is possible to use [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html) in [vmagent](https://docs.victoriametrics.com/vmagent.html)
|
||||
or recording rules in [vmalert](https://docs.victoriametrics.com/vmalert.html) in order to
|
||||
or [recording rules in vmalert](https://docs.victoriametrics.com/vmalert.html#rules) in order to
|
||||
[reduce the number of time series](https://docs.victoriametrics.com/vmalert.html#downsampling-and-aggregation-via-vmalert).
|
||||
|
||||
Downsampling happens during [background merges](https://docs.victoriametrics.com/#storage)
|
||||
and can't be performed if there is not enough of free disk space or if vmstorage
|
||||
is in [read-only mode](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#readonly-mode).
|
||||
Downsampling is performed during [background merges](https://docs.victoriametrics.com/#storage).
|
||||
It cannot be performed if there is not enough of free disk space or if vmstorage is in [read-only mode](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#readonly-mode).
|
||||
|
||||
Please, note that intervals of `-downsampling.period` must be multiples of each other.
|
||||
In case [deduplication](https://docs.victoriametrics.com/#deduplication) is enabled value of `-dedup.minScrapeInterval` must also be multiple of `-downsampling.period` intervals.
|
||||
This is required to ensure consistency of deduplication and downsampling results.
|
||||
Please, note that intervals of `-downsampling.period` must be multiples of each other.
|
||||
In case [deduplication](https://docs.victoriametrics.com/#deduplication) is enabled, value of `-dedup.minScrapeInterval` command-line flag must also
|
||||
be multiple of `-downsampling.period` intervals. This is required to ensure consistency of deduplication and downsampling results.
|
||||
|
||||
The downsampling can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
|
||||
See how to request a free trial license [here](https://victoriametrics.com/products/enterprise/trial/).
|
||||
It is safe updating `-downsampling.period` during VictoriaMetrics restarts - the updated downsampling configuration will be
|
||||
applied eventually to historical data during [background merges](https://docs.victoriametrics.com/#storage).
|
||||
|
||||
See [how to configure downsampling in VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#downsampling).
|
||||
|
||||
See also [retention filters](#retention-filters).
|
||||
|
||||
The downsampling filters can be evaluated for free by downloading and using enterprise binaries from [the releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest).
|
||||
See [how to request a free trial license](https://victoriametrics.com/products/enterprise/trial/).
|
||||
|
||||
## Multi-tenancy
|
||||
|
||||
|
|
|
@ -137,7 +137,7 @@ func (ph *partHeader) MustReadMetadata(partPath string) {
|
|||
metadataPath := filepath.Join(partPath, metadataFilename)
|
||||
if !fs.IsPathExist(metadataPath) {
|
||||
// This is a part created before v1.90.0.
|
||||
// Fall back to reading the metadata from the partPath itsel
|
||||
// Fall back to reading the metadata from the partPath itself.
|
||||
if err := ph.ParseFromPath(partPath); err != nil {
|
||||
logger.Panicf("FATAL: cannot parse metadata from %q: %s", partPath, err)
|
||||
}
|
||||
|
|
Loading…
Reference in a new issue