VictoriaMetrics/docs/stream-aggregation/configuration
2024-08-10 08:12:38 +03:00
..
outputs docs: stream aggregation updated structure, added common mistakes section 2024-08-10 08:12:38 +03:00
_index.md docs: stream aggregation updated structure, added common mistakes section 2024-08-10 08:12:38 +03:00
README.md docs: stream aggregation updated structure, added common mistakes section 2024-08-10 08:12:38 +03:00

Stream aggregation can be configured via the following command-line flags:

  • -streamAggr.config at single-node VictoriaMetrics and at vmagent.
  • -remoteWrite.streamAggr.config at vmagent only. This flag can be specified individually per each -remoteWrite.url and aggregation will happen independently for each of them. This allows writing different aggregates to different remote storage destinations.

These flags must point to a file containing stream aggregation config. The file may contain %{ENV_VAR} placeholders which are substituted by the corresponding ENV_VAR environment variable values.

By default, the following data is written to the storage when stream aggregation is enabled:

  • the aggregated samples;
  • the raw input samples, which didn't match any match option in the provided config

This behaviour can be changed via the following command-line flags:

  • -streamAggr.keepInput at single-node VictoriaMetrics and vmagent. At vmagent -remoteWrite.streamAggr.keepInput flag can be specified individually per each -remoteWrite.url. If one of these flags is set, then all the input samples are written to the storage alongside the aggregated samples.
  • -streamAggr.dropInput at single-node VictoriaMetrics and vmagent. At vmagent -remoteWrite.streamAggr.dropInput flag can be specified individually per each -remoteWrite.url. If one of these flags are set, then all the input samples are dropped, while only the aggregated samples are written to the storage.

Configuration File Reference

Overview

Stream aggregation config file contains YAML formatted configs and may be referred via -streamAggr.config command-line flag at single-node VictoriaMetrics and vmagent. At vmagent -remoteWrite.streamAggr.config command-line flag can be specified individually per each -remoteWrite.url or just once, which enables same aggregation rules for each -remoteWrite.url.

Example configuration

- match: 'http_request_duration_seconds_bucket{env=~"prod|staging"}'
  interval: 1m
  by: [vmrange]
  outputs: [total]
  • name (string: "none") - name of the given streaming aggregation config. If it is set, then it is used as name label in the exposed metrics for the given aggregation config at /metrics page. See monitoring related information here and here
  • match (list<string> or string: []) - an optional filter for incoming samples to aggregate. It can contain arbitrary Prometheus series selector according to filtering concepts. If match isn't set, then all the incoming samples are aggregated. match also can contain a list of series selectors. Then the incoming samples are aggregated if they match at least a single series selector.
  • interval (string: "", required) - interval for the aggregation. The aggregated stats are sent to remote storage once per interval.
  • dedup_interval (string:"") - interval for de-duplication of input samples before the aggregation. Samples are de-duplicated on a per-series basis. See timeseries and deduplication The deduplication is performed after input_relabel_configs relabeling is applied. By default, the deduplication is disabled unless -remoteWrite.streamAggr.dedupInterval or -streamAggr.dedupInterval command-line flags are set.
  • staleness_interval (string:2*interval) - interval for resetting the per-series state if no new samples are received during this interval for the following outputs:
  • no_align_flush_to_interval (bool: false) - disables aligning of flush times for the aggregated data to multiples of interval. By default, flush times for the aggregated data is aligned to multiples of interval. For example:
    • if interval: 1m is set, then flushes happen at the end of every minute,
    • if interval: 1h is set, then flushes happen at the end of every hour
  • flush_on_shutdown (bool: false) - instructs to flush aggregated data to the storage on the first and the last intervals during vmagent starts, restarts or configuration reloads. Incomplete aggregated data isn't flushed to the storage by default, since it is usually confusing.
  • without (list<string>: []) - list of labels, which must be removed from the output aggregation. See aggregation by labels
  • by (list<string>: []) - list of labels, which must be preserved in the output aggregation. See aggregation by labels
  • outputs (list<string>:[], required) - list of aggregations to perform on the input data. See aggregation outputs.
  • keep_metric_names (bool: false) - instructs keeping the original metric names for the aggregated samples. This option can be set only if outputs list contains only a single output. By default, a special suffix is added to original metric names in the aggregated samples. See output metric names
  • ignore_old_samples (bool: false) - instructs ignoring input samples with old timestamps outside the current aggregation interval. See ignoring old samples See also -remoteWrite.streamAggr.ignoreOldSamples or -streamAggr.ignoreOldSamples command-line flag.
  • ignore_first_intervals (int: 0) - instructs ignoring the first N aggregation intervals after process start. See ignore first intervals on start See also -remoteWrite.streamAggr.ignoreFirstIntervals or -streamAggr.ignoreFirstIntervals command-line flags.
  • drop_input_labels (bool: false) - instructs dropping the given labels from input samples. The labels' dropping is performed before input_relabel_configs are applied. This also means that the labels are dropped before deduplication and stream aggregation.
  • input_relabel_configs (array<relabel_config>: []) - relabeling rules, which are applied to the incoming samples after they pass the match filter and before being aggregated. See relabeling
  • output_relabel_configs (array<relabel_config>: []) - relabeling rules, which are applied to the aggregated output metrics.

The file can contain multiple aggregation configs. The aggregation is performed independently per each specified config entry.

Configuration update

vmagent and single-node VictoriaMetrics support the following approaches for hot reloading stream aggregation configs from -remoteWrite.streamAggr.config and -streamAggr.config:

  • By sending SIGHUP signal to vmagent or victoria-metrics process:

    kill -SIGHUP `pidof vmagent`
    
  • By sending HTTP request to /-/reload endpoint (e.g. http://vmagent:8429/-/reload or `http://victoria-metrics:8428/-/reload).