docs/stream-aggregation.md: clarify "Routing" chapter a bit after f153f54d11

This commit is contained in:
Aliaksandr Valialkin 2024-07-15 21:42:48 +02:00
parent db557b86ee
commit fc637555cd
No known key found for this signature in database
GPG key ID: 52C003EE2BCDB9EB

View file

@ -19,8 +19,8 @@ The aggregation is applied to all the metrics received via any [supported data i
and/or scraped from [Prometheus-compatible targets](https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter) and/or scraped from [Prometheus-compatible targets](https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter)
after applying all the configured [relabeling stages](https://docs.victoriametrics.com/vmagent/#relabeling). after applying all the configured [relabeling stages](https://docs.victoriametrics.com/vmagent/#relabeling).
_By default, stream aggregation ignores timestamps associated with the input [samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples). **By default, stream aggregation ignores timestamps associated with the input [samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples).
It expects that the ingested samples have timestamps close to the current time. See [how to ignore old samples](#ignoring-old-samples)._ It expects that the ingested samples have timestamps close to the current time. See [how to ignore old samples](#ignoring-old-samples).**
## Configuration ## Configuration
@ -28,9 +28,9 @@ Stream aggregation can be configured via the following command-line flags:
- `-streamAggr.config` at [single-node VictoriaMetrics](https://docs.victoriametrics.com/single-server-victoriametrics/) - `-streamAggr.config` at [single-node VictoriaMetrics](https://docs.victoriametrics.com/single-server-victoriametrics/)
and at [vmagent](https://docs.victoriametrics.com/vmagent/). and at [vmagent](https://docs.victoriametrics.com/vmagent/).
- `-remoteWrite.streamAggr.config` at [vmagent](https://docs.victoriametrics.com/vmagent/) only. - `-remoteWrite.streamAggr.config` at [vmagent](https://docs.victoriametrics.com/vmagent/) only. This flag can be specified individually
This flag can be specified individually per each `-remoteWrite.url` and aggregation will happen independently for each of them. per each `-remoteWrite.url`, so the aggregation happens independently per each remote storage destination.
This allows writing different aggregates to different remote storage destinations. This allows writing different aggregates to different remote storage systems.
These flags must point to a file containing [stream aggregation config](#stream-aggregation-config). These flags must point to a file containing [stream aggregation config](#stream-aggregation-config).
The file may contain `%{ENV_VAR}` placeholders which are substituted by the corresponding `ENV_VAR` environment variable values. The file may contain `%{ENV_VAR}` placeholders which are substituted by the corresponding `ENV_VAR` environment variable values.
@ -60,26 +60,24 @@ The processed data is then stored in local storage and **can't be forwarded furt
[vmagent](https://docs.victoriametrics.com/vmagent/) supports relabeling, deduplication and stream aggregation for all [vmagent](https://docs.victoriametrics.com/vmagent/) supports relabeling, deduplication and stream aggregation for all
the received data, scraped or pushed. Then, the collected data will be forwarded to specified `-remoteWrite.url` destinations. the received data, scraped or pushed. Then, the collected data will be forwarded to specified `-remoteWrite.url` destinations.
The data processing order is the following: The data processing order is the following:
1. All the received data is [relabeled](https://docs.victoriametrics.com/vmagent/#relabeling) according to
specified `-remoteWrite.relabelConfig`;
1. All the received data is [deduplicated](https://docs.victoriametrics.com/stream-aggregation/#deduplication)
according to specified `-streamAggr.dedupInterval`;
1. All the received data is aggregated according to specified `-streamAggr.config`;
1. The resulting data from p1 and p2 is then replicated to each `-remoteWrite.url`;
1. Data sent to each `-remoteWrite.url` can be additionally relabeled according to the
corresponding `-remoteWrite.urlRelabelConfig` (set individually per URL);
1. Data sent to each `-remoteWrite.url` can be additionally deduplicated according to the
corresponding `-remoteWrite.streamAggr.dedupInterval` (set individually per URL);
1. Data sent to each `-remoteWrite.url` can be additionally aggregated according to the
corresponding `-remoteWrite.streamAggr.config` (set individually per URL). Please note, it is not recommended
to use `-streamAggr.config` and `-remoteWrite.streamAggr.config` together, unless you understand the complications.
Typical scenarios for data routing with vmagent: 1. all the received data is relabeled according to the specified [`-remoteWrite.relabelConfig`](https://docs.victoriametrics.com/vmagent/#relabeling) (if it is set)
1. **Aggregate incoming data and replicate to N destinations**. For this one should configure `-streamAggr.config` 1. all the received data is deduplicated according to specified [`-streamAggr.dedupInterval`](https://docs.victoriametrics.com/stream-aggregation/#deduplication)
(if it is set to duration bigger than 0)
1. all the received data is aggregated according to specified [`-streamAggr.config`](https://docs.victoriametrics.com/stream-aggregation/#configuration) (if it is set)
1. the resulting data is then replicated to each `-remoteWrite.url`
1. data sent to each `-remoteWrite.url` can be additionally relabeled according to the corresponding `-remoteWrite.urlRelabelConfig` (set individually per URL)
1. data sent to each `-remoteWrite.url` can be additionally deduplicated according to the corresponding `-remoteWrite.streamAggr.dedupInterval` (set individually per URL)
1. data sent to each `-remoteWrite.url` can be additionally aggregated according to the corresponding `-remoteWrite.streamAggr.config` (set individually per URL)
It isn't recommended using `-streamAggr.config` and `-remoteWrite.streamAggr.config` simultaneously, unless you understand the complications.
Typical scenarios for data routing with `vmagent`:
1. **Aggregate incoming data and replicate to N destinations**. Specify [`-streamAggr.config`](https://docs.victoriametrics.com/stream-aggregation/#configuration) command-line flag
to aggregate the incoming data before replicating it to all the configured `-remoteWrite.url` destinations. to aggregate the incoming data before replicating it to all the configured `-remoteWrite.url` destinations.
2. **Individually aggregate incoming data for each destination**. For this on should configure `-remoteWrite.streamAggr.config` 2. **Individually aggregate incoming data for each destination**. Specify [`-remoteWrite.streamAggr.config`](https://docs.victoriametrics.com/stream-aggregation/#configuration)
for each `-remoteWrite.url` destination. [Relabeling](https://docs.victoriametrics.com/vmagent/#relabeling) command-line flag for each `-remoteWrite.url` destination. [Relabeling](https://docs.victoriametrics.com/vmagent/#relabeling) via `-remoteWrite.urlRelabelConfig`
via `-remoteWrite.urlRelabelConfig` can be used for routing only selected metrics to each `-remoteWrite.url` destination. can be used for routing only the selected metrics to each `-remoteWrite.url` destination.
## Deduplication ## Deduplication