mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2024-12-01 14:47:38 +00:00
305 lines
14 KiB
Markdown
305 lines
14 KiB
Markdown
---
|
|
sort: 1
|
|
weight: 1
|
|
title: Use cases
|
|
menu:
|
|
docs:
|
|
identifier: stream-aggregation-use-cases
|
|
parent: 'stream-aggregation'
|
|
weight: 1
|
|
aliases:
|
|
- /stream-aggregation/use-cases/
|
|
- /stream-aggregation/use-cases/index.html
|
|
---
|
|
|
|
## Statsd alternative
|
|
|
|
Stream aggregation can be used as [statsd](https://github.com/statsd/statsd) alternative in the following cases:
|
|
|
|
* [Counting input samples](#counting-input-samples)
|
|
* [Summing input metrics](#summing-input-metrics)
|
|
* [Quantiles over input metrics](#quantiles-over-input-metrics)
|
|
* [Histograms over input metrics](#histograms-over-input-metrics)
|
|
* [Aggregating histograms](#aggregating-histograms)
|
|
|
|
Currently, streaming aggregation is available only for [supported data ingestion protocols](https://docs.victoriametrics.com/#how-to-import-time-series-data)
|
|
and not available for [Statsd metrics format](https://github.com/statsd/statsd/blob/master/docs/metric_types.md).
|
|
|
|
## Recording rules alternative
|
|
|
|
Sometimes [alerting queries](https://docs.victoriametrics.com/vmalert#alerting-rules) may require non-trivial amounts of CPU, RAM,
|
|
disk IO and network bandwidth at metrics storage side. For example, if `http_request_duration_seconds` histogram is generated by thousands
|
|
of application instances, then the alerting query `histogram_quantile(0.99, sum(increase(http_request_duration_seconds_bucket[5m])) without (instance)) > 0.5`
|
|
can become slow, since it needs to scan too big number of unique [time series](https://docs.victoriametrics.com/keyconcepts#time-series)
|
|
with `http_request_duration_seconds_bucket` name. This alerting query can be accelerated by pre-calculating
|
|
the `sum(increase(http_request_duration_seconds_bucket[5m])) without (instance)` via [recording rule](https://docs.victoriametrics.com/vmalert#recording-rules).
|
|
But this recording rule may take too much time to execute too. In this case the slow recording rule can be substituted
|
|
with the following [stream aggregation config](./configuration/#configuration-file-reference):
|
|
|
|
```yaml
|
|
- match: 'http_request_duration_seconds_bucket'
|
|
interval: 5m
|
|
without: [instance]
|
|
outputs: [total]
|
|
```
|
|
|
|
This stream aggregation generates `http_request_duration_seconds_bucket:5m_without_instance_total` output series according to [output metric naming](./key-concepts.md#output-metric-names).
|
|
Then these series can be used in [alerting rules](https://docs.victoriametrics.com/vmalert#alerting-rules):
|
|
|
|
```metricsql
|
|
histogram_quantile(0.99, last_over_time(http_request_duration_seconds_bucket:5m_without_instance_total[5m])) > 0.5
|
|
```
|
|
|
|
This query is executed much faster than the original query, because it needs to scan much lower number of time series.
|
|
|
|
See [the list of aggregate output](./configuration/outputs), which can be specified at `output` field.
|
|
See also [aggregating by labels](./key-concepts.md/#aggregating-by-labels).
|
|
|
|
Field `interval` is recommended to be set to a value at least several times higher than your metrics collect interval.
|
|
|
|
## Reducing the number of stored samples
|
|
|
|
If per-[series](https://docs.victoriametrics.com/keyconcepts#time-series) samples are ingested at high frequency,
|
|
then this may result in high disk space usage, since too much data must be stored to disk. This also may result
|
|
in slow queries, since too much data must be processed during queries.
|
|
|
|
This can be fixed with the stream aggregation by increasing the interval between per-series samples stored in the database.
|
|
|
|
For example, the following [stream aggregation config](./configuration/#configuration-file-reference) reduces the frequency of input samples
|
|
to one sample per 5 minutes per each input time series (this operation is also known as downsampling):
|
|
|
|
```yaml
|
|
# Aggregate metrics ending with _total with `total` output.
|
|
# See {{% ref "./configuration/outputs" %}}
|
|
- match: '{__name__=~".+_total"}'
|
|
interval: 5m
|
|
outputs: [total]
|
|
|
|
# Downsample other metrics with `count_samples`, `sum_samples`, `min` and `max` outputs
|
|
# See {{% ref "./configuration/outputs" %}}
|
|
- match: '{__name__!~".+_total"}'
|
|
interval: 5m
|
|
outputs: [count_samples, sum_samples, min, max]
|
|
```
|
|
|
|
The aggregated output metrics have the following names according to [output metric naming](./key-concepts.md#output-metric-names):
|
|
|
|
```text
|
|
# For input metrics ending with _total
|
|
some_metric_total:5m_total
|
|
|
|
# For input metrics not ending with _total
|
|
some_metric:5m_count_samples
|
|
some_metric:5m_sum_samples
|
|
some_metric:5m_min
|
|
some_metric:5m_max
|
|
```
|
|
|
|
See [the list of aggregate output](./configuration/outputs), which can be specified at `output` field.
|
|
See also [aggregating histograms](#aggregating-histograms) and [aggregating by labels](./key-concepts.md#aggregating-by-labels).
|
|
|
|
## Reducing the number of stored series
|
|
|
|
Sometimes applications may generate too many [time series](https://docs.victoriametrics.com/keyconcepts#time-series).
|
|
For example, the `http_requests_total` metric may have `path` or `user` label with too big number of unique values.
|
|
In this case the following stream aggregation can be used for reducing the number metrics stored in VictoriaMetrics:
|
|
|
|
```yaml
|
|
- match: 'http_requests_total'
|
|
interval: 30s
|
|
without: [path, user]
|
|
outputs: [total]
|
|
```
|
|
|
|
This config specifies labels, which must be removed from the aggregate output, in the `without` list.
|
|
See [these docs](./key-concepts.md#aggregating-by-labels) for more details.
|
|
|
|
The aggregated output metric has the following name according to [output metric naming](./key-concepts.md#output-metric-names):
|
|
|
|
```text
|
|
http_requests_total:30s_without_path_user_total
|
|
```
|
|
|
|
See [the list of aggregate output](./configuration/outputs), which can be specified at `output` field.
|
|
See also [aggregating histograms](#aggregating-histograms).
|
|
|
|
## Counting input samples
|
|
|
|
If the monitored application generates event-based metrics, then it may be useful to count the number of such metrics
|
|
at stream aggregation level.
|
|
|
|
For example, if an advertising server generates `hits{some="labels"} 1` and `clicks{some="labels"} 1` metrics
|
|
per each incoming hit and click, then the following [stream aggregation config](./configuration/#configuration-file-reference)
|
|
can be used for counting these metrics per 30 second interval:
|
|
|
|
```yaml
|
|
- match: '{__name__=~"hits|clicks"}'
|
|
interval: 30s
|
|
outputs: [count_samples]
|
|
```
|
|
|
|
This config generates the following output metrics for `hits` and `clicks` input metrics
|
|
according to [output metric naming](./key-concepts.md#output-metric-names):
|
|
|
|
```text
|
|
hits:30s_count_samples count1
|
|
clicks:30s_count_samples count2
|
|
```
|
|
|
|
See [the list of aggregate output](./configuration/outputs), which can be specified at `output` field.
|
|
See also [aggregating by labels](./key-concepts.md#aggregating-by-labels).
|
|
|
|
## Summing input metrics
|
|
|
|
If the monitored application calculates some events and then sends the calculated number of events to VictoriaMetrics
|
|
at irregular intervals or at too high frequency, then stream aggregation can be used for summing such events
|
|
and writing the aggregate sums to the storage at regular intervals.
|
|
|
|
For example, if an advertising server generates `hits{some="labels} N` and `clicks{some="labels"} M` metrics
|
|
at irregular intervals, then the following [stream aggregation config](./configuration/#configuration-file-reference)
|
|
can be used for summing these metrics per minute:
|
|
|
|
```yaml
|
|
- match: '{__name__=~"hits|clicks"}'
|
|
interval: 1m
|
|
outputs: [sum_samples]
|
|
```
|
|
|
|
This config generates the following output metrics according to [output metric naming](https://docs.victoriametrics.com/keyconcepts#output-metric-names):
|
|
|
|
```text
|
|
hits:1m_sum_samples sum1
|
|
clicks:1m_sum_samples sum2
|
|
```
|
|
|
|
See [the list of aggregate output](./configuration/outputs), which can be specified at `output` field.
|
|
See also [aggregating by labels](./key-concepts.md#aggregating-by-labels).
|
|
|
|
## Quantiles over input metrics
|
|
|
|
If the monitored application generates measurement metrics per request, then it may be useful to calculate
|
|
the pre-defined set of [percentiles](https://en.wikipedia.org/wiki/Percentile) over these measurements.
|
|
|
|
For example, if the monitored application generates `request_duration_seconds N` and `response_size_bytes M` metrics
|
|
per each incoming request, then the following [stream aggregation config](./configuration/#configuration-file-reference)
|
|
can be used for calculating 50th and 99th percentiles for these metrics every 30 seconds:
|
|
|
|
```yaml
|
|
- match:
|
|
- request_duration_seconds
|
|
- response_size_bytes
|
|
interval: 30s
|
|
outputs: ["quantiles(0.50, 0.99)"]
|
|
```
|
|
|
|
This config generates the following output metrics according to [output metric naming](./key-concepts.md#output-metric-names):
|
|
|
|
```text
|
|
request_duration_seconds:30s_quantiles{quantile="0.50"} value1
|
|
request_duration_seconds:30s_quantiles{quantile="0.99"} value2
|
|
|
|
response_size_bytes:30s_quantiles{quantile="0.50"} value1
|
|
response_size_bytes:30s_quantiles{quantile="0.99"} value2
|
|
```
|
|
|
|
See [the list of aggregate output](./configuration/outputs), which can be specified at `output` field.
|
|
See also [histograms over input metrics](#histograms-over-input-metrics) and [aggregating by labels](./key-concepts.md#aggregating-by-labels).
|
|
|
|
## Histograms over input metrics
|
|
|
|
If the monitored application generates measurement metrics per request, then it may be useful to calculate
|
|
a [histogram](https://docs.victoriametrics.com/keyconcepts#histogram) over these metrics.
|
|
|
|
For example, if the monitored application generates `request_duration_seconds N` and `response_size_bytes M` metrics
|
|
per each incoming request, then the following [stream aggregation config](./configuration/#configuration-file-reference)
|
|
can be used for calculating [VictoriaMetrics histogram buckets](https://valyala.medium.com/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350)
|
|
for these metrics every 60 seconds:
|
|
|
|
```yaml
|
|
- match:
|
|
- request_duration_seconds
|
|
- response_size_bytes
|
|
interval: 60s
|
|
outputs: [histogram_bucket]
|
|
```
|
|
|
|
This config generates the following output metrics according to [output metric naming](./key-concepts.md#output-metric-names).
|
|
|
|
```text
|
|
request_duration_seconds:60s_histogram_bucket{vmrange="start1...end1"} count1
|
|
request_duration_seconds:60s_histogram_bucket{vmrange="start2...end2"} count2
|
|
...
|
|
request_duration_seconds:60s_histogram_bucket{vmrange="startN...endN"} countN
|
|
|
|
response_size_bytes:60s_histogram_bucket{vmrange="start1...end1"} count1
|
|
response_size_bytes:60s_histogram_bucket{vmrange="start2...end2"} count2
|
|
...
|
|
response_size_bytes:60s_histogram_bucket{vmrange="startN...endN"} countN
|
|
```
|
|
|
|
The resulting histogram buckets can be queried with [MetricsQL](https://docs.victoriametrics.com/metricsql/) in the following ways:
|
|
|
|
1. An estimated 50th and 99th [percentiles](https://en.wikipedia.org/wiki/Percentile) of the request duration over the last hour:
|
|
|
|
```metricsql
|
|
histogram_quantiles("quantile", 0.50, 0.99, sum(increase(request_duration_seconds:60s_histogram_bucket[1h])) by (vmrange))
|
|
```
|
|
|
|
This query uses [histogram_quantiles](https://docs.victoriametrics.com/metricsql/#histogram_quantiles) function.
|
|
|
|
1. An estimated [standard deviation](https://en.wikipedia.org/wiki/Standard_deviation) of the request duration over the last hour:
|
|
|
|
```metricsql
|
|
histogram_stddev(sum(increase(request_duration_seconds:60s_histogram_bucket[1h])) by (vmrange))
|
|
```
|
|
|
|
This query uses [histogram_stddev](https://docs.victoriametrics.com/metricsql/#histogram_stddev) function.
|
|
|
|
1. An estimated share of requests with the duration smaller than `0.5s` over the last hour:
|
|
|
|
```metricsql
|
|
histogram_share(0.5, sum(increase(request_duration_seconds:60s_histogram_bucket[1h])) by (vmrange))
|
|
```
|
|
|
|
This query uses [histogram_share](https://docs.victoriametrics.com/metricsql/#histogram_share) function.
|
|
|
|
See [the list of aggregate output](./configuration/outputs), which can be specified at `output` field.
|
|
See also [quantiles over input metrics](#quantiles-over-input-metrics) and [aggregating by labels](./key-concepts.md#aggregating-by-labels).
|
|
|
|
## Aggregating histograms
|
|
|
|
[Histogram](https://docs.victoriametrics.com/keyconcepts#histogram) is a set of [counter](https://docs.victoriametrics.com/keyconcepts#counter)
|
|
metrics with different `vmrange` or `le` labels. As they're counters, the applicable aggregation output is
|
|
[total](./configuration/outputs/#total):
|
|
|
|
```yaml
|
|
- match: 'http_request_duration_seconds_bucket'
|
|
interval: 1m
|
|
without: [instance]
|
|
outputs: [total]
|
|
```
|
|
|
|
This config generates the following output metrics according to [output metric naming](./key-concepts.md#output-metric-names):
|
|
|
|
```text
|
|
http_request_duration_seconds_bucket:1m_without_instance_total{le="0.1"} value1
|
|
http_request_duration_seconds_bucket:1m_without_instance_total{le="0.2"} value2
|
|
http_request_duration_seconds_bucket:1m_without_instance_total{le="0.4"} value3
|
|
http_request_duration_seconds_bucket:1m_without_instance_total{le="1"} value4
|
|
http_request_duration_seconds_bucket:1m_without_instance_total{le="3"} value5
|
|
http_request_duration_seconds_bucket:1m_without_instance_total{le="+Inf" value6
|
|
```
|
|
|
|
The resulting metrics can be passed to [histogram_quantile](https://docs.victoriametrics.com/metricsql#histogram_quantile)
|
|
function:
|
|
|
|
```metricsql
|
|
histogram_quantile(0.9, sum(rate(http_request_duration_seconds_bucket:1m_without_instance_total[5m])) by(le))
|
|
```
|
|
|
|
Please note, histograms can be aggregated if their `le` labels are configured identically.
|
|
[VictoriaMetrics histogram buckets](https://valyala.medium.com/improving-histogram-usability-for-prometheus-and-grafana-bc7e5df0e350)
|
|
have no such requirement.
|
|
|
|
See [the list of aggregate output](./configuration/outputs), which can be specified at `output` field.
|
|
See also [histograms over input metrics](#histograms-over-input-metrics) and [quantiles over input metrics](#quantiles-over-input-metrics).
|