Commit graph

13 commits

Author SHA1 Message Date
Aliaksandr Valialkin
d10932bd99
lib/streamaggr: benchmark only flush routines in BenchmarkDedupAggrFlushSerial and BenchmarkAggregatorsFlushSerial 2024-03-04 19:13:50 +02:00
Aliaksandr Valialkin
9e00d8ad60
lib/streamaggr: use multiple job label values in BenchmarkAggregatorsPush instead of single value
This should make the benchmark closer to production cases
2024-03-04 19:13:48 +02:00
Aliaksandr Valialkin
9773ad200e
lib/streamaggr: use multiple job labels in BenchmarkAggregatorsPush 2024-03-04 19:13:48 +02:00
Aliaksandr Valialkin
2ffef39bb3
lib/streamaggr: properly drop samples on the first incomplete interval
Previously samples were dropped on the first incomplete interval and the next complete interval.
Also make sure that the de-duplication is performed just before flushing the aggregate state.
This should help the case then dedup_interval = interval.
2024-03-04 17:01:40 +02:00
Aliaksandr Valialkin
48a425898a
lib/streamaggr: enable time alignment for aggregate flushed to multiples of interval
For example, if `interval: 1m`, then data flush occurs at the end of every minute,
while `interval: 1h` leads to data flush at the end of every hour.

Add `no_align_flush_to_interval` option, which can be used for disabling the alignment.
2024-03-04 06:23:35 +02:00
Aliaksandr Valialkin
5205972b83
lib/streamaggr: add a benchmark for measuring the performance of aggregator.flush 2024-03-04 01:22:40 +02:00
Aliaksandr Valialkin
0d5d46f9db
lib/streamaggr: huge pile of changes
- Reduce memory usage by up to 5x when de-duplicating samples across big number of time series.
- Reduce memory usage by up to 5x when aggregating across big number of output time series.
- Add lib/promutils.LabelsCompressor, which is going to be used by other VictoriaMetrics components
  for reducing memory usage for marshaled []prompbmarshal.Label.
- Add `dedup_interval` option at aggregation config, which allows setting individual
  deduplication intervals per each aggregation.
- Add `keep_metric_names` option at aggregation config, which allows keeping the original
  metric names in the output samples.
- Add `unique_samples` output, which counts the number of unique sample values.
- Add `increase_prometheus` and `total_prometheus` outputs, which ignore the first sample
  per each newly encountered time series.
- Use 64-bit hashes instead of marshaled labels as map keys when calculating `count_series` output.
  This makes obsolete https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5579
- Expose various metrics, which may help debugging stream aggregation:
  - vm_streamaggr_dedup_state_size_bytes - the size of data structures responsible for deduplication
  - vm_streamaggr_dedup_state_items_count - the number of items in the deduplication data structures
  - vm_streamaggr_labels_compressor_size_bytes - the size of labels compressor data structures
  - vm_streamaggr_labels_compressor_items_count - the number of entries in the labels compressor
  - vm_streamaggr_flush_duration_seconds - a histogram, which shows the duration of stream aggregation flushes
  - vm_streamaggr_dedup_flush_duration_seconds - a histogram, which shows the duration of deduplication flushes
  - vm_streamaggr_flush_timeouts_total - counter for timed out stream aggregation flushes,
    which took longer than the configured interval
  - vm_streamaggr_dedup_flush_timeouts_total - counter for timed out deduplication flushes,
    which took longer than the configured dedup_interval
- Actualize docs/stream-aggregation.md

The memory usage reduction increases CPU usage during stream aggregation by up to 30%.

This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5850
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5898
2024-03-02 03:15:43 +02:00
Aliaksandr Valialkin
8187244153
lib/streamaggr: make the BenchmarkAggregatorsPushByJobAvg closer to production case with long list of labels per sample 2024-02-29 02:41:48 +02:00
Aliaksandr Valialkin
e6e5b97e1e
lib/streamaggr: expand %{ENV} placeholders in stream aggregation configs 2024-01-24 12:31:42 +02:00
Aliaksandr Valialkin
c049778ad1
lib/streamaggr: follow-up for 736197179e
- Use a byte slice instead of a map for tracking indexes for matching series.
  This improves performance, since access by slice index is faster than access by map key.
- Re-use the byte slice for tracking indexes for matching series.
  This removes unnecessary memory allocations and improves stream aggregation performance a bit.
- Add an ability to return to the previous behvaiour by specifying -remoteWrite.streamAggr.dropInput command-line flag.
  In this case all the input samples are dropped when stream aggregation is enabled.
- Backport the new stream aggregation behaviour from vmagent to single-node VictoriaMetrics when -streamAggr.config
  option is set.
- Improve docs regarding this change at docs/CHANGELOG.md
- Document the new behavior at docs/stream-aggregation.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4243
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4575
2023-07-24 17:06:09 -07:00
Zakhar Bessarab
470afac5ff
{lib/streamaggr,vmagent/remotewrite}: breaking change for keepInput flag (#4575)
* {lib/streamaggr,vmagent/remotewrite}: breaking change for keepInput flag

Changes default behaviour of keepInput flag to write series which did not match any aggregators to the remote write.
See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4243

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Update app/vmagent/remotewrite/remotewrite.go

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-07-24 16:34:38 -07:00
Aliaksandr Valialkin
5defa99a2e
lib/streamaggr: add ability to de-duplicate input samples before aggregation 2023-01-25 09:22:03 -08:00
Aliaksandr Valialkin
3369371636
app/{vmagent,vminsert}: add support for streaming aggregation
See https://docs.victoriametrics.com/stream-aggregation.html

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3460
2023-01-03 22:22:07 -08:00