VictoriaMetrics/lib
Aliaksandr Valialkin 0d5d46f9db
lib/streamaggr: huge pile of changes
- Reduce memory usage by up to 5x when de-duplicating samples across big number of time series.
- Reduce memory usage by up to 5x when aggregating across big number of output time series.
- Add lib/promutils.LabelsCompressor, which is going to be used by other VictoriaMetrics components
  for reducing memory usage for marshaled []prompbmarshal.Label.
- Add `dedup_interval` option at aggregation config, which allows setting individual
  deduplication intervals per each aggregation.
- Add `keep_metric_names` option at aggregation config, which allows keeping the original
  metric names in the output samples.
- Add `unique_samples` output, which counts the number of unique sample values.
- Add `increase_prometheus` and `total_prometheus` outputs, which ignore the first sample
  per each newly encountered time series.
- Use 64-bit hashes instead of marshaled labels as map keys when calculating `count_series` output.
  This makes obsolete https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5579
- Expose various metrics, which may help debugging stream aggregation:
  - vm_streamaggr_dedup_state_size_bytes - the size of data structures responsible for deduplication
  - vm_streamaggr_dedup_state_items_count - the number of items in the deduplication data structures
  - vm_streamaggr_labels_compressor_size_bytes - the size of labels compressor data structures
  - vm_streamaggr_labels_compressor_items_count - the number of entries in the labels compressor
  - vm_streamaggr_flush_duration_seconds - a histogram, which shows the duration of stream aggregation flushes
  - vm_streamaggr_dedup_flush_duration_seconds - a histogram, which shows the duration of deduplication flushes
  - vm_streamaggr_flush_timeouts_total - counter for timed out stream aggregation flushes,
    which took longer than the configured interval
  - vm_streamaggr_dedup_flush_timeouts_total - counter for timed out deduplication flushes,
    which took longer than the configured dedup_interval
- Actualize docs/stream-aggregation.md

The memory usage reduction increases CPU usage during stream aggregation by up to 30%.

This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5850
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5898
2024-03-02 03:15:43 +02:00
..
appmetrics all: add -metrics.exposeMetadata command-line flag, which can be used for adding TYPE and HELP metadata for metrics exposed at /metrics page 2023-12-19 03:26:02 +02:00
auth lib/auth: add NewTokenPossibleMultitenant() for parsing auth token, which can be multitenant 2023-08-30 14:13:51 +02:00
awsapi lib/awsapi: properly assume role with webIdentity token (#5495) 2023-12-20 19:07:04 +02:00
backup lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
blockcache lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
bloomfilter lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
bufferedwriter app/vmselect: move common http functionality from app/vmselect/searchutils to lib/httputils 2023-07-06 17:22:23 -07:00
buildinfo all: open-sourcing single-node version 2019-05-23 00:18:06 +03:00
bytesutil lib/bytesutil: use unsafe.String instead of unsafe conversion of slice header to string header 2024-02-29 17:28:04 +02:00
cgroup lib/cgroup: remove SetGOGC() function 2024-02-05 12:13:08 +02:00
consts app/vminsert: reduce the max packet size, which vminsert can send to vmstorage 2022-04-05 15:39:58 +03:00
decimal lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
encoding lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
envflag lib/envflag: do not allow unsupported form for boolean command-line flags in the form -boolFlag value 2023-08-17 13:37:05 +02:00
envtemplate allowed using dashes and dots in environment variables names (#4009) 2023-03-24 17:57:19 -07:00
fastnum lib/fastnum: use unsafe.Slice() instead of deprecated reflect.SliceHeader 2024-02-29 17:17:24 +02:00
fasttime lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
filestream lib/filestream: do not measure read / write duration from / to in-memory buffers 2024-01-23 14:53:35 +02:00
flagutil lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
formatutil app/vmbackupmanager: add metrics for better observability (#488) 2022-12-20 14:18:43 -08:00
fs lib/fs: fix GOOS=windows build after f8baf29b6e 2024-03-01 01:46:44 +02:00
handshake lib/handshake: substitute time.Now() with fastttime.UnixTimestamp(), since profiling shows time.Now() is slow 2024-01-23 18:39:28 +02:00
htmlcomponents lib/htmlcomponents: use relative links for the top page and for favicon.ico 2023-11-13 20:28:17 +01:00
httpserver lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
httputils lib/promutils: hide the math.Round() logic inside ParseTimeMsec() function 2024-02-23 01:21:42 +02:00
influxutils lib/flagutil: rename Array to ArrayString 2022-10-01 18:28:19 +03:00
ingestserver lib/ingestserver: properly log the number of closed connections 2023-11-14 21:53:10 +01:00
leveledbytebufferpool all: make fmt via the upcoming Go1.19 2022-07-11 19:23:25 +03:00
logger lib/logger: increase default -loggerMaxArgLen command-line flag value from 500 to 1000 2023-11-14 19:55:55 +01:00
logjson app/vlinsert/jsonline: code prettifying 2023-07-06 21:35:55 -07:00
logstorage lib/logstorage: avoid panic when parsing regex with stream filter (#5897) 2024-02-29 15:32:25 +02:00
lrucache lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
memory all: cleanup: remove // +build ... lines, since they are no longer needed after Go1.17, and the minimum supported Go version for VictoriaMetrics source code is Go1.20 2023-11-13 19:15:42 +01:00
mergeset lib/mergeset: use unsafe.Slice and unsafe.String instead of deprecated reflect.SliceHeader with unsafe conversion from slice header to string header 2024-02-29 17:29:40 +02:00
metricsql all: make fmt via the upcoming Go1.19 2022-07-11 19:23:25 +03:00
netutil lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
persistentqueue app/vmagent: follow-up for 090cb2c9de 2023-11-25 12:13:39 +02:00
procutil all: cleanup: remove // +build ... lines, since they are no longer needed after Go1.17, and the minimum supported Go version for VictoriaMetrics source code is Go1.20 2023-11-13 19:15:42 +01:00
promauth lib/promauth: follow-up for fca3b14b7b 2024-01-31 19:47:53 +02:00
prompb lib/prompbmarshal: switch to github.com/VictoriaMetrics/easyproto 2024-01-16 20:48:30 +02:00
prompbmarshal lib/prompbmarshal: code cleanup after 8aaa828ba3 2024-02-01 21:41:10 +02:00
promrelabel lib/promrelabel: use clear() function inside CleanLabels() 2024-03-01 21:34:47 +02:00
promscrape lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
promutils lib/streamaggr: huge pile of changes 2024-03-02 03:15:43 +02:00
protoparser lib/protoparser/opentelemetry/firehose: verify that the full response is parsed properly in ProcessRequestBody 2024-03-01 00:39:47 +02:00
proxy lib/promscrape: use the standard net/http.Client instead of fasthttp.Client for scraping targets in non-streaming mode 2024-01-30 18:39:55 +02:00
pushmetrics lib/pushmetrics: wait until the background goroutines, which push metrics, are stopped at pushmetrics.Stop() 2024-01-16 21:18:22 +02:00
querytracer lib/querytracer: add missing blank comment line after 3121d76bee 2023-11-15 16:11:50 +01:00
regexutil all: upgrade Go builder from Go1.21.7 to Go1.22.0 2024-02-12 22:14:00 +02:00
snapshot lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
storage lib/storage: use unsafe.Slice instead of deprecated reflect.SliceHeader 2024-02-29 17:24:44 +02:00
streamaggr lib/streamaggr: huge pile of changes 2024-03-02 03:15:43 +02:00
stringsutil lib/stringsutil: add tests for LimitStringLen() function 2023-11-13 10:33:07 +01:00
syncwg all: open-sourcing single-node version 2019-05-23 00:18:06 +03:00
tenantmetrics lib/encoding/zstd: switch back from atomic.Pointer to atomic.Value for map[...]... 2023-07-20 21:54:51 -07:00
timerpool lib/timerpool: use timer pool in concurrency limiters 2019-05-28 17:30:10 +03:00
timeutil all: add up to 10% random jitter to the interval between periodic tasks performed by various components 2024-01-22 18:39:16 +02:00
uint64set lib/uint64set: go fmt after c0a9b87f46 2024-02-15 14:52:53 +02:00
vmselectapi lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
workingsetcache lib: consistently use atomic.* types instead of atomic.* functions 2024-02-24 02:10:04 +02:00
writeconcurrencylimiter lib/writeconcurrencylimiter: initialize concurrencyLimitCh before exporting vm_concurrent_insert_capacity and vm_concurrent_insert_current metrics 2023-02-07 11:08:39 -08:00