VictoriaMetrics/lib
Aliaksandr Valialkin 7a8b92b590
lib/{mergeset,storage}: make background merge more responsive and scalable
- Maintain a separate worker pool per each part type (in-memory, file, big and small).
  Previously a shared pool was used for merging all the part types.
  A single merge worker could merge parts with mixed types at once. For example,
  it could merge simultaneously an in-memory part plus a big file part.
  Such a merge could take hours for big file part. During the duration of this merge
  the in-memory part was pinned in memory and couldn't be persisted to disk
  under the configured -inmemoryDataFlushInterval .

  Another common issue, which could happen when parts with mixed types are merged,
  is uncontrolled growth of in-memory parts or small parts when all the merge workers
  were busy with merging big files. Such growth could lead to significant performance
  degradataion for queries, since every query needs to check ever growing list of parts.
  This could also slow down the registration of new time series, since VictoriaMetrics
  searches for the internal series_id in the indexdb for every new time series.

  The third issue is graceful shutdown duration, which could be very long when a background
  merge is running on in-memory parts plus big file parts. This merge couldn't be interrupted,
  since it merges in-memory parts.

  A separate pool of merge workers per every part type elegantly resolves both issues:
  - In-memory parts are merged to file-based parts in a timely manner, since the maximum
    size of in-memory parts is limited.
  - Long-running merges for big parts do not block merges for in-memory parts and small parts.
  - Graceful shutdown duration is now limited by the time needed for flushing in-memory parts to files.
    Merging for file parts is instantly canceled on graceful shutdown now.

- Deprecate -smallMergeConcurrency command-line flag, since the new background merge algorithm
  should automatically self-tune according to the number of available CPU cores.

- Deprecate -finalMergeDelay command-line flag, since it wasn't working correctly.
  It is better to run forced merge when needed - https://docs.victoriametrics.com/#forced-merge

- Tune the number of shards for pending rows and items before the data goes to in-memory parts
  and becomes visible for search. This improves the maximum data ingestion rate and the maximum rate
  for registration of new time series. This should reduce the duration of data ingestion slowdown
  in VictoriaMetrics cluster on e.g. re-routing events, when some of vmstorage nodes become temporarily
  unavailable.

- Prevent from possible "sync: WaitGroup misuse" panic on graceful shutdown.

This is a follow-up for fa566c68a6 .
Thanks @misutoth to for the inspiration at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5190
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3790
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2024-01-26 22:19:52 +01:00
..
appmetrics all: add -metrics.exposeMetadata command-line flag, which can be used for adding TYPE and HELP metadata for metrics exposed at /metrics page 2023-12-19 03:26:02 +02:00
auth lib/auth: add NewTokenPossibleMultitenant() for parsing auth token, which can be multitenant 2023-08-30 14:13:51 +02:00
awsapi lib/awsapi: properly assume role with webIdentity token (#5495) 2023-12-20 19:07:04 +02:00
backup vendor: run make vendor-update 2023-12-11 10:48:47 +02:00
blockcache all: add up to 10% random jitter to the interval between periodic tasks performed by various components 2024-01-22 18:39:16 +02:00
bloomfilter all: call atomic.Load* in front of atomic.CompareAndSwap* at places where the atomic.CompareAndSwap* returns false most of the time 2024-01-22 01:13:41 +02:00
bufferedwriter app/vmselect: move common http functionality from app/vmselect/searchutils to lib/httputils 2023-07-06 17:22:23 -07:00
buildinfo all: open-sourcing single-node version 2019-05-23 00:18:06 +03:00
bytesutil lib/logger: add -loggerMaxArgLen command-line flag for fine-tuning the maximum length of logged args 2023-11-13 09:43:49 +01:00
cgroup lib/cgroup: add SetGOGC() function 2023-07-06 17:24:31 -07:00
consts app/vminsert: reduce the max packet size, which vminsert can send to vmstorage 2022-04-05 15:39:58 +03:00
decimal lib/decimal: use consistent randomizer in tests 2023-01-23 19:24:05 -08:00
encoding lib/encoding: remove uneeded re-slicing of byte slice before passing it to binary.BigEndian.Uint* 2024-01-23 22:50:11 +02:00
envflag lib/envflag: do not allow unsupported form for boolean command-line flags in the form -boolFlag value 2023-08-17 13:37:05 +02:00
envtemplate allowed using dashes and dots in environment variables names (#4009) 2023-03-24 17:57:19 -07:00
fastnum Makefile: add build and test rules with enabled race detector. These rules have -race suffix 2020-03-05 12:05:16 +02:00
fasttime lib: extract common code for returning fast unix timestamp into lib/fasttime 2020-05-14 23:06:50 +03:00
filestream lib/filestream: do not measure read / write duration from / to in-memory buffers 2024-01-23 14:53:35 +02:00
flagutil all: allow dynamically reading *AuthKey flag values from files and urls 2024-01-22 01:23:23 +02:00
formatutil app/vmbackupmanager: add metrics for better observability (#488) 2022-12-20 14:18:43 -08:00
fs all: allow dynamically reading *AuthKey flag values from files and urls 2024-01-22 01:23:23 +02:00
handshake lib/handshake: substitute time.Now() with fastttime.UnixTimestamp(), since profiling shows time.Now() is slow 2024-01-23 18:39:28 +02:00
htmlcomponents lib/htmlcomponents: use relative links for the top page and for favicon.ico 2023-11-13 20:28:17 +01:00
httpserver all: allow dynamically reading *AuthKey flag values from files and urls 2024-01-22 01:23:23 +02:00
httputils lib/httputils: handle step=undefined query arg as an empty value 2024-01-17 00:13:04 +02:00
influxutils lib/flagutil: rename Array to ArrayString 2022-10-01 18:28:19 +03:00
ingestserver lib/ingestserver: properly log the number of closed connections 2023-11-14 21:53:10 +01:00
leveledbytebufferpool all: make fmt via the upcoming Go1.19 2022-07-11 19:23:25 +03:00
logger lib/logger: increase default -loggerMaxArgLen command-line flag value from 500 to 1000 2023-11-14 19:55:55 +01:00
logjson app/vlinsert/jsonline: code prettifying 2023-07-06 21:35:55 -07:00
logstorage lib/logstorage: make sure that WaitGroup.Add isnt called after stopCh is closed and WaitGroup.Wait is called 2024-01-26 21:18:07 +01:00
lrucache all: add up to 10% random jitter to the interval between periodic tasks performed by various components 2024-01-22 18:39:16 +02:00
memory all: cleanup: remove // +build ... lines, since they are no longer needed after Go1.17, and the minimum supported Go version for VictoriaMetrics source code is Go1.20 2023-11-13 19:15:42 +01:00
mergeset lib/{mergeset,storage}: make background merge more responsive and scalable 2024-01-26 22:19:52 +01:00
metricsql all: make fmt via the upcoming Go1.19 2022-07-11 19:23:25 +03:00
netutil app/vmselect: add support for vmstorage groups with independent -replicationFactor per group 2023-12-13 00:14:34 +02:00
persistentqueue app/vmagent: follow-up for 090cb2c9de 2023-11-25 12:13:39 +02:00
procutil all: cleanup: remove // +build ... lines, since they are no longer needed after Go1.17, and the minimum supported Go version for VictoriaMetrics source code is Go1.20 2023-11-13 19:15:42 +01:00
promauth all: allow dynamically reading *AuthKey flag values from files and urls 2024-01-22 01:23:23 +02:00
prompb lib/prompbmarshal: switch to github.com/VictoriaMetrics/easyproto 2024-01-16 20:48:30 +02:00
prompbmarshal lib/prompbmarshal: move WriteRequest proto definition to the correct place 2024-01-16 21:57:03 +02:00
promrelabel all: allow dynamically reading *AuthKey flag values from files and urls 2024-01-22 01:23:23 +02:00
promscrape lib/promscrape/discovery/kubernetes: typo fix in the comment for ContainerStateTerminated struct 2024-01-24 15:10:47 +02:00
promutils all: consistently clear prompbmarshal.Label by assigning an empty struct instead of zeroing Name and Value individually 2024-01-22 01:11:59 +02:00
protoparser all: consistently clear prompbmarshal.Label by assigning an empty struct instead of zeroing Name and Value individually 2024-01-22 01:11:59 +02:00
proxy lib/promauth: follow-up for e16d3f5639 2023-10-26 09:55:47 +02:00
pushmetrics lib/pushmetrics: wait until the background goroutines, which push metrics, are stopped at pushmetrics.Stop() 2024-01-16 21:18:22 +02:00
querytracer lib/querytracer: add missing blank comment line after 3121d76bee 2023-11-15 16:11:50 +01:00
regexutil lib/regexutil: properly handle alternate regexps surrounded by .+ or .* 2023-11-13 18:25:57 +01:00
snapshot Makefile: update golangci-lint version from v1.54.2 to v1.55.1 2023-11-02 21:42:35 +01:00
storage lib/{mergeset,storage}: make background merge more responsive and scalable 2024-01-26 22:19:52 +01:00
streamaggr lib/streamaggr: expand %{ENV} placeholders in stream aggregation configs 2024-01-24 12:31:42 +02:00
stringsutil lib/stringsutil: add tests for LimitStringLen() function 2023-11-13 10:33:07 +01:00
syncwg all: open-sourcing single-node version 2019-05-23 00:18:06 +03:00
tenantmetrics lib/encoding/zstd: switch back from atomic.Pointer to atomic.Value for map[...]... 2023-07-20 21:54:51 -07:00
timerpool lib/timerpool: use timer pool in concurrency limiters 2019-05-28 17:30:10 +03:00
timeutil all: add up to 10% random jitter to the interval between periodic tasks performed by various components 2024-01-22 18:39:16 +02:00
uint64set lib/uint64: remove accidentally added test 2024-01-09 13:32:22 +01:00
vmselectapi vmcluster: re-routing enhancement (#5293) 2023-11-14 01:00:42 +01:00
workingsetcache all: add up to 10% random jitter to the interval between periodic tasks performed by various components 2024-01-22 18:39:16 +02:00
writeconcurrencylimiter lib/writeconcurrencylimiter: initialize concurrencyLimitCh before exporting vm_concurrent_insert_capacity and vm_concurrent_insert_current metrics 2023-02-07 11:08:39 -08:00