github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-01 14:47:38 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	7a8b92b590	lib/{mergeset,storage}: make background merge more responsive and scalable - Maintain a separate worker pool per each part type (in-memory, file, big and small). Previously a shared pool was used for merging all the part types. A single merge worker could merge parts with mixed types at once. For example, it could merge simultaneously an in-memory part plus a big file part. Such a merge could take hours for big file part. During the duration of this merge the in-memory part was pinned in memory and couldn't be persisted to disk under the configured -inmemoryDataFlushInterval . Another common issue, which could happen when parts with mixed types are merged, is uncontrolled growth of in-memory parts or small parts when all the merge workers were busy with merging big files. Such growth could lead to significant performance degradataion for queries, since every query needs to check ever growing list of parts. This could also slow down the registration of new time series, since VictoriaMetrics searches for the internal series_id in the indexdb for every new time series. The third issue is graceful shutdown duration, which could be very long when a background merge is running on in-memory parts plus big file parts. This merge couldn't be interrupted, since it merges in-memory parts. A separate pool of merge workers per every part type elegantly resolves both issues: - In-memory parts are merged to file-based parts in a timely manner, since the maximum size of in-memory parts is limited. - Long-running merges for big parts do not block merges for in-memory parts and small parts. - Graceful shutdown duration is now limited by the time needed for flushing in-memory parts to files. Merging for file parts is instantly canceled on graceful shutdown now. - Deprecate -smallMergeConcurrency command-line flag, since the new background merge algorithm should automatically self-tune according to the number of available CPU cores. - Deprecate -finalMergeDelay command-line flag, since it wasn't working correctly. It is better to run forced merge when needed - https://docs.victoriametrics.com/#forced-merge - Tune the number of shards for pending rows and items before the data goes to in-memory parts and becomes visible for search. This improves the maximum data ingestion rate and the maximum rate for registration of new time series. This should reduce the duration of data ingestion slowdown in VictoriaMetrics cluster on e.g. re-routing events, when some of vmstorage nodes become temporarily unavailable. - Prevent from possible "sync: WaitGroup misuse" panic on graceful shutdown. This is a follow-up for `fa566c68a6` . Thanks @misutoth to for the inspiration at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5190 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3790 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291	2024-01-26 22:19:52 +01:00
Aliaksandr Valialkin	0715f1efcd	lib/storage: rename AssistedMerges to AssistedMergesCount in order to make these field names less misleading These fields are counters, not gauges, so adding Count suffix to them makes easier to understand this while reading the code	2024-01-25 10:21:13 +02:00
Aliaksandr Valialkin	3199558da9	lib/{storage,mergeset}: reduce the maxium compression level for the stored data This reduces CPU usage a bit, while doesn't increase resulting file sizes according to synthetic tests.	2024-01-23 17:47:40 +02:00
Aliaksandr Valialkin	68d76b1436	lib/storage: compress metricIDs, which match the given filters, before storing them in tagFiltersToMetricIDsCache This allows reducing the indexdb/tagFiltersToMetricIDs cache size by 8 on average. The cache size can be checked via vm_cache_size_bytes{type="indexdb/tagFiltersToMetricIDs"} metric exposed at /metrics page.	2024-01-23 16:13:25 +02:00
Aliaksandr Valialkin	9b3217db61	lib/storage: do not sort metricIDs passed to Storage.prefetchMetricNames, since the caller is responsible for the sorting	2024-01-23 16:13:19 +02:00
Zakhar Bessarab	60ef978ffc	lib/storage: print tenant ID in log when discarding or truncating labels (#5658 ) Previously, it was not possible to determine which tenant sends metrics with excessive amount of labels of label values. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 02:27:59 +02:00
Aliaksandr Valialkin	d52fd73f18	all: add up to 10% random jitter to the interval between periodic tasks performed by various components This should smooth CPU and RAM usage spikes related to these periodic tasks, by reducing the probability that multiple concurrent periodic tasks are performed at the same time.	2024-01-22 18:39:16 +02:00
Aliaksandr Valialkin	64e615e6cc	lib/storage: reduce the contention on dateMetricIDCache mutex when new time series are registered at high rate The dateMetricIDCache puts recently registered (date, metricID) entries into mutable cache protected by the mutex. The dateMetricIDCache.Has() checks for the entry in the mutable cache when it isn't found in the immutable cache. Access to the mutable cache is protected by the mutex. This means this access is slow on systems with many CPU cores. The mutabe cache was merged into immutable cache every 10 seconds in order to avoid slow access to mutable cache. This means that ingestion of new time series to VictoriaMetrics could result in significant slowdown for up to 10 seconds because of bottleneck at the mutex. Fix this by merging the mutable cache into immutable cache after len(cacheItems) / 2 cache hits under the mutex, e.g. when the entry is found in the mutable cache. This should automatically adjust intervals between merges depending on the addition rate for new time series (aka churn rate): - The interval will be much smaller than 10 seconds under high churn rate. This should reduce the mutex contention for mutable cache. - The interval will be bigger than 10 seconds under low churn rate. This should reduce the uneeded work on merging of mutable cache into immutable cache.	2024-01-22 18:14:30 +02:00
Aliaksandr Valialkin	d4a1a28543	app/vmselect: handle negative time range start in a generic manner inside NewSearchQuery() This is a follow-up for `cf03e11d89` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5553 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5630	2024-01-22 01:39:27 +02:00
Aliaksandr Valialkin	5f5fcab217	all: call atomic.Load* in front of atomic.CompareAndSwap* at places where the atomic.CompareAndSwap* returns false most of the time This allows avoiding slow inter-CPU synchornization induced by atomic.CompareAndSwap*	2024-01-22 01:13:41 +02:00
Aliaksandr Valialkin	2f94bef59c	lib/storage/partition.go: remove misleading comment, which falsely states that inmemoryParts isn't visible to search Thanks to @satjd for raising attention to this comment at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5410	2024-01-22 01:11:36 +02:00
Aliaksandr Valialkin	41d6c8a7dd	lib/storage: do not prefetch metric names for small number of metricIDs This eliminates prefetchedMetricIDsLock lock contention for queries, which return less than 500 time series. This is a follow-up for `9d886a2eb0`	2024-01-17 13:50:01 +02:00
Aliaksandr Valialkin	f673039e86	lib/storage: follow-up for `4b8088e377` - Clarify the bugfix description at docs/CHANGELOG.md - Simplify the code by accessing prefetchedMetricIDs struct under the lock instead of using lockless access to immutable struct. This shouldn't worsen code scalability too much on busy systems with many CPU cores, since the code executed under the lock is quite small and fast. This allows removing cloning of prefetchedMetricIDs struct every time new metric names are pre-fetched. This should reduce load on Go GC, since the cloning of uin64set.Set struct allocates many new objects.	2024-01-16 22:38:57 +02:00
Roman Khavronenko	d562d772a8	lib/storage: properly check for `storage/prefetchedMetricIDs` cache expiration deadline (#5607 ) Before, this cache was limited only by size. Cache invalidation by time happens with jitter to prevent thundering herd problem. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-16 21:08:59 +02:00
Aliaksandr Valialkin	7d40506744	lib/prompb: change type of Label.Name and Label.Value from []byte to string This makes it more consistent with lib/prompbmarshal.Label	2024-01-16 20:41:37 +02:00
Artem Navoiev	e1005209ba	docs: mention staleNaN handling during deduplication See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5587 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-16 20:11:45 +02:00
hagen1778	ae6152be5f	lib/storage: fix typo Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-21 12:22:49 +02:00
hagen1778	91e365acb6	lib/storage: fix typo Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-21 12:10:34 +02:00
Aliaksandr Valialkin	d9ecc3f6d7	lib/logger: add `-loggerMaxArgLen` command-line flag for fine-tuning the maximum length of logged args	2023-11-13 09:43:49 +01:00
Aliaksandr Valialkin	c22b63af04	lib/storage: follow-up for `29cebd82fb` Use atomic.CompareAndSwapUint32() instead of atomic.LoadUint32() followed by atomic.StoreUint32(). This makes the code more clear. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159	2023-10-31 19:03:50 +01:00
Roman Khavronenko	733b73ffed	lib/storage: log warning about RO mode only on state change (#5191 ) Before, vmstorage would log the same message each second producing excessive amount of logs. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159 Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `29cebd82fb`)	2023-10-30 11:29:49 +01:00
Aliaksandr Valialkin	36a1fdca6c	all: consistently use %w instead of %s in when error is passed to fmt.Errorf() This allows consistently using errors.Is() for verifying whether the given error wraps some other known error.	2023-10-26 09:44:40 +02:00
Aliaksandr Valialkin	190beb79d3	lib/storage: fix test TestStorageSeriesAreNotCreatedOnStaleMarkers	2023-10-26 09:22:15 +02:00
Roman Khavronenko	cd2247b24a	app/vmselect: limit the number of parallel workers by 32 (#5195 ) * app/vmselect: limit the number of parallel workers by 32 The change should improve performance and memory usage during query processing on machines with big number of CPU cores. The number of parallel workers for query processing is controlled via `-search.maxWorkersPerQuery` command-line flag. By default, the number of workers is limited by the number of available CPU cores, but not more than 32. The limit can be increased via `-search.maxWorkersPerQuery`. Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip - The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage, so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md - Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag - Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md - Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS, because bigger values may worsen query performance and increase CPU usage - Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX, since it is closer to a feature than to a bugfix. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5087 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-10-26 09:15:27 +02:00
hagen1778	afab547821	lib/storage: follow-up after `188cfe3a85` `188cfe3a85` See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-26 09:04:49 +02:00
Ilya Trefilov	1fd3385965	lib/storage: do not create tsid if metric contains stale marker(#5069 ) (#5174 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5069	2023-10-26 09:03:19 +02:00
Haleygo	93589ecccf	fix ingesting stale point, follow up `fe8cc573d1` (#5179 )	2023-10-16 14:00:39 +02:00
hagen1778	709a2bad66	docs: remove extra `/` in the end of the link Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-16 13:24:30 +02:00
Aliaksandr Valialkin	f55d114785	lib/{mergeset,storage}: consistently reset isInMerge field in parts passed to mergeParts() before returning from the function While at it consistently check that the isInMerge field is set in all the parts passed to mergeParts()	2023-10-02 20:34:52 +02:00
Aliaksandr Valialkin	8b1d6b995e	lib/{mergeset,storage}: perform at most one assisted merge per each call to addRows/addItems This should reduce tail latency during data ingestion. This shouldn't slow down data ingestion in the worst case, since assisted merges are spread among distinct addRows/addItems calls after this change.	2023-10-02 20:33:51 +02:00
Aliaksandr Valialkin	9ae92ff2ee	lib/storage: remove unused atomicSetBool function after `717c53af27`	2023-09-25 17:37:45 +02:00
Aliaksandr Valialkin	60fe63df07	lib/storage: make it clear that the number of big merge workers always equals to 4 See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4915#issuecomment-1733922830	2023-09-25 17:17:40 +02:00
Aliaksandr Valialkin	a421db5977	lib/storage: stop exposing vm_merge_need_free_disk_space metric This metric confuses users and has no any useful information. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686#issuecomment-1733844128	2023-09-25 17:00:14 +02:00
Aliaksandr Valialkin	281eb0c377	lib/storage: log fatal error inside searchMetricName() instead of propagating it to the caller This simplifies the code a bit at searchMetricName() and searchMetricNameWithCache() call sites This is a result of investigating https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4972	2023-09-22 11:37:55 +02:00
Zakhar Bessarab	47d9e82b52	lib/storage/partition: add check to ensure parts exist on disk (#5017 ) * lib/storage/partition: add check to ensure parts exist on disk If part exists in parts.json but is missing on disk there will be a misleading error similar to "unexpected number of substrings in the part name". This change forces verification of part existence and throws a correct error in case it is missing on disk. Such issue can be result of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5005 or disk corruption. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/partition: use filepath.Join instead of string concatenation Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/partition: add action points for error message Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * all: add a check for missing part in lib/mergeset and lib/logstorage --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-09-19 11:18:21 +02:00
faceair	609c76eec9	lib/storage: remove ForceMergeAllParts internal loop (#4999 ) Signed-off-by: faceair <git@faceair.me>	2023-09-18 16:35:37 +02:00
Aliaksandr Valialkin	24cec79763	lib/storage: handle fatal errors inside indexSearch.getTSIDByMetricID() instead of returning them to the caller This simplifies the code a bit at caller side	2023-09-15 12:01:06 +02:00
Dima Lazerka	b97041c252	flagutil: Make .Msecs private (#4906 ) * Introduce flagutil.Duration To avoid conversion bugs * Fix tests * Clarify documentation re. month=31 days * Add fasttime.UnixTime() to obtain time.Time The goal is to refactor out the last usage of `.Msecs`. * Use fasttime for time.Now() * wip - Remove fasttime.UnixTime(), since it doesn't improve code readability and maintainability - Run `make docs-sync` for syncing changes from README.md to docs/ folder - Make lib/flagutil.Duration.Msec private - Rename msecsPerMonth const to msecsPer31Days in order to be consistent with retention31Days --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-09-03 10:37:57 +02:00
Aliaksandr Valialkin	d8afd7fe98	Makefile: update golangci-lint from v1.51.2 to v1.54.2 See https://github.com/golangci/golangci-lint/releases/tag/v1.54.2	2023-09-01 10:25:49 +02:00
Dima Lazerka	1d60c236b1	Add flagutil.Duration to avoid conversion bugs (#4835 ) * Introduce flagutil.Duration To avoid conversion bugs * Fix tests * Comment why not .Seconds()	2023-09-01 09:30:30 +02:00
Nikolay	1a943bb16a	lib/storage: properly caclucate nextRotationTimestamp (#4874 ) cause of typo unix millis was used instead of unix for current timestamp calculation https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4873 (cherry picked from commit `c5aac34b68`)	2023-08-23 13:29:32 +02:00
Aliaksandr Valialkin	bde876f7c9	all: refer to https://docs.victoriametrics.com/#resource-usage-limits in the error message about -search.max* limit Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4827	2023-08-14 02:02:12 -07:00
Aliaksandr Valialkin	89ccf19b70	lib/storage: update nextRotationTimestamp relative to the timestamp of the indexdb rotation Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563	2023-07-28 19:48:42 -07:00
Nikolay	30b32583f4	lib/storage: pre-create timeseries before indexDB rotation (#4652 ) * lib/storage: pre-create timeseries before indexDB rotation during an hour before indexDB rotation start creating records at the next indexDB it must improve performance during switch for the next indexDB and remove ingestion issues. Since there is no need for creation new index records for timeseries already ingested into current indexDB https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563 * lib/storage: further work on indexdb rotation optimization - Document the change at docs/CHAGNELOG.md - Move back various caches from indexDB to Storage. This makes the change less intrusive. The dateMetricIDCache now takes into account indexDB generation, so it stores (date, metricID) entries for both the current and the next indexDB. - Consolidate the code responsible for idbNext pre-filling into prefillNextIndexDB() function. This improves code readability and maintainability a bit. - Rewrite and simplify the code responsible for calculating the next retention timestamp. Add various tests for corner cases of this code. - Remove indexdb pre-filling from RegisterMetricNames() function, since this function is rarely called. It is OK to add indexdb entries on demand in this function. This simplifies the code. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 * docs/CHANGELOG.md: refer to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-07-22 15:23:14 -07:00
Aliaksandr Valialkin	992c300ce9	all: replace atomic.Value with atomic.Pointer[T] This eliminates the need in .(*T) casting for results obtained from Load() Leave atomic.Value for map, since atomic.Pointer[map[...]...] makes double pointer to map, because map is already a pointer type.	2023-07-19 17:48:26 -07:00
Aliaksandr Valialkin	5ace0701d3	app/vmselect/promql: add the ability to copy all the labels from `one` side of group_left()/group_right() operation This is performed by specifying `` inside group_left()/group_right(). Also allow specifying prefix for the copied labels via `group_left(...) prefix "..."` and `group_right(...) prefix "..."` syntax. For example, the following query adds all the namespace-related labels to pod info, and prefixes all the copied label names with "ns_" prefix: kube_pod_info on(namespace) group_left(*) prefix "ns_" kube_namespace_labels This resolves the following StackOverflow questions: - https://stackoverflow.com/questions/76661818/how-to-add-namespace-labels-to-pod-labels-in-prometheus - https://stackoverflow.com/questions/76653997/how-can-i-make-a-new-copy-of-kube-namespace-labels-metric-with-a-different-name	2023-07-17 16:58:30 -07:00
Aliaksandr Valialkin	3d23fd9853	lib/storage: move series registration in caches from createAllIndexesForMetricName into a separate function - putSeriesToCache This makes the code more clear and easier to read This is a follow-up for `7094fa38bc`	2023-07-13 23:17:14 -07:00
Aliaksandr Valialkin	203a436066	lib/storage: optimize BenchmarkIndexDBGetTSIDs() - Sort MetricName tags only once before the benchmark loop. - Obtain indexSearch per each benchmark loop in order to give a chance for background merge for the recently created parts	2023-07-13 21:49:54 -07:00
Aliaksandr Valialkin	fbddb4ad32	lib/storage: typo fix after `e1cf962bad` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401	2023-07-13 21:29:02 -07:00
Aliaksandr Valialkin	7d359d17d1	lib/storage: properly free up resources from newTestStorage() by calling stopTestStorage()	2023-07-13 17:13:34 -07:00
Aliaksandr Valialkin	e1cf962bad	lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for `1f28b46ae9`	2023-07-13 17:03:50 -07:00
Aliaksandr Valialkin	1bce67df06	lib/storage: fix possible test failure in TestStorageAddRowsConcurrent The number of parts in the snapshot partition may be zero if concurrent goroutine just started creating new partition, but didn't put data into it yet when the current goroutine made a snapshot.	2023-07-13 15:03:51 -07:00
Aliaksandr Valialkin	eea088d87f	docs/CHANGELOG.md: clarify description for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 bugfix This is a follow-up for `5eb5df96e2`	2023-07-06 22:42:02 -07:00
Aliaksandr Valialkin	de574e7128	lib/storage: do not create flock.lock files at partition directories, since it is created at the Storage level	2023-07-06 17:26:37 -07:00
Nikolay	dd7ebd6779	lib/storage: creates parts.json on start-up if it not exists. (#4450 ) * lib/storage: creates parts.json on start-up if it not exists. It fixes migrations from versions below v1.90.0. Previously parts.json was created only after successful merge. But if merge was interruped for some reason (OOM or shutdown), parts.json wasn't created and partitions left after interruped merge weren't properly deleted. Since VM cannot check if it must be removed or not. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 * Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> * Update lib/storage/partition.go Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-07-06 17:10:26 -07:00
Roman Khavronenko	09c05608f2	lib/storage: add comment for how `mustBeDeleted` field should be used (#4454 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-07-06 17:02:44 -07:00
Aliaksandr Valialkin	0ebfb91aba	lib/storage: revert the migration from global to per-day index for (MetricName -> TSID) This reverts the following commits: - `e0e16a2d36` - `2ce02a7fe6` The reason for revert: the updated logic breaks assumptions made when fixing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 . For example, if a time series stop receiving new samples during the first day after the indexdb rotation, there are chances that the time series won't be registered in the new indexdb. This is OK until the next indexdb rotation, since the time series is registered in the previous indexdb, so it can be found during queries. But the time series will become invisible for search after the next indexdb rotation, while its data is still there. There is also incompletely solved issue with the increased CPU and disk IO resource usage just after the indexdb rotation. There was an attempt to fix it, but it didn't fix it in full, while introducing the issue mentioned above. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 TODO: to find out the solution, which simultaneously solves the following issues: - increased memory usage for setups high churn rate and long retention (e.g. what the reverted commit does) - increased CPU and disk IO usage during indexdb rotation ( https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 ) - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698	2023-05-18 11:28:54 -07:00
Aliaksandr Valialkin	67beb8c856	lib/storage: follow-up after `2ce02a7fe6` - Document the change at docs/CHANGELOG.md - Clarify comments for non-trivial code touched by the commit - Improve the logic behind maybeCreateIndexes(): - Correctly create per-day indexes if the indexdb rotation is performed during the first hour or the last hour of the day by UTC. Previously there was a possibility of missing index entries on that day. - Increase the duration for creating new indexes in the current indexdb for up to 22 hours after indexdb rotation. This should reduce the increased resource usage after indexdb rotation. It is safe to postpone index creation for the current day until the last hour of the current day after indexdb rotation by UTC, since the corresponding (date, ...) entries exist in the previous indexdb. - Search for TSID by (date, MetricName) in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation. - Search for (date, metricID) entries in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation.	2023-05-16 23:31:59 -07:00
Roman Khavronenko	c3b1d9ee21	lib/storage: introduce per-day MetricName=>TSID index (#4252 ) The new index substitutes global MetricName=>TSID index used for locating TSIDs on ingestion path. For installations with high ingestion and churn rate, global MetricName=>TSID index can grow enormously making index lookups too expensive. This also results into bigger than expected cache growth for indexdb blocks. New per-day index supposed to be much smaller and more efficient. This should improve ingestion speed and reliability during re-routings in cluster. The negative outcome could be occupied disk size, since per-day index is more expensive comparing to global index. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-05-16 23:18:11 -07:00
Aliaksandr Valialkin	bc98ea9a8d	lib/storage: reduce the unimportant logging during Storage start / stop This should improve the visibility of potentially important logs	2023-05-16 15:32:35 -07:00
Aliaksandr Valialkin	f09745f613	lib/{mergeset,storage}: make it clear that DebugFlush() doesn't store all the recently ingested data to disk DebugFlush() makes sure that the recently ingested data becomes visible to search. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4272	2023-05-16 11:55:58 -07:00
Zakhar Bessarab	a2fc912c43	lib/storage: follow-up after `a50d63c376` (#4289 ) * lib/storage: follow-up after `a50d63c376` - ensure retentionMsecs is rounded to day - remove localTimeOffset in test as localOffset is ignored when using `UnixMilli` Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage: restore retention timezone offset effect on retention deadline Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-05-16 10:13:20 -07:00
Nikolay	8c9dc837b9	lib/storage: properly update link for entry at dateMetricID cache (#4258 ) previously during sync for mutable and immutable cache parts, link for hotEntry with current date may be not properly updated it corrupts cache for backfilling metrics and increased cpu load	2023-05-09 21:39:41 -07:00
Zakhar Bessarab	348693ff84	lib/storage: fix indexdb rotation infinite loop (#4249 ) When using `retentionTimezoneOffset` and having local timezone being more than 4 hours different from UTC indexdb retention calculation could return negative value. This caused indexdb rotation to get in loop. Fix calculation of offset to use `retentionTimezoneOffset` value properly and add test to cover all legit timezone configs. See: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4207 - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4206 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2023-05-09 21:23:01 -07:00
Aliaksandr Valialkin	8b15f93426	lib/{mergeset,storage}: make mustReadPartNames() code more clear	2023-04-14 23:17:08 -07:00
Aliaksandr Valialkin	d739511f5b	lib/storage: replace OpenStorage() with MustOpenStorage() Callers of OpenStorage() log the returned error and exit. The error logging and exit can be performed inside MustOpenStorage() alongside with printing the stack trace for better debuggability. This simplifies the code at caller side.	2023-04-14 23:04:42 -07:00
Aliaksandr Valialkin	f26e480a77	lib/storage: fix a bug, which prevents from reading pre-v1.90.0 parts The bug has been introduced in `c0b852d50d`	2023-04-14 22:33:29 -07:00
Aliaksandr Valialkin	cf4701db65	lib/fs: add MustReadDir() function Use fs.MustReadDir() instead of os.ReadDir() across the code in order to reduce the code verbosity. The fs.MustReadDir() logs the error with the directory name and the call stack on error before exit. This information should be enough for debugging the cause of the error.	2023-04-14 22:11:40 -07:00
Aliaksandr Valialkin	0a11c46cd2	lib/storage: validate rows in partition.AddRows() only during tests	2023-04-14 20:53:05 -07:00
Aliaksandr Valialkin	292b6a851f	all: consistently use fs.MustClose() for closing lock files	2023-04-14 20:16:11 -07:00
Aliaksandr Valialkin	a7678350ad	lib/fs: convert CreateFlockFile to MustCreateFlockFile Callers of CreateFlockFile log the returned err and exit. It is better to log the error inside the MustCreateFlockFile together with the path to the specified directory and the call stack. This simplifies the code at the callers' side while leaving the debuggability at the same level.	2023-04-14 19:51:52 -07:00
Aliaksandr Valialkin	e2de5bf763	lib/{storage,mergeset}: convert InitFromFilePart to MustInitFromFilePart Callers of InitFromFilePart log the error and exit. It is better to log the error with the path to the part and the call stack directly inside the MustInitFromFilePart() function. This simplifies the code at callers' side while leaving the same level of debuggability.	2023-04-14 15:47:20 -07:00
Aliaksandr Valialkin	df99965564	lib/filestream: change Create() to MustCreate() Callers of this function log the returned error and exit. It is better logging the error together with the path to the filename and call stack directly inside the function. This simplifies the code at callers' side without reducing the level of debuggability	2023-04-14 15:14:24 -07:00
Aliaksandr Valialkin	0bbb281c3d	lib/filestream: transform Open() -> MustOpen() Callers of this function log the returned error and exit. Let's log the error with the path to the filename and call stack inside the function. This simplifies the code at callers' side without reducing the level of debuggability.	2023-04-14 15:04:54 -07:00
Aliaksandr Valialkin	b80d93d4b2	lib/fs: substitute ReadFullData with MustReadData Callers of ReadFullData() log the error and then exit. So let's log the error with the path to the filename and the call stack inside MustReadData(). This simplifies the code at callers' side, while leaving the debuggability at the same level.	2023-04-14 14:40:58 -07:00
Aliaksandr Valialkin	36559dfec2	lib/fs: improve error logging inside MustWriteData Log the path to file on errors inside MustWriteData(). This improves debuggability of errors, which may occur inside MustWriteData().	2023-04-14 14:33:45 -07:00
Aliaksandr Valialkin	67df75484f	lib/{mergeset,storage}: remove isInMerge flag from parts only when they werent removed yet from the list of active parts This prevents from possible panic during access to pw.p when it is set to nil at partWrapper.decRef() called inside swapSrcWithDstParts()	2023-04-14 00:16:18 -07:00
Aliaksandr Valialkin	7fb2b14ca0	docs/CHANGELOG.md: run at least 4 background mergers on systems with less than 4 CPU cores This reduces the probability of sudden spike in the number of small parts when all the background mergers are busy with big merges.	2023-04-13 23:37:05 -07:00
Aliaksandr Valialkin	8846ce5f1d	lib/{mergeset,storage}: make sure that getFlushToDiskDeadline() takes into account only in-memory parts	2023-04-13 23:17:24 -07:00
Aliaksandr Valialkin	f75b1b7a53	lib/fs: add Must prefix to CopyDirectory and CopyFile functions Callers of these functions log the returned error and then exit. Let's log the error with the call stack inside the function itself. This simplifies the code at callers' side, while leaving the same level of debuggability in case of errors.	2023-04-13 23:04:37 -07:00
Aliaksandr Valialkin	75b74aa837	lib/fs: rename SymlinkRelative to MustSymlinkRelative Callers of this function log the returned error and then exit. Let's log the error with the call stack inside the function itself. This simplifies the code at callers' side, while leaving the same level of debuggability in case of errors.	2023-04-13 22:53:11 -07:00
Aliaksandr Valialkin	624b86d065	lib/fs: rename HardLinkFiles to MustHardLinkFiles Callers of this function log the returned error and then exit. Let's log the error with the call stack inside the function itself. This simplifies the code at callers' side, while leaving the same level of debuggability in case of errors.	2023-04-13 22:49:38 -07:00
Aliaksandr Valialkin	c4638553a3	lib/fs: rename WriteFileAtomically to MustWriteAtomic Callers of this function log the returned error and exit. So let's just log the error with the given filepath and the call stack inside the function itself and then exit. This simplifies the code at callers' place while leaves the same level of debuggability in case of errors.	2023-04-13 22:43:30 -07:00
Aliaksandr Valialkin	aac3dccfd1	lib/fs: replace MkdirAllIfNotExist->MustMkdirIfNotExist and MkdirAllFailIfExist->MustMkdirFailIfExist Callers of these functions log the returned error and then exit. The returned error already contains the path to directory, which was failed to be created. So let's just log the error together with the call stack inside these functions. This leaves the debuggability of the returned error at the same level while allows simplifying the code at callers' side. While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk(). It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.	2023-04-13 22:22:08 -07:00
Aliaksandr Valialkin	b4c330ea2b	lib/fs: rename MustWriteFileAndSync to MustWriteSync in order to improve readability a bit This is a follow-up for `2a8395be05`	2023-04-13 22:20:31 -07:00
Aliaksandr Valialkin	cdee2cfc5c	lib/{mergeset,storage}: remove unused `path` field from blockStreamWriter This is a follow-up after `42bba64aa7`	2023-04-13 22:20:02 -07:00
Aliaksandr Valialkin	1cda542c48	lib/fs: replace WriteFileAndSync with MustWriteAndSync When WriteFileAndSync fails, then the caller eventually logs the error message and exits. The error message returned by WriteFileAndSync already contains the path to the file, which couldn't be created. This information alongside the call stack is enough for debugging the issue. So just use log.Panicf("FATAL: ...") inside MustWriteAndSync(). This simplifies error handling at caller side a bit.	2023-04-13 22:17:34 -07:00
Aliaksandr Valialkin	eb7df27e20	lib/{mergeset,storage}: properly fsync part directory listing after writing in-memory part to disk This is a follow-up after `42bba64aa7` Previously the part directory listing was fsync'ed implicitly inside partHeader.WriteMetadata() by calling fs.WriteFileAtomically(). Now it must be fsync'ed explicitly. There is no need in fsync'ing the parent directory, since it is fsync'ed by the caller when updating parts.json file.	2023-04-13 21:21:46 -07:00
Aliaksandr Valialkin	13d2350e6a	lib/{mergeset,storage}: explicitly fsync the created part directory listing Previously the created part directory listing was fsynced implicitly when storing metadata.json file in it. Also remove superflouous fsync for part directory listing, which was called at blockStreamWriter.MustClose(). After that the metadata.json file is created, so an additional fsync for the directory contents is needed.	2023-04-13 21:07:33 -07:00
Aliaksandr Valialkin	cf53ce83a0	app/vmstorage: deprecate -bigMergeConcurrency command-line flag Improperly configured -bigMergeConcurrency command-line flag usually leads to uncontrolled growth of unmerged parts, which, in turn, increases CPU usage and query durations. So it is better deprecating this flag. In rare cases -smallMergeConcurrency command-line flag can be used instead for controlling the concurrency of background merges.	2023-04-13 20:42:22 -07:00
Haleygo	7ee32ed06a	fix sort pendingDateMetricsIDs (#4102 )	2023-04-10 10:16:36 -07:00
Aliaksandr Valialkin	52734c71fc	lib/storage: use shorter code after `03bde173b7`	2023-04-02 21:35:34 -07:00
faceair	03bde173b7	lib/storage: fix reuse pendingMetricRow (#4049 )	2023-04-02 21:28:43 -07:00
faceair	a4b4bda166	lib/storage: remove unused code (#4050 )	2023-04-02 21:23:24 -07:00
Roman Khavronenko	5f95f9d453	lib/storage: check for free disk space before opening tables (#4035 ) * lib/storage: check for free disk space before opening tables We check for free disk space before call to `openTable`, so `Storage` can be set to ReadOnly before mergeWorkers start. Before the change, there was a chance that merges will start even if Storage has to start in ReadOnly mode because of `-storage.minFreeDiskSpaceBytes` limit. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4023 Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/storage: chore Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update lib/storage/storage.go --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-03-31 23:50:56 -07:00
Aliaksandr Valialkin	f6c36d5dfd	lib/storage: consistently use OS-independent separator in file paths This is needed for Windows support, which uses `\` instead of `/` as file separator Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 14:34:36 -07:00
Aliaksandr Valialkin	1d9a461c23	all: follow-up after `34634ec357` - Use windows.FlushFileBuffers() instead of windows.Fsync() at streamTracker.adviseDontNeed() for consistency with implementations for other architectures. - Use filepath.Base() instead of filepath.Split(), since the dir part isn't used. This simplifies the code a bit. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 12:00:48 -07:00
Nikolay	d231cefe25	lib/fs: adds memory map for windows (#3988 ) This is a follow-up for `43b24164ef` * lib/fs: adds memory map for windows it should improve performance for file reading * lib/storage: replace '/' with os specific separator it must fix an errors for windows * lib/fs: mention windows fsync support * lib/filestream: adds fdatasync for windows writes Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 12:00:44 -07:00
Dmytro Kozlov	693a3de0a6	lib/storage: fix collect downsampling metrics (#489 ) * lib/storage: fix downsampling * lib/storage: update logic * lib/storage: fix comments, removed unneeded check	2023-03-19 23:30:00 -07:00
Aliaksandr Valialkin	fc3d826d7f	all: add Windows build for VictoriaMetrics This commit changes background merge algorithm, so it becomes compatible with Windows file semantics. The previous algorithm for background merge: 1. Merge source parts into a destination part inside tmp directory. 2. Create a file in txn directory with instructions on how to atomically swap source parts with the destination part. 3. Perform instructions from the file. 4. Delete the file with instructions. This algorithm guarantees that either source parts or destination part is visible in the partition after unclean shutdown at any step above, since the remaining files with instructions is replayed on the next restart, after that the remaining contents of the tmp directory is deleted. Unfortunately this algorithm doesn't work under Windows because it disallows removing and moving files, which are in use. So the new algorithm for background merge has been implemented: 1. Merge source parts into a destination part inside the partition directory itself. E.g. now the partition directory may contain both complete and incomplete parts. 2. Atomically update the parts.json file with the new list of parts after the merge, e.g. remove the source parts from the list and add the destination part to the list before storing it to parts.json file. 3. Remove the source parts from disk when they are no longer used. This algorithm guarantees that either source parts or destination part is visible in the partition after unclean shutdown at any step above, since incomplete partitions from step 1 or old source parts from step 3 are removed on the next startup by inspecting parts.json file. This algorithm should work under Windows, since it doesn't remove or move files in use. This algorithm has also the following benefits: - It should work better for NFS. - It fits object storage semantics. The new algorithm changes data storage format, so it is impossible to downgrade to the previous versions of VictoriaMetrics after upgrading to this algorithm. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3236 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3821 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-19 23:28:26 -07:00

1 2 3 4 5 ...

766 commits