github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-01 14:47:38 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	13d2350e6a	lib/{mergeset,storage}: explicitly fsync the created part directory listing Previously the created part directory listing was fsynced implicitly when storing metadata.json file in it. Also remove superflouous fsync for part directory listing, which was called at blockStreamWriter.MustClose(). After that the metadata.json file is created, so an additional fsync for the directory contents is needed.	2023-04-13 21:07:33 -07:00
Aliaksandr Valialkin	cf53ce83a0	app/vmstorage: deprecate -bigMergeConcurrency command-line flag Improperly configured -bigMergeConcurrency command-line flag usually leads to uncontrolled growth of unmerged parts, which, in turn, increases CPU usage and query durations. So it is better deprecating this flag. In rare cases -smallMergeConcurrency command-line flag can be used instead for controlling the concurrency of background merges.	2023-04-13 20:42:22 -07:00
Haleygo	7ee32ed06a	fix sort pendingDateMetricsIDs (#4102 )	2023-04-10 10:16:36 -07:00
Aliaksandr Valialkin	52734c71fc	lib/storage: use shorter code after `03bde173b7`	2023-04-02 21:35:34 -07:00
faceair	03bde173b7	lib/storage: fix reuse pendingMetricRow (#4049 )	2023-04-02 21:28:43 -07:00
faceair	a4b4bda166	lib/storage: remove unused code (#4050 )	2023-04-02 21:23:24 -07:00
Roman Khavronenko	5f95f9d453	lib/storage: check for free disk space before opening tables (#4035 ) * lib/storage: check for free disk space before opening tables We check for free disk space before call to `openTable`, so `Storage` can be set to ReadOnly before mergeWorkers start. Before the change, there was a chance that merges will start even if Storage has to start in ReadOnly mode because of `-storage.minFreeDiskSpaceBytes` limit. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4023 Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/storage: chore Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update lib/storage/storage.go --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-03-31 23:50:56 -07:00
Aliaksandr Valialkin	f6c36d5dfd	lib/storage: consistently use OS-independent separator in file paths This is needed for Windows support, which uses `\` instead of `/` as file separator Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 14:34:36 -07:00
Aliaksandr Valialkin	1d9a461c23	all: follow-up after `34634ec357` - Use windows.FlushFileBuffers() instead of windows.Fsync() at streamTracker.adviseDontNeed() for consistency with implementations for other architectures. - Use filepath.Base() instead of filepath.Split(), since the dir part isn't used. This simplifies the code a bit. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 12:00:48 -07:00
Nikolay	d231cefe25	lib/fs: adds memory map for windows (#3988 ) This is a follow-up for `43b24164ef` * lib/fs: adds memory map for windows it should improve performance for file reading * lib/storage: replace '/' with os specific separator it must fix an errors for windows * lib/fs: mention windows fsync support * lib/filestream: adds fdatasync for windows writes Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 12:00:44 -07:00
Dmytro Kozlov	693a3de0a6	lib/storage: fix collect downsampling metrics (#489 ) * lib/storage: fix downsampling * lib/storage: update logic * lib/storage: fix comments, removed unneeded check	2023-03-19 23:30:00 -07:00
Aliaksandr Valialkin	fc3d826d7f	all: add Windows build for VictoriaMetrics This commit changes background merge algorithm, so it becomes compatible with Windows file semantics. The previous algorithm for background merge: 1. Merge source parts into a destination part inside tmp directory. 2. Create a file in txn directory with instructions on how to atomically swap source parts with the destination part. 3. Perform instructions from the file. 4. Delete the file with instructions. This algorithm guarantees that either source parts or destination part is visible in the partition after unclean shutdown at any step above, since the remaining files with instructions is replayed on the next restart, after that the remaining contents of the tmp directory is deleted. Unfortunately this algorithm doesn't work under Windows because it disallows removing and moving files, which are in use. So the new algorithm for background merge has been implemented: 1. Merge source parts into a destination part inside the partition directory itself. E.g. now the partition directory may contain both complete and incomplete parts. 2. Atomically update the parts.json file with the new list of parts after the merge, e.g. remove the source parts from the list and add the destination part to the list before storing it to parts.json file. 3. Remove the source parts from disk when they are no longer used. This algorithm guarantees that either source parts or destination part is visible in the partition after unclean shutdown at any step above, since incomplete partitions from step 1 or old source parts from step 3 are removed on the next startup by inspecting parts.json file. This algorithm should work under Windows, since it doesn't remove or move files in use. This algorithm has also the following benefits: - It should work better for NFS. - It fits object storage semantics. The new algorithm changes data storage format, so it is impossible to downgrade to the previous versions of VictoriaMetrics after upgrading to this algorithm. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3236 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3821 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-19 23:28:26 -07:00
Aliaksandr Valialkin	d2f85816ea	lib/{mergeset,storage}: prevent from long wait time when creating a snapshot under high data ingestion rate Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3873	2023-03-19 00:19:02 -07:00
Aliaksandr Valialkin	8aeee8bcca	lib/{fs,mergeset,storage}: substitute os.Open()+os.File.Readdir() with os.ReadDir() This simplifies code a bit	2023-03-17 21:03:52 -07:00
Zakhar Bessarab	d1d108fe77	lib/storage: log original labels set when label value is truncated (#3952 ) lib/storage: log original labels set when label value is truncated	2023-03-14 16:11:02 -07:00
Nikolay	3caf898a83	lib/storage: correctly handle io.EOF error for pre-fetched metrics (#3946 ) io.EOF shouldn't be returned from this function. It breaks all search API logic and may result in empty query results.	2023-03-12 00:19:58 -08:00
Nikolay	361e1b1165	lib{mergset,storage}: prevent possible race condition with logging st… (#3900 ) (#3917 ) lib{mergset,storage}: prevent possible race condition with logging stats for merges Previously partwrapper could be release by background process and reference for part may be invalid during logging stats. It will lead to panic at vmstorage https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3897	2023-03-06 11:11:08 +01:00
Aliaksandr Valialkin	1ad0d22e80	lib/storage: follow-up for `39cdc546dd` - Use flag.Duration instead of flagutil.Duration for -snapshotCreateTimeout, since the flagutil.Duration is intended mostly for big durations, e.g. days, months and years, while the -snapshotCreateTimeout is usually smaller than one hour. - Add links to https://docs.victoriametrics.com/#how-to-work-with-snapshots in docs/CHANGELOG.md, so readers could easily find the corresponding docs when reading the changelog. - Properly remove all the created directories on unsuccessful attempt to create snapshot in Storage.CreateSnapshot(). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551	2023-02-27 13:11:10 -08:00
Zakhar Bessarab	26682e369e	lib/storage: enhancements for snapshots process (#3873 ) * lib/{fs,mergeset,storage}: skip `.must-remove.` dirs when creating snapshot (#3858) * lib/{mergeset,storage}: add timeout configuration for snapshots creation, remove incomplete snapshots from storage * docs: fix formatting * app/vmstorage: add metrics to track status of snapshots * app/vmstorage: use `vm_http_requests_total` metric for snapshot endpoints metrics, rename new flag to make name more clear Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vmstorage: update flag name in docs Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vmstorage: reflect new metrics names change in docs Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-02-27 13:11:06 -08:00
Zakhar Bessarab	75b8733e0b	lib/{fs,mergeset,storage}: skip `.must-remove.` dirs when creating snapshot (#3858 ) (#3867 )	2023-02-24 12:43:43 -08:00
Oleksandr Redko	0e1c395609	app,lib: fix typos in comments (#3804 )	2023-02-13 09:32:35 -08:00
Aliaksandr Valialkin	9053745a6f	lib/{mergeset,storage}: allow at least 3 concurrent flushes during background merges on systems with 1 or 2 CPU cores This should prevent from data ingestion slowdown and query performance degradation on systems with small number of CPU cores (1 or 2), when big merge is performed. This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3790 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337	2023-02-11 12:09:13 -08:00
Nikolay	554876cc38	lib/storage: fixes finalDedup for backfilled data (#3737 ) previously historical data backfilling may trigger force merge for previous month every hour it consumes cpu, disk io and decrease cluster performance. Following commit fixes it by applying deduplication for InMemoryParts	2023-02-01 09:57:02 -08:00
Nikolay	4af05065d1	lib/storage: properly release parts inMerge lock (#3711 ) if storage doesn't have enough disk space, finalDedupWatcher holds inMerge lock for all parts and never release it until storage restart	2023-01-26 08:57:36 -08:00
Aliaksandr Valialkin	903b2e710c	lib/storage: use deterministic random generator in tests Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3683	2023-01-23 20:12:32 -08:00
Aliaksandr Valialkin	c5e858461c	lib/{storage,mergeset}: wake up background merges as soon as there is a potential work for them Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647	2023-01-18 01:10:43 -08:00
Aliaksandr Valialkin	70b5a6fb28	lib/{storage,mergeset}: do not run assisted merges when flushing pending samples to parts Assisted merges are intended to be performed by goroutines, which accept the incoming samples, in order to limit the data ingestion rate. The worker, which converts pending samples to parts, shouldn't be penalized by assisted merges, since this may result in increased number of pending rows as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647#issuecomment-1385039142 when the assisted merge takes too much time.	2023-01-18 00:25:33 -08:00
Aliaksandr Valialkin	0c90b49e4b	lib/storage: use better naming for a function returning new []rawRows - newRawRowsBlock() -> newRawRows()	2023-01-18 00:01:21 -08:00
Aliaksandr Valialkin	103dfd0525	lib/{mergeset,storage}: do not slow down concurrently executed queries during assisted merges Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291	2023-01-16 14:45:40 -08:00
Aliaksandr Valialkin	12e2bcdf81	app/vmselect/promql: avoid memory allocations and copying from source timeseries to the returned result at timeseriesToResult()	2023-01-09 22:39:15 -08:00
Aliaksandr Valialkin	b7a4650ab0	all: use metricsql.CompileRegexp instead of regexp.Compile for compiling regexps used in graphite queries This should speed up repeated queries, since metricsql.CompileRegexp returns regexps from the cache on subsequent calls for the same input regexp.	2023-01-09 21:45:34 -08:00
Aliaksandr Valialkin	eb9a542c1f	lib/storage: simplify the fix from `488940502c` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3566	2023-01-07 01:11:35 -08:00
Dmytro Kozlov	f739e44802	lib/storage: fix returning camelcase label names (#3608 ) * lib/storage: fix returning camelcase label names * doc: add change log * Update docs/CHANGELOG.md * Update docs/CHANGELOG.md Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-01-07 01:11:10 -08:00
Aliaksandr Valialkin	b275983403	lib/writeconcurrencylimiter: improve the logic behind -maxConcurrentInserts limit Previously the -maxConcurrentInserts was limiting the number of established client connections, which write data to VictoriaMetrics. Some of these connections could be idle. Such connections do not consume big amounts of CPU and RAM, so there is a little sense in limiting the number of such connections. So now the -maxConcurrentInserts command-line option limits the number of concurrently executed insert requests, not including idle connections. It is recommended removing -maxConcurrentInserts command-line option, since the default value for this option should work good for most cases.	2023-01-06 22:07:16 -08:00
Roman Khavronenko	57277ed6bc	vmstorage: add more context to the flock acquiring msg (#3584 ) See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3578 Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-01-05 18:32:53 -08:00
Aliaksandr Valialkin	8dc04a86f6	lib/{storage,mergeset}: tune the threshold for assisted merge The https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425#issuecomment-1359117221 reveals that CPU usage for incoming queries may significantly increase when the number of in-memory parts becomes too big. This commit reduces the maximum number of in-memory parts before starting the assisted merge during data ingestion. This should reduce CPU usage for incoming queries, since they need to inspect lower number of in-memory parts. This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425	2022-12-28 14:42:45 -08:00
Aliaksandr Valialkin	1ff62629f4	lib/storage: clear the err if it is set to io.EOF when searching for the TSID by metricID This is expected error after when recently added indexdb data isn't available for search yet or wasn't flushed to disk after unclean shutdown of VictoriaMetrics. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3515	2022-12-20 14:05:53 -08:00
Aliaksandr Valialkin	2184de3bf2	lib/storage: do not check for the result returned by db.doExtDB() where this isn't necessary This simplifies the code a bit	2022-12-19 13:23:30 -08:00
Aliaksandr Valialkin	11bd290201	lib/storage: search for TSIDs for the given metricIDs in the previous indexdb if they aren't found in the current indexdb The issue triggers after the indexdb rotation for time series, which stop receiving new samples. This results in missing data for such time series in query responses. This commit should address the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502 The issue has been introduced in `2dd93449d8`	2022-12-19 11:56:49 -08:00
Aliaksandr Valialkin	8c08d625ee	lib/storage: optimize partSearch.searchBHS() for common case when the TSID for the current block header is bigger or equal to the current tsid This should help improving performance at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425	2022-12-19 10:31:39 -08:00
Aliaksandr Valialkin	512c73cef9	lib/storage: properly set buf capacity inside marshalMetricID Previously it was always set to 0. In theory this could result into incorrect marshaling of metricIDs. The issue has been introduced in `5e4dfe50c6`	2022-12-19 10:31:38 -08:00
Aliaksandr Valialkin	fbeebe4869	lib/storage: skip missing tsids in the current block header by using binary search This improves performance by up to 10x when big number of the requested TSIDs are missing in the searched parts. This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425	2022-12-14 22:07:55 -08:00
Aliaksandr Valialkin	1a88fe5b1f	lib/flagutil/bytes.go: properly handle values bigger than 2GiB on 32-bit architectures This fixes handling of values bigger than 2GiB for the following command-line flags: - -storage.minFreeDiskSpaceBytes - -remoteWrite.maxDiskUsagePerURL	2022-12-14 19:29:57 -08:00
Aliaksandr Valialkin	ea7940e5a7	lib/mergeset: reduce the parts threshold before starting assisted merges This should improve query speed in general case. This is a follow-up for `d1af6046c7`	2022-12-13 09:14:08 -08:00
Aliaksandr Valialkin	2a190f6451	lib/{mergeset,storage}: do not block small merges by pending big merges - assist with small merges instead Blocked small merges may result into big number of small parts, which, in turn, may result in increased CPU and memory usage during queries, since queries need to inspect all the existing small parts. The issue has been introduced in `8189770c50` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337	2022-12-12 17:01:33 -08:00
Aliaksandr Valialkin	e56d5e1918	lib/storage: follow-up after `7c0ae3a86a` - Update docs at https://docs.victoriametrics.com/#deduplication - Optimize the deduplication loop a bit Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333	2022-12-08 18:18:49 -08:00
Roman Khavronenko	909cd04c55	lib/storage: keep sample with the biggest value on timestamp conflict (#3421 ) The change leaves raw sample with the biggest value for identical timestamps per each `-dedup.minScrapeInterval` discrete interval when the deduplication is enabled. ``` benchstat old.txt new.txt name old time/op new time/op delta DeduplicateSamples/minScrapeInterval=1s-10 817ns ± 2% 832ns ± 3% ~ (p=0.052 n=10+10) DeduplicateSamples/minScrapeInterval=2s-10 1.56µs ± 1% 2.12µs ± 0% +35.19% (p=0.000 n=9+7) DeduplicateSamples/minScrapeInterval=5s-10 1.32µs ± 3% 1.65µs ± 2% +25.57% (p=0.000 n=10+10) DeduplicateSamples/minScrapeInterval=10s-10 1.13µs ± 2% 1.50µs ± 1% +32.85% (p=0.000 n=10+10) name old speed new speed delta DeduplicateSamples/minScrapeInterval=1s-10 10.0GB/s ± 2% 9.9GB/s ± 3% ~ (p=0.052 n=10+10) DeduplicateSamples/minScrapeInterval=2s-10 5.24GB/s ± 1% 3.87GB/s ± 0% -26.03% (p=0.000 n=9+7) DeduplicateSamples/minScrapeInterval=5s-10 6.22GB/s ± 3% 4.96GB/s ± 2% -20.37% (p=0.000 n=10+10) DeduplicateSamples/minScrapeInterval=10s-10 7.28GB/s ± 2% 5.48GB/s ± 1% -24.74% (p=0.000 n=10+10) ``` https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333 Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-12-08 18:18:36 -08:00
Aliaksandr Valialkin	0a9992a9c6	lib/{storage,mergeset}: log the duration for flushing in-memory parts on graceful shutdown	2022-12-05 21:55:21 -08:00
Aliaksandr Valialkin	7d5c64eb7a	all: add `-inmemoryDataFlushInterval` command-line flag for controlling the frequency of saving in-memory data to disk The main purpose of this command-line flag is to increase the lifetime of low-end flash storage with the limited number of write operations it can perform. Such flash storage is usually installed on Raspberry PI or similar appliances. For example, `-inmemoryDataFlushInterval=1h` reduces the frequency of disk write operations to up to once per hour if the ingested one-hour worth of data fits the limit for in-memory data. The in-memory data is searchable in the same way as the data stored on disk. VictoriaMetrics automatically flushes the in-memory data to disk on graceful shutdown via SIGINT signal. The in-memory data is lost on unclean shutdown (hardware power loss, OOM crash, SIGKILL). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337	2022-12-05 15:28:09 -08:00
Aliaksandr Valialkin	9ac1174493	lib/{mergeset,storage}: add start background workers via startBackgroundWorkers() function	2022-12-04 00:01:14 -08:00
Aliaksandr Valialkin	a13d21513e	lib/mergeset: panic when too long item is passed to Table.AddItems()	2022-12-03 23:37:20 -08:00
Aliaksandr Valialkin	dccd70ce10	lib/storage: remove duplicate logging for filepath on errors	2022-12-03 23:15:28 -08:00
Aliaksandr Valialkin	813e8402f6	lib/storage: pass a single arg - rowsPerBlock - to getCompressLevel() function instead of two args	2022-12-03 23:10:26 -08:00
Aliaksandr Valialkin	bb93494eac	lib/{storage,mergeset}: use a single sync.WaitGroup for all background workers This simplifies the code	2022-12-03 23:03:32 -08:00
Aliaksandr Valialkin	106332cd9f	lib/storage: properly pass retentionMsecs to OpenStorage() at TestIndexDBRepopulateAfterRotation	2022-12-03 23:03:30 -08:00
Aliaksandr Valialkin	ea55c16422	lib/{mergeset,storage}: pass compressLevel to blockStreamWriter.InitFromInmemoryPart This allows packing in-memory blocks with different compression levels depending on its contents. This may save memory usage.	2022-12-03 22:47:06 -08:00
Aliaksandr Valialkin	7ffa66d249	lib/{mergeset,storage}: take into account byte slice capacity when returning the size of in-memory part This results in more correct reporting of memory usage for in-memory parts	2022-12-03 22:31:34 -08:00
Aliaksandr Valialkin	10a17bfa16	lib/{storage,mergeset}: consistency rename: `flushRaw{Rows,Items} -> flushPending{Rows,Items}	2022-12-03 22:18:05 -08:00
Aliaksandr Valialkin	233301a549	lib/storage: optimization: do not scan block for rows outside retention if it is covered by the retention	2022-12-03 22:14:20 -08:00
Aliaksandr Valialkin	fd9d0a550b	lib/storage: remove logging redundant path values in a single error message	2022-12-03 22:14:19 -08:00
Aliaksandr Valialkin	8e9822bc7f	lib/storage: speed up search for data block for the given tsids Use binary search instead of linear scan for looking up the needed data block inside index block. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425	2022-12-03 20:59:59 -08:00
Aliaksandr Valialkin	d8303845ef	lib/storage: fix TestUpdateCurrHourMetricIDs test when it runs on the first hour of the day by UTC	2022-12-02 17:23:59 -08:00
Aliaksandr Valialkin	d8d4d21d7a	lib/{mergeset,storage}: re-use the code for removing isInMerge flag at parts Move the common code into releasePartsToMerge() method and consistently use it throughout the code.	2022-12-02 17:07:52 -08:00
匠心零度	d4808d5b84	lib/storage: remove extra error check (#3396 )	2022-11-28 17:07:11 +01:00
Zakhar Bessarab	e407e7243a	{app/vmstorage,app/vmselect}: add API to get list of existing tenants (#3348 ) * {app/vmstorage,app/vmselect}: add API to get list of existing tenants * {app/vmstorage,app/vmselect}: add API to get list of existing tenants * app/vmselect: fix error message * {app/vmstorage,app/vmselect}: fix error messages * app/vmselect: change log level for error handling * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-11-25 10:32:45 -08:00
Aliaksandr Valialkin	d3035b1ca1	lib/storage: follow-up for `790768f20b` - Document the bugfix at docs/CHANGELOG.md - Simplify the bugfix a bit	2022-11-07 14:18:06 +02:00
Aliaksandr Valialkin	be78950011	lib/storage: typo fix after 32d48f8dfbb03174858c00bdfe6d9d22431dc8d8	2022-11-07 13:58:13 +02:00
Aliaksandr Valialkin	99e6a937a5	lib/storage: remove unused isFull field from hourMetricIDs struct	2022-11-07 13:15:59 +02:00
Aliaksandr Valialkin	ecb71a7221	lib/fs: add canOverwrite arg to WriteFileAtomically when it is allowed to overwrite the file atomically if it already exists	2022-10-26 01:08:35 +03:00
Aliaksandr Valialkin	a6d4711ac6	lib/storage: add support for retention filters (aka multiple retentions for distinct sets of time series) Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/143 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/289	2022-10-24 16:41:59 +03:00
Aliaksandr Valialkin	51f2e473f5	lib/storage: skip blocks outside the configured retention during search Blocks outside the configured retention are eventually deleted during background merge. But such blocks may reside in the storage for long time until background merge. Previously VictoriaMetrics could spend additional CPU time on processing such blocks during search queries. Now these blocks are skipped.	2022-10-24 02:56:13 +03:00
Aliaksandr Valialkin	2fc82b846e	lib/storage: do not pass retentionMsecs and isReadOnly args explicitly - access them via Storage arg This makes code easier to read. This is a follow-up after `d2d30581a0`	2022-10-24 01:32:56 +03:00
Aliaksandr Valialkin	d51f9b9284	lib/storage: small code cleanups	2022-10-24 01:17:58 +03:00
Aliaksandr Valialkin	5ace1587e6	lib/storage: re-use newTestStorage() instead of manually initializing Storage mock This is a follow-up for `d2d30581a0`	2022-10-23 16:24:42 +03:00
Aliaksandr Valialkin	57ea7a3ee8	lib/storage: pass Storage to table and partition instead of getDeletedMetricIDs callback This improves code readability a bit.	2022-10-23 16:11:02 +03:00
Aliaksandr Valialkin	63419d8e7c	lib/storage: small refactoring: move retentionDeadline to blockStreamMerger This allows defining per-block retention in the future by updating the getRetentionDeadline function	2022-10-23 16:11:01 +03:00
Aliaksandr Valialkin	31071347ca	lib/storage: use a single reference to the currently merged block - bsm.Block during the block merge loop	2022-10-23 14:09:14 +03:00
Aliaksandr Valialkin	5d0a91afd5	lib/storage: properly pass uint64 constant to fmt.Errorf on 32-bit platforms	2022-10-23 12:48:43 +03:00
Aliaksandr Valialkin	2dd93449d8	lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function The searchTSIDs function was searching for metricIDs matching the the given tag filters and then was locating the corresponding TSID entries for the found metricIDs. The TSID entries aren't needed when searching for time series names (aka MetricName), so this commit removes the uneeded TSID search from the implementation of /api/v1/series API. This improves perfromance of /api/v1/series calls. This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls, since now these calls cache small metricIDs instead of big TSID entries in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs) without the need to compress the saved entries in order to save cache space. This commit also removes concurrency limiter during searching for matching time series, which was introduced in `8f16388428`, since the concurrency for all the read queries is already limited with -search.maxConcurrentRequests command-line flag. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648	2022-10-23 12:43:44 +03:00
Aliaksandr Valialkin	fe5611d6e1	lib/storage: free up memory occupied by Storage.pendingHourEntries after a temporary spike in its memory usage This reduces vmstorage memory usage by up to 20% in production workload	2022-10-21 14:59:14 +03:00
Aliaksandr Valialkin	32b6ce691b	lib/storage: move common code to newRawRowsBlock() function	2022-10-21 14:46:06 +03:00
Aliaksandr Valialkin	2f8861ed9c	lib/storage: simplify code a bit after `3f5959c053`	2022-10-21 14:39:44 +03:00
Aliaksandr Valialkin	1fb2be0cae	lib/{mergeset,storage}: simplify the code a bit after `ae55ad8749`	2022-10-21 14:33:15 +03:00
Aliaksandr Valialkin	af648279ce	lib/storage: validate timestamps in the block only if they use encoding, which needs validation This reduces CPU usage when there is no sense in validating timestamps. This is a follow-up for `5fa9525498` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011	2022-10-21 00:54:37 +03:00
Aliaksandr Valialkin	edf3b7be47	lib/storage: try generating initial parts from inmemory rows with identical sizes under high ingestion rate This should improve background merge rate under high load a bit	2022-10-20 23:27:44 +03:00
Aliaksandr Valialkin	6855de311c	lib/{mergeset,storage}: avoid `unaligned 64-bit atomic operation` panic on 32-bit platforms The panic has been introduced in `68f3a02589` While at it, add padding to shard structs in order to avoid false sharing on mordern CPUs This should improve scalability on systems with many CPU cores	2022-10-20 16:24:46 +03:00
Aliaksandr Valialkin	6f69a88a5a	lib/storage: double the number of rawRows shards on multi-core systems This should increase data ingestion scalability on multi-core systems at the cost of slightly higher memory usage	2022-10-17 18:19:28 +03:00
Aliaksandr Valialkin	68f3a02589	lib/{storage,mergeset}: do not hold per-shard lock in fast path when adding per-shard items to the flush list	2022-10-17 18:01:55 +03:00
Aliaksandr Valialkin	b96fe2e265	lib/storage: optimize matching speed for non-trivial regexp filters Wrap re.Match into bytesutil.FastStringMatcher. This increases performance for `{foo=~"complex_regex_here"}` filters by up to 4x.	2022-10-01 12:07:18 +03:00
Aliaksandr Valialkin	fe52378f45	lib/storage: substitute remaining calls to fs.MustRemoveAll with fs.MustRemoveDirAtomic	2022-09-13 15:49:25 +03:00
Aliaksandr Valialkin	6c9729d694	lib/storage: atomically remove parts inside partitions Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038	2022-09-13 15:28:41 +03:00
Aliaksandr Valialkin	daa42e4f79	lib/storage: atomically remove partitions, which went outside the configured retention Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038	2022-09-13 13:37:59 +03:00
Aliaksandr Valialkin	0a342f04b2	lib/storage: properly remove cache directory contents if `reset_cache_on_startup` file is located there Previously the cache directory was removed. This could result in error when the cache directory is mounted to a separate filesystem.	2022-09-13 13:32:05 +03:00
Aliaksandr Valialkin	ff7188b6a5	lib/storage: atomically remove snapshot directories Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038	2022-09-13 13:25:48 +03:00
Aliaksandr Valialkin	051e722112	lib/storage: verify that timestamps in block are in the range specified by blockHeader.{Min,Max}Timestamp when upacking the block This should reduce chances of unnoticed on-disk data corruption. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011 This change modifies the format for data exported via /api/v1/export/native - now this data contains MaxTimestamp and PrecisionBits fields from blockHeader. This is OK, since the native export format is undocumented.	2022-09-06 13:07:49 +03:00
Aliaksandr Valialkin	c49751adf8	lib/regexutil: add Simplify() function for simplifying the regular expression	2022-08-26 11:57:43 +03:00
Aliaksandr Valialkin	d60654eb0a	lib/promrelabel: optimize `action: {labeldrop,labelkeep,keep,drop}` with `regex` containing alternate values For example, the following relabeling rule must work much faster now: - action: labeldrop regex: "foo\|bar\|baz"	2022-08-24 17:55:54 +03:00
Aliaksandr Valialkin	891eb608df	lib/storage: increase the maximum possible `or` values extracted from regexp from 20 to 100 This should improve time series search speed for regexp filters with big number of `or` values.	2022-08-24 17:16:29 +03:00
Aliaksandr Valialkin	1b14cf18b6	lib/storage: ignore `start text` and `end text` anchors in getOrValues(regexp) function This is OK, since the anchors are implicitly applied to the whole regexp. This optimization should improve the speed for regexp series filters with explicit $ and ^ anchors. For example, `{label="^(foo\|bar)$"}`	2022-08-24 17:16:28 +03:00
Aliaksandr Valialkin	7b9ba456ff	app/vmstorage: expose `vm_{hourly,daily}_series_limit_{max,current}_series` metrics if `-storage.max{Hourly,Daily}Series` limits are set These metrics allow alerting when the number of unique series approach the limit. For example, the following query alerts when the number of series reaches 90% of the configured limit: vm_hourly_series_limit_current_series / vm_hourly_series_limit_max_series > 0.9	2022-08-24 13:41:57 +03:00

1 2 3 4 5 ...

678 commits