github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	152ca00fb8	docs/CHANGELOG.md: clarify description for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 bugfix This is a follow-up for `5eb5df96e2`	2023-07-06 17:09:03 -07:00
Aliaksandr Valialkin	aeac39cfd1	lib/storage: do not create flock.lock files at partition directories, since it is created at the Storage level	2023-06-19 22:48:37 -07:00
Nikolay	5eb5df96e2	lib/storage: creates parts.json on start-up if it not exists. (#4450 ) * lib/storage: creates parts.json on start-up if it not exists. It fixes migrations from versions below v1.90.0. Previously parts.json was created only after successful merge. But if merge was interruped for some reason (OOM or shutdown), parts.json wasn't created and partitions left after interruped merge weren't properly deleted. Since VM cannot check if it must be removed or not. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 * Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> * Update lib/storage/partition.go Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2023-06-15 11:19:22 +02:00
Roman Khavronenko	f50f35a8e0	lib/storage: add comment for how `mustBeDeleted` field should be used (#4454 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-06-15 11:17:45 +02:00
Aliaksandr Valialkin	1f28b46ae9	lib/storage: revert the migration from global to per-day index for (MetricName -> TSID) This reverts the following commits: - `e0e16a2d36` - `2ce02a7fe6` The reason for revert: the updated logic breaks assumptions made when fixing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 . For example, if a time series stop receiving new samples during the first day after the indexdb rotation, there are chances that the time series won't be registered in the new indexdb. This is OK until the next indexdb rotation, since the time series is registered in the previous indexdb, so it can be found during queries. But the time series will become invisible for search after the next indexdb rotation, while its data is still there. There is also incompletely solved issue with the increased CPU and disk IO resource usage just after the indexdb rotation. There was an attempt to fix it, but it didn't fix it in full, while introducing the issue mentioned above. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 TODO: to find out the solution, which simultaneously solves the following issues: - increased memory usage for setups high churn rate and long retention (e.g. what the reverted commit does) - increased CPU and disk IO usage during indexdb rotation ( https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 ) - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 Possible solution - to create the new indexdb in one hour before the indexdb rotation and to gradually pre-populate it with the needed index data during the last hour before indexdb rotation. Then the new indexdb will contain all the needed data just after the rotation, so it won't trigger increased CPU and disk IO.	2023-05-18 11:30:49 -07:00
Haleygo	1531d757ea	fix lint check	2023-05-17 13:51:36 +02:00
Aliaksandr Valialkin	e0e16a2d36	lib/storage: follow-up after `2ce02a7fe6` - Document the change at docs/CHANGELOG.md - Clarify comments for non-trivial code touched by the commit - Improve the logic behind maybeCreateIndexes(): - Correctly create per-day indexes if the indexdb rotation is performed during the first hour or the last hour of the day by UTC. Previously there was a possibility of missing index entries on that day. - Increase the duration for creating new indexes in the current indexdb for up to 22 hours after indexdb rotation. This should reduce the increased resource usage after indexdb rotation. It is safe to postpone index creation for the current day until the last hour of the current day after indexdb rotation by UTC, since the corresponding (date, ...) entries exist in the previous indexdb. - Search for TSID by (date, MetricName) in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation. - Search for (date, metricID) entries in both the current and the previous indexdb. Previously the search was performed only in the current indexdb. This could lead to excess creation of per-day indexes for the current day just after indexdb rotation.	2023-05-16 23:19:27 -07:00
Roman Khavronenko	2ce02a7fe6	lib/storage: introduce per-day MetricName=>TSID index (#4252 ) The new index substitutes global MetricName=>TSID index used for locating TSIDs on ingestion path. For installations with high ingestion and churn rate, global MetricName=>TSID index can grow enormously making index lookups too expensive. This also results into bigger than expected cache growth for indexdb blocks. New per-day index supposed to be much smaller and more efficient. This should improve ingestion speed and reliability during re-routings in cluster. The negative outcome could be occupied disk size, since per-day index is more expensive comparing to global index. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-05-16 15:46:42 -07:00
Aliaksandr Valialkin	278278af95	lib/storage: reduce the unimportant logging during Storage start / stop This should improve the visibility of potentially important logs	2023-05-16 15:14:21 -07:00
Aliaksandr Valialkin	09b403d38a	lib/{mergeset,storage}: make it clear that DebugFlush() doesn't store all the recently ingested data to disk DebugFlush() makes sure that the recently ingested data becomes visible to search. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4272	2023-05-16 11:50:17 -07:00
Zakhar Bessarab	242050ba94	lib/storage: follow-up after `a50d63c376` (#4289 ) * lib/storage: follow-up after `a50d63c376` - ensure retentionMsecs is rounded to day - remove localTimeOffset in test as localOffset is ignored when using `UnixMilli` Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage: restore retention timezone offset effect on retention deadline Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-05-16 17:14:08 +02:00
Nikolay	8f4de6fa47	lib/storage: properly update link for entry at dateMetricID cache (#4258 ) previously during sync for mutable and immutable cache parts, link for hotEntry with current date may be not properly updated it corrupts cache for backfilling metrics and increased cpu load	2023-05-05 21:45:47 -07:00
Zakhar Bessarab	aca256735c	lib/storage: fix indexdb rotation infinite loop (#4249 ) When using `retentionTimezoneOffset` and having local timezone being more than 4 hours different from UTC indexdb retention calculation could return negative value. This caused indexdb rotation to get in loop. Fix calculation of offset to use `retentionTimezoneOffset` value properly and add test to cover all legit timezone configs. See: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4207 - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4206 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2023-05-04 17:16:48 +02:00
Aliaksandr Valialkin	2a4c48c59d	lib/{mergeset,storage}: make mustReadPartNames() code more clear	2023-04-14 23:16:59 -07:00
Aliaksandr Valialkin	52006149b2	lib/storage: replace OpenStorage() with MustOpenStorage() Callers of OpenStorage() log the returned error and exit. The error logging and exit can be performed inside MustOpenStorage() alongside with printing the stack trace for better debuggability. This simplifies the code at caller side.	2023-04-14 23:02:40 -07:00
Aliaksandr Valialkin	2a2036160d	lib/storage: fix a bug, which prevents from reading pre-v1.90.0 parts The bug has been introduced in `c0b852d50d`	2023-04-14 22:33:08 -07:00
Aliaksandr Valialkin	3727251910	lib/fs: add MustReadDir() function Use fs.MustReadDir() instead of os.ReadDir() across the code in order to reduce the code verbosity. The fs.MustReadDir() logs the error with the directory name and the call stack on error before exit. This information should be enough for debugging the cause of the error.	2023-04-14 22:10:46 -07:00
Aliaksandr Valialkin	60d92894c5	lib/storage: validate rows in partition.AddRows() only during tests	2023-04-14 20:52:36 -07:00
Aliaksandr Valialkin	df619bdff0	all: consistently use fs.MustClose() for closing lock files	2023-04-14 20:14:21 -07:00
Aliaksandr Valialkin	2a3b19e1d2	lib/fs: convert CreateFlockFile to MustCreateFlockFile Callers of CreateFlockFile log the returned err and exit. It is better to log the error inside the MustCreateFlockFile together with the path to the specified directory and the call stack. This simplifies the code at the callers' side while leaving the debuggability at the same level.	2023-04-14 19:50:01 -07:00
Aliaksandr Valialkin	c0b852d50d	lib/{storage,mergeset}: convert InitFromFilePart to MustInitFromFilePart Callers of InitFromFilePart log the error and exit. It is better to log the error with the path to the part and the call stack directly inside the MustInitFromFilePart() function. This simplifies the code at callers' side while leaving the same level of debuggability.	2023-04-14 15:46:12 -07:00
Aliaksandr Valialkin	9183a439c7	lib/filestream: change Create() to MustCreate() Callers of this function log the returned error and exit. It is better logging the error together with the path to the filename and call stack directly inside the function. This simplifies the code at callers' side without reducing the level of debuggability	2023-04-14 15:12:48 -07:00
Aliaksandr Valialkin	5eb163a08a	lib/filestream: transform Open() -> MustOpen() Callers of this function log the returned error and exit. Let's log the error with the path to the filename and call stack inside the function. This simplifies the code at callers' side without reducing the level of debuggability.	2023-04-14 15:03:42 -07:00
Aliaksandr Valialkin	f341b7b3f8	lib/fs: substitute ReadFullData with MustReadData Callers of ReadFullData() log the error and then exit. So let's log the error with the path to the filename and the call stack inside MustReadData(). This simplifies the code at callers' side, while leaving the debuggability at the same level.	2023-04-14 14:39:29 -07:00
Aliaksandr Valialkin	bd6de6406a	lib/fs: improve error logging inside MustWriteData Log the path to file on errors inside MustWriteData(). This improves debuggability of errors, which may occur inside MustWriteData().	2023-04-14 14:32:45 -07:00
Aliaksandr Valialkin	e0595af2bf	lib/{mergeset,storage}: remove isInMerge flag from parts only when they werent removed yet from the list of active parts This prevents from possible panic during access to pw.p when it is set to nil at partWrapper.decRef() called inside swapSrcWithDstParts()	2023-04-14 00:08:11 -07:00
Aliaksandr Valialkin	9f8209d593	docs/CHANGELOG.md: run at least 4 background mergers on systems with less than 4 CPU cores This reduces the probability of sudden spike in the number of small parts when all the background mergers are busy with big merges.	2023-04-13 23:43:17 -07:00
Aliaksandr Valialkin	550d5c7ea4	lib/{mergeset,storage}: make sure that getFlushToDiskDeadline() takes into account only in-memory parts	2023-04-13 23:43:17 -07:00
Aliaksandr Valialkin	809fbaeaac	lib/fs: add Must prefix to CopyDirectory and CopyFile functions Callers of these functions log the returned error and then exit. Let's log the error with the call stack inside the function itself. This simplifies the code at callers' side, while leaving the same level of debuggability in case of errors.	2023-04-13 23:02:59 -07:00
Aliaksandr Valialkin	780abc3b3b	lib/fs: rename SymlinkRelative to MustSymlinkRelative Callers of this function log the returned error and then exit. Let's log the error with the call stack inside the function itself. This simplifies the code at callers' side, while leaving the same level of debuggability in case of errors.	2023-04-13 22:52:55 -07:00
Aliaksandr Valialkin	5f487ed996	lib/fs: rename HardLinkFiles to MustHardLinkFiles Callers of this function log the returned error and then exit. Let's log the error with the call stack inside the function itself. This simplifies the code at callers' side, while leaving the same level of debuggability in case of errors.	2023-04-13 22:48:07 -07:00
Aliaksandr Valialkin	30425ca81a	lib/fs: rename WriteFileAtomically to MustWriteAtomic Callers of this function log the returned error and exit. So let's just log the error with the given filepath and the call stack inside the function itself and then exit. This simplifies the code at callers' place while leaves the same level of debuggability in case of errors.	2023-04-13 22:41:15 -07:00
Aliaksandr Valialkin	036a7b7365	lib/fs: replace MkdirAllIfNotExist->MustMkdirIfNotExist and MkdirAllFailIfExist->MustMkdirFailIfExist Callers of these functions log the returned error and then exit. The returned error already contains the path to directory, which was failed to be created. So let's just log the error together with the call stack inside these functions. This leaves the debuggability of the returned error at the same level while allows simplifying the code at callers' side. While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk(). It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.	2023-04-13 22:11:59 -07:00
Aliaksandr Valialkin	344209e5e6	lib/fs: rename MustWriteFileAndSync to MustWriteSync in order to improve readability a bit This is a follow-up for `2a8395be05`	2023-04-13 21:43:32 -07:00
Aliaksandr Valialkin	b15c5961ab	lib/{mergeset,storage}: remove unused `path` field from blockStreamWriter This is a follow-up after `42bba64aa7`	2023-04-13 21:39:59 -07:00
Aliaksandr Valialkin	2a8395be05	lib/fs: replace WriteFileAndSync with MustWriteAndSync When WriteFileAndSync fails, then the caller eventually logs the error message and exits. The error message returned by WriteFileAndSync already contains the path to the file, which couldn't be created. This information alongside the call stack is enough for debugging the issue. So just use log.Panicf("FATAL: ...") inside MustWriteAndSync(). This simplifies error handling at caller side a bit.	2023-04-13 21:33:19 -07:00
Aliaksandr Valialkin	25f089de9d	lib/{mergeset,storage}: properly fsync part directory listing after writing in-memory part to disk This is a follow-up after `42bba64aa7` Previously the part directory listing was fsync'ed implicitly inside partHeader.WriteMetadata() by calling fs.WriteFileAtomically(). Now it must be fsync'ed explicitly. There is no need in fsync'ing the parent directory, since it is fsync'ed by the caller when updating parts.json file.	2023-04-13 21:19:04 -07:00
Aliaksandr Valialkin	42bba64aa7	lib/{mergeset,storage}: explicitly fsync the created part directory listing Previously the created part directory listing was fsynced implicitly when storing metadata.json file in it. Also remove superflouous fsync for part directory listing, which was called at blockStreamWriter.MustClose(). After that the metadata.json file is created, so an additional fsync for the directory contents is needed.	2023-04-13 21:03:08 -07:00
Aliaksandr Valialkin	e1211a1187	app/vmstorage: deprecate -bigMergeConcurrency command-line flag Improperly configured -bigMergeConcurrency command-line flag usually leads to uncontrolled growth of unmerged parts, which, in turn, increases CPU usage and query durations. So it is better deprecating this flag. In rare cases -smallMergeConcurrency command-line flag can be used instead for controlling the concurrency of background merges.	2023-04-13 20:40:24 -07:00
Haleygo	0ad6010c91	fix sort pendingDateMetricsIDs (#4102 )	2023-04-10 10:23:12 -07:00
Aliaksandr Valialkin	19b189e9b7	lib/storage: use shorter code after `03bde173b7`	2023-04-02 21:35:52 -07:00
faceair	38fc55976e	lib/storage: fix reuse pendingMetricRow (#4049 )	2023-04-02 21:35:50 -07:00
faceair	f3af8331ec	lib/storage: remove unused code (#4050 )	2023-04-02 21:24:42 -07:00
Roman Khavronenko	27b958ba8b	lib/storage: check for free disk space before opening tables (#4035 ) * lib/storage: check for free disk space before opening tables We check for free disk space before call to `openTable`, so `Storage` can be set to ReadOnly before mergeWorkers start. Before the change, there was a chance that merges will start even if Storage has to start in ReadOnly mode because of `-storage.minFreeDiskSpaceBytes` limit. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4023 Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/storage: chore Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update lib/storage/storage.go --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-03-31 23:50:27 -07:00
Aliaksandr Valialkin	c8f2febaa1	lib/storage: consistently use OS-independent separator in file paths This is needed for Windows support, which uses `\` instead of `/` as file separator Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 14:33:58 -07:00
Aliaksandr Valialkin	b14d96618c	all: follow-up after `34634ec357` - Use windows.FlushFileBuffers() instead of windows.Fsync() at streamTracker.adviseDontNeed() for consistency with implementations for other architectures. - Use filepath.Base() instead of filepath.Split(), since the dir part isn't used. This simplifies the code a bit. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 11:57:39 -07:00
Nikolay	34634ec357	lib/fs: adds memory map for windows (#3988 ) This is a follow-up for `43b24164ef` * lib/fs: adds memory map for windows it should improve performance for file reading * lib/storage: replace '/' with os specific separator it must fix an errors for windows * lib/fs: mention windows fsync support * lib/filestream: adds fdatasync for windows writes Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-25 11:43:19 -07:00
Dmytro Kozlov	5c92022cc6	lib/storage: fix collect downsampling metrics (#489 ) * lib/storage: fix downsampling * lib/storage: update logic * lib/storage: fix comments, removed unneeded check	2023-03-19 23:34:46 -07:00
Aliaksandr Valialkin	43b24164ef	all: add Windows build for VictoriaMetrics This commit changes background merge algorithm, so it becomes compatible with Windows file semantics. The previous algorithm for background merge: 1. Merge source parts into a destination part inside tmp directory. 2. Create a file in txn directory with instructions on how to atomically swap source parts with the destination part. 3. Perform instructions from the file. 4. Delete the file with instructions. This algorithm guarantees that either source parts or destination part is visible in the partition after unclean shutdown at any step above, since the remaining files with instructions is replayed on the next restart, after that the remaining contents of the tmp directory is deleted. Unfortunately this algorithm doesn't work under Windows because it disallows removing and moving files, which are in use. So the new algorithm for background merge has been implemented: 1. Merge source parts into a destination part inside the partition directory itself. E.g. now the partition directory may contain both complete and incomplete parts. 2. Atomically update the parts.json file with the new list of parts after the merge, e.g. remove the source parts from the list and add the destination part to the list before storing it to parts.json file. 3. Remove the source parts from disk when they are no longer used. This algorithm guarantees that either source parts or destination part is visible in the partition after unclean shutdown at any step above, since incomplete partitions from step 1 or old source parts from step 3 are removed on the next startup by inspecting parts.json file. This algorithm should work under Windows, since it doesn't remove or move files in use. This algorithm has also the following benefits: - It should work better for NFS. - It fits object storage semantics. The new algorithm changes data storage format, so it is impossible to downgrade to the previous versions of VictoriaMetrics after upgrading to this algorithm. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3236 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3821 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2023-03-19 01:36:51 -07:00
Aliaksandr Valialkin	6460475e3b	lib/{mergeset,storage}: prevent from long wait time when creating a snapshot under high data ingestion rate Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3873	2023-03-19 00:15:30 -07:00

1 2 3 4 5 ...

663 commits