Commit graph

252 commits

Author SHA1 Message Date
Aliaksandr Valialkin
61562cdee9
lib/storage: keep (date, metricID) entries only for the last two dates
Entries for the previous dates is usually not used, so there is little sense in keeping them in memory.

This should reduce the size of storage/date_metricID cache, which can be monitored
via vm_cache_entries{type="storage/date_metricID"} metric.
2024-01-29 18:44:27 +01:00
Aliaksandr Valialkin
f5559c038c
lib/storage: do not check the limit for -search.maxUniqueTimeseries when performing /api/v1/labels and /api/v1/label/.../values requests
This limit has little sense for these APIs, since:

- Thses APIs frequently result in scanning of all the time series on the given time range.
  For example, if extra_filters={datacenter="some_dc"} .

- Users expect these APIs shouldn't hit the -search.maxUniqueTimeseries limit,
  which is intended for limiting resource usage at /api/v1/query and /api/v1/query_range requests.

Also limit the concurrency for /api/v1/labels, /api/v1/label/.../values
and /api/v1/series requests in order to limit the maximum memory usage and CPU usage for these API.
This limit shouldn't affect typical use cases for these APIs:

- Grafana dashboard load when dashboard labels should be loaded
- Auto-suggestion list load when editing the query in Grafana or vmui

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055
2024-01-29 16:44:46 +01:00
Aliaksandr Valialkin
7a8b92b590
lib/{mergeset,storage}: make background merge more responsive and scalable
- Maintain a separate worker pool per each part type (in-memory, file, big and small).
  Previously a shared pool was used for merging all the part types.
  A single merge worker could merge parts with mixed types at once. For example,
  it could merge simultaneously an in-memory part plus a big file part.
  Such a merge could take hours for big file part. During the duration of this merge
  the in-memory part was pinned in memory and couldn't be persisted to disk
  under the configured -inmemoryDataFlushInterval .

  Another common issue, which could happen when parts with mixed types are merged,
  is uncontrolled growth of in-memory parts or small parts when all the merge workers
  were busy with merging big files. Such growth could lead to significant performance
  degradataion for queries, since every query needs to check ever growing list of parts.
  This could also slow down the registration of new time series, since VictoriaMetrics
  searches for the internal series_id in the indexdb for every new time series.

  The third issue is graceful shutdown duration, which could be very long when a background
  merge is running on in-memory parts plus big file parts. This merge couldn't be interrupted,
  since it merges in-memory parts.

  A separate pool of merge workers per every part type elegantly resolves both issues:
  - In-memory parts are merged to file-based parts in a timely manner, since the maximum
    size of in-memory parts is limited.
  - Long-running merges for big parts do not block merges for in-memory parts and small parts.
  - Graceful shutdown duration is now limited by the time needed for flushing in-memory parts to files.
    Merging for file parts is instantly canceled on graceful shutdown now.

- Deprecate -smallMergeConcurrency command-line flag, since the new background merge algorithm
  should automatically self-tune according to the number of available CPU cores.

- Deprecate -finalMergeDelay command-line flag, since it wasn't working correctly.
  It is better to run forced merge when needed - https://docs.victoriametrics.com/#forced-merge

- Tune the number of shards for pending rows and items before the data goes to in-memory parts
  and becomes visible for search. This improves the maximum data ingestion rate and the maximum rate
  for registration of new time series. This should reduce the duration of data ingestion slowdown
  in VictoriaMetrics cluster on e.g. re-routing events, when some of vmstorage nodes become temporarily
  unavailable.

- Prevent from possible "sync: WaitGroup misuse" panic on graceful shutdown.

This is a follow-up for fa566c68a6 .
Thanks @misutoth to for the inspiration at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5190
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3790
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2024-01-26 22:19:52 +01:00
Aliaksandr Valialkin
9b3217db61
lib/storage: do not sort metricIDs passed to Storage.prefetchMetricNames, since the caller is responsible for the sorting 2024-01-23 16:13:19 +02:00
Aliaksandr Valialkin
d52fd73f18
all: add up to 10% random jitter to the interval between periodic tasks performed by various components
This should smooth CPU and RAM usage spikes related to these periodic tasks,
by reducing the probability that multiple concurrent periodic tasks are performed at the same time.
2024-01-22 18:39:16 +02:00
Aliaksandr Valialkin
64e615e6cc
lib/storage: reduce the contention on dateMetricIDCache mutex when new time series are registered at high rate
The dateMetricIDCache puts recently registered (date, metricID) entries into mutable cache protected by the mutex.
The dateMetricIDCache.Has() checks for the entry in the mutable cache when it isn't found in the immutable cache.
Access to the mutable cache is protected by the mutex. This means this access is slow on systems with many CPU cores.
The mutabe cache was merged into immutable cache every 10 seconds in order to avoid slow access to mutable cache.
This means that ingestion of new time series to VictoriaMetrics could result in significant slowdown for up to 10 seconds
because of bottleneck at the mutex.

Fix this by merging the mutable cache into immutable cache after len(cacheItems) / 2
cache hits under the mutex, e.g. when the entry is found in the mutable cache.
This should automatically adjust intervals between merges depending on the addition rate
for new time series (aka churn rate):

- The interval will be much smaller than 10 seconds under high churn rate.
  This should reduce the mutex contention for mutable cache.
- The interval will be bigger than 10 seconds under low churn rate.
  This should reduce the uneeded work on merging of mutable cache into immutable cache.
2024-01-22 18:14:30 +02:00
Aliaksandr Valialkin
5f5fcab217
all: call atomic.Load* in front of atomic.CompareAndSwap* at places where the atomic.CompareAndSwap* returns false most of the time
This allows avoiding slow inter-CPU synchornization induced by atomic.CompareAndSwap*
2024-01-22 01:13:41 +02:00
Aliaksandr Valialkin
41d6c8a7dd
lib/storage: do not prefetch metric names for small number of metricIDs
This eliminates prefetchedMetricIDsLock lock contention for queries, which return less than 500 time series.

This is a follow-up for 9d886a2eb0
2024-01-17 13:50:01 +02:00
Aliaksandr Valialkin
f673039e86
lib/storage: follow-up for 4b8088e377
- Clarify the bugfix description at docs/CHANGELOG.md
- Simplify the code by accessing prefetchedMetricIDs struct under the lock
  instead of using lockless access to immutable struct.
  This shouldn't worsen code scalability too much on busy systems with many CPU cores,
  since the code executed under the lock is quite small and fast.
  This allows removing cloning of prefetchedMetricIDs struct every time
  new metric names are pre-fetched. This should reduce load on Go GC,
  since the cloning of uin64set.Set struct allocates many new objects.
2024-01-16 22:38:57 +02:00
Roman Khavronenko
d562d772a8
lib/storage: properly check for storage/prefetchedMetricIDs cache expiration deadline (#5607)
Before, this cache was limited only by size.
Cache invalidation by time happens with jitter to prevent thundering herd problem.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-01-16 21:08:59 +02:00
Aliaksandr Valialkin
c22b63af04
lib/storage: follow-up for 29cebd82fb
Use atomic.CompareAndSwapUint32() instead of atomic.LoadUint32() followed by atomic.StoreUint32().
This makes the code more clear.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159
2023-10-31 19:03:50 +01:00
Roman Khavronenko
733b73ffed
lib/storage: log warning about RO mode only on state change (#5191)
Before, vmstorage would log the same message each second producing excessive
amount of logs.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 29cebd82fb)
2023-10-30 11:29:49 +01:00
hagen1778
afab547821
lib/storage: follow-up after 188cfe3a85
188cfe3a85

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-26 09:04:49 +02:00
Ilya Trefilov
1fd3385965
lib/storage: do not create tsid if metric contains stale marker(#5069) (#5174)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5069
2023-10-26 09:03:19 +02:00
Haleygo
93589ecccf
fix ingesting stale point, follow up fe8cc573d1 (#5179) 2023-10-16 14:00:39 +02:00
hagen1778
709a2bad66
docs: remove extra / in the end of the link
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-16 13:24:30 +02:00
Aliaksandr Valialkin
281eb0c377
lib/storage: log fatal error inside searchMetricName() instead of propagating it to the caller
This simplifies the code a bit at searchMetricName() and searchMetricNameWithCache() call sites

This is a result of investigating https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4972
2023-09-22 11:37:55 +02:00
Aliaksandr Valialkin
24cec79763
lib/storage: handle fatal errors inside indexSearch.getTSIDByMetricID() instead of returning them to the caller
This simplifies the code a bit at caller side
2023-09-15 12:01:06 +02:00
Dima Lazerka
b97041c252
flagutil: Make .Msecs private (#4906)
* Introduce flagutil.Duration

To avoid conversion bugs

* Fix tests

* Clarify documentation re. month=31 days

* Add fasttime.UnixTime() to obtain time.Time

The goal is to refactor out the last usage of `.Msecs`.

* Use fasttime for time.Now()

* wip

- Remove fasttime.UnixTime(), since it doesn't improve code readability and maintainability
- Run `make docs-sync` for syncing changes from README.md to docs/ folder
- Make lib/flagutil.Duration.Msec private
- Rename msecsPerMonth const to msecsPer31Days in order to be consistent with retention31Days

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-09-03 10:37:57 +02:00
Dima Lazerka
1d60c236b1
Add flagutil.Duration to avoid conversion bugs (#4835)
* Introduce flagutil.Duration

To avoid conversion bugs

* Fix tests

* Comment why not .Seconds()
2023-09-01 09:30:30 +02:00
Nikolay
1a943bb16a
lib/storage: properly caclucate nextRotationTimestamp (#4874)
cause of typo unix millis was used instead of unix for current timestamp
calculation
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4873

(cherry picked from commit c5aac34b68)
2023-08-23 13:29:32 +02:00
Aliaksandr Valialkin
89ccf19b70
lib/storage: update nextRotationTimestamp relative to the timestamp of the indexdb rotation
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563
2023-07-28 19:48:42 -07:00
Nikolay
30b32583f4
lib/storage: pre-create timeseries before indexDB rotation (#4652)
* lib/storage: pre-create timeseries before indexDB rotation
during an hour before indexDB rotation start creating records at the next indexDB
it must improve performance during switch for the next indexDB and remove ingestion issues.
Since there is no need for creation new index records for timeseries already ingested into current indexDB
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563

* lib/storage: further work on indexdb rotation optimization

- Document the change at docs/CHAGNELOG.md
- Move back various caches from indexDB to Storage. This makes the change less intrusive.
  The dateMetricIDCache now takes into account indexDB generation, so it stores (date, metricID)
  entries for both the current and the next indexDB.
- Consolidate the code responsible for idbNext pre-filling into prefillNextIndexDB() function.
  This improves code readability and maintainability a bit.
- Rewrite and simplify the code responsible for calculating the next retention timestamp.
  Add various tests for corner cases of this code.
- Remove indexdb pre-filling from RegisterMetricNames() function, since this function is rarely called.
  It is OK to add indexdb entries on demand in this function. This simplifies the code.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401

* docs/CHANGELOG.md: refer to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-07-22 15:23:14 -07:00
Aliaksandr Valialkin
992c300ce9
all: replace atomic.Value with atomic.Pointer[T]
This eliminates the need in .(*T) casting for results obtained from Load()

Leave atomic.Value for map, since atomic.Pointer[map[...]...] makes double pointer to map,
because map is already a pointer type.
2023-07-19 17:48:26 -07:00
Aliaksandr Valialkin
3d23fd9853
lib/storage: move series registration in caches from createAllIndexesForMetricName into a separate function - putSeriesToCache
This makes the code more clear and easier to read

This is a follow-up for 7094fa38bc
2023-07-13 23:17:14 -07:00
Aliaksandr Valialkin
7d359d17d1
lib/storage: properly free up resources from newTestStorage() by calling stopTestStorage() 2023-07-13 17:13:34 -07:00
Aliaksandr Valialkin
e1cf962bad
lib/storage: switch from global to per-day index for MetricName -> TSID mapping
Previously all the newly ingested time series were registered in global `MetricName -> TSID` index.
This index was used during data ingestion for locating the TSID (internal series id)
for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names).

The `MetricName -> TSID` index is stored on disk in order to make sure that the data
isn't lost on VictoriaMetrics restart or unclean shutdown.

The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding
data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache,
and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics
uses in-memory cache for speeding up the lookup for active time series.
This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested
active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk.

VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases:

- If `storage/tsid` cache capacity isn't enough for active time series.
  Then just increase available memory for VictoriaMetrics or reduce the number of active time series
  ingested into VictoriaMetrics.

- If new time series is ingested into VictoriaMetrics. In this case it cannot find
  the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index,
  since it doesn't know that the index has no the corresponding entry too.
  This is a typical event under high churn rate, when old time series are constantly substituted
  with new time series.

Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index,
are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics.

Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName`
for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod.
This index can become very large under high churn rate and long retention. VictoriaMetrics
caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups.
The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing
recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series.

This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly
reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics
consults only the index for the current day when new time series is ingested into it.

The downside of this change is increased indexdb size on disk for workloads without high churn rate,
e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store
identical `MetricName -> TSID` entries for static time series for every day.

This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation,
since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 .

At the same time the change fixes the issue, which could result in lost access to time series,
which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698

The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed
in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685

This is a follow-up for 1f28b46ae9
2023-07-13 17:03:50 -07:00
Aliaksandr Valialkin
0ebfb91aba
lib/storage: revert the migration from global to per-day index for (MetricName -> TSID)
This reverts the following commits:
- e0e16a2d36
- 2ce02a7fe6

The reason for revert: the updated logic breaks assumptions made
when fixing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 .
For example, if a time series stop receiving new samples during the first
day after the indexdb rotation, there are chances that the time series
won't be registered in the new indexdb. This is OK until the next indexdb
rotation, since the time series is registered in the previous indexdb,
so it can be found during queries. But the time series will become invisible
for search after the next indexdb rotation, while its data is still there.

There is also incompletely solved issue with the increased CPU and disk IO resource
usage just after the indexdb rotation. There was an attempt to fix it, but it didn't fix
it in full, while introducing the issue mentioned above. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401

TODO: to find out the solution, which simultaneously solves the following issues:
- increased memory usage for setups high churn rate and long retention (e.g. what the reverted commit does)
- increased CPU and disk IO usage during indexdb rotation ( https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 )
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698
2023-05-18 11:28:54 -07:00
Aliaksandr Valialkin
67beb8c856
lib/storage: follow-up after 2ce02a7fe6
- Document the change at docs/CHANGELOG.md
- Clarify comments for non-trivial code touched by the commit
- Improve the logic behind maybeCreateIndexes():
  - Correctly create per-day indexes if the indexdb rotation is performed during
    the first hour or the last hour of the day by UTC.
    Previously there was a possibility of missing index entries on that day.
  - Increase the duration for creating new indexes in the current indexdb for up to 22 hours
    after indexdb rotation. This should reduce the increased resource usage
    after indexdb rotation.
    It is safe to postpone index creation for the current day until the last hour
    of the current day after indexdb rotation by UTC, since the corresponding (date, ...)
    entries exist in the previous indexdb.
- Search for TSID by (date, MetricName) in both the current and the previous indexdb.
  Previously the search was performed only in the current indexdb. This could lead
  to excess creation of per-day indexes for the current day just after indexdb rotation.
- Search for (date, metricID) entries in both the current and the previous indexdb.
  Previously the search was performed only in the current indexdb. This could lead
  to excess creation of per-day indexes for the current day just after indexdb rotation.
2023-05-16 23:31:59 -07:00
Roman Khavronenko
c3b1d9ee21
lib/storage: introduce per-day MetricName=>TSID index (#4252)
The new index substitutes global MetricName=>TSID index
used for locating TSIDs on ingestion path.
For installations with high ingestion and churn rate, global
MetricName=>TSID index can grow enormously making
index lookups too expensive. This also results into bigger
than expected cache growth for indexdb blocks.

New per-day index supposed to be much smaller and more efficient.
This should improve ingestion speed and reliability during
re-routings in cluster.

The negative outcome could be occupied disk size, since
per-day index is more expensive comparing to global index.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-16 23:18:11 -07:00
Aliaksandr Valialkin
bc98ea9a8d
lib/storage: reduce the unimportant logging during Storage start / stop
This should improve the visibility of potentially important logs
2023-05-16 15:32:35 -07:00
Aliaksandr Valialkin
f09745f613
lib/{mergeset,storage}: make it clear that DebugFlush() doesn't store all the recently ingested data to disk
DebugFlush() makes sure that the recently ingested data becomes visible to search.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4272
2023-05-16 11:55:58 -07:00
Zakhar Bessarab
a2fc912c43
lib/storage: follow-up after a50d63c376 (#4289)
* lib/storage: follow-up after a50d63c376

- ensure retentionMsecs is rounded to day
- remove localTimeOffset in test as localOffset is ignored when using `UnixMilli`

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/storage: restore retention timezone offset effect on retention deadline

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-05-16 10:13:20 -07:00
Nikolay
8c9dc837b9
lib/storage: properly update link for entry at dateMetricID cache (#4258)
previously during sync for mutable and immutable cache parts, link for hotEntry with current date may be not properly updated
it corrupts cache for backfilling metrics and increased cpu load
2023-05-09 21:39:41 -07:00
Zakhar Bessarab
348693ff84
lib/storage: fix indexdb rotation infinite loop (#4249)
When using `retentionTimezoneOffset` and having local timezone being more than 4 hours different from UTC indexdb retention calculation could return negative value. This caused indexdb rotation to get in loop.
Fix calculation of offset to use `retentionTimezoneOffset` value properly and add test to cover all legit timezone configs.
See:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4207
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4206

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-05-09 21:23:01 -07:00
Aliaksandr Valialkin
d739511f5b
lib/storage: replace OpenStorage() with MustOpenStorage()
Callers of OpenStorage() log the returned error and exit.
The error logging and exit can be performed inside MustOpenStorage()
alongside with printing the stack trace for better debuggability.
This simplifies the code at caller side.
2023-04-14 23:04:42 -07:00
Aliaksandr Valialkin
cf4701db65
lib/fs: add MustReadDir() function
Use fs.MustReadDir() instead of os.ReadDir() across the code in order to reduce the code verbosity.
The fs.MustReadDir() logs the error with the directory name and the call stack on error
before exit. This information should be enough for debugging the cause of the error.
2023-04-14 22:11:40 -07:00
Aliaksandr Valialkin
292b6a851f
all: consistently use fs.MustClose() for closing lock files 2023-04-14 20:16:11 -07:00
Aliaksandr Valialkin
a7678350ad
lib/fs: convert CreateFlockFile to MustCreateFlockFile
Callers of CreateFlockFile log the returned err and exit.
It is better to log the error inside the MustCreateFlockFile together with the path
to the specified directory and the call stack. This simplifies
the code at the callers' side while leaving the debuggability at the same level.
2023-04-14 19:51:52 -07:00
Aliaksandr Valialkin
f75b1b7a53
lib/fs: add Must prefix to CopyDirectory and CopyFile functions
Callers of these functions log the returned error and then exit.
Let's log the error with the call stack inside the function itself.
This simplifies the code at callers' side, while leaving the same
level of debuggability in case of errors.
2023-04-13 23:04:37 -07:00
Aliaksandr Valialkin
75b74aa837
lib/fs: rename SymlinkRelative to MustSymlinkRelative
Callers of this function log the returned error and then exit.
Let's log the error with the call stack inside the function itself.
This simplifies the code at callers' side, while leaving the same
level of debuggability in case of errors.
2023-04-13 22:53:11 -07:00
Aliaksandr Valialkin
c4638553a3
lib/fs: rename WriteFileAtomically to MustWriteAtomic
Callers of this function log the returned error and exit.
So let's just log the error with the given filepath and the call stack
inside the function itself and then exit. This simplifies the code
at callers' place while leaves the same level of debuggability in case of errors.
2023-04-13 22:43:30 -07:00
Aliaksandr Valialkin
aac3dccfd1
lib/fs: replace MkdirAllIfNotExist->MustMkdirIfNotExist and MkdirAllFailIfExist->MustMkdirFailIfExist
Callers of these functions log the returned error and then exit. The returned error already contains the path
to directory, which was failed to be created. So let's just log the error together with the call stack
inside these functions. This leaves the debuggability of the returned error at the same level
while allows simplifying the code at callers' side.

While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk().
It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.
2023-04-13 22:22:08 -07:00
Haleygo
7ee32ed06a
fix sort pendingDateMetricsIDs (#4102) 2023-04-10 10:16:36 -07:00
Aliaksandr Valialkin
52734c71fc
lib/storage: use shorter code after 03bde173b7 2023-04-02 21:35:34 -07:00
faceair
03bde173b7
lib/storage: fix reuse pendingMetricRow (#4049) 2023-04-02 21:28:43 -07:00
Roman Khavronenko
5f95f9d453
lib/storage: check for free disk space before opening tables (#4035)
* lib/storage: check for free disk space before opening tables

We check for free disk space before call to `openTable`,
so `Storage` can be set to ReadOnly before mergeWorkers start.

Before the change, there was a chance that merges will start
even if Storage has to start in ReadOnly mode because of
`-storage.minFreeDiskSpaceBytes` limit.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4023
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/storage: chore

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update lib/storage/storage.go

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-31 23:50:56 -07:00
Aliaksandr Valialkin
f6c36d5dfd
lib/storage: consistently use OS-independent separator in file paths
This is needed for Windows support, which uses `\` instead of `/` as file separator

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-03-25 14:34:36 -07:00
Aliaksandr Valialkin
fc3d826d7f
all: add Windows build for VictoriaMetrics
This commit changes background merge algorithm, so it becomes compatible with Windows file semantics.

The previous algorithm for background merge:

1. Merge source parts into a destination part inside tmp directory.
2. Create a file in txn directory with instructions on how to atomically
   swap source parts with the destination part.
3. Perform instructions from the file.
4. Delete the file with instructions.

This algorithm guarantees that either source parts or destination part
is visible in the partition after unclean shutdown at any step above,
since the remaining files with instructions is replayed on the next restart,
after that the remaining contents of the tmp directory is deleted.

Unfortunately this algorithm doesn't work under Windows because
it disallows removing and moving files, which are in use.

So the new algorithm for background merge has been implemented:

1. Merge source parts into a destination part inside the partition directory itself.
   E.g. now the partition directory may contain both complete and incomplete parts.
2. Atomically update the parts.json file with the new list of parts after the merge,
   e.g. remove the source parts from the list and add the destination part to the list
   before storing it to parts.json file.
3. Remove the source parts from disk when they are no longer used.

This algorithm guarantees that either source parts or destination part
is visible in the partition after unclean shutdown at any step above,
since incomplete partitions from step 1 or old source parts from step 3 are removed
on the next startup by inspecting parts.json file.

This algorithm should work under Windows, since it doesn't remove or move files in use.
This algorithm has also the following benefits:

- It should work better for NFS.
- It fits object storage semantics.

The new algorithm changes data storage format, so it is impossible to downgrade
to the previous versions of VictoriaMetrics after upgrading to this algorithm.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3236
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3821
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-03-19 23:28:26 -07:00
Aliaksandr Valialkin
8aeee8bcca
lib/{fs,mergeset,storage}: substitute os.Open()+os.File.Readdir() with os.ReadDir()
This simplifies code a bit
2023-03-17 21:03:52 -07:00