Aliaksandr Valialkin
0d41d933e9
lib/mergeset: reduce the parts threshold before starting assisted merges
...
This should improve query speed in general case.
This is a follow-up for d1af6046c7
2022-12-13 09:13:49 -08:00
Aliaksandr Valialkin
d1af6046c7
lib/{mergeset,storage}: do not block small merges by pending big merges - assist with small merges instead
...
Blocked small merges may result into big number of small parts, which, in turn,
may result in increased CPU and memory usage during queries, since queries need to inspect
all the existing small parts.
The issue has been introduced in 8189770c50
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-12 17:00:50 -08:00
Aliaksandr Valialkin
5b9e6b9d24
lib/storage: follow-up after 7c0ae3a86a
...
- Update docs at https://docs.victoriametrics.com/#deduplication
- Optimize the deduplication loop a bit
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333
2022-12-08 18:16:57 -08:00
Roman Khavronenko
7c0ae3a86a
lib/storage: keep sample with the biggest value on timestamp conflict ( #3421 )
...
The change leaves raw sample with the biggest value for identical
timestamps per each `-dedup.minScrapeInterval` discrete interval
when the deduplication is enabled.
```
benchstat old.txt new.txt
name old time/op new time/op delta
DeduplicateSamples/minScrapeInterval=1s-10 817ns ± 2% 832ns ± 3% ~ (p=0.052 n=10+10)
DeduplicateSamples/minScrapeInterval=2s-10 1.56µs ± 1% 2.12µs ± 0% +35.19% (p=0.000 n=9+7)
DeduplicateSamples/minScrapeInterval=5s-10 1.32µs ± 3% 1.65µs ± 2% +25.57% (p=0.000 n=10+10)
DeduplicateSamples/minScrapeInterval=10s-10 1.13µs ± 2% 1.50µs ± 1% +32.85% (p=0.000 n=10+10)
name old speed new speed delta
DeduplicateSamples/minScrapeInterval=1s-10 10.0GB/s ± 2% 9.9GB/s ± 3% ~ (p=0.052 n=10+10)
DeduplicateSamples/minScrapeInterval=2s-10 5.24GB/s ± 1% 3.87GB/s ± 0% -26.03% (p=0.000 n=9+7)
DeduplicateSamples/minScrapeInterval=5s-10 6.22GB/s ± 3% 4.96GB/s ± 2% -20.37% (p=0.000 n=10+10)
DeduplicateSamples/minScrapeInterval=10s-10 7.28GB/s ± 2% 5.48GB/s ± 1% -24.74% (p=0.000 n=10+10)
```
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-08 18:06:11 -08:00
Aliaksandr Valialkin
d99d222f0a
lib/{storage,mergeset}: log the duration for flushing in-memory parts on graceful shutdown
2022-12-05 21:30:48 -08:00
Aliaksandr Valialkin
8189770c50
all: add -inmemoryDataFlushInterval
command-line flag for controlling the frequency of saving in-memory data to disk
...
The main purpose of this command-line flag is to increase the lifetime of low-end flash storage
with the limited number of write operations it can perform. Such flash storage is usually
installed on Raspberry PI or similar appliances.
For example, `-inmemoryDataFlushInterval=1h` reduces the frequency of disk write operations
to up to once per hour if the ingested one-hour worth of data fits the limit for in-memory data.
The in-memory data is searchable in the same way as the data stored on disk.
VictoriaMetrics automatically flushes the in-memory data to disk on graceful shutdown via SIGINT signal.
The in-memory data is lost on unclean shutdown (hardware power loss, OOM crash, SIGKILL).
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-05 15:16:14 -08:00
Aliaksandr Valialkin
544ea89f91
lib/{mergeset,storage}: add start background workers via startBackgroundWorkers() function
2022-12-04 00:01:04 -08:00
Aliaksandr Valialkin
33dda2809b
lib/mergeset: panic when too long item is passed to Table.AddItems()
2022-12-03 23:32:16 -08:00
Aliaksandr Valialkin
932c1f90ae
lib/storage: remove duplicate logging for filepath on errors
2022-12-03 23:15:22 -08:00
Aliaksandr Valialkin
044a304adb
lib/storage: pass a single arg - rowsPerBlock - to getCompressLevel() function instead of two args
2022-12-03 23:10:16 -08:00
Aliaksandr Valialkin
cb44976716
lib/{storage,mergeset}: use a single sync.WaitGroup for all background workers
...
This simplifies the code
2022-12-03 23:03:08 -08:00
Aliaksandr Valialkin
28e6d9e1ff
lib/storage: properly pass retentionMsecs to OpenStorage() at TestIndexDBRepopulateAfterRotation
2022-12-03 23:02:10 -08:00
Aliaksandr Valialkin
343c69fc15
lib/{mergeset,storage}: pass compressLevel to blockStreamWriter.InitFromInmemoryPart
...
This allows packing in-memory blocks with different compression levels
depending on its contents. This may save memory usage.
2022-12-03 22:46:48 -08:00
Aliaksandr Valialkin
f3e3a3daeb
lib/{mergeset,storage}: take into account byte slice capacity when returning the size of in-memory part
...
This results in more correct reporting of memory usage for in-memory parts
2022-12-03 22:30:36 -08:00
Aliaksandr Valialkin
45299efe22
lib/{storage,mergeset}: consistency rename: `flushRaw{Rows,Items} -> flushPending{Rows,Items}
2022-12-03 22:17:46 -08:00
Aliaksandr Valialkin
5ca58cc4fb
lib/storage: optimization: do not scan block for rows outside retention if it is covered by the retention
2022-12-03 22:14:12 -08:00
Aliaksandr Valialkin
152ac564ab
lib/storage: remove logging redundant path values in a single error message
2022-12-03 22:13:13 -08:00
Aliaksandr Valialkin
05c65bd83f
lib/storage: speed up search for data block for the given tsids
...
Use binary search instead of linear scan for looking up the needed
data block inside index block.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-03 20:58:32 -08:00
Aliaksandr Valialkin
299285b147
lib/storage: fix TestUpdateCurrHourMetricIDs test when it runs on the first hour of the day by UTC
2022-12-02 18:52:37 -08:00
Aliaksandr Valialkin
e9636b4c69
lib/{mergeset,storage}: re-use the code for removing isInMerge flag at parts
...
Move the common code into releasePartsToMerge() method and consistently use it throughout the code.
2022-12-02 18:52:37 -08:00
匠心零度
fa0ce10275
lib/storage: remove extra error check ( #3396 )
2022-11-28 16:43:31 -08:00
Aliaksandr Valialkin
daa70e6560
lib/storage: follow-up for 790768f20b
...
- Document the bugfix at docs/CHANGELOG.md
- Simplify the bugfix a bit
2022-11-07 14:04:08 +02:00
Aliaksandr Valialkin
f9dc3da9e2
lib/storage: typo fix after 32d48f8dfbb03174858c00bdfe6d9d22431dc8d8
2022-11-07 13:58:27 +02:00
Aliaksandr Valialkin
dd88c628aa
lib/storage: remove unused isFull field from hourMetricIDs struct
2022-11-07 13:58:26 +02:00
Łukasz Marszał
790768f20b
Fix issue-3309 - currHourMetricIDs shouldn't contain metrics from prev hour ( #3320 )
...
* fix issue-3309 currHourMetricIDs shouldn't contain metrics from prev hour
* Update storage.go
2022-11-07 13:55:37 +02:00
Aliaksandr Valialkin
c4265322f4
lib/fs: add canOverwrite arg to WriteFileAtomically when it is allowed to overwrite the file atomically if it already exists
2022-10-26 01:07:34 +03:00
Aliaksandr Valialkin
8e998aa1a1
lib/storage: add support for retention filters (aka multiple retentions for distinct sets of time series)
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/143
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/289
2022-10-24 16:40:20 +03:00
Aliaksandr Valialkin
dba218a8ce
lib/storage: skip blocks outside the configured retention during search
...
Blocks outside the configured retention are eventually deleted during background merge.
But such blocks may reside in the storage for long time until background merge.
Previously VictoriaMetrics could spend additional CPU time on processing such blocks
during search queries. Now these blocks are skipped.
2022-10-24 02:52:44 +03:00
Aliaksandr Valialkin
e2f0b76ebf
lib/storage: do not pass retentionMsecs and isReadOnly args explicitly - access them via Storage arg
...
This makes code easier to read.
This is a follow-up after d2d30581a0
2022-10-24 01:31:04 +03:00
Aliaksandr Valialkin
89a1108b1a
lib/storage: small code cleanups
2022-10-24 01:17:47 +03:00
Aliaksandr Valialkin
05512fdd74
lib/storage: re-use newTestStorage() instead of manually initializing Storage mock
...
This is a follow-up for d2d30581a0
2022-10-23 16:24:00 +03:00
Aliaksandr Valialkin
d2d30581a0
lib/storage: pass Storage to table and partition instead of getDeletedMetricIDs callback
...
This improves code readability a bit.
2022-10-23 16:10:04 +03:00
Aliaksandr Valialkin
54f35c175c
lib/storage: small refactoring: move retentionDeadline to blockStreamMerger
...
This allows defining per-block retention in the future by updating the getRetentionDeadline function
2022-10-23 16:10:02 +03:00
Aliaksandr Valialkin
187e294a53
lib/storage: use a single reference to the currently merged block - bsm.Block during the block merge loop
2022-10-23 14:08:57 +03:00
Aliaksandr Valialkin
d0a9ca1bc2
lib/storage: properly pass uint64 constant to fmt.Errorf on 32-bit platforms
2022-10-23 12:48:00 +03:00
Aliaksandr Valialkin
5e4dfe50c6
lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function
...
The searchTSIDs function was searching for metricIDs matching the the given tag filters
and then was locating the corresponding TSID entries for the found metricIDs.
The TSID entries aren't needed when searching for time series names (aka MetricName),
so this commit removes the uneeded TSID search from the implementation of /api/v1/series API.
This improves perfromance of /api/v1/series calls.
This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls,
since now these calls cache small metricIDs instead of big TSID entries
in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs)
without the need to compress the saved entries in order to save cache space.
This commit also removes concurrency limiter during searching for matching time series,
which was introduced in 8f16388428
, since the concurrency
for all the read queries is already limited with -search.maxConcurrentRequests command-line flag.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2022-10-23 12:23:47 +03:00
Aliaksandr Valialkin
4128ad71e2
lib/storage: move common code to newRawRowsBlock() function
2022-10-21 14:46:55 +03:00
Aliaksandr Valialkin
b5674164c6
lib/storage: simplify code a bit after 3f5959c053
2022-10-21 14:39:27 +03:00
Aliaksandr Valialkin
fd7c86ae25
lib/{mergeset,storage}: simplify the code a bit after ae55ad8749
2022-10-21 14:33:03 +03:00
Aliaksandr Valialkin
99d67ac8ad
lib/storage: validate timestamps in the block only if they use encoding, which needs validation
...
This reduces CPU usage when there is no sense in validating timestamps.
This is a follow-up for 5fa9525498
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011
2022-10-21 00:52:32 +03:00
Aliaksandr Valialkin
3f5959c053
lib/storage: try generating initial parts from inmemory rows with identical sizes under high ingestion rate
...
This should improve background merge rate under high load a bit
2022-10-20 23:28:24 +03:00
Aliaksandr Valialkin
150e99d403
lib/{mergeset,storage}: avoid unaligned 64-bit atomic operation
panic on 32-bit platforms
...
The panic has been introduced in 68f3a02589
While at it, add padding to shard structs in order to avoid false sharing on mordern CPUs
This should improve scalability on systems with many CPU cores
2022-10-20 16:25:43 +03:00
Aliaksandr Valialkin
fb50730ba7
lib/storage: double the number of rawRows shards on multi-core systems
...
This should increase data ingestion scalability on multi-core systems at the cost of slightly higher memory usage
2022-10-17 18:19:51 +03:00
Aliaksandr Valialkin
ae55ad8749
lib/{storage,mergeset}: do not hold per-shard lock in fast path when adding per-shard items to the flush list
2022-10-17 18:01:26 +03:00
Aliaksandr Valialkin
db16759c68
lib/storage: optimize matching speed for non-trivial regexp filters
...
Wrap re.Match into bytesutil.FastStringMatcher.
This increases performance for `{foo=~"complex_regex_here"}` filters
by up to 4x.
2022-10-01 12:06:06 +03:00
Aliaksandr Valialkin
042a532f70
lib/storage: substitute remaining calls to fs.MustRemoveAll with fs.MustRemoveDirAtomic
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
68e32b0764
lib/storage: atomically remove parts inside partitions
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
340ada871d
lib/storage: atomically remove partitions, which went outside the configured retention
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:37 +03:00
Aliaksandr Valialkin
978dcb4574
lib/storage: properly remove cache directory contents if reset_cache_on_startup
file is located there
...
Previously the cache directory was removed. This could result in error when the cache directory
is mounted to a separate filesystem.
2022-09-13 16:17:36 +03:00
Aliaksandr Valialkin
5f28ca1f42
lib/storage: atomically remove snapshot directories
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:36 +03:00