Aliaksandr Valialkin
ea4c828bae
lib/storage: remove duplicate MetricIDs in tag->metricIDs
items before writing them into inverted index
2019-09-25 17:55:13 +03:00
Aliaksandr Valialkin
aebc45ad26
lib/{mergeset,storage}: do not cache inverted index blocks containing tag->metricIDs
items
...
This should reduce the amounts of used RAM during queries with filters over big number of time series.
2019-09-25 14:02:15 +03:00
Aliaksandr Valialkin
b986516fbe
lib/storage: create and use lib/uint64set
instead of map[uint64]struct{}
...
This should improve inverted index search performance for filters matching big number of time series,
since `lib/uint64set.Set` is faster than `map[uint64]struct{}` for both `Add` and `Has` calls.
See the corresponding benchmarks in `lib/uint64set`.
2019-09-24 21:17:55 +03:00
Aliaksandr Valialkin
ef2296e420
lib/storage: typo fix: return dstData instead of data from mergeTagToMetricIDsRows
2019-09-24 19:32:34 +03:00
Aliaksandr Valialkin
a6086cde78
lib/storage: limit the number of metricIDs in tag->metricIDs row
...
This reduces the overhead on index and metaindex in lib/mergeset
2019-09-24 00:49:51 +03:00
Aliaksandr Valialkin
c9063ece66
lib/storage: share tsids across all the partSearch instances
...
This should reduce memory usage when big number of time series matches the given query.
2019-09-23 22:35:15 +03:00
Aliaksandr Valialkin
4e26ad869b
lib/{storage,mergeset}: verify PrepareBlock callback results
...
Do not touch the first and the last item passed to PrepareBlock
in order to preserve sort order of mergeset blocks.
2019-09-23 20:43:13 +03:00
Aliaksandr Valialkin
0adebae1f8
lib/storage: generate the first tag->metricIDs item in a mergeset block with a single metricID
...
The first item from each mergeset block goes into index (lib/mergeset.blockHeader),
so it must be short in order to reduce index size.
2019-09-22 19:21:33 +03:00
Aliaksandr Valialkin
0686ac52c3
lib/{storage,mergeset}: merge tag->metricID
rows into tag->metricIDs
rows for common tag
values
...
This should improve lookup performance if the same `label=value` pair exists
in big number of time series.
This should also reduce memory usage for mergeset data cache, since `tag->metricIDs` rows
occupy less space than the original `tag->metricID` rows.
2019-09-20 22:06:41 +03:00
Aliaksandr Valialkin
a544f49c2b
lib/storage: optimize selecting all the metricIDs by scanning MetricID->TSID entries instead of tag->MetricID entries
...
The number of MetricID->TSID entries is smaller than the number of tag->MetricID entries
and MetricID->TSID entries are usually shorter than tag->MetricID entries.
This should improve performance when selecting all the metricIDs.
2019-09-20 11:54:10 +03:00
Aliaksandr Valialkin
a84fe76677
lib/storage: use sort.Sort instead of sort.slice in getSortedMetricIDs
2019-09-19 20:07:22 +03:00
Aliaksandr Valialkin
3a697a935a
lib/storage: skip duplicate call to intersectMetricIDsWithTagFilter on zero successful intersects
2019-09-19 17:49:56 +03:00
Aliaksandr Valialkin
3d83f5d334
lib/storage: mark tag filter returning errFallbackToMetricNameMatch as useless
...
This will save CPU on subsequent calls for this filter
2019-09-18 19:10:32 +03:00
Aliaksandr Valialkin
8d35718dc6
lib/storage: properly construct keys for uselessTagFiltersCache and register useless negative tag filters there
2019-09-17 23:20:27 +03:00
Aliaksandr Valialkin
bad53e4207
lib/mergeset: dynamically calculate the maximum number of items per part, which can be cached in OS page cache
2019-09-11 14:53:45 +03:00
Aliaksandr Valialkin
9eb5de334f
lib/storage: typo fix
2019-09-04 19:58:01 +03:00
Aliaksandr Valialkin
16dd145586
lib/storage: remove duplicate tag keys on MetricName.Marshal
call
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/172
2019-09-04 18:13:45 +03:00
Aliaksandr Valialkin
e1d76ec1f3
lib/storage: invalidate tagFilters -> TSIDS
cache when newly added index data becomes visible to search
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/163
2019-08-29 15:08:35 +03:00
Aliaksandr Valialkin
9196c085a7
all: port to FreeBSD on GOARCH=amd64
2019-08-28 01:19:23 +03:00
Aliaksandr Valialkin
2655220c58
lib/storage: go fmt
2019-08-27 14:29:51 +03:00
Aliaksandr Valialkin
bf915fc0db
lib/storage: report proper maxMetrics limit when more than -search.maxUniqueTimeseries series match the given filters
2019-08-27 14:21:42 +03:00
Aliaksandr Valialkin
2fc157ff7a
lib/storage: properly handle (?i)
in the tag filter regexp
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/161
2019-08-26 00:44:45 +03:00
Aliaksandr Valialkin
0dc0006f34
lib/storage: calculate the maximum number of rows per small part from -memory.allowedPercent
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/159
This simplifies error detection additionally to the `vm_rows_ignored_total` counters.
2019-08-25 15:31:47 +03:00
Aliaksandr Valialkin
4b688fffee
lib/storage: calculate the maximum number of rows per small part from -memory.allowedPercent
...
This should improve query speed over recent data on machines with big amounts of RAM
2019-08-25 14:41:12 +03:00
Aliaksandr Valialkin
1402a6b981
lib/storage: properly limit the number of output rows in small and big parts storage
...
Previously small parts storage didn't take into account the available disk space for big parts.
2019-08-25 14:41:12 +03:00
Aliaksandr Valialkin
3308279c4e
lib/storage: remove outdated comment on maxRowsPerSmallPart
...
The commend became outdated after the commit ed6ac1a5df027f0dfc22448e3b27c26b6f77c67a,
which stops merging of small parts on graceful shutdown instead of waiting
for their completion.
2019-08-25 13:47:32 +03:00
Aliaksandr Valialkin
8c03a8c4b4
app/vminsert: allow setting the maximum number of labels per time series via -maxLabelsPerTimeseries
2019-08-23 08:45:26 +03:00
Aliaksandr Valialkin
380cae23a0
lib/storage: add benchmarks for regexp filter match / mismatch
...
These benchmarks allow estimate the performance of regexp filters in promql
2019-08-22 16:36:42 +03:00
Aliaksandr Valialkin
4f738c8a15
lib/storage: try slower path for searching the tag filter with the minimum number of matching time series before giving up with increase -search.maxUniqueTimeseries
error
2019-08-19 16:04:21 +03:00
Aliaksandr Valialkin
c23b66a1ad
lib/storage: pre-allocate memory for blockHeader slice in unmarshalBlockHeaders
...
This reduces memory usage and memory fragmentation when working with big number of time series
2019-08-19 12:46:33 +03:00
Aliaksandr Valialkin
5b41122292
lib/storage: properly cache tagFilters -> TSIDs entries from historical index
2019-08-14 02:29:58 +03:00
Aliaksandr Valialkin
964c296f96
lib/storage: compress contents of cache for tagFilters -> TSIDs
...
This should increase cache capacity
2019-08-14 02:29:52 +03:00
Aliaksandr Valialkin
09fc6e22e5
all: use workingsetcache instead of fastcache
...
This should reduce the amount of RAM required for processing time series
with non-zero churn rate.
The previous cache behavior can be restored with `-cache.oldBehavior` command-line flag.
2019-08-13 21:39:34 +03:00
Aliaksandr Valialkin
ec1b185991
lib/storage: remove broken BenchmarkIndexDBSearchTSIDs
2019-08-13 20:22:08 +03:00
Aliaksandr Valialkin
0967683ae9
lib: move common code for creating flock.lock file into fs.CreateFlockFile
2019-08-13 01:45:46 +03:00
Aliaksandr Valialkin
5d8d110010
lib/fs: atomically create file with the given contents on WriteFileAtomically
...
This should prevent from `transaction` and `metadata.json` files corruption
on unclean shutdown such as OOM, `kill -9`, power loss, etc.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/148
2019-08-12 15:02:55 +03:00
Aliaksandr Valialkin
0b488f1e37
lib/storage: do not change timestamps to constant rate if values are constant or have constant delta
...
This breaks the original timestamps, which results in issues like
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/120 and
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/141 .
2019-08-06 15:40:07 +03:00
Aliaksandr Valialkin
b8bb74ffc6
app/vmstorage: add vm_concurrent_addrows_*
metrics for tracking concurrency for Storage.AddRows calls
...
Track also the number of dropped rows due to the exceeded timeout
on concurrency limit for Storage.AddRows. This number is tracked in `vm_concurrent_addrows_dropped_rows_total`
2019-08-06 15:08:33 +03:00
Aliaksandr Valialkin
8822079b77
lib/storage: properly reset partSearch.fetchData
in partSearch.reset
2019-08-05 09:56:06 +03:00
Aliaksandr Valialkin
47e4b50112
app/vmselect: optimize /api/v1/series
by skipping storage data
...
Fetch and process only time series metainfo.
2019-08-04 23:01:28 +03:00
Aliaksandr Valialkin
c14fd6c43f
lib/storage: typo fixes after a77e88db7d
2019-07-30 15:38:52 +03:00
Aliaksandr Valialkin
a77e88db7d
lib/storage: fix matching against tag filter with empty name
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/137
2019-07-30 15:15:09 +03:00
Aliaksandr Valialkin
f586e1f83c
lib/storage: add metrics for calculating skipped rows outside the retention
...
The metrics are:
- vm_too_big_timestamp_rows_total
- vm_too_small_timestamp_rows_total
2019-07-26 14:11:01 +03:00
Aliaksandr Valialkin
e75d5f47c4
lib/storage: remove unused function isTooBigTimeRangeForDateMetricIDs
2019-07-12 02:28:23 +03:00
Aliaksandr Valialkin
fc90ebf43c
lib/storage: do not reduce maxMetrics
on time ranges exceeding maxDaysForDateMetricIDs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/95
2019-07-12 02:20:34 +03:00
Aliaksandr Valialkin
2bd1a01d1a
lib/storage: do not pollute inverted index with data for samples outside the retention period
2019-07-11 17:04:56 +03:00
Aliaksandr Valialkin
d031e04023
lib/storage: use fast path for orSuffix when searching for metricIDs against plain tag value
2019-07-11 14:48:37 +03:00
Aliaksandr Valialkin
43ea4ce428
lib/storage: remember and skip individual tag filters matching too many metrics
...
This saves CPU time by skipping useless matching for individual tag filters.
2019-07-11 14:48:30 +03:00
Aliaksandr Valialkin
1fe6d784d8
all: consistency renaming: bytesSize -> sizeBytes
2019-07-10 00:47:36 +03:00
Aliaksandr Valialkin
56c154f45b
all: add vm_data_size_bytes
metrics for easy monitoring of on-disk data size and on-disk inverted index size
2019-07-04 19:42:30 +03:00