Commit graph

76 commits

Author SHA1 Message Date
Aliaksandr Valialkin
67a2bcb98a lib/{storage,mergeset}: verify PrepareBlock callback results
Do not touch the first and the last item passed to PrepareBlock
in order to preserve sort order of mergeset blocks.
2019-09-23 20:46:33 +03:00
Aliaksandr Valialkin
d2ed8cb0b2 lib/storage: generate the first tag->metricIDs item in a mergeset block with a single metricID
The first item from each mergeset block goes into index (lib/mergeset.blockHeader),
so it must be short in order to reduce index size.
2019-09-22 19:37:50 +03:00
Aliaksandr Valialkin
7d13c31566 lib/{storage,mergeset}: merge tag->metricID rows into tag->metricIDs rows for common tag values
This should improve lookup performance if the same `label=value` pair exists
in big number of time series.
This should also reduce memory usage for mergeset data cache, since `tag->metricIDs` rows
occupy less space than the original `tag->metricID` rows.
2019-09-20 22:06:23 +03:00
Aliaksandr Valialkin
7e0c6d4ca6 lib/storage: optimize selecting all the metricIDs by scanning MetricID->TSID entries instead of tag->MetricID entries
The number of MetricID->TSID entries is smaller than the number of tag->MetricID entries
and MetricID->TSID entries are usually shorter than tag->MetricID entries.
This should improve performance when selecting all the metricIDs.
2019-09-20 11:57:57 +03:00
Aliaksandr Valialkin
89234f395d lib/storage: use sort.Sort instead of sort.slice in getSortedMetricIDs 2019-09-19 20:08:13 +03:00
Aliaksandr Valialkin
6e586fa09c lib/storage: skip duplicate call to intersectMetricIDsWithTagFilter on zero successful intersects 2019-09-19 17:51:10 +03:00
Aliaksandr Valialkin
c05885fb5f lib/storage: mark tag filter returning errFallbackToMetricNameMatch as useless
This will save CPU on subsequent calls for this filter
2019-09-18 19:11:44 +03:00
Aliaksandr Valialkin
db71c940ea lib/storage: properly construct keys for uselessTagFiltersCache and register useless negative tag filters there 2019-09-17 23:18:37 +03:00
Aliaksandr Valialkin
568ff61dcf lib/mergeset: dynamically calculate the maximum number of items per part, which can be cached in OS page cache 2019-09-09 11:42:45 +03:00
Aliaksandr Valialkin
2c2bd897dd lib/storage: remove duplicate tag keys on MetricName.Marshal call
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/172
2019-09-04 18:13:51 +03:00
Aliaksandr Valialkin
0b0153ba3d lib/storage: invalidate tagFilters -> TSIDS cache when newly added index data becomes visible to search
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/163
2019-08-29 15:08:44 +03:00
Aliaksandr Valialkin
604a4312f9 all: port to FreeBSD on GOARCH=amd64 2019-08-28 01:46:09 +03:00
Aliaksandr Valialkin
da07a6fb38 lib/storage: go fmt 2019-08-27 14:28:24 +03:00
Aliaksandr Valialkin
a63b69e9e2 lib/storage: report proper maxMetrics limit when more than -search.maxUniqueTimeseries series match the given filters 2019-08-27 14:21:31 +03:00
Aliaksandr Valialkin
82e813bad3 lib/storage: properly handle (?i) in the tag filter regexp
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/161
2019-08-26 00:44:56 +03:00
Aliaksandr Valialkin
e2eac858b5 lib/storage: calculate the maximum number of rows per small part from -memory.allowedPercent
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/159

This simplifies error detection additionally to the `vm_rows_ignored_total` counters.
2019-08-25 15:29:09 +03:00
Aliaksandr Valialkin
0a8dd9cc9a lib/storage: calculate the maximum number of rows per small part from -memory.allowedPercent
This should improve query speed over recent data on machines with big amounts of RAM
2019-08-25 14:41:32 +03:00
Aliaksandr Valialkin
bc576fb386 lib/storage: properly limit the number of output rows in small and big parts storage
Previously small parts storage didn't take into account the available disk space for big parts.
2019-08-25 14:41:32 +03:00
Aliaksandr Valialkin
947decb3dd lib/storage: remove outdated comment on maxRowsPerSmallPart
The commend became outdated after the commit ed6ac1a5df027f0dfc22448e3b27c26b6f77c67a,
which stops merging of small parts on graceful shutdown instead of waiting
for their completion.
2019-08-25 13:46:10 +03:00
Aliaksandr Valialkin
e734076f0f app/vminsert: allow setting the maximum number of labels per time series via -maxLabelsPerTimeseries 2019-08-23 08:47:18 +03:00
Aliaksandr Valialkin
4ed63d033a lib/storage: add benchmarks for regexp filter match / mismatch
These benchmarks allow estimate the performance of regexp filters in promql
2019-08-22 16:37:19 +03:00
Aliaksandr Valialkin
6ec6a8d7c1 lib/storage: try slower path for searching the tag filter with the minimum number of matching time series before giving up with increase -search.maxUniqueTimeseries error 2019-08-19 16:07:05 +03:00
Aliaksandr Valialkin
c59f5c4865 lib/storage: pre-allocate memory for blockHeader slice in unmarshalBlockHeaders
This reduces memory usage and memory fragmentation when working with big number of time series
2019-08-19 12:46:45 +03:00
Aliaksandr Valialkin
99eed2ca14 lib/storage: properly cache tagFilters -> TSIDs entries from historical index 2019-08-14 02:32:25 +03:00
Aliaksandr Valialkin
f1d81b9405 lib/storage: compress contents of cache for tagFilters -> TSIDs
This should increase cache capacity
2019-08-14 02:32:22 +03:00
Aliaksandr Valialkin
8c2158af24 all: use workingsetcache instead of fastcache
This should reduce the amount of RAM required for processing time series
with non-zero churn rate.

The previous cache behavior can be restored with `-cache.oldBehavior` command-line flag.
2019-08-13 21:40:28 +03:00
Aliaksandr Valialkin
5a7ab0d90b lib/storage: remove broken BenchmarkIndexDBSearchTSIDs 2019-08-13 20:21:23 +03:00
Aliaksandr Valialkin
39f3f3a517 lib: move common code for creating flock.lock file into fs.CreateFlockFile 2019-08-13 01:46:20 +03:00
Aliaksandr Valialkin
73f866d874 lib/fs: atomically create file with the given contents on WriteFileAtomically
This should prevent from `transaction` and `metadata.json` files corruption
on unclean shutdown such as OOM, `kill -9`, power loss, etc.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/148
2019-08-12 15:02:04 +03:00
Aliaksandr Valialkin
4fb635b0c9 lib/storage: do not change timestamps to constant rate if values are constant or have constant delta
This breaks the original timestamps, which results in issues like
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/120 and
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/141 .
2019-08-06 15:40:17 +03:00
Aliaksandr Valialkin
f56c1298ad app/vmstorage: add vm_concurrent_addrows_* metrics for tracking concurrency for Storage.AddRows calls
Track also the number of dropped rows due to the exceeded timeout
on concurrency limit for Storage.AddRows. This number is tracked in `vm_concurrent_addrows_dropped_rows_total`
2019-08-06 15:08:43 +03:00
Aliaksandr Valialkin
a3ecf3c1f7 lib/storage: properly reset partSearch.fetchData in partSearch.reset 2019-08-05 09:55:50 +03:00
Aliaksandr Valialkin
880b1d80b1 app/vmselect: optimize /api/v1/series by skipping storage data
Fetch and process only time series metainfo.
2019-08-04 23:00:46 +03:00
Aliaksandr Valialkin
b7c4b0c6d2 lib/storage: fix matching against tag filter with empty name
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/137
2019-07-30 15:15:21 +03:00
Aliaksandr Valialkin
c6bec48927 lib/storage: add metrics for calculating skipped rows outside the retention
The metrics are:

    - vm_too_big_timestamp_rows_total
    - vm_too_small_timestamp_rows_total
2019-07-26 14:11:56 +03:00
Aliaksandr Valialkin
73a47d2a53 lib/storage: remove unused function isTooBigTimeRangeForDateMetricIDs 2019-07-12 02:28:40 +03:00
Aliaksandr Valialkin
97f9397687 lib/storage: do not reduce maxMetrics on time ranges exceeding maxDaysForDateMetricIDs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/95
2019-07-12 02:21:52 +03:00
Aliaksandr Valialkin
4ca66344ee lib/storage: do not pollute inverted index with data for samples outside the retention period 2019-07-11 17:11:33 +03:00
Aliaksandr Valialkin
0522efb2d6 lib/storage: add missing tagFilter.Marshal func 2019-07-11 15:01:01 +03:00
Aliaksandr Valialkin
12b1d67b41 lib/storage: use fast path for orSuffix when searching for metricIDs against plain tag value 2019-07-11 14:48:51 +03:00
Aliaksandr Valialkin
bf2e1b0ac1 lib/storage: remember and skip individual tag filters matching too many metrics
This saves CPU time by skipping useless matching for individual tag filters.
2019-07-11 14:48:47 +03:00
Aliaksandr Valialkin
ba8195c58e all: consistency renaming: bytesSize -> sizeBytes 2019-07-10 00:47:42 +03:00
Aliaksandr Valialkin
41f512af1c all: add vm_data_size_bytes metrics for easy monitoring of on-disk data size and on-disk inverted index size 2019-07-04 19:43:04 +03:00
Aliaksandr Valialkin
ffc1bb00f6 lib/storage: skip non-matching metricIDs in sortedFilter
This should improve performance for big sorteFilter lists.
2019-06-29 13:49:40 +03:00
Aliaksandr Valialkin
416d27ef11 lib/storage: optimize time series search by regexp filter
This should improve search speed on label filters like `{foo=~"bar.+baz"}`
2019-06-27 16:18:00 +03:00
Aliaksandr Valialkin
ee23a143b9 lib/storage: make sure non-nil args are passed to openIndexDB 2019-06-25 20:10:08 +03:00
Aliaksandr Valialkin
8b0a63722f lib/storage: reduce too big maxMetrics in getTagFilterWithMinMetricIDsCountAdaptive
This should improve performance on inverted index search for big amount of unique time series
when big -search.maxUniqueTimeseries is set.
2019-06-25 19:57:31 +03:00
Aliaksandr Valialkin
0263cb0adc lib/storage: free up memory from caches owned by indexDB when it is deleted 2019-06-25 14:41:16 +03:00
Aliaksandr Valialkin
362e187011 lib/storage: use unversioned keys for tag cache in extDB
Data in ExtDB cannot be changed, so it is OK to use unversioned keys for tag cache.
This should improve performance for index lookups over big amount of time series.
2019-06-25 13:15:42 +03:00
Aliaksandr Valialkin
51e2f3b48f lib/storage: skip searching in extDB if it doesn't contain items for the given time range
This should improve inverted index search performance for big amount
of unique time series when the search is performed only on recent data.
2019-06-25 12:57:56 +03:00