Aliaksandr Valialkin
b986516fbe
lib/storage: create and use lib/uint64set
instead of map[uint64]struct{}
...
This should improve inverted index search performance for filters matching big number of time series,
since `lib/uint64set.Set` is faster than `map[uint64]struct{}` for both `Add` and `Has` calls.
See the corresponding benchmarks in `lib/uint64set`.
2019-09-24 21:17:55 +03:00
Aliaksandr Valialkin
ef2296e420
lib/storage: typo fix: return dstData instead of data from mergeTagToMetricIDsRows
2019-09-24 19:32:34 +03:00
Aliaksandr Valialkin
a6086cde78
lib/storage: limit the number of metricIDs in tag->metricIDs row
...
This reduces the overhead on index and metaindex in lib/mergeset
2019-09-24 00:49:51 +03:00
Aliaksandr Valialkin
c9063ece66
lib/storage: share tsids across all the partSearch instances
...
This should reduce memory usage when big number of time series matches the given query.
2019-09-23 22:35:15 +03:00
Aliaksandr Valialkin
4e26ad869b
lib/{storage,mergeset}: verify PrepareBlock callback results
...
Do not touch the first and the last item passed to PrepareBlock
in order to preserve sort order of mergeset blocks.
2019-09-23 20:43:13 +03:00
Aliaksandr Valialkin
0adebae1f8
lib/storage: generate the first tag->metricIDs item in a mergeset block with a single metricID
...
The first item from each mergeset block goes into index (lib/mergeset.blockHeader),
so it must be short in order to reduce index size.
2019-09-22 19:21:33 +03:00
Aliaksandr Valialkin
0686ac52c3
lib/{storage,mergeset}: merge tag->metricID
rows into tag->metricIDs
rows for common tag
values
...
This should improve lookup performance if the same `label=value` pair exists
in big number of time series.
This should also reduce memory usage for mergeset data cache, since `tag->metricIDs` rows
occupy less space than the original `tag->metricID` rows.
2019-09-20 22:06:41 +03:00
Aliaksandr Valialkin
a544f49c2b
lib/storage: optimize selecting all the metricIDs by scanning MetricID->TSID entries instead of tag->MetricID entries
...
The number of MetricID->TSID entries is smaller than the number of tag->MetricID entries
and MetricID->TSID entries are usually shorter than tag->MetricID entries.
This should improve performance when selecting all the metricIDs.
2019-09-20 11:54:10 +03:00
Aliaksandr Valialkin
a84fe76677
lib/storage: use sort.Sort instead of sort.slice in getSortedMetricIDs
2019-09-19 20:07:22 +03:00
Aliaksandr Valialkin
3a697a935a
lib/storage: skip duplicate call to intersectMetricIDsWithTagFilter on zero successful intersects
2019-09-19 17:49:56 +03:00
Aliaksandr Valialkin
3d83f5d334
lib/storage: mark tag filter returning errFallbackToMetricNameMatch as useless
...
This will save CPU on subsequent calls for this filter
2019-09-18 19:10:32 +03:00
Aliaksandr Valialkin
8d35718dc6
lib/storage: properly construct keys for uselessTagFiltersCache and register useless negative tag filters there
2019-09-17 23:20:27 +03:00
Aliaksandr Valialkin
bad53e4207
lib/mergeset: dynamically calculate the maximum number of items per part, which can be cached in OS page cache
2019-09-11 14:53:45 +03:00
Aliaksandr Valialkin
9eb5de334f
lib/storage: typo fix
2019-09-04 19:58:01 +03:00
Aliaksandr Valialkin
16dd145586
lib/storage: remove duplicate tag keys on MetricName.Marshal
call
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/172
2019-09-04 18:13:45 +03:00
Aliaksandr Valialkin
e1d76ec1f3
lib/storage: invalidate tagFilters -> TSIDS
cache when newly added index data becomes visible to search
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/163
2019-08-29 15:08:35 +03:00
Aliaksandr Valialkin
9196c085a7
all: port to FreeBSD on GOARCH=amd64
2019-08-28 01:19:23 +03:00
Aliaksandr Valialkin
2655220c58
lib/storage: go fmt
2019-08-27 14:29:51 +03:00
Aliaksandr Valialkin
bf915fc0db
lib/storage: report proper maxMetrics limit when more than -search.maxUniqueTimeseries series match the given filters
2019-08-27 14:21:42 +03:00
Aliaksandr Valialkin
2fc157ff7a
lib/storage: properly handle (?i)
in the tag filter regexp
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/161
2019-08-26 00:44:45 +03:00
Aliaksandr Valialkin
0dc0006f34
lib/storage: calculate the maximum number of rows per small part from -memory.allowedPercent
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/159
This simplifies error detection additionally to the `vm_rows_ignored_total` counters.
2019-08-25 15:31:47 +03:00
Aliaksandr Valialkin
4b688fffee
lib/storage: calculate the maximum number of rows per small part from -memory.allowedPercent
...
This should improve query speed over recent data on machines with big amounts of RAM
2019-08-25 14:41:12 +03:00
Aliaksandr Valialkin
1402a6b981
lib/storage: properly limit the number of output rows in small and big parts storage
...
Previously small parts storage didn't take into account the available disk space for big parts.
2019-08-25 14:41:12 +03:00
Aliaksandr Valialkin
3308279c4e
lib/storage: remove outdated comment on maxRowsPerSmallPart
...
The commend became outdated after the commit ed6ac1a5df027f0dfc22448e3b27c26b6f77c67a,
which stops merging of small parts on graceful shutdown instead of waiting
for their completion.
2019-08-25 13:47:32 +03:00
Aliaksandr Valialkin
8c03a8c4b4
app/vminsert: allow setting the maximum number of labels per time series via -maxLabelsPerTimeseries
2019-08-23 08:45:26 +03:00
Aliaksandr Valialkin
380cae23a0
lib/storage: add benchmarks for regexp filter match / mismatch
...
These benchmarks allow estimate the performance of regexp filters in promql
2019-08-22 16:36:42 +03:00
Aliaksandr Valialkin
4f738c8a15
lib/storage: try slower path for searching the tag filter with the minimum number of matching time series before giving up with increase -search.maxUniqueTimeseries
error
2019-08-19 16:04:21 +03:00
Aliaksandr Valialkin
c23b66a1ad
lib/storage: pre-allocate memory for blockHeader slice in unmarshalBlockHeaders
...
This reduces memory usage and memory fragmentation when working with big number of time series
2019-08-19 12:46:33 +03:00
Aliaksandr Valialkin
5b41122292
lib/storage: properly cache tagFilters -> TSIDs entries from historical index
2019-08-14 02:29:58 +03:00
Aliaksandr Valialkin
964c296f96
lib/storage: compress contents of cache for tagFilters -> TSIDs
...
This should increase cache capacity
2019-08-14 02:29:52 +03:00
Aliaksandr Valialkin
09fc6e22e5
all: use workingsetcache instead of fastcache
...
This should reduce the amount of RAM required for processing time series
with non-zero churn rate.
The previous cache behavior can be restored with `-cache.oldBehavior` command-line flag.
2019-08-13 21:39:34 +03:00
Aliaksandr Valialkin
ec1b185991
lib/storage: remove broken BenchmarkIndexDBSearchTSIDs
2019-08-13 20:22:08 +03:00
Aliaksandr Valialkin
0967683ae9
lib: move common code for creating flock.lock file into fs.CreateFlockFile
2019-08-13 01:45:46 +03:00
Aliaksandr Valialkin
5d8d110010
lib/fs: atomically create file with the given contents on WriteFileAtomically
...
This should prevent from `transaction` and `metadata.json` files corruption
on unclean shutdown such as OOM, `kill -9`, power loss, etc.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/148
2019-08-12 15:02:55 +03:00
Aliaksandr Valialkin
0b488f1e37
lib/storage: do not change timestamps to constant rate if values are constant or have constant delta
...
This breaks the original timestamps, which results in issues like
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/120 and
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/141 .
2019-08-06 15:40:07 +03:00
Aliaksandr Valialkin
b8bb74ffc6
app/vmstorage: add vm_concurrent_addrows_*
metrics for tracking concurrency for Storage.AddRows calls
...
Track also the number of dropped rows due to the exceeded timeout
on concurrency limit for Storage.AddRows. This number is tracked in `vm_concurrent_addrows_dropped_rows_total`
2019-08-06 15:08:33 +03:00
Aliaksandr Valialkin
8822079b77
lib/storage: properly reset partSearch.fetchData
in partSearch.reset
2019-08-05 09:56:06 +03:00
Aliaksandr Valialkin
47e4b50112
app/vmselect: optimize /api/v1/series
by skipping storage data
...
Fetch and process only time series metainfo.
2019-08-04 23:01:28 +03:00
Aliaksandr Valialkin
c14fd6c43f
lib/storage: typo fixes after a77e88db7d
2019-07-30 15:38:52 +03:00
Aliaksandr Valialkin
a77e88db7d
lib/storage: fix matching against tag filter with empty name
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/137
2019-07-30 15:15:09 +03:00
Aliaksandr Valialkin
f586e1f83c
lib/storage: add metrics for calculating skipped rows outside the retention
...
The metrics are:
- vm_too_big_timestamp_rows_total
- vm_too_small_timestamp_rows_total
2019-07-26 14:11:01 +03:00
Aliaksandr Valialkin
e75d5f47c4
lib/storage: remove unused function isTooBigTimeRangeForDateMetricIDs
2019-07-12 02:28:23 +03:00
Aliaksandr Valialkin
fc90ebf43c
lib/storage: do not reduce maxMetrics
on time ranges exceeding maxDaysForDateMetricIDs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/95
2019-07-12 02:20:34 +03:00
Aliaksandr Valialkin
2bd1a01d1a
lib/storage: do not pollute inverted index with data for samples outside the retention period
2019-07-11 17:04:56 +03:00
Aliaksandr Valialkin
d031e04023
lib/storage: use fast path for orSuffix when searching for metricIDs against plain tag value
2019-07-11 14:48:37 +03:00
Aliaksandr Valialkin
43ea4ce428
lib/storage: remember and skip individual tag filters matching too many metrics
...
This saves CPU time by skipping useless matching for individual tag filters.
2019-07-11 14:48:30 +03:00
Aliaksandr Valialkin
1fe6d784d8
all: consistency renaming: bytesSize -> sizeBytes
2019-07-10 00:47:36 +03:00
Aliaksandr Valialkin
56c154f45b
all: add vm_data_size_bytes
metrics for easy monitoring of on-disk data size and on-disk inverted index size
2019-07-04 19:42:30 +03:00
Aliaksandr Valialkin
2ecb117082
lib/storage: skip non-matching metricIDs in sortedFilter
...
This should improve performance for big sorteFilter lists.
2019-06-29 13:48:32 +03:00
Aliaksandr Valialkin
c1be1e4342
lib/storage: optimize time series search by regexp filter
...
This should improve search speed on label filters like `{foo=~"bar.+baz"}`
2019-06-27 16:17:43 +03:00
Aliaksandr Valialkin
683bf2a11f
lib/storage: make sure non-nil args are passed to openIndexDB
2019-06-25 20:10:04 +03:00
Aliaksandr Valialkin
eb2283a029
lib/storage: reduce too big maxMetrics in getTagFilterWithMinMetricIDsCountAdaptive
...
This should improve performance on inverted index search for big amount of unique time series
when big -search.maxUniqueTimeseries is set.
2019-06-25 19:55:27 +03:00
Aliaksandr Valialkin
e8377011ab
lib/storage: free up memory from caches owned by indexDB when it is deleted
2019-06-25 14:42:44 +03:00
Aliaksandr Valialkin
33ea2120c3
lib/storage: use unversioned keys for tag cache in extDB
...
Data in ExtDB cannot be changed, so it is OK to use unversioned keys for tag cache.
This should improve performance for index lookups over big amount of time series.
2019-06-25 13:08:58 +03:00
Aliaksandr Valialkin
cf63669303
lib/storage: skip searching in extDB if it doesn't contain items for the given time range
...
This should improve inverted index search performance for big amount
of unique time series when the search is performed only on recent data.
2019-06-25 13:00:37 +03:00
Aliaksandr Valialkin
af2ceaaa0b
lib/storage: mention source parts on merge error
...
This should improve determining broken source part.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/76
2019-06-24 14:08:43 +03:00
Aliaksandr Valialkin
80db24386e
lib/storage: typo fixes found by golangci-lint; updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/69
2019-06-20 14:37:55 +03:00
Aliaksandr Valialkin
a78b3dba7f
app/vmstorage: add vm_cache_entries{type="storage/hour_metric_ids"}
metric for tracking active time series count
2019-06-19 18:36:47 +03:00
Aliaksandr Valialkin
d2c801029b
lib/storage: persist metric ids for the current and the previous hour on graceful shutdown
...
This should improve performance after restart when the db contains a lot of time series
with high time series churn (i.e. metrics from Kubernetes with many pods and frequent deployments)
2019-06-14 07:55:14 +03:00
Aliaksandr Valialkin
419197ba08
lib/fs: consolidate *RemoveAll* funcs into a single MustRemoveAll func
...
The func syncs parent dir in order to persist directory removal
in the event of power loss
2019-06-12 01:53:46 +03:00
Aliaksandr Valialkin
935bfd7a18
lib/fs: consistency renaming SyncPath -> MustSyncPath, since it doesnt return error
2019-06-11 23:13:49 +03:00
Aliaksandr Valialkin
20fc0e0e54
lib/{storage,mergeset}: sync filenames inside part when finalizing the part
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/63
2019-06-11 21:51:13 +03:00
Aliaksandr Valialkin
ac7b186f13
all: try hard removing directory with contents
...
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/61
2019-06-11 01:57:59 +03:00
Aliaksandr Valialkin
cbe692f0e2
app/vmselect: add /api/v1/labels/count
handler for quick detection of labels with the maximum number of distinct values
2019-06-10 19:55:38 +03:00
Aliaksandr Valialkin
7b6623558f
lib/storage: skip adaptive searching for tag filter matching the minimum number of metrics if the identical previous search didn't found such filter
...
This should improve speed for searching metrics among high number of time series
with high churn rate like in big Kubernetes clusters with frequent deployments.
2019-06-10 14:07:39 +03:00
Aliaksandr Valialkin
a1351bbaee
lib/storage: factor out getTagFilterWithMinMetricIDsCountAdaptive from updateMetricIDsForTagFilters
2019-06-10 13:26:44 +03:00
Aliaksandr Valialkin
b4d707d9bb
lib/storage: give clearer names to more functions
2019-06-10 13:01:23 +03:00
Aliaksandr Valialkin
bee7298f81
lib/storage: give more clear names to functions
2019-06-10 12:50:44 +03:00
Aliaksandr Valialkin
dbd217b8f0
lib/storage: test GetSeriesCount
2019-06-10 12:43:34 +03:00
Aliaksandr Valialkin
4d936b1524
lib/storage: make getSeriesCount func indexSearch method
2019-06-10 12:29:11 +03:00
Aliaksandr Valialkin
d37924900b
lib/storage: optimize time series lookup for recent hours when the db contains many millions of time series with high churn rate (aka frequent deployments in Kubernetes)
2019-06-09 19:13:56 +03:00
Aliaksandr Valialkin
28f6c36ab4
lib/storage: tune updating a map with today`s metric ids
...
- Increase update iterval from 1s to 10s. This should reduce CPU usage
for large amounts of metric ids with constant churn.
- Reduce pendingTodayMetricIDsLock lock duration during the update.
2019-06-02 21:58:16 +03:00
Aliaksandr Valialkin
4794f894a4
lib/storage: speed up checking metricID existence in the list for the current date
2019-06-02 18:34:08 +03:00
Aliaksandr Valialkin
e307a4d92c
lib/timerpool: use timer pool in concurrency limiters
...
This should reduce the number of memory allocations in highly loaded system
2019-05-28 17:20:10 +03:00
Aliaksandr Valialkin
54fb8b21f9
all: fix misspellings
2019-05-25 21:51:11 +03:00
Aliaksandr Valialkin
d6523ffe90
Makefile: add -s flag to go fmt
in make fmt
command
2019-05-25 21:43:35 +03:00
Aliaksandr Valialkin
1836c415e6
all: open-sourcing single-node version
2019-05-23 00:18:06 +03:00