Aliaksandr Valialkin
87b39222be
Revert "lib/fs: do not postpone directory removal on NFS error"
...
This reverts commit 21aeb02b46649ac9906cb37733f7b155a77a0db9.
2019-11-12 16:29:50 +02:00
Aliaksandr Valialkin
c48e39eea9
lib/storage: add tests for dateMetricIDCache
2019-11-11 13:21:05 +02:00
Aliaksandr Valialkin
5f52eb7653
lib/fs: do not postpone directory removal on NFS error
...
Continue trying to remove NFS directory on temporary errors for up to a minute.
The previous async removal process breaks in the following case during VictoriaMetrics start
- VictoriaMetrics opens index, finds incomplete merge transactions and starts replaying them.
- The transaction instructs removing old directories for parts, which were already merged into bigger part.
- VictoriaMetrics removes these directories, but their removal is delayed due to NFS errors.
- VictoriaMetrics scans partition directory after all the incomplete merge transactions are finished
and finds directories, which should be removed, but weren't still removed due to NFS errors.
- VictoriaMetrics panics when it finds unexpected empty directory.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-11-10 13:27:16 +02:00
Aliaksandr Valialkin
9ea2bd822e
lib/storage: implement per-day inverted index
2019-11-10 00:20:32 +02:00
Aliaksandr Valialkin
dea2f3efed
lib/storage: use specialized cache for (date, metricID) entries
...
This improves ingestion performance.
2019-11-09 23:09:18 +02:00
Aliaksandr Valialkin
46e67bb78c
lib/storage: export vm_new_timeseries_created_total
metric for determining time series churn rate
2019-11-08 19:58:21 +02:00
Aliaksandr Valialkin
0063c857f5
lib/storage: add inmemory inverted index for the last hour
...
It should improve performance for `last N hours` dashboards with update intervals smaller than 1 hour.
2019-11-08 19:37:46 +02:00
Aliaksandr Valialkin
1c777e0245
lib/storage: substitute error message about unsorted items in the index block after metricIDs merge with counter
...
The origin of the error has been detected and documented in the code,
so it is enough to export a counter for such errors at `vm_index_blocks_with_metric_ids_incorrect_order_total`,
so it could be monitored and alerted on high error rates.
Export also the counter for processed index blocks with metricIDs - `vm_index_blocks_with_metric_ids_processed_total`,
so its' rate could be compared to `rate(vm_index_blocks_with_metric_ids_incorrect_order_total)`.
2019-11-06 14:32:41 +02:00
Aliaksandr Valialkin
6ab9c98a1e
app/vmstorage: add -bigMergeConcurrency
and -smallMergeConcurrency
flags for tuning the maximum number of CPU cores used during merges
2019-10-31 16:17:29 +02:00
Aliaksandr Valialkin
b101064f8b
all: report the number of bytes read on io.ReadFull error
...
This should simplify error investigation similar to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/175
2019-09-11 14:50:24 +03:00
Aliaksandr Valialkin
2c654258ef
lib/fs: add MustStopDirRemover for waiting until pending directories are removed on graceful shutdown
...
This patch is mainly required for laggy NFS. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-09-05 11:17:17 +03:00
Aliaksandr Valialkin
5893a9f9a3
app/vmstorage: increase default values for search.maxTagKeys, search.maxTagValues and search.maxUniqueTimeseries
2019-08-27 14:28:26 +03:00
Aliaksandr Valialkin
f56c1298ad
app/vmstorage: add vm_concurrent_addrows_*
metrics for tracking concurrency for Storage.AddRows calls
...
Track also the number of dropped rows due to the exceeded timeout
on concurrency limit for Storage.AddRows. This number is tracked in `vm_concurrent_addrows_dropped_rows_total`
2019-08-06 15:08:43 +03:00
Aliaksandr Valialkin
880b1d80b1
app/vmselect: optimize /api/v1/series
by skipping storage data
...
Fetch and process only time series metainfo.
2019-08-04 23:00:46 +03:00
Aliaksandr Valialkin
8253790157
app/vmstorage: consistency renaming for ignored rows
metrics
...
vm_too_big_timestamp_rows_total -> vm_rows_ignored_total{reason="big_timestamp"}
vm_too_small_timestamp_rows_total -> vm_rows_ignored_total{reason="small_timestamp"}
2019-07-26 20:02:24 +03:00
Aliaksandr Valialkin
c6bec48927
lib/storage: add metrics for calculating skipped rows outside the retention
...
The metrics are:
- vm_too_big_timestamp_rows_total
- vm_too_small_timestamp_rows_total
2019-07-26 14:11:56 +03:00
Aliaksandr Valialkin
54f035d4ce
all: small updates after PR #114
2019-07-24 17:43:43 +03:00
Aliaksandr Valialkin
ba8195c58e
all: consistency renaming: bytesSize -> sizeBytes
2019-07-10 00:47:42 +03:00
Aliaksandr Valialkin
41f512af1c
all: add vm_data_size_bytes
metrics for easy monitoring of on-disk data size and on-disk inverted index size
2019-07-04 19:43:04 +03:00
Aliaksandr Valialkin
a0c22a6830
app/vmstorage: add vm_cache_entries{type="storage/hour_metric_ids"}
metric for tracking active time series count
2019-06-19 18:37:38 +03:00
Aliaksandr Valialkin
945894e049
app/vmselect: properly handle empty label (aka __name__) in LabelEntries handler
2019-06-10 19:55:02 +03:00
Aliaksandr Valialkin
75a0acf72d
app/vmselect: add /api/v1/labels/count
handler for quick detection of labels with the maximum number of distinct values
2019-06-10 19:54:55 +03:00
Aliaksandr Valialkin
547bcdce63
app/vmstorage: enable compression of responses to vmselect by default
...
This should save vmstorage => vmselect network bandwidth in common case
when recently added data is queried.
2019-06-10 14:54:59 +03:00
Aliaksandr Valialkin
d54f5fec0b
lib/storage: skip adaptive searching for tag filter matching the minimum number of metrics if the identical previous search didn't found such filter
...
This should improve speed for searching metrics among high number of time series
with high churn rate like in big Kubernetes clusters with frequent deployments.
2019-06-10 14:07:47 +03:00
Aliaksandr Valialkin
4c3913290a
app/vmstorage: add missing _total
suffixes to newly added metrics
2019-06-09 22:11:41 +03:00
Aliaksandr Valialkin
d882afa905
lib/storage: optimize time series lookup for recent hours when the db contains many millions of time series with high churn rate (aka frequent deployments in Kubernetes)
2019-06-09 19:14:04 +03:00
Aliaksandr Valialkin
364f4ec3bb
all: remove -p XXXX:XXXX
from docker run
options, since it is unnesessary if --net=host
is set
2019-05-24 12:53:12 +03:00
Aliaksandr Valialkin
24578b4bb1
all: open-sourcing cluster version
2019-05-23 00:25:38 +03:00
Aliaksandr Valialkin
1836c415e6
all: open-sourcing single-node version
2019-05-23 00:18:06 +03:00