Commit graph

212 commits

Author SHA1 Message Date
Aliaksandr Valialkin
3f99f39e9b app/vmselect/prometheus: reduce default value for -search.latencyOffset from 60s to 30s
30 seconds should be enough for almost all the cases
2019-11-25 16:33:42 +02:00
Aliaksandr Valialkin
e91cb34c0e app/vmselect/promql: allow nested parens 2019-11-25 16:13:41 +02:00
Aliaksandr Valialkin
826dfd63a5 vendor: update github.com/VictoriaMetrics/metrics from v1.9.0 to v1.9.1 2019-11-25 15:23:01 +02:00
Aliaksandr Valialkin
0401969d78 app/vmselect/promql: re-use metrics.Histogram when calculating histogram function for each point on the graph
This should reduce the amounts memory allocations
2019-11-25 14:24:21 +02:00
Aliaksandr Valialkin
da98703748 app/vmselect/promql: optimize binary search over big number of samples during rollup calculations 2019-11-25 14:01:46 +02:00
Aliaksandr Valialkin
c28876172f app/vmselect/promql: adjust tests after the upgrade of github.com/VictoriaMetrics/metrics from v1.8.3 to v1.9.0 2019-11-25 13:43:57 +02:00
Aliaksandr Valialkin
50ae1879c6 app/vmselect/promql: add histogram aggregate function, which is useful for building heatmaps from multiple time series 2019-11-24 00:04:25 +02:00
Aliaksandr Valialkin
8582b50360 app/vmselect/promql: do not take into account buckets with negative counters in prometheus_buckets 2019-11-23 14:19:25 +02:00
Aliaksandr Valialkin
19dfe52254 app/vmselect/promql: properly handle histogram_quantile(0, ...) with zero buckets 2019-11-23 14:02:35 +02:00
Aliaksandr Valialkin
4bb88843cf app/vmselect: add vm_per_query_{rows,series}_processed_count histograms 2019-11-23 13:23:26 +02:00
Aliaksandr Valialkin
7753c8c0a1 app/vmselect/promql: transparently apply prometheus_buckets in histogram_quantile 2019-11-23 11:48:51 +02:00
Aliaksandr Valialkin
c4287b3c86 app/vmselect/promql: add prometheus_buckets function for converting the upcoming histogram buckets from github.com/VictoriaMetrics/metrics to Prometheus-compatible buckets 2019-11-23 00:20:20 +02:00
Aliaksandr Valialkin
1f3fd2c910 app/vmselect: adjust end arg instead of adjusting start arg if start > end
`start` arg has higher chances to be set properly comparing to `end` arg,
so it is expected that the `end` arg could be adjusted if it was set incorrectly.
2019-11-22 16:12:19 +02:00
Aliaksandr Valialkin
7a4635f853 all: remove the remaining mentions of cluster version 2019-11-21 23:18:22 +02:00
Aliaksandr Valialkin
443189fb0a app/{vmbackup,vmrestore}: add -maxBytesPerSecond command-line flag for limiting the used network bandwidth during backup / restore 2019-11-19 20:31:52 +02:00
Aliaksandr Valialkin
0094bc4fc9 app/vmselect/prometheus: properly adjust too big time time on /api/v1/query
Too big `time` must be adjusted to `now()-queryOffset`.
2019-11-19 00:42:00 +02:00
Aliaksandr Valialkin
3f1637fae8 app/vmselect/promql: properly calculate integrate(q[d]) 2019-11-13 21:10:41 +02:00
Aliaksandr Valialkin
c56b9ed03b app/victoria-metrics: add build rules for GOARCH=ppc64le
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/235
2019-11-13 20:24:33 +02:00
Aliaksandr Valialkin
3fd32e331a app/vmselect/promql: use universal approach for determining maxByteSliceLen on 32-bit and 64-bit archs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/235
2019-11-13 20:24:26 +02:00
Aliaksandr Valialkin
119dfd01bb lib/storage: add vm_cache_size_bytes{type="storage/hour_metric_ids"} metric 2019-11-13 20:24:21 +02:00
Aliaksandr Valialkin
86a1cd700b lib/storage: remove inmemory index for recent hour, since it uses too much memory
Production workload shows that the index requires ~4Kb of RAM per active time series.
This is too much for high number of active time series, so let's delete this index.

Now the queries should fall back to the index for the current day instead of the index
for the recent hour. The query performance for the current day index should be good enough
given the 100M rows/sec scan speed per CPU core.
2019-11-13 17:58:07 +02:00
Aliaksandr Valialkin
c57eb0ff83 lib/storage: add -disableRecentHourIndex flag for disabling inmemory index for recent hour
This may be useful for saving RAM on high number of time series aka high cardinality
2019-11-13 15:02:51 +02:00
Aliaksandr Valialkin
ca259864e2 lib/storage: return back inmemory inverted index for recent hour
Issues fixed:
- Slow startup times. Now the index is loaded from cache during start.
- High memory usage related to superflouos index copies every 10 seconds.
2019-11-13 13:11:04 +02:00
Aliaksandr Valialkin
01bb3c06c7 lib/storage: remove inmemory inverted index for recent hours
Production load with >10M active time series showed it could
slow down VictoriaMetrics startup times and could eat
all the memory leading to OOM.

Remove inmemory inverted index for recent hours until thorough
testing on production data shows it works OK.
2019-11-13 10:45:53 +02:00
Aliaksandr Valialkin
6c2303764e Revert "lib/fs: do not postpone directory removal on NFS error"
This reverts commit 4c02e496f7.

Reason for revert: the commit breaks on NFS - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/234
2019-11-12 16:18:09 +02:00
Aliaksandr Valialkin
661dd190bb Refer to https://medium.com/@valyala/speeding-up-backups-for-big-time-series-databases-533c1a927883 from multiple places in README.md 2019-11-12 13:02:39 +02:00
Oleg Kovalov
b4f44befa3 fix misspelled words (#229) 2019-11-12 00:16:42 +02:00
Aliaksandr Valialkin
8e8f98f712 lib/storage: add tests for dateMetricIDCache 2019-11-11 13:21:57 +02:00
Aliaksandr Valialkin
56d7cc8a0d app/victoria-metrics: remove deprecated fs.MustStopDirRemover from main_test.go 2019-11-10 13:37:13 +02:00
Aliaksandr Valialkin
4c02e496f7 lib/fs: do not postpone directory removal on NFS error
Continue trying to remove NFS directory on temporary errors for up to a minute.

The previous async removal process breaks in the following case during VictoriaMetrics start

- VictoriaMetrics opens index, finds incomplete merge transactions and starts replaying them.
- The transaction instructs removing old directories for parts, which were already merged into bigger part.
- VictoriaMetrics removes these directories, but their removal is delayed due to NFS errors.
- VictoriaMetrics scans partition directory after all the incomplete merge transactions are finished
  and finds directories, which should be removed, but weren't still removed due to NFS errors.
- VictoriaMetrics panics when it finds unexpected empty directory.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-11-10 13:24:51 +02:00
Aliaksandr Valialkin
5c3fa59181 app/vmrestore: the upcoming release would be 1.29.0 2019-11-10 00:20:41 +02:00
Aliaksandr Valialkin
ee7765b10d lib/storage: implement per-day inverted index 2019-11-10 00:02:46 +02:00
Aliaksandr Valialkin
5810ba57c2 lib/storage: use specialized cache for (date, metricID) entries
This improves ingestion performance.
2019-11-09 23:06:11 +02:00
Aliaksandr Valialkin
6ad7fe8eeb lib/storage: export vm_new_timeseries_created_total metric for determining time series churn rate 2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
63b05c0b9f app/vmselect/promql: adjust memory limits calculations for incremental aggregate functions
Incremental aggregate functions don't keep all the selected time series in memory -
they keep only up to GOMAXPROCS time series for incremental aggregations.

Take into account that the number of time series in RAM can be higher if they are split
into many groups with `by (...)` or `without (...)` modifiers.

This should reduce the number of `not enough memory for processing ... data points` false
positive errors.
2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
d888b21657 lib/storage: add inmemory inverted index for the last hour
It should improve performance for `last N hours` dashboards with update intervals smaller than 1 hour.
2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
1e46961d68 app/{vmbackup,vmrestore}: add vmbackup and vmrestore tools for creating backups on s3 or gcs from instant snapshots
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/203
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/38
2019-11-08 21:21:07 +02:00
Aliaksandr Valialkin
e472f0b23b lib/storage: substitute error message about unsorted items in the index block after metricIDs merge with counter
The origin of the error has been detected and documented in the code,
so it is enough to export a counter for such errors at `vm_index_blocks_with_metric_ids_incorrect_order_total`,
so it could be monitored and alerted on high error rates.

Export also the counter for processed index blocks with metricIDs - `vm_index_blocks_with_metric_ids_processed_total`,
so its' rate could be compared to `rate(vm_index_blocks_with_metric_ids_incorrect_order_total)`.
2019-11-06 14:28:11 +02:00
Aliaksandr Valialkin
2f58f37f07 app/vmselect/promql: add lag(q[d]) function, which returns the lag between the current timestamp and the timstamp for the last data point in q 2019-11-01 12:21:33 +02:00
Aliaksandr Valialkin
d18ea0c95b app/vmstorage: add -bigMergeConcurrency and -smallMergeConcurrency flags for tuning the maximum number of CPU cores used during merges 2019-10-31 16:19:13 +02:00
Aliaksandr Valialkin
b0295dbf2e app/vmselect: add -search.latencyOffset flag for tuning the time after data collection when data points become visible in query results
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/218
2019-10-28 12:31:07 +02:00
Aliaksandr Valialkin
e83fe938c8 all: make fmt 2019-10-17 20:04:34 +03:00
Aliaksandr Valialkin
97ce4e03a5 all: add support for GOARCH=386 and fix all the issues related to 32-bit architectures such as GOARCH=arm
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
2019-10-17 18:23:23 +03:00
Aliaksandr Valialkin
f752479cb8 app/victoria-metrics/test: add missing docs to public funcs PopulateTimeTplString and PopulateTimeTpl 2019-10-17 00:50:46 +03:00
Aliaksandr Valialkin
61e956e175 app/victoria-metrics: add a test for max_lookback=<duration> query arg
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209
2019-10-15 21:31:48 +03:00
Aliaksandr Valialkin
c66a691593 app/vmselect/prometheus: add -search.maxLookback command-line flag for overriding dynamic calculations for max lookback interval
This flag is similar to `-search.lookback-delta` if set. The max lookback interval is determined dynamically
from interval between datapoints for each input time series if the flag isn't set.

The interval can be overriden on per-query basis by passing `max_lookback=<duration>` query arg to `/api/v1/query` and `/api/v1/query_range`.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209
2019-10-15 21:31:48 +03:00
Aliaksandr Valialkin
cc21b31502 app/victoria-metrics/test: add a test for PopulateTimeTplString 2019-10-15 21:31:48 +03:00
Aliaksandr Valialkin
956fdd89d3 app/vmselect/promql: take into account the previous point when calculating max_over_time and min_over_time
This lines up with `first_over_time` function used in `rollup_candlestick`, so `rollup=low` always returns
the minimum value.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/204
2019-10-08 12:30:05 +03:00
Artem Navoiev
1e2c511747 Add regression test for query apo
Part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/187
cover:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/153
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/150
2019-10-07 22:18:04 +03:00
Artem Navoiev
a116f5e7c1 Add regression test for query apo (#194)
Part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/187
cover:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/184
2019-09-30 11:25:54 +03:00