github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2025-03-21 15:45:01 +00:00

Author	SHA1	Message	Date
Guillem Jover	1d8b7faf71	spelling and grammar fixes via codespell (#8497 ) ### Describe Your Changes Fix many spelling errors and some grammar, including misspellings in filenames. The change also fixes a typo in metric `vm_mmaped_files` to `vm_mmapped_files`. While this is a breaking change, this metric isn't used in alerts or dashboards. So it seems to have low impact on users. The change also deprecates `cspell` as it is much heavier and less usable. --------- Co-authored-by: Andrii Chubatiuk <achubatiuk@victoriametrics.com> Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com> (cherry picked from commit `76d205feae`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-03-17 16:38:11 +01:00
Roman Khavronenko	27f9eaa852	app/vmselect/promql: optimize binary operator `or` for common cases (#8489 ) The optimization touches 2 things: 1. Reduces amount of allocations when comparing canonical metric names between left and right parts of expressions. 2. Adds fast path for cases when right part of expression returns scalar: `series_selector or on() vector(1)`, which is a typical expression. ``` benchcmp old.txt new.txt benchcmp is deprecated in favor of benchstat: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat benchmark old ns/op new ns/op delta BenchmarkBinaryOpOr/tss:1_or_tss:1-14 291 272 -6.56% BenchmarkBinaryOpOr/tss:1_or_tss:1000-14 44590 28592 -35.88% BenchmarkBinaryOpOr/tss:1000_or_tss:1-14 103124 39563 -61.64% BenchmarkBinaryOpOr/tss:1000_or_tss:1000-14 20386150 1859335 -90.88% BenchmarkBinaryOpOr/tss:1000_or_on()_vector(0)-14 91382 36805 -59.72% ``` https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8382 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `dc1f7ef0d0`)	2025-03-14 12:30:08 +01:00
Zakhar Bessarab	dea3eb20cb	app/vmselect/promql: fix panic with using @ with series which is not present at the start of the query (#8445 ) ### Describe Your Changes Previously, "selector @ another_selector" assumed that "another_selector" metric is supposed to exist since "start" used in the query. If the query was evaluated in the following case (timestamps): - start - 2, end - 10 - "another_selector" 5,6,7,8,9,10 - "selector" The resulting "at" timestamp would be taken from NaN (as `int64(NaN * 1000)`), causing a panic or invalid behavior later. Note that type cast of `NaN` to int64 is also platform-dependent, so value of `int64(math.NaN() * 1000)` can produce `0` or max int64 on different platforms and versions of Go. This commit changes this and checks for the first non-NaN value. This makes it easier to use for users as series are not always aligned and returning an error in this case would disallow using this for some time ranges. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8444 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `7dfaef9088`)	2025-03-06 16:42:51 +01:00
Zhu Jiekun	774004867b	bugfix: negative rate result when lookbehind window longer than search.maxLookback (#8378 ) ### Describe Your Changes #8342 fix negative rate result when the lookbehind window is longer than `-search.maxLookback` or `-search.maxStalenessInterval` and data contains gap. This issue was introduced since [v1.110.0](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8072). ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2025-02-27 22:55:32 +01:00
Roman Khavronenko	13cd76347d	app/vmselect/promql: fix discrepancies when using `or` binary operator The change covers various corner cases when using `or` binary operator. See corresponding issues and pull request here to see the cases: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7770 Related issues: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7759 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7640 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `72837919ae`)	2025-02-01 22:31:55 +01:00
Roman Khavronenko	9261da53a0	app/vmselect/promql: respect staleness in `removeCounterResets` (#8073 ) See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8072 ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2025-01-24 07:52:51 +01:00
Roman Khavronenko	3dfcbf3229	app/vmselect/promql: limit staleness detection for increase/increase_pure/delta (#8052 ) `doInternal` has adaptive staleness detection mechanism. It is calculated using timestamp distance between samples in selected list of samples. It is dynamic because VM can store signals from many sources with different samples resolution. And while it works for most of cases, there are edge cases for rollup functions that are comparing values between windows: increase, increase_pure, delta. The edge case 1. There was a gap between series because of the missed scrape or two. In this case staleness will trigger and increase-like functions will assume the value they need to compare with is 0. In result, this could produce spikes for a flappy data - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/894 This problem was solved by introducing a `realPrevValue` field - `1f19c167a4`. It stores the closest real sample value on selected interval and is used for comparison when samples start after the gap. The edge case 2. `realPrevValue` doesn't respect staleness interval. In result, even if gap between samples is huge (hours), the increase-like functions will not consider it as a new series that started from 0. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8002. Covering both edge cases is tricky, because `realPrevValue` has to respect and not respect the staleness interval in the same time. In other words, it should be able to ignore periodic missing gaps, but reset if the gap is too big. While "too big gap" can't be figured out empirically, I suggest using `-search.maxStalenessInterval` for this purpose. If `-search.maxStalenessInterval` is set to 0 (default), then `realPrevValue` ignores staleness interval. If `-search.maxStalenessInterval` is > 0, then `realPrevValue` respects it as a staleness interval. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `7d2a6764e7`)	2025-01-16 17:07:38 +01:00
Roman Khavronenko	9de0b8a165	make: bump golangci-lint to v1.63.4 ( New version has additional checks and reduced resource consumption, so it doesn't timeout for our internal repos. To make linter happy, I addressed "redefinition of the built-in function" lint error. ---- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-01-13 07:23:21 +01:00
Roman Khavronenko	0614eb97a5	app/vmselect/promql: account for staleness when populating realPrevValue (#8002 ) When vmselect process a rollup function it fetches all the raw samples on requested `start-end` interval of the query. It then loops through the raw samples, picks the range of the samples based on provided `step` interval and invokes a rollup function for each of the picked ranges of samples. During this processing, vmselect always populates the `realPrevValue` field with the closest previous raw sample value before the picked range of samples. This `realPrevValue` is used by rollup functions like increase_pure or delta to decide whether the counter change happened or not. For example, we get the counter value == 1. If we've seen this counter before and its value was also 1 - then no change happened. If we didn't see it before, then this counter should have started with value=0 and we need to account for `1-0=1` change. All this is required to deal with situations when scrapes are missing or `step` is too small. However, vmselect doesn't check how "old" is the `realPrevValue`. In other words, it doesn't respect the staleness interval when picking it. In result, depending on the `start` and `end` params, vmselect can use `realPrevValue` which is a couple of hours old and is unlikely to be a temporary scrape fail. In result, some increases can be incorrectly ingnored by vmselect. This change makes sure that vmselect doesn't populate `realPrevValue` with samples that are older than staleness interval. ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ x ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). ------------------- To reproduce, create a dataset with one metric `foo` which has samples with value=1 on interval of couple of hours and resolution 15s, and a gap for an hour in the middle: <img width="769" alt="image" src="https://github.com/user-attachments/assets/a39b2740-b741-45f8-ad18-093b7c57c3b3" /> Then run `increase(foo[1m])` expression on this time range (disable cache): <img width="1472" alt="image" src="https://github.com/user-attachments/assets/463cece1-f359-4c75-a96c-60092a31cab2" /> In result, there will be one increase on the beginning of the series. And no increase after the gap. Then change the time range so it starts in the middle of the gap: <img width="1505" alt="image" src="https://github.com/user-attachments/assets/f4a460c3-9fd1-4ec7-ab47-15e716ec1019" /> Now, there is an increase>0 because the `realPrevValue` wasn't populated. This is wrong, because it hides the increase of the series. With the fix, the original increase query on full time range should show 2 increases: <img width="1492" alt="image" src="https://github.com/user-attachments/assets/aa9d8a6b-7b22-41f6-9eb9-83b3113a6982" /> Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-01-13 07:23:17 +01:00
Zakhar Bessarab	8f8b29355d	app/vmselect/promql: set tenant information for numbers Since `44b071296d` `evalNumber` function no longer updating MetricName tenancy information. This leads to mismatch in metric names between the query result and evaluated number for all tenants other than 0:0. For example, query `count(up) or 0` will return different results for tenants 0:0 and 1:1 (assuming up is present for both tenants): - tenant 0:0 - will only contain result of `count(up)` - tenant 1:1 - will return both `count(up)` and `0` since metric names will not be matched This restores setting of tenancy information for metric name for single-tenant queries. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7987 --- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2025-01-10 09:57:58 +01:00
YuDong Tang	ec73b22d24	app/select: add command-line flag -search.maxBinaryOpPushdownLabelValues ### Describe Your Changes Binary operations like `exprFirst op exprSecond` in VictoriaMetrics are performed in the following way: 1. Execute exprFirst. 2. Extract common label filters from the result of step 1. 3. Apply these common label filters to `exprSecond` and execute it, in order to retrieve less time series from vmstorage nodes. In step 2, only labels with less than `100` (hard-coded) value could be used as common label filter (e.g. `{common_lb=~"v1\|v2\|...\|v100"}`. In our scenarios, a label, take `instance` label as an example, could has thousands of candidate values. Regarding bring more pressure to vmstorage node, it's still beneficial if labels with more than 100 values could be used as filter in `exprSecond`, with enough vmstorage resources. After adjusting the value from `100` to `10000`, our query round-trip time drops significantly from 5s to 2s. This pull request change the hard-coded value into a configurable flag.	2025-01-03 13:19:44 +01:00
f41gh7	795af212d5	app/vmselect/promql: improve performance of parseCache on systems with many CPU cores Parse cache is a pretty simple implementation of cache. It's just a standard map with mutex. Map with mutex overall has poor performance, plus when the cache overflow occurs, the whole cache locks until 1k elements have been deleted (now it's 10% of 10000 max elements in the cache). To avoid this bottleneck and improve performance of cache on systems with many CPU cores but keep it rather simple, we can implement cache with per bucket locks like it's done in fastcache. The logic and API remain the same. So now each bucket will have a map with approximately 78 elements (with 128 buckets), and overflow will occur now for each bucket, and only 7 elements need to be deleted. Because exec_test.go has about 10k lines of code, it's better to move the cache into a separate file to add tests and benchmarks for it, because now it does not have them. ``` goos: windows goarch: amd64 pkg: github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql cpu: 11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz Current cache implementation performance on 8 cores: BenchmarkCachePutNoOverFlow-8 1932 618372 ns/op 253 B/op 0 allocs/op BenchmarkCacheGetNoOverflow-8 6547 211527 ns/op 0 B/op 0 allocs/op BenchmarkCachePutGetNoOverflow-8 1873 621718 ns/op 261 B/op 0 allocs/op BenchmarkCachePutOverflow-8 2262 464328 ns/op 32 B/op 0 allocs/op BenchmarkCachePutGetOverflow-8 1764 655866 ns/op 38 B/op 0 allocs/op New cache implementation performance on 8 cores: BenchmarkCachePutNoOverFlow-8 10408 111412 ns/op 0 B/op 0 allocs/op BenchmarkCacheGetNoOverflow-8 22407 52809 ns/op 0 B/op 0 allocs/op BenchmarkCachePutGetNoOverflow-8 6583 168088 ns/op 0 B/op 0 allocs/op BenchmarkCachePutOverflow-8 9822 117212 ns/op 2 B/op 0 allocs/op BenchmarkCachePutGetOverflow-8 6481 175952 ns/op 3 B/op 0 allocs/op Current cache implementation performance on 16 cores: BenchmarkCachePutNoOverFlow-16 2331 475307 ns/op 218 B/op 0 allocs/op BenchmarkCacheGetNoOverflow-16 6069 196905 ns/op 0 B/op 0 allocs/op BenchmarkCachePutGetNoOverflow-16 1870 644236 ns/op 262 B/op 0 allocs/op BenchmarkCachePutOverflow-16 2296 509279 ns/op 34 B/op 0 allocs/op BenchmarkCachePutGetOverflow-16 1726 671510 ns/op 45 B/op 0 allocs/op New cache implementation performance on 16 cores: BenchmarkCachePutNoOverFlow-16 13549 82413 ns/op 0 B/op 0 allocs/op BenchmarkCacheGetNoOverflow-16 30274 38997 ns/op 0 B/op 0 allocs/op BenchmarkCachePutGetNoOverflow-16 8512 126239 ns/op 0 B/op 0 allocs/op BenchmarkCachePutOverflow-16 13884 88124 ns/op 1 B/op 0 allocs/op BenchmarkCachePutGetOverflow-16 7903 131299 ns/op 3 B/op 0 allocs/op ``` From the benchmarks above, we can see that the new implementation is ~5 times faster than the old one. --------- Co-authored-by: f41gh7 <nik@victoriametrics.com>	2025-01-02 17:47:54 +01:00
f41gh7	5558841cc1	Fix inconsistent treatment of millisecond-precision time for instant queries (#7767 ) This PR fixes #5796. See the points 6 and 7 in `Steps to reproduce`: > Now let's set time to only 5ms past the timestamp of the first point, since even 199ms worked for the second point. Surprise, the point isn't returned 💥: > > ```curl -s $VMQURL -d 'query=series1' -d 'time=1707123456705' -d 'step=1ms' \| grep 10 # nothing!``` > > But, 4ms works: 🤨🤔 > > ```curl -s $VMQURL -d 'query=series1' -d 'time=1707123456704' -d 'step=1ms' \| grep 10 # found``` This happens so because the actual step becomes 5ms due to jitter being applied. THe fix is to do not apply jitter if scrape interval was not detected (the case when vmstorage returns only one result). In this case the scrape interval is set to `5m+step`. An integration test has been added to check the steps to reproduce and then to confirm that fix works. Note that the cluster tests are currently disabled because the fix is not in cluster branch yet. The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>	2024-12-18 22:40:44 +01:00
Andrei Baidarov	439d1b932e	app/vmselect: fix panic/incorrect tenant in key This is a follow-up after `3120dc2` - Consistently use key for rollupCache in multitenant mode cache keys use different authTokens. Previously it could lead to panic in rare cases when cache state was inconsistent. - Do not share `err` variable across goroutines for `processBlock` function. It could lead to data races. Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7549 --------- Signed-off-by: Andrei Baidarov <abaidarov@yandex.ru> Co-authored-by: f41gh7 <nik@victoriametrics.com>	2024-11-25 11:47:24 +01:00
Nikolay	8357c22cc8	app/vmselect: properly return binary pow function result (#7619 ) Previously, for `^` aka pow function calls, VictoriaMetrics returned `1` if left arg was Nan. For example, given query=`(hour()==2)^1` returns 1 for NaN produced by hour() == 2 function. It added additional non-exist datapoints to the timeseries. This commit port bugfix from `metricql` package and adds test for it. Now, VictoriaMetrics correctly returns `NaN` for such cases. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7359 Signed-off-by: f41gh7 <nik@victoriametrics.com> (cherry picked from commit `bb399518db`)	2024-11-21 15:23:49 +01:00
Andrei Baidarov	3120dc2054	app/vmselect: fixes possible panics for multitenant queries This commit fixes panic for multitenant requests and empty storage node responses for tenants api. It also optimizes `populateSqTenantTokensIfNeeded` function calls, by making it only once for query request. Previously it was incorrectly called multiple times per each storage node request. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7549 --------- Signed-off-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: f41gh7 <nik@victoriametrics.com>	2024-11-15 16:12:30 +01:00
Andrii Chubatiuk	93bc205e05	promql: exclude limit_offset from default by metric name sorting (#7402 ) ### Describe Your Changes I don't like this solution, but it works. Other possible solutions described in an issue fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7068 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `a88f896b43`)	2024-11-06 15:27:29 +01:00
Roman Khavronenko	4114301955	lib/flagutil: rename Duration to RetentionDuration (#7284 ) The purpose of this change is to reduce confusion between using `flag.Duration` and `flagutils.Duration`. The reason is that `flagutils.Duration` was mistakenly used for cases that required `m` support. See `ab0d31a7b0` The change in name should clearly indicate the purpose of this data type. Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-10-17 11:18:45 -03:00
Roman Khavronenko	f825a9de80	app/vmselect/promql: fix seriesFetched update logic (#7181 ) ### Describe Your Changes evalInstantRollup could have overreport the number of fetched series if `offset` checks will result into retry. This change updates fetched series only if these checks were successful. It also adds a comment to another potential place of over-reporting series fetched. It doesn't fix it, because it would require spending extra resources on such a check, while discrepancy in seriesFetched doesn't affect calculations in any way. Probably related to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7170 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `ebd393d8b3`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-10-07 14:47:22 +02:00
Zakhar Bessarab	44b071296d	vmselect: add support of multi-tenant queries (#6346 ) ### Describe Your Changes Added an ability to query data across multiple tenants. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1434 Currently, the following endpoints work with multi-tenancy: - /prometheus/api/v1/query - /prometheus/api/v1/query_range - /prometheus/api/v1/series - /prometheus/api/v1/labels - /prometheus/api/v1/label/<label_name>/values - /prometheus/api/v1/status/active_queries - /prometheus/api/v1/status/top_queries - /prometheus/api/v1/status/tsdb - /prometheus/api/v1/export - /prometheus/api/v1/export/csv - /vmui A note regarding VMUI: endpoints such as `active_queries` and `top_queries` have been updated to indicate whether query was a single-tenant or multi-tenant, but UI needs to be updated to display this info. cc: @Loori-R --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Signed-off-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: f41gh7 <nik@victoriametrics.com>	2024-10-01 16:37:18 +02:00
Aliaksandr Valialkin	2a17cddf3d	app/vmselect/promql: consistently replace `NaN` data points with non-`NaN` values for `range_first` and `range_last` functions It is expected that range_first and range_last functions return non-nan const value across all the points if the original series contains at least a single non-NaN value. Previously this rule was violated for NaN data points in the original series. This could confuse users. While at it, add tests for series with NaN values across all the range_* and running_* functions, in order to maintain consistent handling of NaN values across these functions.	2024-09-23 15:00:05 +02:00
Aliaksandr Valialkin	4e00e4428e	app/vmselect/promql: properly calculate `c1 and c2` and `c1 or c2` by upgrading github.com/VictoriaMetrics/metricsql to v0.79.0 The fix is in the https://github.com/VictoriaMetrics/metricsql/pull/34 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6637 (cherry picked from commit `b82e2cabc5`)	2024-09-19 15:48:06 +02:00
Aliaksandr Valialkin	01c8e12370	app/vlselect: add /select/logsql/stats_query endpoint, which is going to be used by vmalert Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6942 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6706	2024-09-06 23:00:58 +02:00
Zakhar Bessarab	84b8ea7337	app/vmseleсt/promql: fix calculation of histogram buckets This issue was introduced in `6a4bd5049b` See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6714 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-08-15 10:13:54 +02:00
Aliaksandr Valialkin	f8aa445945	all: consistently use stringsutil.JSONString() for formatting JSON strings with fmt.* functions instead of using "%q" formatter The %q formatter may result in incorrectly formatted JSON string if the original string contains special chars such as \x1b . They must be encoded as \u001b , otherwise the resulting JSON string cannot be parsed by JSON parsers. This is a follow-up for `c0caa69939` See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/24	2024-07-17 14:01:37 +02:00
Aliaksandr Valialkin	7c97cef95c	app: consistently use t.Fatal* instead of t.Error* (except of app/vmalert and app/vmctl - these packages will be processed in a separate commit) Consistently using t.Fatal* simplifies the test code and makes it less fragile, since it is common error to forget to make proper cleanup after t.Error* call. Also t.Error* calls do not provide any practical benefits when some tests fail. They just clutter test output with additional noise information, which do not help in fixing failing tests most of the time. This is a follow-up for `a9525da8a4`	2024-07-11 16:01:25 +02:00
Zakhar Bessarab	401ae72587	app/vmselect/promql: propagate lower bucket values when fixing a histogram (#6547 ) ### Describe Your Changes In most cases histograms are exposed in sorted manner with lower buckets being first. This means that during scraping buckets with lower bounds have higher chance of being updated earlier than upper ones. Previously, values were propagated from upper to lower bounds, which means that in most cases that would produce results higher than expected once all buckets will become updated. Propagating from upper bound effectively limits highest value of histogram to the value of previous scrape. Once the data will become consistent in the subsequent evaluation this causes spikes in the result. Changing propagation to be from lower to higher buckets reduces value spikes in most cases due to nature of the original inconsistency. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4580 An example histogram with previous(red) and updated(blue) versions: ![1719565540](https://github.com/VictoriaMetrics/VictoriaMetrics/assets/1367798/605c5e60-6abe-45b5-89b2-d470b60127b8) This also makes logic of filling nan values with lower buckets values: [1 2 3 nan nan nan] => [1 2 3 3 3 3] obsolete. Since buckets are now fixed from lower ones to upper this happens in the main loop, so there is no need in a second one. --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `6a4bd5049b`)	2024-07-10 15:17:08 +02:00
Aliaksandr Valialkin	d6415b2572	all: consistently use 'any' instead of 'interface{}' 'any' type is supported starting from Go1.18. Let's consistently use it instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.	2024-07-10 00:23:26 +02:00
Aliaksandr Valialkin	bb7406e9c0	app/vmselect/promql: follow-up for `dd0d2c77c8` and `6149adbe10` Use metricsql.IsLikelyInvalid() function for determining whether the given query is likely invalid, e.g. there is high change the query is incorrectly written, so it will return unexpected results. The query is invalid most of the time if it passes something other than series selector into rollup function. For example: - rate(sum(foo)) - rate(foo + bar) - rate(foo > bar) Improtant note: the query is considered valid if it misses the lookbehind window in square brackes inside rollup function, e.g. rate(foo), since this is very convenient MetricsQL extention to PromQL, and this query returns the expected results most of the time. Other unsafe query types can be added in the future into metricsql.IsLikelyInvalid(). TODO: probably, the -search.disableImplicitConversion command-line flag must be set by default in the future releases of VictoriaMetrics. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4338 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6180 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6450	2024-07-03 00:46:56 +02:00
Roman Khavronenko	df7e300071	app/vmselect/promql: check for ranged vectors in aggr funcs if implicit conversions are disabled (#6450 ) Check for ranged vector arguments in aggregate expressions when `-search.disableImplicitConversion` or `-search.logImplicitConversion` are enabled. For example, `sum(up[5m])` will fail to execute if these flags are set. ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [*] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `6149adbe10`)	2024-06-17 14:25:43 +02:00
Aliaksandr Valialkin	87338633b1	lib/slicesutil: add helper functions for setting slice length and extending its capacity The added helper functions - SetLength() and ExtendCapacity() - replace error-prone code with simple function calls.	2024-05-12 11:33:49 +02:00
Aliaksandr Valialkin	6b81441ed0	app/vmselect: use strings.EqualFold instead of strings.ToLower where appropriate Strings.EqualFold doesn't allocate memory contrary to strings.ToLower if the input string contains uppercase chars	2024-05-12 10:21:24 +02:00
Aliaksandr Valialkin	536d87cd51	app/vmselect/promql: properly estimate the needed amounts of memory for executing aggregate function over rollup function in incremental mode Incremental aggregation processes only GOMAXPROCS time series at a time, so its' memory usage doesn't depend on the number of input time series. The issue has been introduced in `5138eaeea0` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203	2024-05-12 10:14:27 +02:00
Hui Wang	7fdea4b31c	app/vmselect: implement cmd-line flags `-search.disableImplicitConversions` and `-search.logImplicitConversions` (#6180 ) address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4338 support disable or log [implicit conversions](https://docs.victoriametrics.com/metricsql/#implicit-query-conversions) for subquery with cmd-line flags `-search.disableImplicitConversion` and `-search.logImplicitConversion` Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `dd0d2c77c8`)	2024-04-25 13:08:05 +02:00
Aliaksandr Valialkin	fba3c10ed1	app/vmselect/promql: add support for matching against multiple numeric constants via `q == (c1,...,cN)` and `q != (c1,...,cN)` syntax	2024-04-19 17:57:09 +02:00
Aliaksandr Valialkin	64938732e3	all: replace old https://docs.victoriametrics.com/MetricsQL.html url with the new one - https://docs.victoriametrics.com/metricsql/	2024-04-18 02:15:33 +02:00
Aliaksandr Valialkin	00f59d6ddf	all: fix golangci-lint(revive) warnings after `0c0ed61ce7` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6001	2024-04-03 03:00:45 +03:00
Aliaksandr Valialkin	fed04e7f25	app/vmselect/promql: use unsafe.Slice instead of deprecated reflect.SliceHeader	2024-02-29 17:51:16 +02:00
Hui Wang	25e454be4c	metricsql: fix label_join() when `dst_label` is equal to one of the `… (#5886 ) * metricsql: fix label_join() when `dst_label` is equal to one of the `src_label` * Update app/vmselect/promql/transform.go * Update docs/CHANGELOG.md --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-29 16:02:48 +02:00
Aliaksandr Valialkin	b3dbbc22b9	app/vmselect/promql: properly handle args in count_values_over_time() function Prevsiously they were swapped - the first arg should be the label name and the second arg should be label filters This is a follow-up for e389b7b959e8144fdff5075bf7a5a39b2b0c6dd3 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5847	2024-02-25 01:48:37 +02:00
Aliaksandr Valialkin	63d635a5e4	app: consistently use atomic.* types instead of atomic.* functions See `ea9e2b19a5`	2024-02-24 03:06:14 +02:00
Aliaksandr Valialkin	477fdc21aa	app/vmselect/promql: add `count_values_over_time()` MetricsQL function See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5847	2024-02-23 01:05:31 +02:00
Aliaksandr Valialkin	65fb54ab8f	app/vmselect/promql: move needSilenceIntervalForRollupFunc from eval.go to rollup.go This should improve maintainability of the code related to rollup functions, since it is located in rollup.go While at it, properly return empty results from holt_winters(), rate_over_sum(), sum2_over_time(), geomean_over_time() and distinct_over_time() when there are no real samples on the selected lookbehind window. Previously the previous sample value was mistakenly returned from these functions.	2024-02-23 01:05:11 +02:00
Aliaksandr Valialkin	5f2905d120	app/vmselect: add sum_eq_over_time, sum_gt_over_time and sum_le_over_time functions to MetricsQL See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4641	2024-02-13 23:40:30 +02:00
Aliaksandr Valialkin	202d8e2c40	docs: update -help output after `61d9df4c36` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/834	2024-02-08 14:50:56 +02:00
Aliaksandr Valialkin	b18e608016	app/vmselect: add ability to reset rollup result cache on startup by passing -search.resetRollupResultCacheOnStartup command-line flag Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/834	2024-02-08 14:42:15 +02:00
Aliaksandr Valialkin	def5573f92	app/vmselect/promql: properly handle precision errors in rollup functions changes(), increases_over_time() and resets() shouldn't take into account value changes, which may occur because of precision errors. The maximum guaranteed precision for raw samples stored in VictoriaMetrics is 12 decimal digits. So do not count relative changes for values if they are smaller than 1e-12 comparing to the value. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/767	2024-02-08 02:33:17 +02:00
Aliaksandr Valialkin	2033fe4caf	app/vmselect/promql: really keep metric names when keep_metric_names modifier is applied to binary operator Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5556	2024-01-31 02:33:06 +02:00
Roman Khavronenko	02e609b141	app/vmselect: set proper timestamp for cached instant responses (#5723 ) * app/vmselect: set proper timestamp for cached instant responses The change updates `getSumInstantValues` to prefer timestamp from the most recent results. Before, timestamp from cached series was used. The old behavior had negative impact on recording rules as they were getting responses with shifted timestamps in past. Subsequent recording or alerting rules fetching results of these recording rules could get no result due to staleness interval. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5659 Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-30 22:20:16 +02:00
Aliaksandr Valialkin	e20cbfcbc3	app/vmselect/promql: remove superflouos memory allocations at aggrPrepareSeries() While at it, also remove unneeded map lookup	2024-01-23 02:29:14 +02:00

1 2 3 4 5 ...

594 commits