Commit graph

48 commits

Author SHA1 Message Date
Andrei Baidarov
3120dc2054
app/vmselect: fixes possible panics for multitenant queries
This commit fixes panic for multitenant requests and empty storage node responses for tenants api.

 It also optimizes `populateSqTenantTokensIfNeeded` function calls, by making it only once for query request. Previously it was incorrectly called multiple times per each storage node request.

Related issue: 
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7549
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: f41gh7 <nik@victoriametrics.com>
2024-11-15 16:12:30 +01:00
Zakhar Bessarab
44b071296d
vmselect: add support of multi-tenant queries (#6346)
### Describe Your Changes

Added an ability to query data across multiple tenants. See:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1434

Currently, the following endpoints work with multi-tenancy:
- /prometheus/api/v1/query
- /prometheus/api/v1/query_range
- /prometheus/api/v1/series
- /prometheus/api/v1/labels
- /prometheus/api/v1/label/<label_name>/values
- /prometheus/api/v1/status/active_queries
- /prometheus/api/v1/status/top_queries
- /prometheus/api/v1/status/tsdb
- /prometheus/api/v1/export
- /prometheus/api/v1/export/csv
- /vmui


A note regarding VMUI: endpoints such as `active_queries` and
`top_queries` have been updated to indicate whether query was a
single-tenant or multi-tenant, but UI needs to be updated to display
this info.
cc: @Loori-R 

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: f41gh7 <nik@victoriametrics.com>
2024-10-01 16:37:18 +02:00
Aliaksandr Valialkin
87338633b1
lib/slicesutil: add helper functions for setting slice length and extending its capacity
The added helper functions - SetLength() and ExtendCapacity() - replace error-prone code with simple function calls.
2024-05-12 11:33:49 +02:00
Aliaksandr Valialkin
63d635a5e4
app: consistently use atomic.* types instead of atomic.* functions
See ea9e2b19a5
2024-02-24 03:06:14 +02:00
Aliaksandr Valialkin
202d8e2c40
docs: update -help output after 61d9df4c36
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/834
2024-02-08 14:50:56 +02:00
Aliaksandr Valialkin
b18e608016
app/vmselect: add ability to reset rollup result cache on startup by passing -search.resetRollupResultCacheOnStartup command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/834
2024-02-08 14:42:15 +02:00
Aliaksandr Valialkin
7ca8ebef20
app/vmselect/promql: properly handle duplicate series when merging cached results with the results obtained from the database
evalRollupFuncNoCache() may return time series with identical labels (aka duplicate series)
when performing queries satisfying all the following conditions:

- It must select time series with multiple metric names. For example, {__name__=~"foo|bar"}
- The series selector must be wrapped into rollup function, which drops metric names. For example, rate({__name__=~"foo|bar"})
- The rollup function must be wrapped into aggregate function, which has no streaming optimization.
  For example, quantile(0.9, rate({__name__=~"foo|bar"})

In this case VictoriaMetrics shouldn't return `cannot merge series: duplicate series found` error.
Instead, it should fall back to query execution with disabled cache.

Also properly store the merged results. Previously they were incorrectly stored because of a typo
introduced in the commit 41a0fdaf39

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5332
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5337
2023-11-16 16:16:17 +01:00
Aliaksandr Valialkin
d9ecc3f6d7
lib/logger: add -loggerMaxArgLen command-line flag for fine-tuning the maximum length of logged args 2023-11-13 09:43:49 +01:00
Aliaksandr Valialkin
7fc5178a4b
app/vmselect/promql: add missing trace message in rollupResultCache.GetSeries() 2023-11-02 09:17:13 +01:00
Aliaksandr Valialkin
c5e3b11762
app/vmselect/promql: apply SLO-like optimization to all the count_*_over_time() functions
This is a follow-up for 41a0fdaf39
2023-11-01 09:58:50 +01:00
Aliaksandr Valialkin
9661918bb4
app/vmselect/promql: optimize repeated SLI-like instant queries with lookbehind windows >= 1d
Repeated instant queries with long lookbehind windows, which contain one of the following rollup functions,
are optimized via partial result caching:

- sum_over_time()
- count_over_time()
- avg_over_time()
- increase()
- rate()

The basic idea of optimization is to calculate

  rf(m[d] @ t)

as

  rf(m[offset] @ t) + rf(m[d] @ (t-offset)) - rf(m[offset] @ (t-d))

where rf(m[d] @ (t-offset)) is cached query result, which was calculated previously

The offset may be in the range of up to 1 hour.
2023-10-31 20:08:38 +01:00
Nikolay
4a50e9400c
app/vmselect: reduce lock contention for heavy aggregation requests (#5119)
reduce lock contention for heavy aggregation requests
previously lock contetion may happen on machine with big number of CPU due to enabled string interning. sync.Map was a choke point for all aggregation requests.
Now instead of interning, new string is created. It may increase CPU and memory usage for some cases.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5087
2023-10-10 13:44:02 +02:00
Aliaksandr Valialkin
c4638553a3
lib/fs: rename WriteFileAtomically to MustWriteAtomic
Callers of this function log the returned error and exit.
So let's just log the error with the given filepath and the call stack
inside the function itself and then exit. This simplifies the code
at callers' place while leaves the same level of debuggability in case of errors.
2023-04-13 22:43:30 -07:00
Aliaksandr Valialkin
3b4a3583bc
app/vmselect/promql: prevent from cannot unmarshal timeseries from rollupResultCache panic after the upgrade to v1.89.0 2023-03-12 19:09:11 -07:00
Aliaksandr Valialkin
7956b0d974
app/vmselect/promql: pre-allocate memory for values to be merged in mergeTimeseries()
This should reduce the number of memory re-allocations
2023-01-09 22:52:19 -08:00
Aliaksandr Valialkin
895d5d9d22
app/vmselect/promql: consistently intern series names obtained from marshalMetricNameSorted
This reduces memory allocations when the returned series names are used as map keys later
2023-01-09 22:46:30 -08:00
Aliaksandr Valialkin
12e2bcdf81
app/vmselect/promql: avoid memory allocations and copying from source timeseries to the returned result at timeseriesToResult() 2023-01-09 22:39:15 -08:00
Aliaksandr Valialkin
ecb71a7221
lib/fs: add canOverwrite arg to WriteFileAtomically when it is allowed to overwrite the file atomically if it already exists 2022-10-26 01:08:35 +03:00
Aliaksandr Valialkin
c92aef39b5
app/vmselect/promql: expose missing metric vm_cache_size_max_bytes{type="promql/rollupResult"} 2022-10-23 12:14:07 +03:00
Aliaksandr Valialkin
06f6de6d47
all: use os.{Read|Write}File instead of ioutil.{Read|Write}File
The ioutil.{Read|Write}File is deprecated since Go1.16 -
see https://tip.golang.org/doc/go1.16#ioutil

VictoriaMetrics needs at least Go1.18, so it is safe to remove ioutil usage
from source code.

This is a follow-up for 02ca2342ab
2022-08-21 23:55:20 +03:00
Aliaksandr Valialkin
4fb0f15322
all: readability improvements for query traces
- show dates in human-readable format, e.g. 2022-05-07, instead of a numeric value
- limit the maximum length of queries and filters shown in trace messages
2022-06-30 18:19:43 +03:00
Aliaksandr Valialkin
6386f117c8
all: show timeRange in traces in human-readable format instead of timestamps in milliseconds 2022-06-27 13:42:57 +03:00
Aliaksandr Valialkin
a9ea3fee38
lib/querytracer: make it easier to use by passing trace context message to New and NewChild
The context message can be extended by calling Donef.
If there is no need to extend the message, then just call Done.
2022-06-08 21:16:12 +03:00
Aliaksandr Valialkin
afced37c0b
all: add initial support for query tracing
See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#query-tracing

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403
2022-06-01 02:31:44 +03:00
Aliaksandr Valialkin
3605e1743b
app/vmselect/promql: allow calling InitRollupResultCache+StopRollupResultCache multiple times during tests 2022-04-11 12:32:23 +03:00
Aliaksandr Valialkin
244c23ea2c
lib/workingsetcache: reduce the default cache rotation period from hour to 20 minutes
This should reduce memory usage under high time series churn rate
2022-02-23 13:42:27 +02:00
Aliaksandr Valialkin
54d4a1c959
app/vmselect: accept optional extra_filters[] query args for all the supported Prometheus querying APIs
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1863
2021-12-06 17:33:49 +02:00
Aliaksandr Valialkin
56b08390f6 app/vmselect/promql: allow to use 2x more memory for query processing in cluster mode compared to single-node mode
`vmselect` has no `vmstorage`-related caches. So it can use more memory for query processing compared to single-node VictoriaMetrics.
2021-05-12 14:43:49 +03:00
Aliaksandr Valialkin
92531a38c4 app/vmselect/promql: increment key prefix for faster reset for rollup result cache 2021-03-22 11:59:39 +02:00
Nikolay
673b10dd7f adds enforced tag filters into cache key (#1095) 2021-02-27 00:23:38 +02:00
Aliaksandr Valialkin
0c00fe70cf app/vmselect: do not adjust start and end query args passed to /api/v1/query_range when -search.disableCache command-line flag is set
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/563
2020-07-30 23:14:56 +03:00
Aliaksandr Valialkin
d962568e93 all: use %w instead of %s for wrapping errors in fmt.Errorf
This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode .
See https://blog.golang.org/go1.13-errors for details.
2020-06-30 23:33:46 +03:00
Aliaksandr Valialkin
3845420a8f lib: extract common code for returning fast unix timestamp into lib/fasttime 2020-05-14 23:06:50 +03:00
Aliaksandr Valialkin
9ed4951ec8 lib/metricsql: move it to a separate repository - github.com/VictoriaMetrics/metrics 2020-04-28 15:30:06 +03:00
Aliaksandr Valialkin
f22aefdb16 app/vmselect/promql: log when rollupResult cache is cleared 2020-02-21 20:06:53 +02:00
Aliaksandr Valialkin
d5c2a0ce64 app/vmselect: add -search.cacheTimestampOffset command-line flag
This flag can be used for removing gaps on graphs if the difference between the current time
and the timestamps from the ingested data exceeds 5 minutes.

This is the case when the time between data sources and VictoriaMetrics is improperly synchronized.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/312
2020-02-21 14:02:15 +02:00
Aliaksandr Valialkin
ea53a21b02 all: consistently log durations in seconds with millisecond precision
This should improve logs readability
2020-01-22 18:35:24 +02:00
Aliaksandr Valialkin
9a1f6848ca app/vmselect/promql: fix results caching for multi-arg rollup functions such as quantile_over_time
Previosly only a single arg was taken into account, so caching didn't work properly for multi-arg rollup funcs.
2020-01-03 20:44:54 +02:00
Aliaksandr Valialkin
453d71d082 Rename lib/promql to lib/metricsql and apply small fixes 2019-12-25 22:09:09 +02:00
Mike Poindexter
009d1559db Split Extended PromQL parsing to a separate library 2019-12-25 22:09:07 +02:00
Aliaksandr Valialkin
90a4b00b10 app/vmselect/promql: fix panic on -search.disableCache
Reset the cache if it is disabled instead of stopping, since it is stopped on graceful shutdown.
2019-08-21 17:12:01 +03:00
Aliaksandr Valialkin
b8bbe92de1 app/vmselect/promql: store compressed results in the cache
This should increase rollup results cache capacity.
2019-08-14 02:32:16 +03:00
Aliaksandr Valialkin
8c2158af24 all: use workingsetcache instead of fastcache
This should reduce the amount of RAM required for processing time series
with non-zero churn rate.

The previous cache behavior can be restored with `-cache.oldBehavior` command-line flag.
2019-08-13 21:40:28 +03:00
Aliaksandr Valialkin
cbab86fd9d app/vmselect/promql: reduce RAM usage for aggregates over big number of time series
Calculate incremental aggregates for `aggr(metric_selector)` function instead of
keeping all the time series matching the given `metric_selector` in memory.
2019-07-10 13:03:36 +03:00
Aliaksandr Valialkin
ba8195c58e all: consistency renaming: bytesSize -> sizeBytes 2019-07-10 00:47:42 +03:00
Aliaksandr Valialkin
2ff0d595b0 app/vmselect/promql: add -search.disableCache flag for disabling response caching
This may be useful for data back-filling, when the response caching
could interfere badly with newly added data points with timestamps
in the past.
2019-06-04 17:30:41 +03:00
Aliaksandr Valialkin
24578b4bb1 all: open-sourcing cluster version 2019-05-23 00:25:38 +03:00
Aliaksandr Valialkin
1836c415e6 all: open-sourcing single-node version 2019-05-23 00:18:06 +03:00