Commit graph

110 commits

Author SHA1 Message Date
Aliaksandr Valialkin
41a0fdaf39
app/vmselect/promql: optimize repeated SLI-like instant queries with lookbehind windows >= 1d
Repeated instant queries with long lookbehind windows, which contain one of the following rollup functions,
are optimized via partial result caching:

- sum_over_time()
- count_over_time()
- avg_over_time()
- increase()
- rate()

The basic idea of optimization is to calculate

  rf(m[d] @ t)

as

  rf(m[offset] @ t) + rf(m[d] @ (t-offset)) - rf(m[offset] @ (t-d))

where rf(m[d] @ (t-offset)) is cached query result, which was calculated previously

The offset may be in the range of up to 1 hour.
2023-10-31 19:25:23 +01:00
Aliaksandr Valialkin
51aab7bb17
app/vmselect/promql: wrap too long line after a950873fff 2023-10-31 18:59:10 +01:00
Roman Khavronenko
a950873fff
app/vmselect: expose vm_memory_intensive_queries_total counter metric (#5208)
The new metric gets increased each time `-search.logQueryMemoryUsage` memory limit
is exceeded by a query. This metric should help to identify expensive and heavy queries
without inspecting the logs.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-31 13:31:09 +01:00
Nikolay
1f91f22b5f
app/vmselect: reduce lock contention for heavy aggregation requests (#5119)
reduce lock contention for heavy aggregation requests
previously lock contetion may happen on machine with big number of CPU due to enabled string interning. sync.Map was a choke point for all aggregation requests.
Now instead of interning, new string is created. It may increase CPU and memory usage for some cases.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5087
2023-10-10 13:45:20 +02:00
Aliaksandr Valialkin
072d891ed9
app/vmselect: prevent from panic when lookbehind window inside rollup function is parsed into negative value
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4795
2023-08-12 04:47:53 -07:00
Aliaksandr Valialkin
4cb024d8a3
all: add support for or filters in series selectors
This commit adds ability to select series matching distinct filters via a single series selector.
For example, the following selector selects series with either {env="prod",job="a"}
or {env="dev",job="b"} labels:

  {env="prod",job="a" or env="dev",job="b"}

The `or` filter is supported in all the VictoriaMetrics tools now.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3997
Uses https://github.com/VictoriaMetrics/metricsql/pull/14
2023-07-16 00:06:33 -07:00
Haleygo
20e7db47ee
vmselect: fix result in Prometheus query when time is small (#4578)
vmselect: fix result in Prometheus query when time is small

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-07-07 11:48:05 +02:00
Aliaksandr Valialkin
622000797a
app/vmselect: follow-up for 10ab086366
- Expose stats.seriesFetched at `/api/v1/query_range` responses too
  for the sake of consistency.

- Initialize QueryStats when it is needed and pass it to EvalConfig then.
  This guarantees that the QueryStats is properly collected when the query
  contains some subqueries.
2023-03-27 15:22:00 -07:00
Roman Khavronenko
4021aa11b5
app/vmselect: export seriesFetched stat for /query responses (#3925)
The change adds a new field `seriesFetched` to EvalConfig object.
Since EvalConfig object can be copied inside `Exec`,
`seriesFetched` is a pointer which can be updated by all copied
objects.

The reason for having stats is that other components, like vmalert,
could benefit from this information.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-03-27 15:18:25 -07:00
Aliaksandr Valialkin
2b851e69d2
app/vmselect/promql: typo fix after e7f46a0aab 2023-03-24 23:46:30 -07:00
Aliaksandr Valialkin
e7f46a0aab
app/vmselect/promql: follow-up for 7205c79c5a
- Allocate and initialize seriesByWorkerID slice in a single go instead
  of initializing every item in the list separately.
  This should reduce CPU usage a bit.
- Properly set anti-false sharing padding at timeseriesWithPadding structure
- Document the change at docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-24 23:34:37 -07:00
Zakhar Bessarab
7205c79c5a
app/vmselect/promql: use lock-less approach to gather results of parallel processing for evalRollup* funcs (#4004)
* vmselect/promql: refactor `evalRollupNoIncrementalAggregate` to use lock-less approach for parallel workers computation

Locking there is causing issues when running on highly multi-core system as it introduces lock contention during results merge.

New implementation uses lock less approach to store results per workerID and merges final result in the end, this is expected to significantly reduce lock contention and CPU usage for systems with high number of cores.

Related: #3966
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* vmselect/promql: add pooling for `timeseriesWithPadding` to reduce allocations

Related: #3966
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* vmselect/promql: refactor `evalRollupFuncWithSubquery` to avoid using locks

Uses same approach as `evalRollupNoIncrementalAggregate` to remove locking between workers and reduce lock contention.

Related: #3966
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-03-24 23:07:12 -07:00
Aliaksandr Valialkin
e480b9881e
app/vmselect/promql: pass workerID to the callback inside doParallel()
This opens the possibility to remove tssLock from evalRollupFuncWithSubquery()
in the follow-up commit from @zekker6 in order to speed up the code
for systems with many CPU cores.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3966
2023-03-20 20:54:57 -07:00
Aliaksandr Valialkin
c87c7d1e29
app/vmselect/promql: measure the time required for calculating the aggregate function from the prepared source time series 2023-02-23 20:05:14 -08:00
Aliaksandr Valialkin
b285207aa7
app/vmselect: add -search.logQueryMemoryUsage command-line flag for logging queries, which take big amounts of memory
Thanks to @michal-kralik for initial attempts for this feature:

- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3651
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3715

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3553
2023-02-23 18:47:08 -08:00
Oleksandr Redko
9fff48c3e3
app,lib: fix typos in comments (#3804) 2023-02-13 13:27:13 +01:00
Aliaksandr Valialkin
27afe7bc38
app/vmselect/promql: reduce the number of memory allocations inside getCommonLabelFilters()
This should improve performance a bit for `q1 op q2` queries
2023-01-15 13:03:23 -08:00
Aliaksandr Valialkin
7067e8206c
app/vmselect/promql: reduce memory allocations at getCommonLabelFilters() function
Intern tag keys and values there
2023-01-12 01:27:41 -08:00
Aliaksandr Valialkin
31fc29599f
app/vmselect/promql: move the eval function args in parallel query trace outside the loop 2023-01-10 22:23:30 -08:00
Aliaksandr Valialkin
b796a0dc3f
app/vmselect/promql: optimize e1 op e2 when e1 returns an empty result
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3349
2022-11-21 16:09:10 +02:00
Aliaksandr Valialkin
b8da90b893
app/vmselect/promql: properly handle zero and negative values for -search.maxMemoryPerQuery
This is a follow-up for 04a05f161c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203
2022-10-12 09:25:17 +03:00
Aliaksandr Valialkin
04a05f161c
app/vmselect: return back the logic for limits the amounts of memory occupied by concurrently executed queries if -search.maxMemoryPerQuery isn't set
This is needed for preserving backwards compatibility with the previous releases of VictoriaMetrics.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203
2022-10-10 21:45:13 +03:00
Aliaksandr Valialkin
5138eaeea0
app/vmselect: allow limiting per-query memory usage via -search.maxMemoryPerQuery command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3203
2022-10-08 01:08:05 +03:00
Dmytro Kozlov
4415c71a2b
vmselect/{promql, prometheus}: show flag names which user can update in error message (#3049)
* vmselect/{promql, prometheus}: show flag names which user can update in error message

* vmselect/{promql, prometheus}: fix typo
2022-09-06 13:25:59 +03:00
Aliaksandr Valialkin
4076277cf0
app/vmselect/promql: evaluate union() args in parallel in order to increase query performance
Note that the parallel execution of `union()` args may take more memory and CPU time
than the sequential execution if args contain heavy queries, which may load all the available CPU,
disk and memory resources and vmselect and vmstorage levels.
2022-09-02 19:46:27 +03:00
Aliaksandr Valialkin
ad11b8d83d
app/vmselect/promql: follow-up after 2d71b4859c
- Use getScalar() function for obtaining the expected scalar from phi arg
- Reduce the error message returned to the user when incorrect phi is passed to histogram_quantiles
- Improve the description of this bugfix in the docs/CHANGELOG.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3026
2022-08-27 01:35:49 +03:00
Dmytro Kozlov
463ea6897b
vmselect/promql: enable search.maxPointsSubqueryPerTimeseries for sub-queries (#2963)
* vmselect/promql: enable search.maxPointsPerTimeSeriesSubquery for sub-queries

* vmselect/promql: cleanup

* vmselect/promql: rename config flag

* vmselect/promql: add tests

* vmselect/promql: use test object instead of log

* vmselect/promql: fix posible panic is subquery has more points. add description

* vmselect/promql: update tests descriptions

* vmselect/promql: update doInternal validation

* vmselect/promql: fix linter

* vmselect/promql: fix linter

* vmselect/promql: update documentation and release notes

* wip

- Properly apply -search.maxPointsSubqueryPerTimeseries limit to subqueries.
  Previously the -search.maxPointsPerTimeseries limit was unexpectedly applied to subqueries
  if it was smaller than the -search.maxPointsSubqueryPerTimeseries .
- Clarify docs for -search.maxPointsSubqueryPerTimeseries command-line flag .
- Document -search.maxPointsPerTimeseries and -search.maxPointsSubqueryPerTimeseries flags at https://docs.victoriametrics.com/#resource-usage-limits .
- Update docs/CHANGELOG.md .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2922

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-24 15:25:18 +03:00
Aliaksandr Valialkin
4ac79d29ad
app/vmselect: follow-up after 63e0f16062
* Explicitly store a pointer to UserReadableError in the error interface.
  Previously Go automatically converted the value to a pointer before storing in the error interface.

* Add Unwrap() method to UserReadableError, so it can be used transparently with the other code,
  which calls errors.Is() and errors.As().

* Document the change in docs/CHANGELOG.md
2022-08-15 13:50:16 +03:00
Roman Khavronenko
63e0f16062
vmselect: introduce UserReadableError type of error (#2894)
When read query fails, VM returns rich error message with
all the details. While these details might be useful
for debugging specific cases, they're usually too verbose
for users.
Introducing a new error type `UserReadableError` is supposed
to allow to return to user only the most important parts
of the error trace. This supposed to improve error readability
in web interfaces such as VMUI or Grafana.

The full error trace is still logged with the full context
and can be found in vmselect logs.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-15 13:38:47 +03:00
Roman Khavronenko
9ccf695d57
vmselect: return correct error for second part of expression (#2893)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-20 16:44:28 +02:00
Aliaksandr Valialkin
cc7d499bbd
app/vmselect/promql: execute q1 and q2 from q1 op q2 in parallel if labels pushdown cannot be applied
This should improve query performance if VictoriaMetrics has enough resources for processing `q1` and `q2` in parallel.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2886
2022-07-19 14:27:48 +03:00
Aliaksandr Valialkin
c2197ad139
app/vmselect/promql: validate function name before evaluating its arguments
This avoids unneeded evaluation of args for unknown functions
2022-07-12 19:48:26 +03:00
Aliaksandr Valialkin
3e2dd85f7d
all: readability improvements for query traces
- show dates in human-readable format, e.g. 2022-05-07, instead of a numeric value
- limit the maximum length of queries and filters shown in trace messages
2022-06-30 18:20:33 +03:00
Aliaksandr Valialkin
a14188dd8e
app/vmselect: expose additional histograms at /metrics page, which may help get more insights for the query workload
This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2792
2022-06-28 20:18:13 +03:00
Aliaksandr Valialkin
a43f2d0bc5
app/vmselect/promql: show the number of scanned samples in the query trace 2022-06-28 19:26:17 +03:00
Aliaksandr Valialkin
e578549b8a
app/vmselect: optimize /api/v1/series a bit for time ranges smaller than one day 2022-06-28 13:02:47 +03:00
Aliaksandr Valialkin
a963b2a0aa
all: show timeRange in traces in human-readable format instead of timestamps in milliseconds 2022-06-27 13:45:51 +03:00
Aliaksandr Valialkin
12ac255dae
lib/querytracer: make it easier to use by passing trace context message to New and NewChild
The context message can be extended by calling Donef.
If there is no need to extend the message, then just call Done.
2022-06-08 21:06:52 +03:00
Aliaksandr Valialkin
41958ed5dd
all: add initial support for query tracing
See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#query-tracing

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403
2022-06-01 02:29:23 +03:00
Aliaksandr Valialkin
6e364e19ef
app/vmselect: add fine-grained limits for the number of returned/scanned time series for various APIs 2022-03-26 11:29:49 +02:00
Aliaksandr Valialkin
0d47c23a03
app/vmselect/promql: reduce the maximum number of label values, which can be propagated from one side of the binary operation to another side of the binary operation from 10K to 1K
There are user reports that 10K unique values in a single label filter may lead to performance and memory usage issues
2022-02-24 04:05:18 +02:00
Aliaksandr Valialkin
b1f94f7f0e
app/vmselect/promql: return at most one time series from absent_over_time() in the same way as Prometheus does
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2130
2022-02-12 15:45:09 +02:00
Aliaksandr Valialkin
96b7de6736
app/vmselect/promql: clarify comments on why the right side of if and and operators are executed at first 2022-02-03 00:26:14 +02:00
Aliaksandr Valialkin
4b850c2a59
app/vmselect/promql: do not push down filters, which enumerate more than 10k unique values
Such filters may slow down time series search, so just skip them.

This is a follow-up for e7f1ceeb84

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1827
2022-02-02 23:40:02 +02:00
Aliaksandr Valialkin
2016a2c899
app/vmselect/promql: properly handle foo or bar queries
Such queries may miss `bar` results after the commit e7f1ceeb84
because common label filters from `foo` could be mistakenly applied to `bar`.
2022-02-01 17:40:51 +02:00
Aliaksandr Valialkin
a8d22e1223
app/vmselect/promql: check for binary operation in case-insensitive manner when deciding which side of the operation to perform the first
PromQL and MetricsQL operators are case-insensitive
2022-01-31 22:06:15 +02:00
Aliaksandr Valialkin
e7f1ceeb84
app/vmselect/promql: optimize queries, which join on _info metrics.
Automatically add common filters from one side of binary operation
to the other side before sending the query to storage subsystem.

See https://grafana.com/blog/2021/08/04/how-to-use-promql-joins-for-more-effective-queries-of-prometheus-metrics-at-scale/
and https://www.robustperception.io/exposing-the-software-version-to-prometheus
2022-01-31 19:32:36 +02:00
Aliaksandr Valialkin
dc7b63a793
app/vmselect/promql: properly keep metric names when optimized path is used for aggregate function calculations
For example, `sum(rate(...) keep_metric_names) by (__name__)` didn't leave the original metric name because of this issue.

This is a follup-up for 1bdc71d917

Udates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/949
2022-01-17 15:30:30 +02:00
Aliaksandr Valialkin
1bdc71d917
app/vmselect/promql: implement keep_metric_names modifier for transform and rollup functions
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/949
2022-01-14 04:14:59 +02:00
Aliaksandr Valialkin
f41846d002
app/vmselect/promql: add stale_samples_over_time() function 2022-01-14 01:48:04 +02:00