app/select: add command-line flag -search.maxBinaryOpPushdownLabelValues

### Describe Your Changes

Binary operations like `exprFirst op exprSecond` in VictoriaMetrics are
performed in the following way:
1. Execute exprFirst.
2. Extract **common label filters** from the result of step 1.
3. Apply these common label filters to `exprSecond` and execute it, in
order to retrieve less time series from vmstorage nodes.

In step 2, only labels with less than `100` (hard-coded) value could be
used as **common label filter** (e.g. `{common_lb=~"v1|v2|...|v100"}`.

In our scenarios, a label, take `instance` label as an example, could
has thousands of candidate values. Regarding bring more pressure to
vmstorage node, it's still beneficial if labels with more than 100
values could be used as filter in `exprSecond`, with enough vmstorage
resources. After adjusting the value from `100` to `10000`, our query
round-trip time drops significantly from 5s to 2s.

This pull request change the hard-coded value into a configurable flag.
This commit is contained in:
YuDong Tang 2025-01-03 20:19:44 +08:00 committed by GitHub
parent 51a2cc17c6
commit ec73b22d24
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 7 additions and 1 deletions
app/vmselect/promql
docs

View file

@ -48,6 +48,8 @@ var (
"so there is no need in spending additional CPU time on its handling. Staleness markers may exist only in data obtained from Prometheus scrape targets")
minWindowForInstantRollupOptimization = flag.Duration("search.minWindowForInstantRollupOptimization", time.Hour*3, "Enable cache-based optimization for repeated queries "+
"to /api/v1/query (aka instant queries), which contain rollup functions with lookbehind window exceeding the given value")
maxBinaryOpPushdownLabelValues = flag.Int("search.maxBinaryOpPushdownLabelValues", 100, "The maximum number of values for a label in the first expression that can be extracted as a common label filter and pushed down to the second expression in a binary operation. "+
"A larger value makes the pushed-down filter more complex but fewer time series will be returned. This flag is useful when selective label contains numerous values, for example `instance`, and storage resources are abundant.")
)
// The minimum number of points per timeseries for enabling time rounding.
@ -601,7 +603,7 @@ func getCommonLabelFilters(tss []*timeseries) []metricsql.LabelFilter {
}
continue
}
if len(vc.values) > 100 {
if len(vc.values) > *maxBinaryOpPushdownLabelValues {
// Too many unique values found for the given tag.
// Do not make a filter on such values, since it may slow down
// search for matching time series.

View file

@ -1576,6 +1576,9 @@ Below is the output for `/path/to/vmselect -help`:
Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
-search.logSlowQueryDuration duration
Log queries with execution time exceeding this value. Zero disables slow query logging. See also -search.logQueryMemoryUsage (default 5s)
-search.maxBinaryOpPushdownLabelValues int
The maximum number of values for a label in the first expression that can be extracted as a common label filter and pushed down to the second expression in a binary operation.
A larger value makes the pushed-down filter more complex but fewer time series will be returned. This flag is useful when selective label contains numerous values, for example `instance`, and storage resources are abundant. (default 100)
-search.maxConcurrentRequests int
The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores, while many concurrently executed requests may require high amounts of memory. See also -search.maxQueueDuration and -search.maxMemoryPerQuery (default 16)
-search.maxDeleteSeries int

View file

@ -24,6 +24,7 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/).
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent/) and [Single-node VictoriaMetrics](https://docs.victoriametrics.com/): added `min` and `max` metrics for Datadog Sketches API metrics, changed `_` metric name separator to `.` if metrics are not sanitized for consistency.
* FEATURE: [Single-node VictoriaMetrics](https://docs.victoriametrics.com/): support `-maxIngestionRate` cmd-line flag to ratelimit samples/sec ingested. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7377) for details.
* FEATURE: [vmsingle](https://docs.victoriametrics.com/single-server-victoriametrics/) and `vmselect` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/cluster-victoriametrics/): improve query performance on systems with high number of CPU cores. See [this PR](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7416) for details.
* FEATURE: [[vmsingle](https://docs.victoriametrics.com/single-server-victoriametrics/) and `vmselect` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/cluster-victoriametrics/): add command-line flag `-search.maxBinaryOpPushdownLabelValues` to allow using labels with more candidate values as push down filter in binary operation. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7243). Thanks to @tydhot for implementation.
* BUGFIX: [dashboards](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/dashboards): consistently use `vmagent_remotewrite_pending_data_bytes` on vmagent dashboard to represent persistent queue size.
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert/): fix the auto-generated metrics `ALERTS` and `ALERTS_FOR_STATE` for alerting rules. Previously, metrics might have incorrect labels and affect the restore process. See this [issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7796).