app/select: add command-line flag -search.maxBinaryOpPushdownLabelValues

### Describe Your Changes Binary operations like `exprFirst op exprSecond` in VictoriaMetrics are performed in the following way: 1. Execute exprFirst. 2. Extract **common label filters** from the result of step 1. 3. Apply these common label filters to `exprSecond` and execute it, in order to retrieve less time series from vmstorage nodes. In step 2, only labels with less than `100` (hard-coded) value could be used as **common label filter** (e.g. `{common_lb=~"v1|v2|...|v100"}`. In our scenarios, a label, take `instance` label as an example, could has thousands of candidate values. Regarding bring more pressure to vmstorage node, it's still beneficial if labels with more than 100 values could be used as filter in `exprSecond`, with enough vmstorage resources. After adjusting the value from `100` to `10000`, our query round-trip time drops significantly from 5s to 2s. This pull request change the hard-coded value into a configurable flag.
2025-03-21 15:45:01 +00:00 · 2025-01-03 20:19:44 +08:00 · 2025-01-03 20:19:44 +08:00 · ec73b22d24
commit ec73b22d24
parent 51a2cc17c6
3 changed files with 7 additions and 1 deletions
--- a/app/vmselect/promql/eval.go
+++ b/app/vmselect/promql/eval.go
@ -48,6 +48,8 @@ var (
 		"so there is no need in spending additional CPU time on its handling. Staleness markers may exist only in data obtained from Prometheus scrape targets")
 	minWindowForInstantRollupOptimization = flag.Duration("search.minWindowForInstantRollupOptimization", time.Hour*3, "Enable cache-based optimization for repeated queries "+
 		"to /api/v1/query (aka instant queries), which contain rollup functions with lookbehind window exceeding the given value")
+	maxBinaryOpPushdownLabelValues = flag.Int("search.maxBinaryOpPushdownLabelValues", 100, "The maximum number of values for a label in the first expression that can be extracted as a common label filter and pushed down to the second expression in a binary operation. "+
+		"A larger value makes the pushed-down filter more complex but fewer time series will be returned. This flag is useful when selective label contains numerous values, for example `instance`, and storage resources are abundant.")
 )

 // The minimum number of points per timeseries for enabling time rounding.
@ -601,7 +603,7 @@ func getCommonLabelFilters(tss []*timeseries) []metricsql.LabelFilter {
 				}
 				continue
 			}
-			if len(vc.values) > 100 {
+			if len(vc.values) > *maxBinaryOpPushdownLabelValues {
 				// Too many unique values found for the given tag.
 				// Do not make a filter on such values, since it may slow down
 				// search for matching time series.
--- a/docs/Cluster-VictoriaMetrics.md
+++ b/docs/Cluster-VictoriaMetrics.md
@ -1576,6 +1576,9 @@ Below is the output for `/path/to/vmselect -help`:
     Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 0)
  -search.logSlowQueryDuration duration
     Log queries with execution time exceeding this value. Zero disables slow query logging. See also -search.logQueryMemoryUsage (default 5s)
+  -search.maxBinaryOpPushdownLabelValues int
+     The maximum number of values for a label in the first expression that can be extracted as a common label filter and pushed down to the second expression in a binary operation. 
+     A larger value makes the pushed-down filter more complex but fewer time series will be returned. This flag is useful when selective label contains numerous values, for example `instance`, and storage resources are abundant. (default 100)
  -search.maxConcurrentRequests int
     The maximum number of concurrent search requests. It shouldn't be high, since a single request can saturate all the CPU cores, while many concurrently executed requests may require high amounts of memory. See also -search.maxQueueDuration and -search.maxMemoryPerQuery (default 16)
  -search.maxDeleteSeries int
--- a/docs/changelog/CHANGELOG.md
+++ b/docs/changelog/CHANGELOG.md
@ -24,6 +24,7 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/).
 * FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent/) and [Single-node VictoriaMetrics](https://docs.victoriametrics.com/): added `min` and `max` metrics for Datadog Sketches API metrics, changed `_` metric name separator to `.` if metrics are not sanitized for consistency.
 * FEATURE: [Single-node VictoriaMetrics](https://docs.victoriametrics.com/): support `-maxIngestionRate` cmd-line flag to ratelimit samples/sec ingested. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7377) for details.
 * FEATURE: [vmsingle](https://docs.victoriametrics.com/single-server-victoriametrics/) and `vmselect` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/cluster-victoriametrics/): improve query performance on systems with high number of CPU cores. See [this PR](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7416) for details.
+* FEATURE: [[vmsingle](https://docs.victoriametrics.com/single-server-victoriametrics/) and `vmselect` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/cluster-victoriametrics/): add command-line flag `-search.maxBinaryOpPushdownLabelValues` to allow using labels with more candidate values as push down filter in binary operation. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7243). Thanks to @tydhot for implementation.

 * BUGFIX: [dashboards](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/dashboards): consistently use `vmagent_remotewrite_pending_data_bytes` on vmagent dashboard to represent persistent queue size.
 * BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert/): fix the auto-generated metrics `ALERTS` and `ALERTS_FOR_STATE` for alerting rules. Previously, metrics might have incorrect labels and affect the restore process. See this [issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7796).