app/vmselect/promql: do not push down filters, which enumerate more than 10k unique values

Such filters may slow down time series search, so just skip them.

This is a follow-up for e7f1ceeb84

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1827
This commit is contained in:
Aliaksandr Valialkin 2022-02-02 23:37:35 +02:00
parent 4ef32df4fa
commit 4b850c2a59
No known key found for this signature in database
GPG key ID: A72BEC6CD3D0DED1
2 changed files with 7 additions and 2 deletions

View file

@ -382,12 +382,18 @@ func getCommonLabelFilters(tss []*timeseries) []metricsql.LabelFilter {
continue
}
values = getUniqueValues(values)
if len(values) > 10000 {
// Skip the filter on the given tag, since it needs to enumerate too many unique values.
// This may slow down the search for matching time series.
continue
}
lf := metricsql.LabelFilter{
Label: key,
}
if len(values) == 1 {
lf.Value = values[0]
} else {
sort.Strings(values)
lf.Value = joinRegexpValues(values)
lf.IsRegexp = true
}
@ -408,7 +414,6 @@ func getUniqueValues(a []string) []string {
m[s] = struct{}{}
}
}
sort.Strings(results)
return results
}

View file

@ -11,7 +11,7 @@ sort: 15
* Binary operations with `on()`, `without()`, `group_left()` and `group_right()` modifiers. For example, `foo{a="b"} on (a) + bar` is now optimized to `foo{a="b"} on (a) + bar{a="b"}`
* Multi-level binary operations. For example, `foo{a="b"} + bar{x="y"} + baz{z="q"}` is now optimized to `foo{a="b",x="y",z="q"} + bar{a="b",x="y",z="q"} + baz{a="b",x="y",z="q"}`
* Aggregate functions. For example, `sum(foo{a="b"}) by (c) + bar{c="d"}` is now optimized to `sum(foo{a="b",c="d"}) by (c) + bar{c="d"}`
* FEATURE [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html): optimize joining with `*_info` labels. For example: `kube_pod_created{namespace="prod"} * on (uid) group_left(node) kube_pod_info` now automatically adds the needed filters on `uid` label to `kube_pod_info` before selecting series for the right side of `*` operation. This may save CPU, RAM and disk IO resources. See [this article](https://www.robustperception.io/exposing-the-software-version-to-prometheus) for details on `*_info` labels.
* FEATURE [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html): optimize joining with `*_info` labels. For example: `kube_pod_created{namespace="prod"} * on (uid) group_left(node) kube_pod_info` now automatically adds the needed filters on `uid` label to `kube_pod_info` before selecting series for the right side of `*` operation. This may save CPU, RAM and disk IO resources. See [this article](https://www.robustperception.io/exposing-the-software-version-to-prometheus) for details on `*_info` labels. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1827).
* FEATURE: all: expose `process_cpu_cores_available` metric, which shows the number of CPU cores available to the app. The number can be fractional if the corresponding cgroup limit is set to a fractional value. This metric is useful for alerting on CPU saturation. For example, the following query alerts when the app uses more than 90% of CPU during the last 5 minutes: `rate(process_cpu_seconds_total[5m]) / process_cpu_cores_available > 0.9` . See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2107).
* FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): add ability to configure notifiers (e.g. alertmanager) via a file in the way similar to Prometheus. See [these docs](https://docs.victoriametrics.com/vmalert.html#notifier-configuration-file), [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2127).
* FEATURE: [vmalert](https://docs.victoriametrics.com/vmalert.html): add support for Consul service discovery for notifiers. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1947).