docs: refer to cardinality explorer

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233
This commit is contained in:
Aliaksandr Valialkin 2022-06-08 19:17:33 +03:00
parent f2754c3e90
commit 8d98b8ae10
No known key found for this signature in database
GPG key ID: A72BEC6CD3D0DED1
3 changed files with 45 additions and 7 deletions

View file

@ -282,15 +282,15 @@ The main reason for high churn rate is a metric label with frequently changed va
* A label derived from the current time such as `timestamp`, `minute` or `hour`.
* A `hash` or `uuid` label, which changes frequently.
The solution against high churn rate is to identify and eliminate labels with frequently changed values. The [/api/v1/status/tsdb](https://docs.victoriametrics.com/#tsdb-stats) page can help determining these labels.
The solution against high churn rate is to identify and eliminate labels with frequently changed values. [Cardinality explorer](https://docs.victoriametrics.com/#cardinality-explorer) can help determining these labels.
## What is high cardinality?
High cardinality usually means a high number of [active time series](#what-is-an-active-time-series). High cardinality may lead to high memory usage and/or to a high percentage of [slow inserts](#what-is-a-slow-insert). The source of high cardinality is usually a label with a large number of unique values, which presents a big share of the ingested time series. The solution is to identify and remove the source of high cardinality with the help of [/api/v1/status/tsdb](https://docs.victoriametrics.com/#tsdb-stats).
High cardinality usually means a high number of [active time series](#what-is-an-active-time-series). High cardinality may lead to high memory usage and/or to a high percentage of [slow inserts](#what-is-a-slow-insert). The source of high cardinality is usually a label with a large number of unique values, which presents a big share of the ingested time series. The solution is to identify and remove the source of high cardinality with the help of [cardinality explorer](https://docs.victoriametrics.com/#cardinality-explorer).
## What is a slow insert?
VictoriaMetrics maintains in-memory cache for mapping of [active time series](#what-is-an-active-time-series) into internal series ids. The cache size depends on the available memory for VictoriaMetrics in the host system. If the information about all the active time series doesn't fit the cache, then VictoriaMetrics needs to read and unpack the information from disk on every incoming sample for time series missing in the cache. This operation is much slower than the cache lookup, so such an insert is named a `slow insert`. A high percentage of slow inserts on the [official dashboard for VictoriaMetrics](https://docs.victoriametrics.com/#monitoring) indicates a memory shortage for the current number of [active time series](#what-is-an-active-time-series). Such a condition usually leads to a significant slowdown for data ingestion and to significantly increased disk IO and CPU usage. The solution is to add more memory or to reduce the number of [active time series](#what-is-an-active-time-series). The `/api/v1/status/tsdb` page can be helpful for locating the source of high number of active time seriess see [these docs](https://docs.victoriametrics.com/#tsdb-stats).
VictoriaMetrics maintains in-memory cache for mapping of [active time series](#what-is-an-active-time-series) into internal series ids. The cache size depends on the available memory for VictoriaMetrics in the host system. If the information about all the active time series doesn't fit the cache, then VictoriaMetrics needs to read and unpack the information from disk on every incoming sample for time series missing in the cache. This operation is much slower than the cache lookup, so such an insert is named a `slow insert`. A high percentage of slow inserts on the [official dashboard for VictoriaMetrics](https://docs.victoriametrics.com/#monitoring) indicates a memory shortage for the current number of [active time series](#what-is-an-active-time-series). Such a condition usually leads to a significant slowdown for data ingestion and to significantly increased disk IO and CPU usage. The solution is to add more memory or to reduce the number of [active time series](#what-is-an-active-time-series). [Cardinality explorer](https://docs.victoriametrics.com/#cardinality-explorer) can be helpful for locating the source of high number of active time series.
## How to optimize MetricsQL query?

View file

@ -243,7 +243,9 @@ Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](
## vmui
VictoriaMetrics provides UI for query troubleshooting and exploration. The UI is available at `http://victoriametrics:8428/vmui`.
The UI allows exploring query results via graphs and tables. Graphs support scrolling and zooming:
The UI allows exploring query results via graphs and tables. It also provides support for [cardinality explorer](#cardinality-explorer).
Graphs in vmui support scrolling and zooming:
* Drag the graph to the left / right in order to move the displayed time range into the past / future.
* Hold `Ctrl` (or `Cmd` on MacOS) and scroll up / down in order to zoom in / out the graph.
@ -261,6 +263,21 @@ VMUI allows investigating correlations between two queries on the same graph. Ju
See the [example VMUI at VictoriaMetrics playground](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/?g0.expr=100%20*%20sum(rate(process_cpu_seconds_total))%20by%20(job)&g0.range_input=1d).
## Cardinality explorer
VictoriaMetrics provides an ability to explore time series cardinality at `cardinality` tab in [vmui](#vmui) in the following ways:
- To identify metric names with the highest number of series.
- To identify label=name pairs with the highest number of series.
- To identify labels with the highest number of unique values.
By default cardinality explorer analyzes time series for the current date. It provides the ability to select different day at the top right corner.
By default all the time series for the selected date are analyzed. It is possible to narrow down the analysis to series
matching the specified [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors).
Cardinality explorer is built on top of [/api/v1/status/tsdb](#tsdb-stats).
## How to apply new config to VictoriaMetrics
VictoriaMetrics is configured via command-line flags, so it must be restarted when new command-line flags should be applied:
@ -1385,6 +1402,8 @@ VictoriaMetrics returns TSDB stats at `/api/v1/status/tsdb` page in the way simi
* `match[]=SELECTOR` where `SELECTOR` is an arbitrary [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) for series to take into account during stats calculation. By default all the series are taken into account.
* `extra_label=LABEL=VALUE`. See [these docs](#prometheus-querying-api-enhancements) for more details.
VictoriaMetrics provides an UI on top of `/api/v1/status/tsdb` - see [cardinality explorer docs](#cardinality-explorer).
## Query tracing
VictoriaMetrics supports query tracing, which can be used for determining bottlenecks during query processing.
@ -1522,7 +1541,7 @@ See also more advanced [cardinality limiter in vmagent](https://docs.victoriamet
It may be needed in order to suppress default gap filling algorithm used by VictoriaMetrics - by default it assumes
each time series is continuous instead of discrete, so it fills gaps between real samples with regular intervals.
* Metrics and labels leading to [high cardinality](https://docs.victoriametrics.com/FAQ.html#what-is-high-cardinality) or [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate) can be determined at `/api/v1/status/tsdb` page. See [these docs](#tsdb-stats) for details.
* Metrics and labels leading to [high cardinality](https://docs.victoriametrics.com/FAQ.html#what-is-high-cardinality) or [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate) can be determined via [cardinality explorer](#cardinality-explorer) and via [/api/v1/status/tsdb](#tsdb-stats) endpoint.
* New time series can be logged if `-logNewSeries` command-line flag is passed to VictoriaMetrics.

View file

@ -247,7 +247,9 @@ Prometheus doesn't drop data during VictoriaMetrics restart. See [this article](
## vmui
VictoriaMetrics provides UI for query troubleshooting and exploration. The UI is available at `http://victoriametrics:8428/vmui`.
The UI allows exploring query results via graphs and tables. Graphs support scrolling and zooming:
The UI allows exploring query results via graphs and tables. It also provides support for [cardinality explorer](#cardinality-explorer).
Graphs in vmui support scrolling and zooming:
* Drag the graph to the left / right in order to move the displayed time range into the past / future.
* Hold `Ctrl` (or `Cmd` on MacOS) and scroll up / down in order to zoom in / out the graph.
@ -265,6 +267,21 @@ VMUI allows investigating correlations between two queries on the same graph. Ju
See the [example VMUI at VictoriaMetrics playground](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/prometheus/graph/?g0.expr=100%20*%20sum(rate(process_cpu_seconds_total))%20by%20(job)&g0.range_input=1d).
## Cardinality explorer
VictoriaMetrics provides an ability to explore time series cardinality at `cardinality` tab in [vmui](#vmui) in the following ways:
- To identify metric names with the highest number of series.
- To identify label=name pairs with the highest number of series.
- To identify labels with the highest number of unique values.
By default cardinality explorer analyzes time series for the current date. It provides the ability to select different day at the top right corner.
By default all the time series for the selected date are analyzed. It is possible to narrow down the analysis to series
matching the specified [series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors).
Cardinality explorer is built on top of [/api/v1/status/tsdb](#tsdb-stats).
## How to apply new config to VictoriaMetrics
VictoriaMetrics is configured via command-line flags, so it must be restarted when new command-line flags should be applied:
@ -1389,6 +1406,8 @@ VictoriaMetrics returns TSDB stats at `/api/v1/status/tsdb` page in the way simi
* `match[]=SELECTOR` where `SELECTOR` is an arbitrary [time series selector](https://prometheus.io/docs/prometheus/latest/querying/basics/#time-series-selectors) for series to take into account during stats calculation. By default all the series are taken into account.
* `extra_label=LABEL=VALUE`. See [these docs](#prometheus-querying-api-enhancements) for more details.
VictoriaMetrics provides an UI on top of `/api/v1/status/tsdb` - see [cardinality explorer docs](#cardinality-explorer).
## Query tracing
VictoriaMetrics supports query tracing, which can be used for determining bottlenecks during query processing.
@ -1526,7 +1545,7 @@ See also more advanced [cardinality limiter in vmagent](https://docs.victoriamet
It may be needed in order to suppress default gap filling algorithm used by VictoriaMetrics - by default it assumes
each time series is continuous instead of discrete, so it fills gaps between real samples with regular intervals.
* Metrics and labels leading to [high cardinality](https://docs.victoriametrics.com/FAQ.html#what-is-high-cardinality) or [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate) can be determined at `/api/v1/status/tsdb` page. See [these docs](#tsdb-stats) for details.
* Metrics and labels leading to [high cardinality](https://docs.victoriametrics.com/FAQ.html#what-is-high-cardinality) or [high churn rate](https://docs.victoriametrics.com/FAQ.html#what-is-high-churn-rate) can be determined via [cardinality explorer](#cardinality-explorer) and via [/api/v1/status/tsdb](#tsdb-stats) endpoint.
* New time series can be logged if `-logNewSeries` command-line flag is passed to VictoriaMetrics.