docs/Troubleshooting.md: document an additional case, which could result in slow inserts

If `-cacheExpireDuration` is lower than the interval between ingested samples for the same time series,
then vm_slow_row_inserts_total` metric is increased.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183
This commit is contained in:
Aliaksandr Valialkin 2023-03-20 13:28:33 -07:00
parent 3283f0dae4
commit 91533531f5
No known key found for this signature in database
GPG key ID: A72BEC6CD3D0DED1
5 changed files with 11 additions and 4 deletions

View file

@ -2746,7 +2746,7 @@
"type": "prometheus", "type": "prometheus",
"uid": "$ds" "uid": "$ds"
}, },
"description": "The percentage of slow inserts comparing to total insertion rate during the last 5 minutes. \n\nThe less value is better. If percentage remains high (>10%) during extended periods of time, then it is likely more RAM is needed for optimal handling of the current number of active time series. \n\nIn general, VictoriaMetrics requires ~1KB or RAM per active time series, so it should be easy calculating the required amounts of RAM for the current workload according to capacity planning docs. But the resulting number may be far from the real number because the required amounts of memory depends on may other factors such as the number of labels per time series and the length of label values.", "description": "The percentage of slow inserts comparing to total insertion rate during the last 5 minutes. \n\nThe less value is better. If percentage remains high (>10%) during extended periods of time, then it is likely more RAM is needed for optimal handling of the current number of active time series. \n\nIn general, VictoriaMetrics requires ~1KB or RAM per active time series, so it should be easy calculating the required amounts of RAM for the current workload according to capacity planning docs. But the resulting number may be far from the real number because the required amounts of memory depends on many other factors such as the number of labels per time series and the length of label values. See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183",
"fieldConfig": { "fieldConfig": {
"defaults": { "defaults": {
"color": { "color": {

View file

@ -2803,7 +2803,7 @@
"type": "prometheus", "type": "prometheus",
"uid": "$ds" "uid": "$ds"
}, },
"description": "The percentage of slow inserts comparing to total insertion rate during the last 5 minutes. \n\nThe less value is better. If percentage remains high (>10%) during extended periods of time, then it is likely more RAM is needed for optimal handling of the current number of active time series. \n\nIn general, VictoriaMetrics requires ~1KB or RAM per active time series, so it should be easy calculating the required amounts of RAM for the current workload according to capacity planning docs. But the resulting number may be far from the real number because the required amounts of memory depends on may other factors such as the number of labels per time series and the length of label values.", "description": "The percentage of slow inserts comparing to total insertion rate during the last 5 minutes. \n\nThe less value is better. If percentage remains high (>10%) during extended periods of time, then it is likely more RAM is needed for optimal handling of the current number of active time series. \n\nIn general, VictoriaMetrics requires ~1KB or RAM per active time series, so it should be easy calculating the required amounts of RAM for the current workload according to capacity planning docs. But the resulting number may be far from the real number because the required amounts of memory depends on many other factors such as the number of labels per time series and the length of label values. See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183",
"fieldConfig": { "fieldConfig": {
"defaults": { "defaults": {
"color": { "color": {

View file

@ -152,7 +152,8 @@ groups:
dashboard: "http://localhost:3000/d/oS7Bi_0Wz?viewPanel=108" dashboard: "http://localhost:3000/d/oS7Bi_0Wz?viewPanel=108"
summary: "Percentage of slow inserts is more than 5% for the last 15m" summary: "Percentage of slow inserts is more than 5% for the last 15m"
description: "High rate of slow inserts may be a sign of resource exhaustion description: "High rate of slow inserts may be a sign of resource exhaustion
for the current load. It is likely more RAM is needed for optimal handling of the current number of active time series." for the current load. It is likely more RAM is needed for optimal handling of the current number of active time series.
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183"
- alert: ProcessNearFDLimits - alert: ProcessNearFDLimits
expr: (process_max_fds - process_open_fds) < 100 expr: (process_max_fds - process_open_fds) < 100

View file

@ -132,7 +132,8 @@ groups:
dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=68&var-instance={{ $labels.instance }}" dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=68&var-instance={{ $labels.instance }}"
summary: "Percentage of slow inserts is more than 5% on \"{{ $labels.instance }}\" for the last 15m" summary: "Percentage of slow inserts is more than 5% on \"{{ $labels.instance }}\" for the last 15m"
description: "High rate of slow inserts on \"{{ $labels.instance }}\" may be a sign of resource exhaustion description: "High rate of slow inserts on \"{{ $labels.instance }}\" may be a sign of resource exhaustion
for the current load. It is likely more RAM is needed for optimal handling of the current number of active time series." for the current load. It is likely more RAM is needed for optimal handling of the current number of active time series.
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183"
- alert: LabelsLimitExceededOnIngestion - alert: LabelsLimitExceededOnIngestion
expr: sum(increase(vm_metrics_with_dropped_labels_total[5m])) by (instance) > 0 expr: sum(increase(vm_metrics_with_dropped_labels_total[5m])) by (instance) > 0

View file

@ -186,6 +186,11 @@ There are the following most commons reasons for slow data ingestion in Victoria
Issues like this are very hard to catch via [official Grafana dashboard for cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#monitoring) Issues like this are very hard to catch via [official Grafana dashboard for cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#monitoring)
and proper diagnosis would require checking resource usage on the instances where VictoriaMetrics runs. and proper diagnosis would require checking resource usage on the instances where VictoriaMetrics runs.
6. If you see `TooHighSlowInsertsRate` [alert](https://docs.victoriametrics.com/#monitoring) when single-node VictoriaMetrics or `vmstorage` has enough
free CPU and RAM, then increase `-cacheExpireDuration` command-line flag at single-node VictoriaMetrics or at `vmstorage` to the value,
which exceeds the interval between ingested samples for the same time series (aka `scrape_interval`).
See [this comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183) for more details.
## Slow queries ## Slow queries
Some queries may take more time and resources (CPU, RAM, network bandwidth) than others. Some queries may take more time and resources (CPU, RAM, network bandwidth) than others.