mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2024-11-21 14:44:00 +00:00
docs/Troubleshooting.md: document an additional case, which could result in slow inserts
If `-cacheExpireDuration` is lower than the interval between ingested samples for the same time series, then vm_slow_row_inserts_total` metric is increased. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183
This commit is contained in:
parent
3283f0dae4
commit
91533531f5
5 changed files with 11 additions and 4 deletions
|
@ -2746,7 +2746,7 @@
|
||||||
"type": "prometheus",
|
"type": "prometheus",
|
||||||
"uid": "$ds"
|
"uid": "$ds"
|
||||||
},
|
},
|
||||||
"description": "The percentage of slow inserts comparing to total insertion rate during the last 5 minutes. \n\nThe less value is better. If percentage remains high (>10%) during extended periods of time, then it is likely more RAM is needed for optimal handling of the current number of active time series. \n\nIn general, VictoriaMetrics requires ~1KB or RAM per active time series, so it should be easy calculating the required amounts of RAM for the current workload according to capacity planning docs. But the resulting number may be far from the real number because the required amounts of memory depends on may other factors such as the number of labels per time series and the length of label values.",
|
"description": "The percentage of slow inserts comparing to total insertion rate during the last 5 minutes. \n\nThe less value is better. If percentage remains high (>10%) during extended periods of time, then it is likely more RAM is needed for optimal handling of the current number of active time series. \n\nIn general, VictoriaMetrics requires ~1KB or RAM per active time series, so it should be easy calculating the required amounts of RAM for the current workload according to capacity planning docs. But the resulting number may be far from the real number because the required amounts of memory depends on many other factors such as the number of labels per time series and the length of label values. See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183",
|
||||||
"fieldConfig": {
|
"fieldConfig": {
|
||||||
"defaults": {
|
"defaults": {
|
||||||
"color": {
|
"color": {
|
||||||
|
|
|
@ -2803,7 +2803,7 @@
|
||||||
"type": "prometheus",
|
"type": "prometheus",
|
||||||
"uid": "$ds"
|
"uid": "$ds"
|
||||||
},
|
},
|
||||||
"description": "The percentage of slow inserts comparing to total insertion rate during the last 5 minutes. \n\nThe less value is better. If percentage remains high (>10%) during extended periods of time, then it is likely more RAM is needed for optimal handling of the current number of active time series. \n\nIn general, VictoriaMetrics requires ~1KB or RAM per active time series, so it should be easy calculating the required amounts of RAM for the current workload according to capacity planning docs. But the resulting number may be far from the real number because the required amounts of memory depends on may other factors such as the number of labels per time series and the length of label values.",
|
"description": "The percentage of slow inserts comparing to total insertion rate during the last 5 minutes. \n\nThe less value is better. If percentage remains high (>10%) during extended periods of time, then it is likely more RAM is needed for optimal handling of the current number of active time series. \n\nIn general, VictoriaMetrics requires ~1KB or RAM per active time series, so it should be easy calculating the required amounts of RAM for the current workload according to capacity planning docs. But the resulting number may be far from the real number because the required amounts of memory depends on many other factors such as the number of labels per time series and the length of label values. See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183",
|
||||||
"fieldConfig": {
|
"fieldConfig": {
|
||||||
"defaults": {
|
"defaults": {
|
||||||
"color": {
|
"color": {
|
||||||
|
|
|
@ -152,7 +152,8 @@ groups:
|
||||||
dashboard: "http://localhost:3000/d/oS7Bi_0Wz?viewPanel=108"
|
dashboard: "http://localhost:3000/d/oS7Bi_0Wz?viewPanel=108"
|
||||||
summary: "Percentage of slow inserts is more than 5% for the last 15m"
|
summary: "Percentage of slow inserts is more than 5% for the last 15m"
|
||||||
description: "High rate of slow inserts may be a sign of resource exhaustion
|
description: "High rate of slow inserts may be a sign of resource exhaustion
|
||||||
for the current load. It is likely more RAM is needed for optimal handling of the current number of active time series."
|
for the current load. It is likely more RAM is needed for optimal handling of the current number of active time series.
|
||||||
|
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183"
|
||||||
|
|
||||||
- alert: ProcessNearFDLimits
|
- alert: ProcessNearFDLimits
|
||||||
expr: (process_max_fds - process_open_fds) < 100
|
expr: (process_max_fds - process_open_fds) < 100
|
||||||
|
|
|
@ -132,7 +132,8 @@ groups:
|
||||||
dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=68&var-instance={{ $labels.instance }}"
|
dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=68&var-instance={{ $labels.instance }}"
|
||||||
summary: "Percentage of slow inserts is more than 5% on \"{{ $labels.instance }}\" for the last 15m"
|
summary: "Percentage of slow inserts is more than 5% on \"{{ $labels.instance }}\" for the last 15m"
|
||||||
description: "High rate of slow inserts on \"{{ $labels.instance }}\" may be a sign of resource exhaustion
|
description: "High rate of slow inserts on \"{{ $labels.instance }}\" may be a sign of resource exhaustion
|
||||||
for the current load. It is likely more RAM is needed for optimal handling of the current number of active time series."
|
for the current load. It is likely more RAM is needed for optimal handling of the current number of active time series.
|
||||||
|
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183"
|
||||||
|
|
||||||
- alert: LabelsLimitExceededOnIngestion
|
- alert: LabelsLimitExceededOnIngestion
|
||||||
expr: sum(increase(vm_metrics_with_dropped_labels_total[5m])) by (instance) > 0
|
expr: sum(increase(vm_metrics_with_dropped_labels_total[5m])) by (instance) > 0
|
||||||
|
|
|
@ -186,6 +186,11 @@ There are the following most commons reasons for slow data ingestion in Victoria
|
||||||
Issues like this are very hard to catch via [official Grafana dashboard for cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#monitoring)
|
Issues like this are very hard to catch via [official Grafana dashboard for cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#monitoring)
|
||||||
and proper diagnosis would require checking resource usage on the instances where VictoriaMetrics runs.
|
and proper diagnosis would require checking resource usage on the instances where VictoriaMetrics runs.
|
||||||
|
|
||||||
|
6. If you see `TooHighSlowInsertsRate` [alert](https://docs.victoriametrics.com/#monitoring) when single-node VictoriaMetrics or `vmstorage` has enough
|
||||||
|
free CPU and RAM, then increase `-cacheExpireDuration` command-line flag at single-node VictoriaMetrics or at `vmstorage` to the value,
|
||||||
|
which exceeds the interval between ingested samples for the same time series (aka `scrape_interval`).
|
||||||
|
See [this comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3976#issuecomment-1476883183) for more details.
|
||||||
|
|
||||||
## Slow queries
|
## Slow queries
|
||||||
|
|
||||||
Some queries may take more time and resources (CPU, RAM, network bandwidth) than others.
|
Some queries may take more time and resources (CPU, RAM, network bandwidth) than others.
|
||||||
|
|
Loading…
Reference in a new issue