diff --git a/docs/Cluster-VictoriaMetrics.md b/docs/Cluster-VictoriaMetrics.md index 058110be81..42c698f24c 100644 --- a/docs/Cluster-VictoriaMetrics.md +++ b/docs/Cluster-VictoriaMetrics.md @@ -300,8 +300,47 @@ All the node types - `vminsert`, `vmselect` and `vmstorage` - may be updated via Send `SIGINT` signal to the corresponding process, wait until it finishes and then start new version with new configs. -Cluster should remain in working state if at least a single node of each type remains available during -the update process. See [cluster availability](#cluster-availability) section for details. +There are the following cluster update / upgrade approaches exist: + +* `No downtime` strategy. Gracefully restart every node in the cluster one-by-one with the updated config / upgraded binary. + + It is recommended restarting the nodes in the following order: + + 1. Restart `vmstorage` nodes. + 2. Restart `vminsert` nodes. + 3. Restart `vmselect` nodes. + + This strategy allows upgrading the cluster without downtime if the following conditions are met: + + - The cluster has at least a pair of nodes of each type - `vminsert`, `vmselect` and `vmstorage`, + so it can continue accept new data and serve incoming requests when a single node is temporary unavailable + during its restart. See [cluster availability docs](#cluster-availability) for details. + - The cluster has enough compute resources (CPU, RAM, network bandwidth, disk IO) for processing + the current workload when a single node of any type (`vminsert`, `vmselect` or `vmstorage`) + is temporarily unavailable during its restart. + - The updated config / upgraded binary is compatible with the remaining components in the cluster. + See the [CHANGELOG](https://docs.victoriametrics.com/CHANGELOG.html) for compatibility notes between different releases. + + If at least a single condition isn't met, then the rolling restart may result in cluster unavailability + during the config update / version upgrade. In this case the following strategy is recommended. + +* `Minimum downtime` strategy: + + 1. Gracefully stop all the `vminsert` and `vmselect` nodes in parallel. + 2. Gracefully restart all the `vmstorage` nodes in parallel. + 3. Start all the `vminsert` and `vmselect` nodes in parallel. + + The cluster is unavailable for data ingestion and querying when performing the steps above. + The downtime is minimized by restarting cluster nodes in parallel at every step above. + The `minimum downtime` strategy has the following benefits comparing to `no downtime` startegy: + + - It allows performing config update / version upgrade with minimum disruption + when the previous config / version is incompatible with the new config / version. + - It allows perorming config update / version upgrade with minimum disruption + when the cluster has no enough compute resources (CPU, RAM, disk IO, network bandwidth) + for rolling upgrade. + - It allows minimizing the duration of config update / version ugprade for clusters with big number of nodes + of for clusters with big `vmstorage` nodes, which may take long time for graceful restart. ## Cluster availability diff --git a/docs/FAQ.md b/docs/FAQ.md index b9ebcaf142..8c6b85bdcb 100644 --- a/docs/FAQ.md +++ b/docs/FAQ.md @@ -335,10 +335,4 @@ The query engine may behave differently for some functions. Please see [this art Single-node VictoriaMetrics cannot be restarted / upgraded or downgraded without downtime, since it needs to be gracefully shut down and then started again. See [how to upgrade VictoriaMetrics](https://docs.victoriametrics.com/#how-to-upgrade-victoriametrics). -[Cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) can be restarted / upgraded / downgraded without downtime if the following conditions are met: - -* If every component of the cluster - `vminsert`, `vmselect` and `vmstorage` - has at least 2 instances. -* If the cluster has enough compute resources (CPU, RAM, disk IO, network bandwidth) to perform rolling restart of all its components. -* If the version used for upgrade / downgrade is compatible with the currently running version. The [CHANGELOG](https://docs.victoriametrics.com/CHANGELOG.html) contains compatibility notes for the published releases. - -See [updating / reconfiguring cluster nodes](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#updating--reconfiguring-cluster-nodes) for details on cluster upgrade. +[Cluster version of VictoriaMetrics](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html) can be restarted / upgraded / downgraded without downtime according to [these instructions](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#updating--reconfiguring-cluster-nodes).