mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2024-11-21 14:44:00 +00:00
docs/{vmbackup,vmbackupmanager}.md: clarify why storing backups to S3 Glacier can be time-consuming and expensive
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5614
This is a follow-up for e14e3d9c8c
This commit is contained in:
parent
b0287867fe
commit
c830064c2f
2 changed files with 23 additions and 11 deletions
|
@ -62,6 +62,10 @@ with the following command:
|
|||
```
|
||||
|
||||
It saves time and network bandwidth costs by performing server-side copy for the shared data from the `-origin` to `-dst`.
|
||||
Typical object storage just creates new names for already existing objects when performing server-side copy,
|
||||
so this operation should be fast and inexpensive. Unfortunately, there are object storage systems such as [S3 Glacier](https://aws.amazon.com/s3/storage-classes/glacier/),
|
||||
which make full copies for the copied objects during server-side copy. This may significantly slow down server-side copy
|
||||
and make it very expensive.
|
||||
|
||||
### Incremental backups
|
||||
|
||||
|
@ -82,20 +86,24 @@ Smart backups mean storing full daily backups into `YYYYMMDD` folders and creati
|
|||
./vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshot.createURL=http://localhost:8428/snapshot/create -dst=gs://<bucket>/latest
|
||||
```
|
||||
|
||||
Where `<latest-snapshot>` is the latest [snapshot](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots).
|
||||
The command will upload only changed data to `gs://<bucket>/latest`.
|
||||
This command creates an [instant snapshot](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#how-to-work-with-snapshots)
|
||||
and uploads it to `gs://<bucket>/latest`. It uploads only the changed data (aka incremental backup). This saves network bandwidth costs and time
|
||||
when backing up large amounts of data.
|
||||
|
||||
* Run the following command once a day:
|
||||
|
||||
```console
|
||||
./vmbackup -storageDataPath=</path/to/victoria-metrics-data> -snapshot.createURL=http://localhost:8428/snapshot/create -dst=gs://<bucket>/<YYYYMMDD> -origin=gs://<bucket>/latest
|
||||
./vmbackup -storageDataPath=</path/to/victoria-metrics-data> -origin=gs://<bucket>/latest -dst=gs://<bucket>/<YYYYMMDD>
|
||||
```
|
||||
|
||||
Where `<daily-snapshot>` is the snapshot for the last day `<YYYYMMDD>`.
|
||||
This command creates server-side copy of the backup from `gs://<bucket>/latest` to `gs://<bucket>/<YYYYMMDD>`, were `<YYYYMMDD>` is the current
|
||||
date like `20240125`. Server-side copy of the backup should be fast on most object storage systems, since it just creates new names for already
|
||||
existing objects. The server-side copy can be slow on some object storage systems such as [S3 Glacier](https://aws.amazon.com/s3/storage-classes/glacier/),
|
||||
since they may perform full object copy instead of creating new names for already existing objects. This may be slow and expensive.
|
||||
|
||||
The `smart backups` approach described above saves network bandwidth costs on hourly backups (since they are incremental)
|
||||
and allows recovering data from either the last hour (the `latest` backup) or from any day (`YYYYMMDD` backups).
|
||||
|
||||
This approach saves network bandwidth costs on hourly backups (since they are incremental) and allows recovering data from either the last hour (`latest` backup)
|
||||
or from any day (`YYYYMMDD` backups). Because of this feature, it is not recommended to store `latest` data folder
|
||||
in storages with expensive reads or additional archiving features (like [S3 Glacier](https://aws.amazon.com/s3/storage-classes/glacier/)).
|
||||
Note that hourly backup shouldn't run when creating daily backup.
|
||||
|
||||
Do not forget to remove old backups when they are no longer needed in order to save storage costs.
|
||||
|
@ -115,7 +123,9 @@ from `gs://bucket/foo` to `gs://bucket/bar`:
|
|||
The `-origin` and `-dst` must point to the same object storage bucket or to the same filesystem.
|
||||
|
||||
The server-side backup copy is usually performed at much faster speed comparing to the usual backup, since backup data isn't transferred
|
||||
between the remote storage and locally running `vmbackup` tool.
|
||||
between the remote storage and locally running `vmbackup` tool. Object storage systems usually just make new names for already existing
|
||||
objects during server-side copy. Unfortunately there are systems such as [S3 Glacier](https://aws.amazon.com/s3/storage-classes/glacier/),
|
||||
which perform full object copy during server-side copying. This may be slow and expensive.
|
||||
|
||||
If the `-dst` already contains some data, then its' contents is synced with the `-origin` data. This allows making incremental server-side copies of backups.
|
||||
|
||||
|
|
|
@ -124,9 +124,11 @@ The result on the GCS bucket
|
|||
|
||||
<img alt="latest folder" src="vmbackupmanager_latest_folder.webp">
|
||||
|
||||
Please note, `latest` data folder is used for [smart backups](https://docs.victoriametrics.com/vmbackup.html#smart-backups).
|
||||
It is not recommended to store `latest` data folder in storages with expensive reads or additional archiving features
|
||||
(like [S3 Glacier](https://aws.amazon.com/s3/storage-classes/glacier/)).
|
||||
`vmbackupmanager` uses [smart backups](https://docs.victoriametrics.com/vmbackup.html#smart-backups) technique in order
|
||||
to speed up backups and save both data transfer costs and data copying costs. This includes server-side copy of already existing
|
||||
objects. Typical object storage systems implement server-side copy by creating new names for already existing objects.
|
||||
This is very fast and efficient. Unfortunately there are systems such as [S3 Glacier](https://aws.amazon.com/s3/storage-classes/glacier/),
|
||||
which perform full object copy during server-side copying. This may be slow and expensive.
|
||||
|
||||
Please, see [vmbackup docs](https://docs.victoriametrics.com/vmbackup.html#advanced-usage) for more examples of authentication with different
|
||||
storage types.
|
||||
|
|
Loading…
Reference in a new issue