Commit graph

92 commits

Author SHA1 Message Date
f41gh7
3a2ca7601d
{lib/backup,app/}: gracefully cancel currently running operation during graceful shutdown
* {lib/backup,app/}: gracefully cancel currently running operation during graceful shutdown

Make backup/restore process interruptable by passing global context from the operation caller.

This is needed in order to reduce shutdown delays in case backup/restore cancellation is requested.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8554
2025-05-06 12:01:12 +02:00
Aliaksandr Valialkin
afd3cfe982
all: use new canonical urls to vmbackup docs: https://docs.victoriametrics.com/victoriametrics/vmbackup/
This avoids a redirect from the old link https://docs.victoriametrics.com/vmbackup/ to https://docs.victoriametrics.com/victoriametrics/vmbackup/ ,
and fixes `backwards` navigation for these links across VictoriaMetrics docs.

This is a follow-up for f152021521
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8595#issuecomment-2831598274
2025-04-30 16:51:57 +02:00
Zakhar Bessarab
e44c0d2a3d
lib/backup/s3remote: enable HTTP/2 for S3 connections
### Describe Your Changes

HTTP/2 support is used by some S3-compatible storage providers, so
disabling it default leads to unexpected errors when trying to connect
to S3 endpoint.
For example, using MinIO as S3 storage backend: `net/http: HTTP/1.x
transport connection broken: malformed HTTP response`.

HTTP/2 was enabled by default previously, but while fixing inconsistency
e5f4826 commit disabled this by default.
cc: @valyala 

### Checklist

The following checks are **mandatory**:

- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2025-04-23 09:46:23 +04:00
Nikolay
ebe15e0c7b
lib/backup/s3: properly set ProfileName
Previously, if ProfileName is set to empty value (as default). AWS s3
lib ignored any profile config defined with `-configProfilePath`.

This commit correctly configure client options and set profile name only
if it's set to non-empty value.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8668
2025-04-08 17:45:22 +02:00
Aliaksandr Valialkin
f83e780a55
lib/httputil: automatically initialize data transfer metrics for the created HTTP transports via NewTransport() 2025-03-27 15:22:15 +01:00
Aliaksandr Valialkin
88e82614bf
lib/httputil: add NewTransport() function for creating pre-initialized net/http.Transport 2025-03-26 20:16:39 +01:00
Zakhar Bessarab
2ee91f6c5a
lib/backup/s3remote: add retries for "IncompleteBody" errors
These errors could be caused by intermittent network issues, especially
in case of using proxies when accessing S3 storage. Previously, such
error would abort backup/restore process and require manual intervention
to ensure backups consistency.

This commit adds automatic retries to handle this to improve backups
reliability and resilience to network issues.
2025-03-20 15:36:50 +01:00
Guillem Jover
1d8b7faf71
spelling and grammar fixes via codespell ()
### Describe Your Changes

Fix many spelling errors and some grammar, including misspellings in
filenames.

The change also fixes a typo in metric `vm_mmaped_files` to `vm_mmapped_files`.
While this is a breaking change, this metric isn't used in alerts or dashboards.
So it seems to have low impact on users.

The change also deprecates `cspell` as it is much heavier and less usable.
---------

Co-authored-by: Andrii Chubatiuk <achubatiuk@victoriametrics.com>
Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>

(cherry picked from commit 76d205feae)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-03-17 16:38:11 +01:00
Zakhar Bessarab
54315fbad6
lib/backup/s3remote: add retryer configuration ()
### Describe Your Changes

This helps to improve reliability of performing backups in environments
with unreliable connection and tolerate temporary errors at S3 provider
side.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6732

Default retry timeout is up to 3 minutes to make this consistent with
the same configuration for GCS:
a05317f61f/lib/backup/gcsremote/gcs.go (L70-L76)

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
(cherry picked from commit cb00b4b00f)
2024-08-07 16:59:23 +02:00
Juraj Bubniak
daa8c4970d
lib/backup/s3remote: fix typos ()
Fixes a few typos in errors in lib/backup/s3remote package.

(cherry picked from commit 11c0b05e8a)
2024-07-29 14:30:21 +02:00
Aliaksandr Valialkin
9b529c2742
lib/backup/azremote: follow-up for 5fd3aef549
- Mention that credentials can be configured via env variables at both vmbackup and vmrestore docs.

- Make clear that the AZURE_STORAGE_DOMAIN env var is optional at https://docs.victoriametrics.com/vmbackup/#providing-credentials-via-env-variables

- Use string literals as is for env variable names instead of indirecting them via string constants.
  This makes easier to read and understand the code. These environment variable names aren't going to change
  in the future, so there is no sense in hiding them under string constants with some other names.

- Refer to https://docs.victoriametrics.com/vmbackup/#providing-credentials-via-env-variables in error messages
  when auth creds are improperly configured. This should simplify figuring out how to fix the error.

- Simplify the code a bit at FS.newClient(), so it is easier to follow it now.
  While at it, remove the check when superflouos environment variables are set, since it is too fragile
  and it looks like it doesn't help properly configuring vmbackup / vmrestore.

- Remove envLookuper indirection - just use 'func(name string) (string, bool)' type inline.
  This simplifies code reading and understanding.

- Split TestFSInit() into TestFSInit_Failure() and TestFSInit_Success(). This simplifies the test code,
  so it should be easier to maintain in the future.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6518
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5984
2024-07-17 17:55:39 +02:00
Aliaksandr Valialkin
43fc1183b9
app/vmalert: switch from table-driven tests to f-tests
This makes test code more clear and reduces the number of code lines by 500.
This also simplifies debugging tests. See https://itnext.io/f-tests-as-a-replacement-for-table-driven-tests-in-go-8814a8b19e9e

While at it, consistently use t.Fatal* instead of t.Error* across tests, since t.Error*
requires more boilerplate code, which can result in additional bugs inside tests.
While t.Error* allows writing logging errors for the same, this doesn't simplify fixing
broken tests most of the time.

This is a follow-up for a9525da8a4
2024-07-12 22:45:50 +02:00
hagen1778
7370f84b97
lib/bakcup/azremote: follow-up after 5fd3aef549
Simplify tests by converting them to f-tests.

5fd3aef549
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 03e4c5c19c)
2024-07-10 15:17:06 +02:00
justinrush
e65e55e2dd
lib/backup: add support for Azure Managed Identity ()
### Describe Your Changes

These changes support using Azure Managed Identity for the `vmbackup`
utility. It adds two new environment variables:

* `AZURE_USE_DEFAULT_CREDENTIAL`: Instructs the `vmbackup` utility to
build a connection using the [Azure Default
Credential](https://pkg.go.dev/github.com/Azure/azure-sdk-for-go/sdk/azidentity@v1.5.2#NewDefaultAzureCredential)
mode. This causes the Azure SDK to check for a variety of environment
variables to try and make a connection. By default, it tries to use
managed identity if that is set up.

This will close
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5984

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

### Testing

However you normally test the `vmbackup` utility using Azure Blob should
continue to work without any changes. The set up for that is environment
specific and not listed out here.

Once regression testing has been done you can set up [Azure Managed
Identity](https://learn.microsoft.com/en-us/entra/identity/managed-identities-azure-resources/overview)
so your resource (AKS, VM, etc), can use that credential method. Once it
is set up, update your environment variables according to the updated
documentation.

I added unit tests to the `FS.Init` function, then made my changes, then
updated the unit tests to capture the new branches.

I tested this in our environment, but with SAS token auth and managed
identity and it works as expected.

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Justin Rush <jarush@epic.com>
Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 5fd3aef549)
2024-07-10 12:26:21 +02:00
Andrii Chubatiuk
abc233a902
lib/backup/s3remote: fixed credsFilePath flag ()
properly use credsFilePath flag value

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6353

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit e678a9aa51)
2024-06-14 14:14:58 +02:00
Roman Khavronenko
f3e893f699
lib/backup: add -s3TLSInsecureSkipVerify command-line flag ()
* The new flag can be used for for skipping TLS certificates
verification when connecting to S3 endpoint. Affects vmbackup,
vmrestore, vmbackupmanager.

* replace deprecated `EndpointResolver` with `BaseEndpoint`

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1056

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit ac836bcf6c)
2024-05-22 16:40:06 +02:00
Aliaksandr Valialkin
6fe21520ce
all: replace old https://docs.victoriametrics.com/vmbackup.html url with the new one - https://docs.victoriametrics.com/vmbackup/ 2024-04-18 01:58:00 +02:00
Aliaksandr Valialkin
d845edc24b
lib: consistently use atomic.* types instead of atomic.* functions
See ea9e2b19a5
2024-02-24 02:10:04 +02:00
Aliaksandr Valialkin
61519f6c22
lib/backup/actions: expose vm_backups_downloaded_bytes_total metric in order to be consistent with vm_backups_uploaded_bytes_total metric 2024-02-24 01:14:57 +02:00
Aliaksandr Valialkin
510e3d9cda
lib/backup/actions: update vm_backups_uploaded_bytes_total metric along the file upload instead of after the file upload
This solves two issues:

1. The vm_backups_uploaded_bytes_total metric will grow more smoothly
2. This prevents from int overflow at metrics.Counter.Add() when uploading files bigger than 2GiB
2024-02-24 01:08:34 +02:00
Aliaksandr Valialkin
0ac1c533dc
lib/backup/actions: consistently use atomic.* types instead of atomic.* functions
See ea9e2b19a5
2024-02-24 01:02:37 +02:00
Aliaksandr Valialkin
ae12ac69ba
lib/snapshot: move Time, Validate and NewName into lib/snapshot/snapshotutil package
This allows removing importing unneeded command-line flags into binaries, which import lib/storage,
which, in turn, was importing lib/snapshot in order to use Time, Validate and NewName functions.

This is a follow-up for 83e55456e2

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5738
2024-02-09 04:19:30 +02:00
Aliaksandr Valialkin
15a7542ef8
vendor: run make vendor-update 2023-12-11 10:48:47 +02:00
Aliaksandr Valialkin
2f4dc2aff1
lib/backup: consistently use path.Join() when constructing paths for s3, gs and azblob
E.g. replace `fs.Dir + filePath` with `path.Join(fs.Dir, filePath)`

The fs.Dir is guaranteed to end with slash - see Init() functions.
The filePath may start with slash. If it starts with slash, then `fs.Dir + filePath` constructs
an incorrect path with double slashes.
path.Join() properly substitutes duplicate slashes with a single slash in this case.

While at it, also substitute incorrect usage of filepath.Join() with path.Join()
for constructing paths to object storage systems, which expect forward slashes in paths.
filepath.Join() substittues forward slashes with backslashes on Windows, so this may break
creating or managing backups from Windows.

This is a follow-up for 0399367be602b577baf6a872ca81bf0f99ba401b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/719
2023-12-04 17:25:41 +02:00
Alexander Marshalov
1b4e7fcdb3
fixed error when creating a full backup using the -origin flag ()
* fixed error when creating a full backup using the `-origin` flag ()

* Update docs/CHANGELOG.md

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-10-16 14:01:16 +02:00
hagen1778
709a2bad66
docs: remove extra / in the end of the link
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-10-16 13:24:30 +02:00
Zakhar Bessarab
8b42a1733c
lib/backup: add -deleteAllObjectVersions command-line flag ()
New flag enforces removal of all versions of the object in remote object storage.

See:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5121
- https://docs.victoriametrics.com/vmbackup.html#permanent-deletion-of-objects-in-s3-compatible-storages

(cherry picked from commit 2fc7e9f47e)
2023-10-10 14:14:21 +02:00
Zakhar Bessarab
e216592378
lib/backup: fix issue with inconsistent copying of appliedRetention.txt ()
* lib/backup: fix issue with inconsistent copying of appliedRetention.txt

appliedRetention.txt can be modified in place, so it should be always copied just the same as parts.json

Updates: https://github.com/victoriaMetrics/victoriaMetrics/issues/5005
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* docs: add changelog entry for appliedRetention.txt copying fix

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-09-21 11:26:13 +02:00
Aliaksandr Valialkin
d0c103ad05
lib/backup: properly copy parts.json files inside indexdb directory additional to data directory
This is a follow-up for 264ffe3fa1

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5005
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5006
2023-09-19 00:38:31 +02:00
Aliaksandr Valialkin
c0b8143daf
lib/backup/common: consistently use canonical path with / directory separators at Part.Path
Previously Part.Path could contain `\` directory separators on Windows OS,
which could result in incorrect filepaths generation when making backups at object storage.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4704

This is a follow-up for f2df8ad480
2023-09-19 00:36:32 +02:00
Zakhar Bessarab
aa583f0b9a
lib/backup: force copying of parts.json ()
* lib/backup: force copying of parts.json

Copying of parts.json is required because `part.key()` comparison can create same key value for files with different contents. This will result in inconsistent backup being created or restored.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5005
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/backup: ensure parts.json is only copied once

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-09-18 16:18:03 +02:00
Aliaksandr Valialkin
1361239393
app/vmbackup: add ability to make server-side copying of existing backups 2023-08-13 17:26:26 -07:00
Zakhar Bessarab
9efc077214
vmbackupmanager: fixes for windows compatibility ()
* app/vmbackupmanager/storage: fix path join for windows

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4704

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/backup: fixes for windows support

- close dir before running os.RemoveAll. Windows FS does not allow to delete directory before all handles will be closed.

- add path "normalization" for local FS to use the same format of paths for both *unix and Windows

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4704

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-08-11 05:43:28 -07:00
Alexander Marshalov
eb611c3dc3
fix removing storage data dir before restoring from backup ()
* fix removing storage data dir before restoring from backup

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* fix review comment

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* fix review comment

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* fixes after merge with `enterprise-single-node` branch

Signed-off-by: Alexander Marshalov <_@marshalov.org>

---------

Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-07-06 22:32:12 -07:00
Aliaksandr Valialkin
eda26a8352
lib/backup/actions: remove misleading comment about the default value for Concurrency field 2023-07-06 22:31:40 -07:00
Alexander Marshalov
677c8a5465
show backup progress percentage in vmbackup log during backup uploading and restoring progress percentage in vmrestore log during backup downloading () ()
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-07-06 21:56:54 -07:00
Alexander Marshalov
b3f8bb5b50
vmbackupmanager bugfixes: ()
- error on running with empty -dst dir and without -runOnStart
- error on restoring with backup, created before v1.90.0
2023-07-05 22:08:04 -07:00
Alexander Marshalov
ad35081066
backup metadata are written in separate file ()
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-05-16 11:24:44 -07:00
Alexander Marshalov
f796c5dd9e
fixed error with double slash in vmbackupmanager ()
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-05-11 13:38:40 -07:00
justcompile
44e929fbc6
squash commits () 2023-05-08 23:18:08 -07:00
Nikolay
7bfa1d7d9e
lib/backup: fixes path generation for windows ()
replaces custom fsync function with standard Fsync methods for files.
fixes pattern matching for parts and properly generate backup path for local fs.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-05-08 23:16:26 -07:00
Aliaksandr Valialkin
a7678350ad
lib/fs: convert CreateFlockFile to MustCreateFlockFile
Callers of CreateFlockFile log the returned err and exit.
It is better to log the error inside the MustCreateFlockFile together with the path
to the specified directory and the call stack. This simplifies
the code at the callers' side while leaving the debuggability at the same level.
2023-04-14 19:51:52 -07:00
Aliaksandr Valialkin
aac3dccfd1
lib/fs: replace MkdirAllIfNotExist->MustMkdirIfNotExist and MkdirAllFailIfExist->MustMkdirFailIfExist
Callers of these functions log the returned error and then exit. The returned error already contains the path
to directory, which was failed to be created. So let's just log the error together with the call stack
inside these functions. This leaves the debuggability of the returned error at the same level
while allows simplifying the code at callers' side.

While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk().
It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.
2023-04-13 22:22:08 -07:00
Aliaksandr Valialkin
e95e401e4d
app/vmbackupmanager: sync with enterprise-single-node branch after 41a54c775891c87e3d5ed59ff0769c869dd2fe71 2023-04-13 19:38:28 -07:00
Zakhar Bessarab
217eea6e15
lib/backup/actions: store metadata(creation and completion time) in backup files ()
This makes it easier to understand exact point in time which is included in this backup.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-04-13 19:20:34 -07:00
Aliaksandr Valialkin
f6c36d5dfd
lib/storage: consistently use OS-independent separator in file paths
This is needed for Windows support, which uses `\` instead of `/` as file separator

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-03-25 14:34:36 -07:00
Oleksandr Redko
0e1c395609
app,lib: fix typos in comments () 2023-02-13 09:32:35 -08:00
Aliaksandr Valialkin
146b3bd088
lib/backup/azremote: fix after upgrading github.com/Azure/azure-sdk-for-go/sdk/storage/azblob from v0.6.1 to v1.0.0 2023-02-08 09:19:10 -08:00
Zakhar Bessarab
decf46d72b
app/vmbackupmanager: add metrics for better observability ()
* app/vmbackupmanager: add metrics for better observability, include more information to `/api/v1/backups` API call response

* app/vmbackupmanager: drop old metrics before creating new ones

* app/vmbackupmanager: use `_total` postfix for counter metrics

* app/vmbackupmanager: remove `_total` postfix for gauge-like metrics

* app/vmbackupmanager: add `_last_run_failed` metrics for backups and retention

* app/vmbackupmanager: address review feedback

* app/vmbackupmanager: fix metric name

* app/vmbackupmanager: address review feedback, remove background updates of metrics, add restoring state of `_last_run_failed` metric from remote storage

* app/vmbackupmanager: improve performance for backup size calculation

* app/vmbackupmanager: refactor backup and retention runs to deduplicate each run logic

* {app/vmbackupmanager,lib/formatutil}: move HumanizeBytes into lib package

* app/vmbackupmanager: fix creating new metrics instead of reusing existing ones

* lit/formatutil: add comment to make linter happy

* app/vmbackupmanager: address review feedback
2022-12-20 14:18:43 -08:00
Zakhar Bessarab
1e58eabde6
lib/backup/azremote: fix copying for parts larger than 256M by using async copy ()
* lib/backup/azremote: fix copying for parts larger than 256M by using async copy

* lib/backup/azremote: add description of an error for log message
2022-12-13 09:36:36 -08:00