Commit graph

9 commits

Author SHA1 Message Date
Aliaksandr Valialkin
bbebdf9ba1 lib/{storage,mergeset}: remove empty directories on startup. Such directories can be left after unclean shutdown on NFS storage
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1142
2021-04-22 13:02:44 +03:00
Aliaksandr Valialkin
4fb049bcba lib/fs: reduce the frequency of failed to remove directory ... due to NFS lock log warnings
Log `failed to remove directory ... due to NFS lock` warning only if the directory cannot be removed in one second.
2021-03-18 13:24:46 +02:00
Aliaksandr Valialkin
8683ea85e6 lib/fs: properly handle stale NFS file handle error during file deletion
This error can appear when -storageDataPath points to NFS volume and the given file has been already removed.
2021-02-26 23:25:14 +02:00
Aliaksandr Valialkin
9d32fb1d9e lib/fs: use WARN instead of ERROR log level for the message when NFS diretory removal temporarily fails
this is expected condition, so it is better to use WARN log level for it
2020-08-09 12:07:30 +03:00
Aliaksandr Valialkin
639b26b40c lib/fs: export vm_nfs_pending_dirs_to_remove metric for monitoring the number of pending directories that couldn't be removed due to NFS lock 2020-08-06 15:31:34 +03:00
Aliaksandr Valialkin
638a5cbb16 lib/{mergeset,storage}: remove transaction files only after the mentioned dirs are really removed
This should fix the issue on NFS when incompletely removed dirs may be left
after unclean shutdown (OOM, kill -9, hard reset, etc.), while the corresponding transaction
files are already removed.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-12-02 21:36:31 +02:00
Aliaksandr Valialkin
6c2303764e Revert "lib/fs: do not postpone directory removal on NFS error"
This reverts commit 4c02e496f7.

Reason for revert: the commit breaks on NFS - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/234
2019-11-12 16:18:09 +02:00
Aliaksandr Valialkin
4c02e496f7 lib/fs: do not postpone directory removal on NFS error
Continue trying to remove NFS directory on temporary errors for up to a minute.

The previous async removal process breaks in the following case during VictoriaMetrics start

- VictoriaMetrics opens index, finds incomplete merge transactions and starts replaying them.
- The transaction instructs removing old directories for parts, which were already merged into bigger part.
- VictoriaMetrics removes these directories, but their removal is delayed due to NFS errors.
- VictoriaMetrics scans partition directory after all the incomplete merge transactions are finished
  and finds directories, which should be removed, but weren't still removed due to NFS errors.
- VictoriaMetrics panics when it finds unexpected empty directory.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-11-10 13:24:51 +02:00
Aliaksandr Valialkin
88f8670ede lib/fs: add MustStopDirRemover for waiting until pending directories are removed on graceful shutdown
This patch is mainly required for laggy NFS. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-09-05 11:13:17 +03:00