Previously the blocked directories were removed sequentially by a single goroutine.
This can be not enough for highly loaded VictoriaMetrics that accepts millions of sample per second,
when big number of LSM parts are created and removed at high rate.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
This should fix the issue on NFS when incompletely removed dirs may be left
after unclean shutdown (OOM, kill -9, hard reset, etc.), while the corresponding transaction
files are already removed.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
Continue trying to remove NFS directory on temporary errors for up to a minute.
The previous async removal process breaks in the following case during VictoriaMetrics start
- VictoriaMetrics opens index, finds incomplete merge transactions and starts replaying them.
- The transaction instructs removing old directories for parts, which were already merged into bigger part.
- VictoriaMetrics removes these directories, but their removal is delayed due to NFS errors.
- VictoriaMetrics scans partition directory after all the incomplete merge transactions are finished
and finds directories, which should be removed, but weren't still removed due to NFS errors.
- VictoriaMetrics panics when it finds unexpected empty directory.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162