add command-line flag `-search.inmemoryBufSizeBytes` for configuring size of in-memory buffers used by vmselect during processing of vmstorage responses. A new summary metric `vm_tmp_blocks_inmemory_file_size_bytes` is exposed to show the size of the buffer during requests processing.
The new setting can be used by experienced users to adjust memory usage by vmselect when processing
many small read requests. Instead of allocating 4MB buffers each time, vmselect can be instructed to lower
the buffer size via `-search.inmemoryBufSizeBytes`. To make the decision whether this flag needs to be adjusted
users can consult with `vm_tmp_blocks_inmemory_file_size_bytes` which shows the actual size of buffers used
during query processing.
----------
The detailed information of this PR can be found in
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6851
### Checklist
The following checks are **mandatory**:
- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
---------
Co-authored-by: hagen1778 <roman@victoriametrics.com>
This reverts commit cb23685681.
Reason for revert: the "fix" may hide programming bugs related to incorrect creation of folders
before their use. This may complicate detecting and fixing such bugs in the future.
There are the following fixes for the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5985 :
- To configure the OS to do not drop data from the system-wide temporary directory (aka /tmp).
- To run VictoriaMetrics with -cacheDataPath command-line flag, which points to the directory,
which cannot be removed automatically by the OS.
The case when the user accidentally deletes the directory with some files created by VictoriaMetrics
shouldn't be considered as expected, so VictoriaMetrics shouldn't try resolving this case automatically.
It is much better from operation and debuggability PoV is to crash with the clear `directory doesn't exist` error
in this case.
vmselect uses a cache folder in file system for two purposes:
1. Storing rollup cache results on shutdown;
2. Storing temporary search results from vmstorage during query executions.
It could happen that cache folder is deleted accidentally by user, or by OS
during cleanup routines. This would cause vmselect to:
1. panic on /metrics call, because `MustGetFreeSpace` will fail;
2. return query error user, as it won't be able to store temporary search results.
The changes in this commit are the following:
1. Make `MustGetFreeSpace` to try re-creating the cache folder if it is missing;
2. Make vmselect to try re-creating the cache folder if it can't persist tmp search
results.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5985
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
(cherry picked from commit cb23685681)
Remove temporary file before closing it in order to signal the OS that it shouldn't
store the file contents from page cache to disk when the file is closed.
Gracefully handle the case when the file cannot be removed before being closed -
in this case remove the file after closing it. This allows working on Windows.
Also remove superflouos opening of temporary file for reading - re-use already opened file handle for writing.
This is a follow-up for 9b1e002287
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4020
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
Callers of these functions log the returned error and then exit. The returned error already contains the path
to directory, which was failed to be created. So let's just log the error together with the call stack
inside these functions. This leaves the debuggability of the returned error at the same level
while allows simplifying the code at callers' side.
While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk().
It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.
The io/ioutil package is deprecated since Go1.16 - see https://tip.golang.org/doc/go1.16#ioutil
VictoriaMetrics requires at least Go1.18, so it is time to remove the io/ioutil from source code
This is a follow-up for 02ca2342ab
This should improve vmselect performance scalability on systems with many CPU cores.
The following tasks were done:
- Use separate temporary files for storing the data read from each vmstorage node.
This may result in the following potential issues:
- Up to N times higher memory usage for performing each query where N is the number
of vmstorage nodes known to vmselect.
This issue shouldn't increase chances of out of memory errors in most cases,
since per-query memory overhead is quite low comparing to the overall vmselect memory usage.
- Up to N times higher number of open temporary files where N is the number
of vmstorage nodes known to vmselect.
This issue should be fixed by increasing the limit on the number of open files.
- Use separate counters per each vmstorage node for various stats calculation
when reading the data from vmstorage nodes.
Previously bytesutil.Resize() was copying the original byte slice contents to a newly allocated slice.
This wasted CPU cycles and memory bandwidth in some places, where the original slice contents wasn't needed
after slize resizing. Switch such places to bytesutil.ResizeNoCopy().
Rename the original bytesutil.Resize() function to bytesutil.ResizeWithCopy() for the sake of improved readability.
Additionally, allocate new slice with `make()` instead of `append()`. This guarantees that the capacity of the allocated slice
exactly matches the requested size. The `append()` could return a slice with bigger capacity as an optimization for further `append()` calls.
This could result in excess memory usage when the returned byte slice was cached (for instance, in lib/blockcache).
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
All the callers for fs.OpenReaderAt expect that the file will be opened.
So it is better to log fatal error inside fs.MustOpenReaderAt instead of leaving this to the caller.
* feat(vmselect): add tmp block dir size metrics `vm_tmp_blocks_files_size_total`
* refactor(vmselect): use free space instead of used space in tmp block file metrics
* fix: add `bytes` suffix to tmp dir free space metric
The maximum inmemory file size now depends on `-memory.allowedPercent`.
This should improve performance and reduce the number of filesystem calls
on machines with big amounts of RAM when performing heavy queries
over big number of samples and time series.