Commit graph

43 commits

Author SHA1 Message Date
Guillem Jover
1d8b7faf71
spelling and grammar fixes via codespell ()
### Describe Your Changes

Fix many spelling errors and some grammar, including misspellings in
filenames.

The change also fixes a typo in metric `vm_mmaped_files` to `vm_mmapped_files`.
While this is a breaking change, this metric isn't used in alerts or dashboards.
So it seems to have low impact on users.

The change also deprecates `cspell` as it is much heavier and less usable.
---------

Co-authored-by: Andrii Chubatiuk <achubatiuk@victoriametrics.com>
Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>

(cherry picked from commit 76d205feae)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-03-17 16:38:11 +01:00
Aliaksandr Valialkin
6f9d70ae89
lib/{mergeset,storage,logstorage}: use chunked buffer instead of bytesutil.ByteBuffer as a storage for in-memory parts
This commit adds lib/chunkedbuffer.Buffer - an in-memory chunked buffer
optimized for random access via MustReadAt() function.
It is better than bytesutil.ByteBuffer for storing large volumes of data,
since it stores the data in chunks of a fixed size (4KiB at the moment)
instead of using a contiguous memory region. This has the following benefits over bytesutil.ByteBuffer:

- reduced memory fragmentation
- reduced memory re-allocations when new data is written to the buffer
- reduced memory usage, since the allocated chunks can be re-used
  by other Buffer instances after Buffer.Reset() call

Performance tests show up to 2x memory reduction for VictoriaLogs
when ingesting logs with big number of fields (aka wide events) under high speed.
2025-03-15 21:20:04 +01:00
Roman Khavronenko
53904f8816
lib/bytesutil: don't drop ByteBuffer.B when its capacity is bigger th… ()
…an 64KB at Reset

This commit reverts
b58e2ab214
as it has negative impacts when ByteBuffer is used for workloads that
always exceed 64KiB size. This significantly slows down affected
components because:
* buffers aren't beign reused;
* growing new buffers to >64KiB is very slow.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8501

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2025-03-15 01:39:01 +01:00
Aliaksandr Valialkin
0a8d52376e
lib/bytesutil: drop ByteBuffer.B when its capacity is bigger than 64KB at Reset
There is little sense in keeping too big buffers - they just waste RAM and do not reduce
the load on GC too much. So it is better dropping such buffers at Reset instead of keeping them around.

(cherry picked from commit b58e2ab214)
2025-02-19 13:30:03 +01:00
Roman Khavronenko
c41a9b8d17
lib/bytesutil: smooth buffer growth rate ()
Before, buffer growth was always x2 of its size, which could lead to
excessive memory usage when processing big amount of data.
For example, scraping a target with hundreds of MBs in response could
result into hih memory spikes in vmagent because buffer has to double
its size to fit the response. See
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6759

The change smoothes out the growth rate, trading higher allocation rate
for lower mem usage at certain conditions.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit f28f496a9d)
2024-08-07 16:59:23 +02:00
Aliaksandr Valialkin
d6415b2572
all: consistently use 'any' instead of 'interface{}'
'any' type is supported starting from Go1.18. Let's consistently use it
instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.
2024-07-10 00:23:26 +02:00
Aliaksandr Valialkin
faf07fbc67
lib/bytesutil: optimize internStringMap cleanup
- Make it in a separate goroutine, so it doesn't slow down regular intern() calls.

- Do not lock internStringMap.mutableLock during the cleanup routine, since now
  it is called from a single goroutine and reads only the readonly part of the internStringMap.
  This should prevent from locking regular intern() calls for new strings during cleanups.

- Add jitter to the cleanup interval in order to prevent from synchornous increase in resource usage
  during cleanups.

- Run the cleanup twice per -internStringCacheExpireDuration . This should save 30% CPU time spent
  on cleanup comparing to the previous code, which was running the cleanup 3 times per -internStringCacheExpireDuration .
2024-06-13 15:09:42 +02:00
Aliaksandr Valialkin
9135b404d9
lib/logstorage: work-in-progress 2024-06-11 17:51:01 +02:00
Aliaksandr Valialkin
6470eac7dc
lib/bytesutil: reduce the number of memory allocations per each interned string in bytesutil.InternString() from 5 to 1
This should reduce GC overhead when tens of millions of strings are interned (for example, during stream deduplication
of millions of active time series).
2024-06-10 18:06:24 +02:00
Aliaksandr Valialkin
87338633b1
lib/slicesutil: add helper functions for setting slice length and extending its capacity
The added helper functions - SetLength() and ExtendCapacity() - replace error-prone code with simple function calls.
2024-05-12 11:33:49 +02:00
Aliaksandr Valialkin
99269ea640
lib/bytesutil: use unsafe.String instead of unsafe conversion of slice header to string header 2024-02-29 17:28:04 +02:00
Aliaksandr Valialkin
3383f73191
lib/bytesutil: make BenchmarkToUnsafeString and BenchmarkToUnsafeBytes more reliable
This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5880
2024-02-29 17:12:30 +02:00
helen
74b0605232
Optimize TouUnsafeBytes to make it leaner, more standards-compliant and ()
slightly faster.
2024-02-29 17:12:04 +02:00
Aliaksandr Valialkin
d845edc24b
lib: consistently use atomic.* types instead of atomic.* functions
See ea9e2b19a5
2024-02-24 02:10:04 +02:00
Aliaksandr Valialkin
d9ecc3f6d7
lib/logger: add -loggerMaxArgLen command-line flag for fine-tuning the maximum length of logged args 2023-11-13 09:43:49 +01:00
Aliaksandr Valialkin
d8afd7fe98
Makefile: update golangci-lint from v1.51.2 to v1.54.2
See https://github.com/golangci/golangci-lint/releases/tag/v1.54.2
2023-09-01 10:25:49 +02:00
Roman Khavronenko
80768d53dd
docs: follow-up after aec4b5db81 ()
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-19 14:48:17 -07:00
Aliaksandr Valialkin
af6c14d5e7
lib/bytesutil: substitute parentheses with slashes in ByteBuffer.Path() output, so it can be passed to path manipulating functions
This is needed for the upcoming VictoriaLogs
2023-07-06 17:23:52 -07:00
Aliaksandr Valialkin
a24e08e6de
lib/bytesutil: go fmt after 2ec17bed2c 2023-05-10 20:29:15 -07:00
Aliaksandr Valialkin
b7239c2221
lib/bytesutil: add benchmarks for ToUnsafeString() and ToUnsafeBytes() 2023-05-10 13:05:33 -07:00
Alexander Marshalov
d321ea91f2
fixed typos in documentation and commandline flags descriptions () 2023-05-10 02:22:06 -07:00
Aliaksandr Valialkin
36559dfec2
lib/fs: improve error logging inside MustWriteData
Log the path to file on errors inside MustWriteData().
This improves debuggability of errors, which may occur inside MustWriteData().
2023-04-14 14:33:45 -07:00
Aliaksandr Valialkin
086a4b4fca
lib/bytesutil: add -internStringDisableCache and -internStringCacheExpireDuration command-line flags
This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3872
2023-02-27 14:18:02 -08:00
Aliaksandr Valialkin
a522bbc8b4
lib/bytesutil/internstring.go: increase the limit on the maximum string lengths, which can be interned
The limit has been increased from 300 bytes to 500 bytes according to the collected production stats.
This allows reducing CPU usage without significant increase of RAM usage in most practical cases.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692
2023-01-31 11:04:09 -08:00
Aliaksandr Valialkin
0698467ae5
lib/bytesutil: do not intern long strings, since they may need big amounts of additional memory for the cache
Allow users fine-tuning the maximum string length for interning via -internStringMaxLen command-line flag.
This may be used for fine-tuning RAM vs CPU usage for certain workloads.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3692
2023-01-23 23:37:08 -08:00
Aliaksandr Valialkin
d3f8298739
lib/bytesutil: add InternBytes() function as a shortcut to InternString(ToUnsafeString(..)) 2023-01-03 22:15:49 -08:00
Aliaksandr Valialkin
fcee36081b
lib/bytesutil: make sure that the cleanup code is performed only by a single goroutine out of many concurrently running goroutines
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466
2022-12-21 12:58:27 -08:00
Aliaksandr Valialkin
f3e5c9c246
lib/bytesutil: cache results for all the input strings, which were passed during the last 5 minutes from FastStringMatcher.Match(), FastStringTransformer.Transform() and InternString()
Previously only up to 100K results were cached.
This could result in sub-optimal performance when more than 100K unique strings were actually used.
For example, when the relabeling rule was applied to a million of unique Graphite metric names
like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466

This commit should reduce the long-term CPU usage for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466
after all the unique Graphite metrics are registered in the FastStringMatcher.Transform() cache.

It is expected that the number of unique strings, which are passed to FastStringMatcher.Match(),
FastStringTransformer.Transform() and to InternString() during the last 5 minutes,
is limited, so the function results fit memory. Otherwise OOM crash can occur.
This should be the case for typical production workloads.
2022-12-12 14:47:00 -08:00
Aliaksandr Valialkin
be6da5053f
lib/promscrape: optimize service discovery speed
- Return meta-labels for the discovered targets via promutils.Labels
  instead of map[string]string. This improves the speed of generating
  meta-labels for discovered targets by up to 5x.

- Remove memory allocations in hot paths during ScrapeWork generation.
  The ScrapeWork contains scrape settings for a single discovered target.
  This improves the service discovery speed by up to 2x.
2022-11-29 21:26:23 -08:00
Aliaksandr Valialkin
ed324aad66
lib/bytesutil: make sure that the string passed to FastStringMather.Match() is copied before using it as a key in the internal cache map
This prevents from possible corruption of the internal cache map
when the underlying byte slice used by the string key is modified.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3227
2022-10-14 09:52:18 +03:00
Aliaksandr Valialkin
b4bb1477fe
lib/regexutil: cache MatchString results for unoptimized regexps
This increases relabeling performance by 3x for unoptimized regexs
2022-09-30 12:28:25 +03:00
Aliaksandr Valialkin
c0aa10bd73
lib/bytesutil: move InternString() from lib/promscrape/discoverytutils to lib/bytesutil
lib/bytesutil is more appropriate place for InternString() function
2022-09-30 07:34:59 +03:00
Aliaksandr Valialkin
4afa25fb38
lib/bytesutil: add FastStringTransformer and use it in the rest of the code where needed 2022-09-28 10:39:42 +03:00
Aliaksandr Valialkin
4fb0f15322
all: readability improvements for query traces
- show dates in human-readable format, e.g. 2022-05-07, instead of a numeric value
- limit the maximum length of queries and filters shown in trace messages
2022-06-30 18:19:43 +03:00
Aliaksandr Valialkin
02b2bfcff3
lib/bytesutil: split Resize* funcs to MayOverallocate and NoOverallocate for more fine-grained control over memory allocations
Follow-up for f4989edd96

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-02-01 11:20:20 +02:00
Aliaksandr Valialkin
6232eaa938
lib/bytesutil: split Resize() into ResizeNoCopy() and ResizeWithCopy() functions
Previously bytesutil.Resize() was copying the original byte slice contents to a newly allocated slice.
This wasted CPU cycles and memory bandwidth in some places, where the original slice contents wasn't needed
after slize resizing. Switch such places to bytesutil.ResizeNoCopy().

Rename the original bytesutil.Resize() function to bytesutil.ResizeWithCopy() for the sake of improved readability.

Additionally, allocate new slice with `make()` instead of `append()`. This guarantees that the capacity of the allocated slice
exactly matches the requested size. The `append()` could return a slice with bigger capacity as an optimization for further `append()` calls.
This could result in excess memory usage when the returned byte slice was cached (for instance, in lib/blockcache).

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-25 15:28:42 +02:00
Aliaksandr Valialkin
76064ba9e7 Perform conversion from string to []byte according to rule at https://golang.org/pkg/unsafe/#Pointer 2020-08-05 11:55:12 +03:00
Aliaksandr Valialkin
304f9499cf lib/bytesutil: prevent from garbage collecting s before returning from ToUnsafeBytes 2020-06-03 00:23:27 +03:00
Aliaksandr Valialkin
da19fffa08 all: rename ReadAt* to MustReadAt* in order to dont clash with io.ReaderAt 2020-01-30 15:16:16 +02:00
Artem Navoiev
e05500cbd4 add unittests for bytesutil and storage () 2019-11-04 00:57:24 +02:00
Aliaksandr Valialkin
24ae3ef532 lib/prompb: remove superflouos bytes copying in ReadSnappy 2019-06-18 21:02:02 +03:00
Aliaksandr Valialkin
bdf696ef18 all: fix misspellings 2019-05-25 21:51:24 +03:00
Aliaksandr Valialkin
1836c415e6 all: open-sourcing single-node version 2019-05-23 00:18:06 +03:00