Commit graph

2367 commits

Author SHA1 Message Date
Aliaksandr Valialkin
2a5c6e1cd5
app/vmstorage: deprecate -snapshotCreateTimeout command-line flag
Creating snapshot shouldn't time out under normal conditions.
The timeout was related to the bug, which has been fixed in 6460475e3b .

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551
2024-02-23 04:51:57 +02:00
Aliaksandr Valialkin
42437e05c7
lib/storage: do not drop (date, metricID) entries for the date older than 2 days if samples are ingested at this date
Previously the (date, metricID) entries for dates older than the last 2 days were removed.
This could lead to slow check for the (date, metricID) entry in the indexdb during ingesting historical data (aka backfilling).

The issue has been introduced in 431aa16c8d
2024-02-23 04:06:54 +02:00
Aliaksandr Valialkin
83217b7473
app/vmselect: add -search.maxLabelsAPIDuration and -search.maxLabelsAPISeries options for fine-tuning CPU and RAM usage for /api/v1/series , /api/v1/labels and /api/v1/label/.../values
This commit returns back limits for these endpoints, which have been removed at 5d66ee88bd ,
since it has been appeared that missing limits result in high CPU usage, while the introduced concurrency limiter
results in failed lightweight requests to these endpoints because of timeout when heavyweight requests are executed.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055
2024-02-23 02:56:58 +02:00
Aliaksandr Valialkin
21170e558c
lib/promutils: hide the math.Round() logic inside ParseTimeMsec() function
This should prevent from bugs similar to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5801 in the future

This is a follow-up for ce3ec3ff2e
2024-02-23 01:21:42 +02:00
Aliaksandr Valialkin
dfcbcf4368
lib/mergeset: run go fmt after bace9a2501 2024-02-23 01:21:31 +02:00
Aliaksandr Valialkin
19032f9913
lib/{mergeset,storage}: convert bufferred items to searchable parts more optimally
Do not convert shard items to part when a shard becomes full. Instead, collect multiple
full shards and then convert them to a searchable part at once. This reduces
the number of searchable parts, which, in turn, should increase query performance,
since queries need to scan smaller number of parts.
2024-02-23 01:21:03 +02:00
Nikolay
22762d7a69
app/vmselect: change export/csv timestamp format for rfc3339 to respect milliseconds (#5853)
* app/vmselect: adds milliseconds to the csv export response for rfc3339
* milliseconds is a standard prescion for VictoriaMetrics query request responses
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5837

* app/victoria-metrics: adds tests for csv export/import
follow-up after 3541a8d0cf96dd4f8563624c4aab6816615d0756

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-02-23 01:16:08 +02:00
Aliaksandr Valialkin
08c5250a7b
lib/storage: handle common case when the number of rows passed to flushRowsToInmemoryParts() doesnt exceed maxRawRowsPerShard 2024-02-23 01:12:18 +02:00
Aliaksandr Valialkin
8669584e9f
lib/{storage,mergeset}: convert beffered items into searchable in-memory parts exactly once per the given flush interval
Previously the interval between item addition and its conversion to searchable in-memory part
could vary significantly because of too coarse per-second precision. Switch from fasttime.UnixTimestamp()
to time.Now().UnixMilli() for millisecond precision. It is OK to use time.Now() for tracking
the time when buffered items must be converted to searchable in-memory parts, since time.Now()
calls aren't located in hot paths.

Increase the flush interval for converting buffered samples to searchable in-memory parts
from one second to two seconds. This should reduce the number of blocks, which are needed
to be processed during high-frequency alerting queries. This, in turn, should reduce CPU usage.

While at it, hardcode the maximum size of rawRows shard to 8Mb, since this size gives the optimal
data ingestion pefromance according to load tests. This reduces memory usage and CPU usage on systems
with big amounts of RAM under high data ingestion rate.
2024-02-23 01:11:57 +02:00
Aliaksandr Valialkin
5f1fa8e7f7
lib/storage: avoid superflouos copy of block header data 2024-02-23 01:11:31 +02:00
Aliaksandr Valialkin
a982ab6bfb
app/vmstorage: expose vm_snapshots metric, which shows the current number of snapshots
While at it, refresh docs about snapshots - https://docs.victoriametrics.com/#how-to-work-with-snapshots
2024-02-23 01:07:04 +02:00
Aliaksandr Valialkin
3f9022bc08
lib/storage: do not pool rawRowsBlock when flushing rawRows to in-memory blocks
The pooled rawRowsBlock objects occupies big amounts of memory between flushes,
and the flushes are relatively rare. So it is better to don't use the pool
and to allocate rawRow blocks on demand. This should reduce the average
memory usage between flushes.
2024-02-23 01:06:28 +02:00
Aliaksandr Valialkin
bf07e2ac87
lib/storage: do not keep rawRows buffer across flush() calls
The buffer can be quite big under high ingestion rate (e.g. more than 100MB).
This leads to increased memory usage between buffer flushes.
So it is better to re-create the buffer on every flush in order to reduce memory usage
between buffer flushes.
2024-02-23 01:06:09 +02:00
Alexander Marshalov
8322425364
[lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5814)
* [lib/promutils, lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801)

* fixed tests

* fixed test

* Revert "fixed test"

This reverts commit 8a29764806.

* Revert "fixed tests"

This reverts commit 9ce13d1042.

* Revert "[lib/promutils, lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801)"

This reverts commit a7a04bd4

* [lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801)

---------

Co-authored-by: Nikolay <nik@victoriametrics.com>
2024-02-23 00:58:26 +02:00
Aliaksandr Valialkin
b58c429044
app/vlselect: follow-up for 451d2abf50
- Consistently return the first `limit` log entries if the total size of found log entries doesn't exceed 1Mb.
  See app/vlselect/logsql/sort_writer.go . Previously random log entries could be returned with each request.
- Document the change at docs/VictoriaLogs/CHANGELOG.md
- Document the `limit` query arg at docs/VictoriaLogs/querying/README.md
- Make the change less intrusive.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5674
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5778
2024-02-18 23:06:08 +02:00
Dmytro Kozlov
2d674f98d4
Enable the limit query param for the /select/logsql/query (#5778)
* app/vlselect: add limit for logs query

* app/vlselect: CHANGELOG.md

* app/vlselect: stop search process if limit is reached, update logic, remove default limit

* app/vlselect: fix tests

* app/vlselect: fix filter tests

* app/vlselect: fix tests
2024-02-18 22:59:16 +02:00
Aliaksandr Valialkin
82e38e1627
lib/promscrape: add support for enable_compression option in the same way as Prometheus does
Updates https://github.com/prometheus/prometheus/pull/13166
Updates https://github.com/prometheus/prometheus/issues/12319

Do not document enable_compression option at docs/sd_configs.md, since vmagent already supports
more clear disable_compression option - see https://docs.victoriametrics.com/vmagent/#scrape_config-enhancements
2024-02-18 19:42:09 +02:00
Aliaksandr Valialkin
f0db7d474f
lib/promscrape/discovery/kuma: add support for client_id option
See https://github.com/prometheus/prometheus/pull/13278
2024-02-18 19:19:55 +02:00
Aliaksandr Valialkin
55bba932d4
docs/CHANGELOG.md: document f8207e33a2 2024-02-17 17:55:01 +02:00
Alexander Marshalov
89e9bfc276
lib/httputils: fixed error message for getting zero duration (#5795) (#5812)
(cherry picked from commit f8207e33a2)
2024-02-16 15:31:59 +01:00
Aliaksandr Valialkin
33b2553c78
app/vmstorage: expose vm_last_partition_parts metrics, which may help identifying performance issues related to the increased number of parts in the last partition 2024-02-15 14:52:53 +02:00
Aliaksandr Valialkin
d4875cccdf
lib/uint64set: go fmt after c0a9b87f46 2024-02-15 14:52:53 +02:00
Aliaksandr Valialkin
4e9b70e8b4
lib/mergeset: optimize Set.AddMulti() a bit for len(items) < 10000
This should improve the search speed for time series matching the given label filters
2024-02-15 14:31:00 +02:00
Aliaksandr Valialkin
c89f4c97f3
lib/uint64set: benchmark AddMulti on small number of items, since this case is the most frequent in lib/storage 2024-02-15 14:31:00 +02:00
Aliaksandr Valialkin
06da06dac0
lib/promrelabel: store the original labels before returning them them to promutils.PutLabels()
This should reduce memory allocations.

This is a follow-up for b09bd6c42a

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389
2024-02-14 16:09:38 +02:00
Aliaksandr Valialkin
990a46c478
lib/promrelabel: factor out applyInternal code into ApplyDebug and Apply functions
This improves readability and maintanability

Also remove memory allocation from SortLabels()
2024-02-14 14:27:44 +02:00
Aliaksandr Valialkin
61608b6303
lib/promscrape: avoid copying labels when -promscrape.dropOriginalLabels command-line flag is set
This should save some CPU

This regression has been introduced in 487f6380d0
when working on https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389
2024-02-14 03:26:32 +02:00
Aliaksandr Valialkin
f5680a6857
all: upgrade Go builder from Go1.21.7 to Go1.22.0
See https://go.dev/doc/go1.22
2024-02-12 22:14:00 +02:00
Aliaksandr Valialkin
99aaa5067f
lib/mergeset: do not panic on too long items passed to Table.AddItems()
Instead, log a sample of these long items once per 5 seconds into error log,
so users could notice and fix the issue with too long labels or too many labels.

Previously this panic could occur in production when ingesting samples with too long labels.
2024-02-12 20:18:19 +02:00
Aliaksandr Valialkin
397bb8771b
lib/mergeset: properly record the firstItem in metaindexRow at blockStreamWriter.WriteBlock
The 3c246cdf00 added an optimization where the previous metaindexRow
could be saved to disk when the current block header couldn't be added indexBlock because the resulting
indexBlock size became too big. This could result in an empty metaindexRow.firstItem for the next metaindexRow.
2024-02-12 20:16:50 +02:00
Aliaksandr Valialkin
838b2275d7
lib/storage: do not append headerData to bsw.indexData if its size exceeds maxBlockSize
This is a follow-up optimization after 3c246cdf00
2024-02-12 20:16:32 +02:00
Aliaksandr Valialkin
ae12ac69ba
lib/snapshot: move Time, Validate and NewName into lib/snapshot/snapshotutil package
This allows removing importing unneeded command-line flags into binaries, which import lib/storage,
which, in turn, was importing lib/snapshot in order to use Time, Validate and NewName functions.

This is a follow-up for 83e55456e2

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5738
2024-02-09 04:19:30 +02:00
Aliaksandr Valialkin
cf64597878
all: add support for specifying multiple -httpListenAddr options 2024-02-09 03:22:49 +02:00
Aliaksandr Valialkin
ae7da12280
lib/httpserver: do not close client connections every 2 minutes by default
Closing client connections every 2 minutes doesn't help load balancing -
this just leads to "jumpy" connections between multiple backend servers,
e.g. the load isn't spread evenly among backend servers, and instead jumps
between the servers every 2 minutes.

It is still possible periodically closing client connections by specifying non-zero -http.connTimeout command-line flag.

This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1304#issuecomment-1636997037

This is a follow-up for d387da142e
2024-02-08 21:10:54 +02:00
Khushi Jain
a076cb4a93
app/vmbackup: support client-side TLS configuration for create/delete snapshot API (#5738)
(cherry picked from commit 83e55456e2)
2024-02-08 15:58:34 +01:00
Aliaksandr Valialkin
1856c9fcc1
lib/mergeset: add a test for too long item passed to Table.AddItems() 2024-02-08 14:14:23 +02:00
Aliaksandr Valialkin
d2a846eddd
lib/mergeset: typo fix: indexdb/indexBlock -> indexdb/indexBlocks 2024-02-08 14:14:23 +02:00
Aliaksandr Valialkin
950b126a09
lib/{storage,mergeset}: do not create index blocks with sizes exceeding 64Kb in common case
This should reduce memory fragmentation and memory usage for indexdb/indexBlocks and storage/indexBlocks caches
2024-02-08 14:14:22 +02:00
Aliaksandr Valialkin
1c3eac5c1e
lib/mergeset: verify that the index block for in-memory part doesnt exceed the 3*maxIndexBlockSize 2024-02-08 14:14:22 +02:00
Aliaksandr Valialkin
9a3a88b321
lib/mergeset: do not store commonPrefix in blockHeader if the block contains only a single item
There is no sense in storing commonPrefix for blockHeader containing only a single item,
since this only increases blockHeader size without any benefits.
2024-02-08 14:14:22 +02:00
Aliaksandr Valialkin
ae2a9c8195
lib/mergeset: prevent from possible too big indexBlockSize panic
This panic could occur when samples with too long label values are ingested into VictoriaMetrics.
This could result in too long fistItem and commonPrefix values at blockHeader (up to 64kb each).
This may inflate the maximum index block size by 4 * maxIndexBlockSize.
2024-02-08 12:55:58 +02:00
Aliaksandr Valialkin
ec02e9ba19
lib/protoparser/datadogsketches: use math.RoundToEven() for calculating the rank
The original code uses this function - see 48d52eeea6/pkg/quantile/sparse.go (L138)

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5775
2024-02-07 21:45:05 +02:00
Aliaksandr Valialkin
28fffdfcc7
lib/protoparser/datadogsketches: add more permalinks to the original source code
These permalinks should help verifying the correctness of the code

This is a follow-up after 07213f4e0c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5775
2024-02-07 21:45:05 +02:00
Andrii Chubatiuk
3aa439a618
added ddsketch permalink (#5775)
Co-authored-by: Andrew Chubatiuk <andrew.chubatiuk@motional.com>
2024-02-07 21:45:04 +02:00
Aliaksandr Valialkin
5d9e0ab71e
docs/CHANGELOG.md: support empty command-line flag values in short array notation
For example, -fooDuration=',10s,' is now supported - it sets three command-line flag values:

- the first and the last one are set to the default value for `-fooDuration`
- the second one is set to 10s
2024-02-07 20:55:01 +02:00
Aliaksandr Valialkin
82f4e4e070
app/{vmagent,vminsert}: follow-up after a1d1ccd6f2
- Document the change at docs/CHANGELOG.md
- Copy changes from docs/Single-server-VictoriaMetrics.md to README.md
- Add missing handler for processing multitenant requests ( https://docs.victoriametrics.com/vmagent/#multitenancy )
- Substitute github.com/stretchr/testify dependency with 3 lines of code in the added tests
- Comment unclear code at lib/protoparser/datadogsketches/parser.go , so @AndrewChubatiuk could update it
  and add permalinks to the original source code there.
- Various code cleanups

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5584
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3091
2024-02-07 01:31:52 +02:00
Andrii Chubatiuk
c634859c4f
support datadog /api/beta/sketches API (#5584)
Co-authored-by: Andrew Chubatiuk <andrew.chubatiuk@motional.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-02-07 01:30:00 +02:00
Aliaksandr Valialkin
293617028d
lib/storage: move fixupTimestamps() call to Block.Init()
This is a follow-up for 0bf7921721
2024-02-06 22:44:09 +02:00
Zakhar Bessarab
fdbc44d813
lib/storage/raw_row: properly initialize TS for tmp blocks (#5762)
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2024-02-06 22:44:08 +02:00
Aliaksandr Valialkin
e19b53748a
lib/fs: lazily open the file at ReaderAt on the first access
This should significantly reduce the number of open ReaderAt files
on VictoriaMetrics and VictoriaLogs startup.

The open files can be tracked via vm_fs_readers metric
2024-02-06 21:10:00 +02:00