github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2025-04-10 16:00:50 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	7a46af3920	victorialogs: add cluster mode Cluster mode is enabled when -storageNode command-line flag is passed to VictoriaLogs. In this mode it spreads the ingested logs among storage nodes specified in the -storageNode flag. It also queries storage nodes during `select` queries. Cluster mode allows building multi-level cluster setup when top-level select node can query multiple lower-level clusters and get global querying view. See https://docs.victoriametrics.com/victorialogs/cluster/ Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5077 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7950 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8223	2025-04-10 16:55:23 +02:00
Aliaksandr Valialkin	adae788b18	lib/logstorage: pad pipeStatsProcessorShard.groupMapShards in order to avoid false sharing when merging these shards in parallel on many CPU cores	2025-04-03 22:21:18 +02:00
Aliaksandr Valialkin	a65d10fcce	lib/logstorage: add padding between hitsMap items at hitsMapAdaptive.shards in order to avoid false sharing when processing the hitsMapAdaptive.shards on multiple CPU cores	2025-04-03 20:14:20 +02:00
Dan Dascalescu	0a49d8c930	chore: minor grammar fix in error messages (#8580 ) ### Describe Your Changes `its'` -> `its` ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2025-03-27 10:21:52 +01:00
Aliaksandr Valialkin	baf701889a	lib/logstorage: typo fix in the comment to Storage.GetStreamFieldValues() function	2025-03-19 13:22:16 +01:00
Aliaksandr Valialkin	ea4534d154	lib/logstorage: support for `{field in ()}` and `{field not_in ()}` syntax in LogsQL This is needed for https://github.com/VictoriaMetrics/victorialogs-datasource/issues/238 to be consistent with `in(*)` feature, which has been added in the commit `84d5771b41`	2025-03-19 12:09:55 +01:00
Guillem Jover	76d205feae	spelling and grammar fixes via codespell (#8497 ) ### Describe Your Changes Fix many spelling errors and some grammar, including misspellings in filenames. The change also fixes a typo in metric `vm_mmaped_files` to `vm_mmapped_files`. While this is a breaking change, this metric isn't used in alerts or dashboards. So it seems to have low impact on users. The change also deprecates `cspell` as it is much heavier and less usable. --------- Co-authored-by: Andrii Chubatiuk <achubatiuk@victoriametrics.com> Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>	2025-03-17 16:32:10 +01:00
Aliaksandr Valialkin	336f954056	lib/logstorage: switch the type of LogRows.streamTagCanonicals from [][]byte to []string This reduces the size of LogRows.streamTagCanonicals by 1/3 because of the eliminated `cap` field in the slice header (reflect.SliceHeader) compared to the string header (reflect.StringHeader).	2025-03-17 15:02:51 +01:00
Aliaksandr Valialkin	13ff9a8ebd	lib/{mergeset,storage,logstorage}: use chunked buffer instead of bytesutil.ByteBuffer as a storage for in-memory parts This commit adds lib/chunkedbuffer.Buffer - an in-memory chunked buffer optimized for random access via MustReadAt() function. It is better than bytesutil.ByteBuffer for storing large volumes of data, since it stores the data in chunks of a fixed size (4KiB at the moment) instead of using a contiguous memory region. This has the following benefits over bytesutil.ByteBuffer: - reduced memory fragmentation - reduced memory re-allocations when new data is written to the buffer - reduced memory usage, since the allocated chunks can be re-used by other Buffer instances after Buffer.Reset() call Performance tests show up to 2x memory reduction for VictoriaLogs when ingesting logs with big number of fields (aka wide events) under high speed.	2025-03-15 20:58:33 +01:00
Aliaksandr Valialkin	73aae546e0	lib/logstorage: pre-allocate buffers for fields and rows inside block.appendRowsTo() This reduces the number of memory re-allocations inside the loop, which copies the rows.	2025-03-15 17:18:45 +01:00
Aliaksandr Valialkin	174a6db19f	lib/logstorage: pre-allocated buffers for fields and rows inside rows.appendRows() This should reduce the number of memory re-allocations inside the loop, which copies the rows.	2025-03-15 16:39:19 +01:00
Aliaksandr Valialkin	0e413a7efb	lib/logstorage: pre-allocate the buffer needed for marshaling a block of strings inside marshalStringsBlock This reduces the number of memory re-allocations when appending the strings to the buffer in the loop.	2025-03-15 15:56:33 +01:00
Aliaksandr Valialkin	9769ad3a24	lib/logstorage: optimize copying dict values inside valuesDict.copyFrom a bit Pre-allocate the needed slice of strings and then assign items to it by index instead of appending them. This reduces the number of memory allocations and improves performance a bit.	2025-03-15 15:32:21 +01:00
Aliaksandr Valialkin	8e773564b1	lib/logstorage: intern column names instead of cloning them during data ingestion This reduces the number of memory allocations when ingesting logs with big number of fields (aka wide events)	2025-03-15 15:29:54 +01:00
Aliaksandr Valialkin	2c7dd2b991	lib/logstorage: support for `{label in (v1,...,vN)}` and `{label not_in (v1, ..., vN)}` syntax	2025-03-15 01:35:13 +01:00
Aliaksandr Valialkin	c60b4175bb	app/vlinsert: add an ability to ignore log fields starting with the given prefixes The `ignore_fields` HTTTP query args can contain prefixes ending with ''. For example, `ignore_fields=foo.,bar` skips all the fields starting with `foo.` during data ingestion.	2025-03-15 00:03:02 +01:00
Aliaksandr Valialkin	f874a3aa7b	lib/logstorage: show a link to query options docs in the error message emitted during failure to parse query options This should help figuring out and fixing the error by the user.	2025-03-15 00:03:02 +01:00
Aliaksandr Valialkin	8c079602c1	lib/logstorage: optimize handling long constant fields Long constant fields cannot be stored in columnsHeader as a const column, because their size exceeds maxConstColumnValueSize, so they are stored as regular values. This commit optimizes storing such fields by storing only a single value across the field values in a block instead of storing multiple values. This should improve data ingestion performance a bit. This also should improve query performance when the query accesses such fields because of better cache locality. Also improve persisting of constant string lengths by storing them only once.	2025-03-14 03:14:01 +01:00
Aliaksandr Valialkin	974d504043	lib/logstorage: add a test for marshalUint64Block / unmarshalUint64Block	2025-03-14 03:14:00 +01:00
Aliaksandr Valialkin	c62ccf11ae	lib/logstorage: newTestLogRows: create a const column, which cannot be stored in the column header because its length exceeds maxConstColumnValueSize	2025-03-14 03:14:00 +01:00
Aliaksandr Valialkin	4d44c3e154	lib/logstorage: properly parse floating-point numbers with leading zeroes in fractional part Parsing for floating-point numbers with leading zeroes such as 1.023, 1.00234 has been broken in the commit `ae5e28524e` . Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8464 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8361	2025-03-12 15:25:58 +01:00
Roman Khavronenko	63f6ac3ff8	lib/promutils: move time-related funcs from `promutils` to `timeutil` (#8403 ) Since funcs `ParseDuration` and `ParseTimeMsec` are used in vlogs, vmalert, victoriametrics and other components, importing promutils only for this reason makes them to export irrelevant `vm_rows_invalid_total{type="prometheus"}` metric. This change removes `vm_rows_invalid_total{type="prometheus"}` metric from /metrics page for these components. ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-03-03 10:25:42 +01:00
Aliaksandr Valialkin	744ac496bd	lib/logstorage: add ability to specify field name prefixes inside `fields (...)` lists passed to `pack_json` and `pack_logfmt` pipes	2025-02-27 22:54:18 +01:00
Aliaksandr Valialkin	84d5771b41	lib/logstorage: allow passing `` at `in()`, `contains_any()` and `contains_all()` Such filters are equivalent to `match all` filter aka `*`. These filters are needed for VictoriaLogs plugin for Grafana. See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/238#issuecomment-2685447673	2025-02-27 11:37:43 +01:00
Aliaksandr Valialkin	ae5e28524e	lib/logstorage: do not treat a string with leading zeros as a number at tryParseUint64 The "00123" string shouldn't be treated as 123 number. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8361	2025-02-25 22:10:36 +01:00
Aliaksandr Valialkin	1f75e5bb59	lib/logstorage: optimize common regex filters generated by Grafana For example, `field:~".+"`, `field:~"."` or `field:""` Replace such filters to faster ones. For example, `field:~"."` is replaced with ``, while `field:~".+"` is replaced with `field:`.	2025-02-25 20:34:28 +01:00
Aliaksandr Valialkin	82cdcec6c6	lib/logstorage: run `make fmt` after `30974e7f3f`	2025-02-25 18:37:40 +01:00
Aliaksandr Valialkin	30974e7f3f	lib/logstorage: add `le_field` and `lt_field` filters These filters can be used for selecting logs where one field value is less than another field value. These filter complement `<=` and `<` filters for constant literals.	2025-02-25 18:24:50 +01:00
Aliaksandr Valialkin	edc750dd55	lib/logstorage: optimize eq_filter when it is applied to fields of the same type	2025-02-25 18:24:15 +01:00
Aliaksandr Valialkin	bc69d5f1a4	lib/mergeset: explicitly pass the interval for flushing in-memory data to disk at MustOpenTable() This allows using different intervals for flushing in-memory data among different mergeset.Table instances. The initial user of this feature is lib/logstorage.Storage, which explicitly passes Storage.flushInterval to every created mereset.Table instance. Previously mergeset.Table instances were using 5 seconds flush interval, which didn't depend on the Storage.flushInterval. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775	2025-02-24 15:23:50 +01:00
Aliaksandr Valialkin	6764a03fbe	lib/logstorage: properly use datadb.flushInterval as an interval between flushes for the in-memory parts The dataFlushInterval variable has been mistakenly introduced in the commit `9dbd0f9085` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775	2025-02-24 15:06:36 +01:00
Aliaksandr Valialkin	2857188939	lib/logstorage: limit the maximum log field name length, which can be generated by JSONParser.ParseLogMessage Make sure that the maximum log field name, which can be generated by JSONParser.ParseLogMessage, doesn't exceed the hardcoded limit maxFieldNameSize. Stop flattening of nested JSON objects when the resulting field name becomes longer than maxFieldNameSize, and return the nested JSON object as a string instead. This should prevent from parse errors when ingesting deeply nested JSON logs with long field names.	2025-02-24 12:55:24 +01:00
Aliaksandr Valialkin	c0b9732fc8	lib/logstorage: add a benchmark for JSONParser.ParseLogMessage	2025-02-24 12:50:49 +01:00
Aliaksandr Valialkin	f1eac36a80	lib/logstorage: add `contains_any` and `contains_all` filters - `contains_any` selects logs with fields containing at least one word/phrase from the provided list. The provided list can be generated by a subquery. - `contains_all` selects logs with fields containing all the words and phrases from the provided list. The provided list can be generated by a subquery.	2025-02-22 21:55:58 +01:00
Aliaksandr Valialkin	4275653f03	lib/logstorage: do not spend CPU time on preparing values for already filtered out rows according to bm at filterEqField.applyToBlockSearch	2025-02-22 21:55:58 +01:00
Aliaksandr Valialkin	1ffd5e9b69	lib/logstorage: avoid extra memory allocations at getEmptyStrings()	2025-02-22 21:55:57 +01:00
Aliaksandr Valialkin	c372e10937	lib/logstorage: add an ability to drop duplicate words at unpack_words pipe	2025-02-22 21:55:57 +01:00
Aliaksandr Valialkin	7da98b540b	lib/logstorage: rename unpack_tokens to unpack_words pipe The LogsQL defines a word at https://docs.victoriametrics.com/victorialogs/logsql/#word , so it is more natural to use unpack_words instead of unpack_tokens name for the pipe.	2025-02-22 21:55:56 +01:00
Aliaksandr Valialkin	387d0369da	lib/logstorage: optimize `OR` filter a bit for many inner filters Use two operations on bitmaps per each inner filter instead of three operations.	2025-02-22 21:55:56 +01:00
Aliaksandr Valialkin	69f02f83ae	lib/logstorage: use clear() for clearing bitmap bits at resetBits() instead of a loop The clear() call is easier to read and understand than the loop.	2025-02-22 21:55:55 +01:00
Aliaksandr Valialkin	3c1d738196	lib/logstorage: avoid calling bitmap.reset() at getBitmap() The bitmap at getBitamp() must be already reset when it was returned to the pool via putBitamp(). Thise saves CPU a bit.	2025-02-22 21:55:55 +01:00
Aliaksandr Valialkin	35e1c35281	lib/logstorage: improve error logging for improperly escaped backslashes inside quoted strings This should simplify debugging LogsQL queries by users	2025-02-22 21:55:54 +01:00
Aliaksandr Valialkin	dfcfaba374	lib/logstorage: add `field1:eq_field(field2)` filter, which returns logs with identical values at field1 and field2	2025-02-22 21:55:54 +01:00
Aliaksandr Valialkin	cd5b24b377	lib/logstorage: optimize `len`, `hash` and `json_array_len` pipes for repeated values Re-use the previous result instead of calculating new result for repated input values	2025-02-22 21:55:54 +01:00
Aliaksandr Valialkin	d33e24ab9b	lib/logstorage: add `json_array_len` pipe for calculating the length of JSON arrays	2025-02-22 21:55:53 +01:00
Aliaksandr Valialkin	cd73c1bafb	lib/logstorage: refactor unroll_tokens into unpack_tokens pipe unpack_tokens pipe generates a JSON array of unpacked tokens from the source field. This composes better with other pipes such as unroll pipe.	2025-02-22 21:55:53 +01:00
Aliaksandr Valialkin	d32c697361	lib/logstorage: add `unroll_tokens` pipe for unrolling individual word tokens from the log field	2025-02-22 21:55:52 +01:00
Aliaksandr Valialkin	1ea3f72d50	lib/logstorage: simplify usage of `top`, `uniq` and `unroll` pipes by allowing comma-separated list of fields without parens Examples: - `top 5 x, y` is equivalent to `top 5 by (x, y)` - `uniq foo, bar` is equivalent to `uniq by (foo, bar)` - `unroll foo, bar` is equivalent to `unroll (foo, bar)`	2025-02-20 22:36:09 +01:00
Aliaksandr Valialkin	31e88a692d	lib/logstorage: properly handle _time:<=max_time filter _time:<=max_time filter must include logs with timestamps matching max_time. For example, _time:<=2025-02-24Z must include logs with timestamps until the end of February 24, 2025.	2025-02-20 19:15:37 +01:00
Aliaksandr Valialkin	ffbd0ebbae	lib/logstorage: allow using '>', '>=', '<' and '<=' in '_time:...' filter Examples: _time:>=2025-02-24Z selects logs with timestamps bigger or equal to 2025-02-24 UTC _time:>1d selects logs with timestamps older than one day comparing to the current time This simplifies writing queries with _time filters. See https://docs.victoriametrics.com/victorialogs/logsql/#time-filter	2025-02-20 19:04:51 +01:00

1 2 3 4 5 ...

336 commits