github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-11 14:53:49 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	246c339e3d	lib/logstorage: read timestamps column when it is really needed during query execution Previously timestamps column was read unconditionally on every query. This could significantly slow down queries, which do not need reading this column like in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070 .	2024-09-25 19:18:37 +02:00
Aliaksandr Valialkin	180137a377	lib/logstorage: improve the performance of obtaining _stream column value Substitute global streamTagsCache with per-blockSearch cache for ((stream.id) -> (_stream value)) entries. This improves scalability of obtaining _stream values on a machine with many CPU cores, since every CPU has its own blockSearch instance. This also should reduce memory usage when querying logs over big number of streams, since per-blockSearch cache of ((stream.id) -> (_stream value)) entries is limited in size, and its lifetime is bounded by a single query.	2024-09-24 20:57:39 +02:00
Aliaksandr Valialkin	9d11a21541	lib/logstorage/consts.go: document that it isn't recommended setting maxColumnsPerBlock constant to too big values This should help avoiding cases like this one - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6425#issuecomment-2337446083	2024-09-24 18:52:54 +02:00
Aliaksandr Valialkin	1264350566	lib/logstorage: improve performance for streamID.marshalString() by more than 2x The streamID.marshalString() is executed in hot path if the query selects _stream_id field. Command to run the benchmark: go test ./lib/logstorage/ -run=NONE -bench=BenchmarkStreamIDMarshalString -benchtime=5s Results before the commit: BenchmarkStreamIDMarshalString-16 438480714 14.04 ns/op 71.23 MB/s 0 B/op 0 allocs/op Results after the commit: BenchmarkStreamIDMarshalString-16 982459660 6.049 ns/op 165.30 MB/s 0 B/op 0 allocs/op	2024-09-24 18:38:21 +02:00
Aliaksandr Valialkin	d944c162da	lib/logstorage: add benchmark for streamID.marshalString	2024-09-24 18:38:21 +02:00
Aliaksandr Valialkin	472b6b326e	lib/logstorage: make sure that getCommonTokens returns common tokens in the original order of tokens inside tokenSets arg This fixes flaky test TestGetCommonTokensForOrFilters: filter_or_test.go:143: unexpected tokens for field "_msg"; got ["foo" "bar"]; want ["bar" "foo"]	2024-09-19 16:00:21 +02:00
Aliaksandr Valialkin	cad236003b	app/vlselect: consistently reuse the original query timestamp when executing /select/logsql/query with positive limit=N query arg Previously the query could return incorrect results, since the query timestamp was updated with every Query.Clone() call during iterative search for the time range with up to limit=N rows. While at it, optimize queries, which find low number of matching logs, while spend a lot of CPU time for searching across big number of logs. The optimization reduces the upper bound of the time range to search if the current time range contains zero matching rows. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6785	2024-09-08 14:34:46 +02:00
Aliaksandr Valialkin	297301e8c0	lib/logstorage: preserve the order of tokens to check against bloom filters in AND filters Previously tokens from AND filters were extracted in random order. This could slow down checking them agains bloom filters if the most specific tokens go at the beginning of the AND filters. Preserve the original order of tokens when matching them against bloom filters, so the user could control the performance of the query by putting the most specific AND filters at the beginning of the query. While at it, add tests for getCommonTokensForAndFilters() and getCommonTokensForOrFilters(). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6554 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6556	2024-09-08 12:28:34 +02:00
Aliaksandr Valialkin	4b49b62a58	lib/logstorage: improve error logging for incorrect queries passed to /select/logsql/stats_query and /select/logsql/stats_query_range functions	2024-09-08 12:28:33 +02:00
Aliaksandr Valialkin	edb1afe804	lib/logstorage: properly extract common tokens from unsupported OR filters Previously the following query could miss rows matching !bar if these rows do not contain foo: foo OR !bar This is because of incorrect detection of common tokens for OR filters - all the unsupported filters were skipped (including the NOT filter (aka `!`)), while in this case zero common tokens must be returned. While at it, move repetiteve code in TestFilterAnd and TestFilterOr into f function. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6554 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6556	2024-09-08 12:28:33 +02:00
Aliaksandr Valialkin	c448189f69	app/vlselect: add /select/logsql/stats_query_range endpoint for building time series panels in VictoriaLogs plugin for Grafana Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6943 Updates https://github.com/VictoriaMetrics/victorialogs-datasource/issues/61	2024-09-07 00:44:34 +02:00
Aliaksandr Valialkin	01c8e12370	app/vlselect: add /select/logsql/stats_query endpoint, which is going to be used by vmalert Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6942 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6706	2024-09-06 23:00:58 +02:00
Aliaksandr Valialkin	c5badeea08	lib/logstorage: substitute `\|` operator with `or` operator at `math` pipe This is needed for avoiding confusion between the `\|` operator at `math` pipe and `\|` pipe delimiter. For example, the following query was parsed unexpectedly: * \| math foo / bar \| fields x as * \| math foo / (bar \| fields) as x Substituting `\|` with `or` inside `math` pipe fixes this ambiguity.	2024-09-06 22:43:29 +02:00
Aliaksandr Valialkin	08fe7949d1	lib/logstorage: consistently use nsecsPerDay constant and remove nsecPerDay constant	2024-09-06 16:18:15 +02:00
Aliaksandr Valialkin	7dcce1ca02	lib/logstorage: pre-calculate hashes from tokens used in bloom filter search Previously per-token hashes for per-block bloom filters were re-calculated on every scanned block. This could be slow when the number of tokens is big or when the number of blocks to scan is big. Pre-calculate hashes for bloom filters and then use them for searching in bloom filters. This improves performance by 2.5x for in(...) filters with many values to search inside `in()`.	2024-09-05 19:44:42 +02:00
Aliaksandr Valialkin	2630497e2c	lib/logstorage: delete unused function - bloomfilter.containsAny	2024-09-05 16:57:47 +02:00
Aliaksandr Valialkin	5763a957ef	lib/logstorage: properly fix incorrect extraction of common tokens for `OR` filters at distinct log fields Previously (f1:foo OR f2:bar) was incorrectly returning `foo` token for `f1` and `bar` token for `f2`. These tokens were used for checking against bloom filter for every data block, so the data block, which didn't contain simultaneously `foo` token for `f1` field and `bar` token for `f2` field, was skipped. This was incorrect, since such a block may contain logs matching the original OR filter. The fix is to return common tokens from `OR`-delimted filters only if these tokens exist at EVERY such filter for the given field name. If some `OR`-delimited filter misses the given field name, then `OR`-delimited filters do not contain common tokens, which could be used for checking against bloom filter. While at it, add more tests covering various edge cases for filters delimited by AND and OR. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6554 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6556	2024-09-05 16:57:47 +02:00
jackyin	66789a8144	lib/logstorage: `and` filter results in unexpected response (#6556 ) fix #6554 andfilter shouldn't return orfilter field which result in bloomfilter return false. --------- Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `975ed27a76`)	2024-09-03 10:49:25 +02:00
Aliaksandr Valialkin	a1decb5ca1	app/vlinsert/loki: use easyproto instead for parsing Loki protobuf messages	2024-07-10 03:05:55 +02:00
Aliaksandr Valialkin	b8a8d3d6f1	lib/logstorage: drop all the pipes from the query when calculating the number of matching logs at /select/logsql/hits API	2024-07-10 00:39:16 +02:00
Aliaksandr Valialkin	d6415b2572	all: consistently use 'any' instead of 'interface{}' 'any' type is supported starting from Go1.18. Let's consistently use it instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.	2024-07-10 00:23:26 +02:00
Aliaksandr Valialkin	9edeecabc8	lib: consistently use f-tests instead of table-driven tests This makes easier to read and debug these tests. This also reduces test lines count by 15% from 3K to 2.5K . See https://itnext.io/f-tests-as-a-replacement-for-table-driven-tests-in-go-8814a8b19e9e . While at it, consistently use t.Fatal* instead of t.Error, since t.Error usually leads to more complicated and fragile tests, while it doesn't bring any practical benefits over t.Fatal*.	2024-07-09 22:39:13 +02:00
Aliaksandr Valialkin	6397c38a0a	lib/logstorage: use quicktemplate.AppendJSONString instead of strconv.AppendQuote for encoding JSON strings The strconv.AppendQuote improperly encodes special chars such as \x1b . They must be encoded as \u001b . See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/24	2024-07-05 01:22:49 +02:00
Aliaksandr Valialkin	7426b40250	lib/logstorage: allow writing `after N` in front of `before N` at `stream_context` pipe	2024-07-02 01:39:45 +02:00
Aliaksandr Valialkin	208a624d4d	lib/logstorage: properly search for the surrounding logs in `stream_context` pipe The set of log fields in the found logs may differ from the set of log fields present in the log stream. So compare only the log fields in the found logs when searching for the matching log entry in the log stream. While at it, return _stream field in the delimiter log entry, since this field is used by VictoriaLogs Web UI for grouping logs by log streams.	2024-07-01 02:33:00 +02:00
Aliaksandr Valialkin	76a58ae08d	lib/logstorage: add ability to store sorted log position into a separate field with `sort ... rank <fieldName>` syntax	2024-07-01 01:46:03 +02:00
Aliaksandr Valialkin	d0dca7b8c5	lib/logstorage: add delimiter between log chunks returned from `\| stream_context` pipe	2024-07-01 01:46:02 +02:00
Aliaksandr Valialkin	4b3477e62b	lib/logstorage: add `stream_context` pipe, which allows selecting surrounding logs for the matching logs	2024-06-28 19:15:19 +02:00
Aliaksandr Valialkin	2f28819bb1	lib/logstorage: it is safe using `\| unroll` pipe in live tailing `\| unroll` pipe can make multiple copies of rows from the input row. This doesn't break live tailing, so allow `\| unroll` pipe in live tailing.	2024-06-27 19:45:12 +02:00
Aliaksandr Valialkin	dd62a2b9d6	lib/logstorage: work-in-progress	2024-06-27 14:21:03 +02:00
Aliaksandr Valialkin	d5cbda3424	app/vlstorage: add -retention.maxDiskSpaceUsageBytes command-line flag for limiting the retention at VictoriaLogs by disk space usage	2024-06-25 17:30:46 +02:00
Aliaksandr Valialkin	f24123a776	lib/logstorage: parse syslog structured data into separate fields in order to simplify further querying of this data	2024-06-25 14:54:25 +02:00
Aliaksandr Valialkin	1716c4e609	lib/logstorage: properly parse timezone offset at TryParseTimestampRFC3339Nano() The TryParseTimestampRFC3339Nano() must properly parse RFC3339 timestamps with timezone offsets. While at it, make tryParseTimestampISO8601 function private in order to prevent from improper usage of this function from outside the lib/logstorage package. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6508	2024-06-25 14:54:24 +02:00
Aliaksandr Valialkin	2a7fcba330	lib/logstorage: make golangci-lint happy	2024-06-25 03:06:28 +02:00
Aliaksandr Valialkin	7de6f5b4ce	lib/logstorage: work-in-progress	2024-06-25 00:44:57 +02:00
Aliaksandr Valialkin	d5224f3363	lib/logstorage: work-in-progress	2024-06-20 03:10:37 +02:00
Aliaksandr Valialkin	c10a646d19	app/vlinsert/syslog: allow accepting syslog messages with different configs at different ports	2024-06-17 23:16:58 +02:00
Aliaksandr Valialkin	1750991119	lib/logstorage: work-in-progress	2024-06-17 12:13:25 +02:00
Aliaksandr Valialkin	9135b404d9	lib/logstorage: work-in-progress	2024-06-11 17:51:01 +02:00
Aliaksandr Valialkin	37a8cc0b12	lib/logstorage: work-in-progress	2024-06-10 18:42:31 +02:00
Aliaksandr Valialkin	53382ae837	lib/logstorage: work-in-progress	2024-06-06 12:27:11 +02:00
Aliaksandr Valialkin	a200fb433a	lib/logstorage: allow using `eval` keyword instead of `math` keyword in `math` pipe	2024-06-05 10:08:08 +02:00
Aliaksandr Valialkin	b45e466a1b	lib/logstorage: work-in-progress	2024-06-05 03:18:25 +02:00
Aliaksandr Valialkin	1ce8a9a751	lib/logstorage: allow typing `asc` in `sort` pipe for the sake of consistency with `desc`	2024-06-04 02:29:18 +02:00
Aliaksandr Valialkin	b7b3a9e9a3	lib/logstorage: work-in-progress	2024-06-04 01:50:55 +02:00
Aliaksandr Valialkin	540bbb63a2	lib/logstorage: work-in-progress	2024-05-30 16:19:36 +02:00
Aliaksandr Valialkin	e83fd4a117	lib/logstorage: work-in-progress	2024-05-29 01:52:34 +02:00
Aliaksandr Valialkin	79c03fc35f	lib/logstorage: work-in-progress	2024-05-28 19:29:50 +02:00
Aliaksandr Valialkin	ce5e4c842a	lib/logstorage: fix golangci-lint warnings	2024-05-26 02:02:41 +02:00
Aliaksandr Valialkin	afa597ce2a	lib/logstorage: work-in-progress	2024-05-26 01:56:12 +02:00
Aliaksandr Valialkin	6427b3c3c0	lib/logstorage: work-in-progress	2024-05-25 22:59:21 +02:00
Aliaksandr Valialkin	9edbeca46b	lib/logstorage: re-use per-shard fields across processed blocks in pipePackJSON and pipeUnroll	2024-05-25 22:13:44 +02:00
Aliaksandr Valialkin	03fe4c8963	lib/logstorage: work-in-progress	2024-05-25 21:36:24 +02:00
Aliaksandr Valialkin	3152df2bce	lib/logstorage: work-in-progress	2024-05-25 00:31:55 +02:00
Aliaksandr Valialkin	7a2a2f173e	lib/logstorage: work-in-progress	2024-05-24 03:07:07 +02:00
Alexander Marshalov	0b70c4c1f1	[vmlogs] fixed time parsing with millisecond precision time (#6293 ) (#6295 ) fix for #6293 Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-05-22 21:54:50 +02:00
Aliaksandr Valialkin	04d0dd2542	lib/logstorage: work-in-progress	2024-05-22 21:01:28 +02:00
Aliaksandr Valialkin	45fbcc74e0	lib/logstorage: fix golangci-lint warnings	2024-05-20 11:04:37 +02:00
Aliaksandr Valialkin	582e7d5439	lib/logstorage: work-in-progress	2024-05-20 04:09:15 +02:00
Aliaksandr Valialkin	28626db066	lib/logstorage: work-in-progress (cherry picked from commit `0aa19a2837`)	2024-05-16 09:35:55 +02:00
Aliaksandr Valialkin	b1ee7bca1a	lib/logstorage: work-in-progress	2024-05-14 03:06:02 +02:00
Aliaksandr Valialkin	f52275bbd7	lib/logstorage: work-in-progress Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6258	2024-05-14 01:49:58 +02:00
Aliaksandr Valialkin	32193b6059	lib/encoding: optimize UnmarshalVarUint64, UnmarshalVarInt64 and UnmarshalBytes a bit Change the return values for these functions - now they return the unmarshaled result plus the size of the unmarshaled result in bytes, so the caller could re-slice the src for further unmarshaling. This improves performance of these functions in hot loops of VictoriaLogs a bit.	2024-05-14 01:30:25 +02:00
hagen1778	84a896cd6e	lib/logstorage: make linter happy Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `17283fab6c`)	2024-05-13 16:49:37 +02:00
Aliaksandr Valialkin	147704aab0	lib/logstorage: initial implementation of pipes in LogsQL See https://docs.victoriametrics.com/victorialogs/logsql/#pipes	2024-05-12 16:36:01 +02:00
Aliaksandr Valialkin	87338633b1	lib/slicesutil: add helper functions for setting slice length and extending its capacity The added helper functions - SetLength() and ExtendCapacity() - replace error-prone code with simple function calls.	2024-05-12 11:33:49 +02:00
wanshuangcheng	52a4ae0b28	chore: fix function names in comment (#6076 ) Signed-off-by: wanshuangcheng <wanshuangcheng@outlook.com>	2024-04-08 15:38:51 +02:00
Aliaksandr Valialkin	00f59d6ddf	all: fix golangci-lint(revive) warnings after `0c0ed61ce7` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6001	2024-04-03 03:00:45 +03:00
XLONG96	88b9088499	lib/logstorage: avoid panic when parsing regex with stream filter (#5897 )	2024-02-29 15:32:25 +02:00
Aliaksandr Valialkin	ca1e78bd16	lib/logstorage: consistently use atomic.* types instead of atomic.* functions on regular types See `ea9e2b19a5`	2024-02-24 00:29:39 +02:00
Aliaksandr Valialkin	d0538d11d3	lib/mergeset: consistently use atomic.* types instead of atomic.* function calls on ordinary types See `ea9e2b19a5`	2024-02-24 00:29:12 +02:00
Aliaksandr Valialkin	92e098012a	lib/logstorage: consistently use atomic.* type for refCount and mustDrop fields in datadb and storage structs in the same way as it is used in lib/storage See `ea9e2b19a5` and `a204fd69f1`	2024-02-24 00:28:56 +02:00
Aliaksandr Valialkin	b58c429044	app/vlselect: follow-up for `451d2abf50` - Consistently return the first `limit` log entries if the total size of found log entries doesn't exceed 1Mb. See app/vlselect/logsql/sort_writer.go . Previously random log entries could be returned with each request. - Document the change at docs/VictoriaLogs/CHANGELOG.md - Document the `limit` query arg at docs/VictoriaLogs/querying/README.md - Make the change less intrusive. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5674 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5778	2024-02-18 23:06:08 +02:00
Dmytro Kozlov	2d674f98d4	Enable the `limit` query param for the `/select/logsql/query` (#5778 ) * app/vlselect: add limit for logs query * app/vlselect: CHANGELOG.md * app/vlselect: stop search process if limit is reached, update logic, remove default limit * app/vlselect: fix tests * app/vlselect: fix filter tests * app/vlselect: fix tests	2024-02-18 22:59:16 +02:00
noodles2hg	60a8e59366	lib/logstorage: proper exit during block search (#5400 )	2024-02-01 14:11:20 +02:00
Jiajing LU	9c75e3ee15	count inmemoryParts that have not been taken for merge (#5447 )	2024-02-01 14:07:13 +02:00
Aliaksandr Valialkin	230ef43a32	lib/logstorage: make sure that WaitGroup.Add isnt called after stopCh is closed and WaitGroup.Wait is called This protects from rare panic, which may occur during graceful shutdown of VictoriaLogs	2024-01-26 21:18:07 +01:00
Aliaksandr Valialkin	d52fd73f18	all: add up to 10% random jitter to the interval between periodic tasks performed by various components This should smooth CPU and RAM usage spikes related to these periodic tasks, by reducing the probability that multiple concurrent periodic tasks are performed at the same time.	2024-01-22 18:39:16 +02:00
Aliaksandr Valialkin	9760221214	lib/logstorage: always check the previous indexBlockHeader for blocks with matching tenantID and/or streamID The previous indexBlockHeader may contain blocks for the matching tenantID and/or streamID, so it must be scanned unconditionally during the search. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5295 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4856 This is a follow-up for `89dcbc2fe7`	2023-11-14 01:02:02 +01:00
XLONG96	77033dbfb6	lib/logstorage: fix streamID and tenantID search (#4856 ) (#5295 )	2023-11-14 01:02:02 +01:00
Aliaksandr Valialkin	36a1fdca6c	all: consistently use %w instead of %s in when error is passed to fmt.Errorf() This allows consistently using errors.Is() for verifying whether the given error wraps some other known error.	2023-10-26 09:44:40 +02:00
Zakhar Bessarab	85b604d414	lib/logstorage: fix free space check (#5113 ) Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-10-03 17:50:16 +02:00
Aliaksandr Valialkin	7bb5f75a2a	lib/logstorage: follow-up for `94627113db` - Move uniqueFields from rows to blockStreamMerger struct. This allows localizing all the references to uniqueFields inside blockStreamMerger.mustWriteBlock(), which should improve readability and maintainability of the code. - Remove logging of the event when blocks cannot be merged because they contain more than maxColumnsPerBlock, since the provided logging didn't provide the solution for the issue with too many columns. I couldn't figure out the proper solution, which could be helpful for end user, so decided to remove the logging until we find the solution. This commit also contains the following additional changes: - It truncates field names longer than 128 chars during logs ingestion. This should prevent from ingesting bogus field names. This also should prevent from too big columnsHeader blocks, which could negatively affect search query performance, since columnsHeader is read on every scan of the corresponding data block. - It limits the maximum length of const column value to 256. Longer values are stored in an ordinary columns. This helps limiting the size of columnsHeader blocks and improving search query performance by avoiding reading too long const columns on every scan of the corresponding data block. - It deduplicates columns with identical names during data ingestion and background merging. Previously it was possible to pass columns with duplicate names to block.mustInitFromRows(), and they were stored as is in the block. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4762 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4969	2023-10-02 21:06:49 +02:00
Aliaksandr Valialkin	120f3bc467	lib/logstorage: follow-up for `8a23d08c21` - Compare the actual free disk space to the value provided via -storage.minFreeDiskSpaceBytes directly inside the Storage.IsReadOnly(). This should work fast in most cases. This simplifies the logic at lib/storage. - Do not take into account -storage.minFreeDiskSpaceBytes during background merges, since it results in uncontrolled growth of small parts when the free disk space approaches -storage.minFreeDiskSpaceBytes. The background merge logic uses another mechanism for determining whether there is enough disk space for the merge - it reserves the needed disk space before the merge and releases it after the merge. This prevents from out of disk space errors during background merge. - Properly handle corner cases for flushing in-memory data to disk when the storage enters read-only mode. This is better than losing the in-memory data. - Return back Storage.MustAddRows() instead of Storage.AddRows(), since the only case when AddRows() can return error is when the storage is in read-only mode. This case must be handled by the caller by calling Storage.IsReadOnly() before adding rows to the storage. This simplifies the code a bit, since the caller of Storage.MustAddRows() shouldn't handle errors returned by Storage.AddRows(). - Properly store parsed logs to Storage if parts of the request contain invalid log lines. Previously the parsed logs could be lost in this case. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4737 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4945	2023-10-02 20:38:00 +02:00
Aliaksandr Valialkin	cbbdf9cdf5	lib/logstorage: run up to GOMAXPROCS flushers of old in-memory parts to disk One flusher isn't enough under high data ingestion rate. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775	2023-10-02 20:36:53 +02:00
Aliaksandr Valialkin	78e9cda4b1	lib/logstorage: assist merging in-memory parts at data ingestion path if their number starts exceeding maxInmemoryPartsPerPartition This is a follow-up for `9310e9f584` , which removed data ingestion pacing. This can result in uncontrolled growth of in-memory parts under high data ingestion rate, which, in turn, can result in unbounded RAM usage, OOM crashes and slow query performance. While at it, consistently reset isInMerge field for parts passed to mergeParts() before returning from this function. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4828	2023-10-02 20:35:20 +02:00
Zakhar Bessarab	876bce5a57	lib/logstorage: prevent from panic during background merge (#4969 ) * lib/logstorage: prevent from panic during background merge Fixes panic during background merge when resulting block would contain more columns than maxColumnsPerBlock. Buffered data will be flushed and replaced by the next block. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4762 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/logstorage: clarify field description and comment Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-10-02 19:29:31 +02:00
Zakhar Bessarab	dfdada055c	lib/logstorage: switch to read-only mode when running out of disk space (#4945 ) * lib/logstorage: switch to read-only mode when running out of disk space Added support of `--storage.minFreeDiskSpaceBytes` command-line flag to allow graceful handling of running out of disk space at `--storageDataPath`. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4737 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/logstorage: fix error handling logic during merge Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/logstorage: fix log level Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2023-10-02 17:09:57 +02:00
Zakhar Bessarab	53268ebc66	lib/logstorage/datadb: remove parts merge cond (#4828 ) It was added in order to limit number of goroutines performing assisted merges during ingestion. It turned out that blocking ingestion goroutines lower ingestion performance and limits overall ingestion around 40k items per seconds because of lock contention. Removing parts merge sync.Cond allows to remove lock contention at write path and significantly improves write performance. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-10-02 17:09:12 +02:00
Zakhar Bessarab	47d9e82b52	lib/storage/partition: add check to ensure parts exist on disk (#5017 ) * lib/storage/partition: add check to ensure parts exist on disk If part exists in parts.json but is missing on disk there will be a misleading error similar to "unexpected number of substrings in the part name". This change forces verification of part existence and throws a correct error in case it is missing on disk. Such issue can be result of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5005 or disk corruption. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/partition: use filepath.Join instead of string concatenation Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * lib/storage/partition: add action points for error message Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * all: add a check for missing part in lib/mergeset and lib/logstorage --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-09-19 11:18:21 +02:00
Aliaksandr Valialkin	d8afd7fe98	Makefile: update golangci-lint from v1.51.2 to v1.54.2 See https://github.com/golangci/golangci-lint/releases/tag/v1.54.2	2023-09-01 10:25:49 +02:00
crossoverJie	64b27c9217	lib/logstorage: Set ptwHot to nil when the partition pointed by ptwHot is dropped (#4902 )	2023-08-29 11:22:53 +02:00
Aliaksandr Valialkin	1e1aa94ffb	lib/logstorage: eliminate data race when clearing s.ptwHot after deleting the corresponding partition The previous code could result in the following data race: 1. The s.ptwHot partition is marked to be deleted 2. ptw.decRef() is called on it 3. ptw.pt is set to nil 4. s.ptwHot.pt is accessed from concurrent goroutine, which leads to panic. The change clears s.ptwHot under s.partitionsLock in order to prevent from the data race. This is a follow-up for `8d50032dd6` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4895	2023-08-29 11:12:07 +02:00
crossoverJie	db0ae3fffb	lib/logstorage: add nil check for ptwHot.pt (#4896 ) (cherry picked from commit `cde5029bce`)	2023-08-27 09:14:28 +02:00
Aliaksandr Valialkin	8b4bf5d269	app/vlstorage: expose vl_data_size_bytes metric at /metrics page for tracking the on-disk data size (both indexdb and the data itself)	2023-07-31 07:56:16 -07:00
Aliaksandr Valialkin	30098ac8bd	app/vlinsert/loki: follow-up after `09df5b66fd` - Parse protobuf if Content-Type isn't set to `application/json` - this behavior is documented at https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki - Properly handle gzip'ped JSON requests. The `gzip` header must be read from `Content-Encoding` instead of `Content-Type` header - Properly flush all the parsed logs with the explicit call to vlstorage.MustAddRows() at the end of query handler - Check JSON field types more strictly. - Allow parsing Loki timestamp as floating-point number. Such a timestamp can be generated by some clients, which store timestamps in float64 instead of int64. - Optimize parsing of Loki labels in Prometheus text exposition format. - Simplify tests. - Remove lib/slicesutil, since there are no more users for it. - Update docs with missing info and fix various typos. For example, it should be enough to have `instance` and `job` labels as stream fields in most Loki setups. - Allow empty of missing timestamps in the ingested logs. The current timestamp at VictoriaLogs side is then used for the ingested logs. This simplifies debugging and testing of the provided HTTP-based data ingestion APIs. The remaining MAJOR issue, which needs to be addressed: victoria-logs binary size increased from 13MB to 22MB after adding support for Loki data ingestion protocol at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4482 . This is because of shitty protobuf dependencies. They must be replaced with another protobuf implementation similar to the one used at lib/prompb or lib/prompbmarshal .	2023-07-20 21:52:11 -07:00
Zakhar Bessarab	5b3cbd4db1	app/vlinsert: add support of loki push protocol (#4482 ) * app/vlinsert: add support of loki push protocol - implemented loki push protocol for both Protobuf and JSON formats - added examples in documentation - added example docker-compose Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert: move protobuf metric into its own file Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * deployment/docker/victorialogs/promtail: update reference to docker image Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * deployment/docker/victorialogs/promtail: make volume name unique Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert/loki: add license reference Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * deployment/docker/victorialogs/promtail: fix volume name Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/VictoriaLogs/data-ingestion: add stream fields for loki JSON ingestion example Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert/loki: move entities to places where those are used Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert/loki: refactor to use common components - use CommonParameters from insertutils - stop ingestion after first error similar to elasticsearch and jsonline Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vlinsert/loki: address review feedback - add missing logstorage.PutLogRows calls - refactor tenant ID parsing to use common function - reduce number of allocations for parsing by reusing logfields slices - add tests and benchmarks for requests processing funcs Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2023-07-20 16:49:43 -07:00
Yury Molodov	3ad80e281f	vmui: add Active Queries page (#4653 ) * feat: add page to display a list of active queries (#4598) * app/vmagent: code formatting * fix: remove console --------- Co-authored-by: dmitryk-dk <kozlovdmitriyy@gmail.com>	2023-07-19 16:02:58 -07:00
Aliaksandr Valialkin	5819d4e6f7	lib/logstorage: properly encode `"offset"` search word just after _time filter	2023-07-18 16:03:57 -07:00
Aliaksandr Valialkin	da2ef397fa	lib/logstorage: add abilty to speficy offset for the selected _time filter The following syntax is supported: _time:filter offset off For example: - _time:5m offset 1h - 5-minute duration one hour before the current time - _time:2023 offset 2w - 2023 year with the 2 weeks offset in the past	2023-07-17 19:07:14 -07:00

1 2 3 4

161 commits