Commit graph

25 commits

Author SHA1 Message Date
Aliaksandr Valialkin
2023f017b1
lib/logstorage: optimize performance for queries, which select all the log fields for logs containing hundreds of log fields (aka "wide events")
Unpack the full columnsHeader block instead of unpacking meta-information per each individual column
when the query, which selects all the columns, is executed. This improves performance when scanning
logs with big number of fields.
2024-10-18 02:22:42 +02:00
Aliaksandr Valialkin
507b206a7d
lib/logstorage: move getConstColumnValue() and getColumnHeader() methods from columnsHeader to blockSearch
This localizes blockSearch.getColumnsHeader() call at block_search.go .
This call is going to be optimized in the next commits in order to avoid
unmarshaling of header data for unneeded columns, which weren't requested
by getConstColumnValue() / getColumnHeader().
2024-10-13 14:29:02 +02:00
Aliaksandr Valialkin
867f671cc4
lib/logstorage: make sure that bs.br is non-nil before checking br.bs.bsw.bh.rowsCount there
br.bs may be nil when br contains the block with additional filters applied during pipe calculations.
For example, `* | count() if (error) errors`.
2024-10-12 20:51:29 +02:00
Aliaksandr Valialkin
a350be48b6
lib/logstorage: do not count dictionary values which have no matching logs in count_uniq stats function
Create blockResultColumn.forEachDictValue* helper functions for visiting matching
dictionary values. These helper functions should prevent from counting dictionary values
without matching logs in the future.

This is a follow-up for 0c0f013a60
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7152
2024-10-01 13:34:45 +02:00
Aliaksandr Valialkin
b82bd0c2ec
lib/logstorage: improve performance for stream_context pipe over streams with big number of log entries
Do not read timestamps for blocks, which cannot contain surrounding logs.
This should improve peformance for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6730 .

Also optimize min(_time) and max(_time) calculations a bit by avoiding conversion
of timestamp to string when it isn't needed.
This should improve performance for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070 .
2024-09-26 22:22:23 +02:00
Aliaksandr Valialkin
65b93b17b1
lib/logstorage: lazily read column headers metadata during queries
This improves performance for analytical queries, which do not need column headers metadata.
For example, the following query doesn't need column headers metadata, since _stream and min(_time)
are stored in block header, which is read separately from colum headers metadata:

  _time:1w | stats by (_stream) min(_time) min_time

This commit significantly improves the performance for this query.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070
2024-09-25 19:17:48 +02:00
Aliaksandr Valialkin
4599429f51
lib/logstorage: read timestamps column when it is really needed during query execution
Previously timestamps column was read unconditionally on every query.
This could significantly slow down queries, which do not need reading this column
like in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070 .
2024-09-25 19:17:47 +02:00
Aliaksandr Valialkin
7f1ba18719
lib/logstorage: improve the performance of obtaining _stream column value
Substitute global streamTagsCache with per-blockSearch cache for ((stream.id) -> (_stream value)) entries.
This improves scalability of obtaining _stream values on a machine with many CPU cores, since every CPU
has its own blockSearch instance.

This also should reduce memory usage when querying logs over big number of streams, since per-blockSearch
cache of ((stream.id) -> (_stream value)) entries is limited in size, and its lifetime is bounded by a single query.
2024-09-24 20:57:00 +02:00
Aliaksandr Valialkin
9e1c037249
lib/logstorage: properly parse timezone offset at TryParseTimestampRFC3339Nano()
The TryParseTimestampRFC3339Nano() must properly parse RFC3339 timestamps with timezone offsets.

While at it, make tryParseTimestampISO8601 function private in order to prevent
from improper usage of this function from outside the lib/logstorage package.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6508
2024-06-25 14:53:38 +02:00
Aliaksandr Valialkin
7229dd8c33
lib/logstorage: work-in-progress 2024-06-20 03:10:08 +02:00
Aliaksandr Valialkin
2b6a634ec0
lib/logstorage: work-in-progress 2024-06-17 12:13:18 +02:00
Aliaksandr Valialkin
8f5dc966f6
lib/logstorage: work-in-progress 2024-06-11 17:50:32 +02:00
Aliaksandr Valialkin
43cf221681
lib/logstorage: work-in-progress 2024-06-05 03:18:12 +02:00
Aliaksandr Valialkin
539fce9227
lib/logstorage: work-in-progress 2024-06-04 01:49:02 +02:00
Aliaksandr Valialkin
1de187bcb7
lib/logstorage: work-in-progress 2024-05-29 01:52:13 +02:00
Aliaksandr Valialkin
dc55146752
lib/logstorage: work-in-progress 2024-05-25 21:36:16 +02:00
Aliaksandr Valialkin
e2590f0485
lib/logstorage: work-in-progress 2024-05-25 00:30:58 +02:00
Aliaksandr Valialkin
4b458370c1
lib/logstorage: work-in-progress 2024-05-24 03:06:55 +02:00
Aliaksandr Valialkin
22107421eb
lib/logstorage: work-in-progress 2024-05-22 21:01:20 +02:00
Aliaksandr Valialkin
ad505a7a9a
lib/logstorage: work-in-progress 2024-05-20 04:08:30 +02:00
Aliaksandr Valialkin
0aa19a2837
lib/logstorage: work-in-progress 2024-05-15 04:55:44 +02:00
Aliaksandr Valialkin
da3af090c6
lib/logstorage: work-in-progress 2024-05-14 03:05:03 +02:00
Aliaksandr Valialkin
cb35e62e04
lib/logstorage: work-in-progress
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6258
2024-05-14 01:49:23 +02:00
hagen1778
17283fab6c
lib/logstorage: make linter happy
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-05-13 15:35:11 +02:00
Aliaksandr Valialkin
9dbd0f9085
lib/logstorage: initial implementation of pipes in LogsQL
See https://docs.victoriametrics.com/victorialogs/logsql/#pipes
2024-05-12 16:33:31 +02:00