Commit graph

10 commits

Author SHA1 Message Date
Aliaksandr Valialkin
fea934936b
lib/logstorage: properly propagate extra filters to all the subqueries
The purpose of extra filters ( https://docs.victoriametrics.com/victorialogs/querying/#extra-filters )
is to limit the subset of logs, which can be queried. For example, it is expected that all the queries
with `extra_filters={tenant=123}` can access only logs, which contain `123` value for the `tenant` field.

Previously this wasn't the case, since the provided extra filters weren't applied to subqueries.
For example, the following query could be used to select all the logs outside `tenant=123`, for any `extra_filters` arg:

    * | union({tenant!=123})

This commit fixes this by propagating extra filters to all the subqueries.

While at it, this commit also properly propagates [start, end] time range filter from HTTP querying APIs
into all the subqueries, since this is what most users expect. This behaviour can be overriden on per-subquery
basis with the `options(ignore_global_time_filter=true)` option - see https://docs.victoriametrics.com/victorialogs/logsql/#query-options

Also properly apply apply optimizations across all the subqueries. Previously the optimizations at Query.optimize()
function were applied only to the top-level query.
2025-01-26 22:05:05 +01:00
Aliaksandr Valialkin
a326a4747e
lib/logstorage: reduce memory allocations when splitting in(...) values into tokens and calculating hashes for these tokens
While at it, reduce memory allocations at Storage.getFieldValuesNoHits and make it more scalable on multi-CPU systems.

This improves performance of in(<query>) filter when the <query> returns big number of values.
2024-12-23 19:45:03 +01:00
Aliaksandr Valialkin
6b0da64b30
lib/logstorage: reduce memory allocations at stats and top pipes
Use chunked allocator in order to reduce memory allocations. It allocates objects from slices of up to 64Kb size.
This improves performance for `stats` and `top` pipes by up to 2x when they are applied to big number of `by (...)` groups.

Also parallelize execution of `count_uniq`, `count_uniq_hash` and `uniq_values` stats functions,
so they are executed faster on hosts with many CPU cores when applied to fields with big number
of unique values.
2024-12-23 19:45:02 +01:00
Aliaksandr Valialkin
e71a8e3a6c
lib/logstorage: add facets pipe for returning the most frequent values across all the log fields seen in the selected logs
(cherry picked from commit dbec34bafc)
2024-12-09 12:23:27 +01:00
Aliaksandr Valialkin
a4ea3b87d7
lib/logstorage: optimize query imeediately after its parsing
This eliminates possible bugs related to forgotten Query.Optimize() calls.

This also allows removing optimize() function from pipe interface.

While at it, drop filterNoop inside filterAnd.

(cherry picked from commit 66b2987f49)
2024-11-08 17:07:56 +01:00
Aliaksandr Valialkin
1ea65d662f
lib/logstorage: properly reset cached output fields for extract and extract_regexp pipes after the log entry matches if(...) condition
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7162

(cherry picked from commit c5d08d317c)
2024-10-31 14:11:08 +01:00
Aliaksandr Valialkin
246c339e3d
lib/logstorage: read timestamps column when it is really needed during query execution
Previously timestamps column was read unconditionally on every query.
This could significantly slow down queries, which do not need reading this column
like in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070 .
2024-09-25 19:18:37 +02:00
Aliaksandr Valialkin
dd62a2b9d6
lib/logstorage: work-in-progress 2024-06-27 14:21:03 +02:00
Aliaksandr Valialkin
540bbb63a2
lib/logstorage: work-in-progress 2024-05-30 16:19:36 +02:00
Aliaksandr Valialkin
79c03fc35f
lib/logstorage: work-in-progress 2024-05-28 19:29:50 +02:00