Commit graph

155 commits

Author SHA1 Message Date
Hui Wang
68bad22fd2
vmalert: integrate with victorialogs (#7255)
address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6706.
See
https://github.com/VictoriaMetrics/VictoriaMetrics/blob/vmalert-support-vlog-ds/docs/VictoriaLogs/vmalert.md.

Related fix
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7254.

Note: in this pull request, vmalert doesn't support
[backfilling](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/vmalert-support-vlog-ds/docs/VictoriaLogs/vmalert.md#rules-backfilling)
for rules with a customized time filter. It might be added in the
future, see [this
issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7289)
for details.

Feature can be tested with image
`victoriametrics/vmalert:heads-vmalert-support-vlog-ds-0-g420629c-scratch`.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-10-29 16:30:39 +01:00
Zakhar Bessarab
837d0d136d
lib/mergeset: add sparse indexdb cache (#7269)
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7182

- add a separate index cache for searches which might read through large
amounts of random entries. Primary use-case for this is retention and
downsampling filters, when applying filters background merge needs to
fetch large amount of random entries which pollutes an index cache.
Using different caches allows to reduce effect on memory usage and cache
efficiency of the main cache while still having high cache hit rate. A
separate cache size is 5% of allowed memory.

- reduce size of indexdb/dataBlocks cache in order to free memory for
new sparse cache. Reduced size by 5% and moved this to a separate cache.

- add a separate metricName search which does not cache metric names -
this is needed in order to allow disabling metric name caching when
applying downsampling/retention filters. Applying filters during
background merge accesses random entries, this fills up cache and does
not provide an actual improvement due to random access nature.


Merge performance and memory usage stats before and after the change:

- before

![image](https://github.com/user-attachments/assets/485fffbb-c225-47ae-b5c5-bc8a7c57b36e)


- after

![image](https://github.com/user-attachments/assets/f4ba3440-7c1c-4ec1-bc54-4d2ab431eef5)

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2024-10-24 15:21:17 +02:00
Aliaksandr Valialkin
0f24078146
lib/logstorage: use simpler in-memory cache instead of workingsetcache for caching recently ingested _stream values and recently queried set of streams
These caches aren't expected to grow big, so it is OK to use the most simplest cache based on sync.Map.
The benefit of this cache compared to workingsetcache is better scalability on systems with many CPU cores,
since it doesn't use mutexes at fast path.
An additional benefit is lower memory usage on average, since the size of in-memory cache equals
working set for the last 3 minutes.

The downside is that there is no upper bound for the cache size, so it may grow big during workload spikes.
But this is very unlikely for typical workloads.
2024-10-18 02:22:43 +02:00
Aliaksandr Valialkin
8aa144fa74
lib/logstorage: do not persist streamIDCache, since it may go out of sync with partition directories, which can be changed manually between VictoriaLogs restarts
Partition directories can be manually deleted and copied from another sources such as backups or other VitoriaLogs instances.
In this case the persisted cache becomes out of sync with partitions. This can result in missing index entries
during data ingestion or in incorrect results during querying. So it is better to do not persist caches.
This shouldn't hurt VictoriaLogs performance just after the restart too much, since its caches usually contain
small amounts of data, which can be quickly re-populated from the persisted data.
2024-10-18 02:22:43 +02:00
Aliaksandr Valialkin
1892e357c3
lib/logstorage: consistently use "pHits := m[..]" pattern
Consistency improves maintainability of the code a bit.
2024-10-18 02:22:43 +02:00
Aliaksandr Valialkin
2023f017b1
lib/logstorage: optimize performance for queries, which select all the log fields for logs containing hundreds of log fields (aka "wide events")
Unpack the full columnsHeader block instead of unpacking meta-information per each individual column
when the query, which selects all the columns, is executed. This improves performance when scanning
logs with big number of fields.
2024-10-18 02:22:42 +02:00
Aliaksandr Valialkin
78c6fb0883
lib/logstorage: improve performance of top and field_values pipes on systems with many CPU cores
- Parallelize mering of per-CPU results.
- Parallelize writing the results to the next pipe.
2024-10-18 02:22:42 +02:00
Aliaksandr Valialkin
c4b2fdff70
lib/logstorage: optimize 'stats by(...)' calculations for by(...) fields with millions of unique values on multi-CPU systems
- Parallelize merging of per-CPU `stats by(...)` result shards.
- Parallelize writing `stats by(...)` results to the next pipe.
2024-10-18 02:22:41 +02:00
Aliaksandr Valialkin
192c07f76a
lib/logstorage: optimize performance for top pipe when it is applied to a field with millions of unique values
- Use parallel merge of per-CPU shard results. This improves merge performance on multi-CPU systems.
- Use topN heap sort of per-shard results. This improves performance when results contain millions of entries.
2024-10-18 02:21:56 +02:00
Aliaksandr Valialkin
508e498ae3
lib/logstorage: follow-up for 72941eac36
- Allow dropping metrics if the query result contains at least a single metric.
- Allow copying by(...) fields.
- Disallow overriding by(...) fields via `math` pipe.
- Allow using `format` pipe in stats query. This is useful for constructing some labels from the existing by(...) fields.
- Add more tests.
- Remove the check for time range in the query filter according to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7254/files#r1803405826

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7254
2024-10-16 19:43:52 +02:00
Hui Wang
72941eac36
victorialogs: add more checks for stats query APIs (#7254)
1. Verify if field in [fields
pipe](https://docs.victoriametrics.com/victorialogs/logsql/#fields-pipe)
exists. If not, it generates a metric with illegal float value "" for
prometheus metrics protocol.
2. check if multiple time range filters produce conflicted query time
range, for instance:
```
query: _time: 5m | stats count(), 
start:2024-10-08T10:00:00.806Z, 
end: 2024-10-08T12:00:00.806Z, 
time: 2024-10-10T10:02:59.806Z
```
must give no result due to invalid final time range.

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-10-16 19:25:43 +02:00
Aliaksandr Valialkin
202eb429a7
lib/logstorage: refactor storage format to be more efficient for querying wide events
It has been appeared that VictoriaLogs is frequently used for collecting logs with tens of fields.
For example, standard Kuberntes setup on top of Filebeat generates more than 20 fields per each log.
Such logs are also known as "wide events".

The previous storage format was optimized for logs with a few fields. When at least a single field
was referenced in the query, then the all the meta-information about all the log fields was unpacked
and parsed per each scanned block during the query. This could require a lot of additional disk IO
and CPU time when logs contain many fields. Resolve this issue by providing an (field -> metainfo_offset)
index per each field in every data block. This index allows reading and extracting only the needed
metainfo for fields used in the query. This index is stored in columnsHeaderIndexFilename ( columns_header_index.bin ).
This allows increasing performance for queries over wide events by 10x and more.

Another issue was that the data for bloom filters and field values across all the log fields except of _msg
was intermixed in two files - fieldBloomFilename ( field_bloom.bin ) and fieldValuesFilename ( field_values.bin ).
This could result in huge disk read IO overhead when some small field was referred in the query,
since the Operating System usually reads more data than requested. It reads the data from disk
in at least 4KiB blocks (usually the block size is much bigger in the range 64KiB - 512KiB).
So, if 512-byte bloom filter or values' block is read from the file, then the Operating System
reads up to 512KiB of data from disk, which results in 1000x disk read IO overhead. This overhead isn't visible
for recently accessed data, since this data is usually stored in RAM (aka Operating System page cache),
but this overhead may become very annoying when performing the query over large volumes of data
which isn't present in OS page cache.

The solution for this issue is to split bloom filters and field values across multiple shards.
This reduces the worst-case disk read IO overhead by at least Nx where N is the number of shards,
while the disk read IO overhead is completely removed in best case when the number of columns doesn't exceed N.
Currently the number of shards is 8 - see bloomValuesShardsCount . This solution increases
performance for queries over large volumes of newly ingested data by up to 1000x.

The new storage format is versioned as v1, while the old storage format is version as v0.
It is stored in the partHeader.FormatVersion.

Parts with the old storage format are converted into parts with the new storage format during background merge.
It is possible to force merge by querying /internal/force_merge HTTP endpoint - see https://docs.victoriametrics.com/victorialogs/#forced-merge .
2024-10-16 17:35:07 +02:00
Aliaksandr Valialkin
bac193e50b
app/vlselect: do not show empty fields in query results
Empty fields are treated as non-existing fields by VictoriaLogs data model.
So there is no sense in returning empty fields in query results, since they may mislead and confuse users.
2024-10-14 23:43:58 +02:00
Aliaksandr Valialkin
3c73dbbacc
app/vlstorage: add support for forced merge via /internal/force_merge HTTP endpoint 2024-10-13 22:20:31 +02:00
Aliaksandr Valialkin
b4b79a4961
lib/logstorage: make a copy of s.partitions slice when performing queries over the selected partitions
s.partitions can be changed when new partition is registered or when old partition is dropped.
This could lead to data races and panics when s.partitions slice is accessed by concurrently executed queries.

The fix is to make a copy of the selected partitions under s.partitionsLock before performing the query.
2024-10-13 22:14:34 +02:00
Aliaksandr Valialkin
507b206a7d
lib/logstorage: move getConstColumnValue() and getColumnHeader() methods from columnsHeader to blockSearch
This localizes blockSearch.getColumnsHeader() call at block_search.go .
This call is going to be optimized in the next commits in order to avoid
unmarshaling of header data for unneeded columns, which weren't requested
by getConstColumnValue() / getColumnHeader().
2024-10-13 14:29:02 +02:00
Aliaksandr Valialkin
279e25e7c8
lib/logstorage: avoid redundant copying of column names and column values for dictionary-encoded columns during querying
Refer the original byte slice with the marshaled columnsHeader for columns names and dictionary-encoded column values.
This improves query performance a bit when big number of blocks with big number of columns are scanned during the query.
2024-10-13 13:25:38 +02:00
Aliaksandr Valialkin
9e48074b59
lib/logstorage: avoid calling columnsHeader.initFromBlockHeader() multiple times for the same blockSearch
This should improve performance when blockSearch.getColumnsHeader() is called multiple times
from different places of the code.
2024-10-13 12:56:12 +02:00
Aliaksandr Valialkin
867f671cc4
lib/logstorage: make sure that bs.br is non-nil before checking br.bs.bsw.bh.rowsCount there
br.bs may be nil when br contains the block with additional filters applied during pipe calculations.
For example, `* | count() if (error) errors`.
2024-10-12 20:51:29 +02:00
Aliaksandr Valialkin
7b475ed95d
lib/logstorage: disallow using pipe names as the first unquoted words in filter pipe
Improperly written pipes could be silently parsed as filter pipe.
For example, the following query:

   * | by (x)

was silently parsed to:

   * | filter "by" x

It is better to return error, so the user could identify and fix invalid pipe
instead of silently executing invalid query with `filter` pipe.
2024-10-09 16:10:13 +02:00
Aliaksandr Valialkin
6acf543b90
lib/logstorage: disallow using by as the first word in log filters, since it frequently clashes with stats by(...) pipe where stats word is omitted 2024-10-09 15:53:15 +02:00
Aliaksandr Valialkin
89686094a0
lib/logstorage: allow special chars in unquoted _stream tag names and values
This simplifies writing _stream filters. For example,

{foo-bar=abc:de}

can be written instead of

{"foo-bar"="abc:de"}
2024-10-07 15:10:03 +02:00
Aliaksandr Valialkin
462b7cd597
lib/logstorage: quote logfmt strings only if they contain special chars, which could break logfmt parsing and/or reading 2024-10-07 14:31:30 +02:00
Aliaksandr Valialkin
364f084b43
lib/logstorage: add len pipe for calculating byte length of log field values 2024-10-03 18:21:10 +02:00
Aliaksandr Valialkin
a350be48b6
lib/logstorage: do not count dictionary values which have no matching logs in count_uniq stats function
Create blockResultColumn.forEachDictValue* helper functions for visiting matching
dictionary values. These helper functions should prevent from counting dictionary values
without matching logs in the future.

This is a follow-up for 0c0f013a60
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7152
2024-10-01 13:34:45 +02:00
Aliaksandr Valialkin
630211cfed
app/vlogscli: add interactive command-line tool for querying VictoriaLogs 2024-10-01 12:23:07 +02:00
Aliaksandr Valialkin
0c0f013a60
lib/logstorage: skip values with zero hits for 'uniq', 'top' and 'field_values' pipes
See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/72#issuecomment-2352078483
2024-09-30 14:15:07 +02:00
Aliaksandr Valialkin
1da4650143
lib/logstorage: allow using ! in unescaped phrase
Previously the phrase filter with `!` was treated unexpectedly.
For example, `foo!bar` filter was treated at `foo AND NOT bar`,
while most users expect that it matches "foo!bar" phrase.

This commit aligns with users' expectations.
2024-09-29 11:14:15 +02:00
Aliaksandr Valialkin
60183c7c79
lib/logstorage: allow using - instead of ! in front of (...) 2024-09-29 11:12:22 +02:00
Aliaksandr Valialkin
b52862badf
lib/logstorage: return the expected hits results from uniq pipe when the number of unique values reaches the specified limit
Previously `uniq` pipe could return zero `hits` if the number of found unique values equals the specified limit.
This wasn't expected in most cases.
2024-09-29 10:51:09 +02:00
Aliaksandr Valialkin
55eb321f77
lib/logstorage: clear hits slice obtained from encoding.GetUint64s() before updating it with hits for valueTypeDict column
encoding.GetUint64s() returns uninitialized slice, which may contain arbitrary values.
So values in this slice must be reset to zero before using it for counting hits in `uniq` and `top` pipes.
2024-09-29 10:29:13 +02:00
Aliaksandr Valialkin
94afcbd9a9
lib/logstorage: postpone initialization of per-shard stateSizeBudget until the first call to pipeProcessor.writeBlock()
This simplifies pipeProcessor initialization logic a bit.
This also doesn't mangle the original maxStateSize value, which is used in error messages when the state size exceeds maxStateSize.
2024-09-29 10:29:13 +02:00
Aliaksandr Valialkin
0b91452ca4
lib/logstorage: add non-empty if (...) condition to automatically generated result names in stats pipe
This allows executing queries with `stats` pipe, which calculate multiple results with the same functions,
but with different `if (...)` conditions. For example:

  _time:5m | count(), count() if (error)

Previously such queries couldn't be executed becasue automatically generated name for the second result
didn't include `if (error)`, so names for both results were identical - `count(*)`.
2024-09-29 09:51:28 +02:00
Aliaksandr Valialkin
8772aea24b
lib/logstorage: support order alias for sort pipe
Now the following queries are equivalents:

    _time:5s | sort by (_time)

    _time:5s | order by (_time)

This is needed for convenience, since `order by` is commonly used in other query languages such as SQL.
2024-09-29 09:51:27 +02:00
Aliaksandr Valialkin
09b309a82e
lib/logstorage: allow using - instead of ! as a shorthand for NOT operator in LogsQL 2024-09-27 13:14:47 +02:00
Aliaksandr Valialkin
76c1b0b8ea
lib/logstorage: support skipping _stream: prefix for stream filters
'_stream:{...}' can be written as '{...}'

This simplifies writing queries with stream filters, and makes them more familier to Loki users.
2024-09-27 13:14:46 +02:00
Aliaksandr Valialkin
9367a9a6a2
lib/logstorage: consistently sort stream contexts belonging to different streams by the minimum time seen in the matching logs
This should simplify debugging of stream_context output, since it remains stable over repeated requests.
2024-09-27 11:19:26 +02:00
Aliaksandr Valialkin
b49d1ea809
lib/logstorage: add _msg="---" delimiter between different log streams in stream_context output
This should help investigating contexts, which belong to different log streams.
2024-09-27 11:01:13 +02:00
Aliaksandr Valialkin
b82bd0c2ec
lib/logstorage: improve performance for stream_context pipe over streams with big number of log entries
Do not read timestamps for blocks, which cannot contain surrounding logs.
This should improve peformance for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6730 .

Also optimize min(_time) and max(_time) calculations a bit by avoiding conversion
of timestamp to string when it isn't needed.
This should improve performance for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070 .
2024-09-26 22:22:23 +02:00
Aliaksandr Valialkin
4b1611267f
lib/logstorage: properly return surrounding logs outside the selected time range by stream_context pipe
Previously only logs inside the selected time range could be returned by stream_context pipe.
For example, the following query could return up to 10 surrounding logs only for the last 5 minutes,
while most users expect this query should return up to 10 surrounding logs without restrictions on the time range.

    _time:5m panic | stream_context before 10

This enables the ability to implement stream context feature at VictoriaLogs web UI: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7063 .

Reduce memory usage when returning stream context over big log streams with millions of entries.
The new logic scans over all the log messages for the selected log stream, while keeping in memory only
the given number of surrounding logs. Previously all the logs for the given log stream on the selected time range
were loaded in memory before selecting the needed surrounding logs.
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6730 .

Reduce the scan performance for big log streams by fetching only the requested fields. For example, the following
query should be executed much faster than before if logs contain many fields other than _stream, _msg and _time:

    panic | stream_context after 30 | fields _stream, _msg, _time
2024-09-26 17:03:45 +02:00
Aliaksandr Valialkin
037652d5ae
app/vlinsert: support _time field without timezone information during data ingestion
Use local timezone of the host server in this case. The timezone can be overridden
with TZ environment variable if needed.

While at it, allow using whitespace instead of T as a delimiter between data and time
in the ingested _time field. For example, '2024-09-20 10:20:30' is now accepted
during data ingestion. This is valid ISO8601 format, which is used by some log shippers,
so it should be supported. This format is also known as SQL datetime format.

Also assume local time zone when time without timezone information is passed to querying APIs.
Previously such a time was parsed in UTC timezone. Add `Z` to the end of the time string
if the old behaviour is preferred.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6721
2024-09-26 12:49:35 +02:00
Aliaksandr Valialkin
255d1d4e13
app/vlselect/logsql: clone the query with the current timestamp when performing live tailing requests in the loop
Previously the original timestamp was used in the copied query, so _time:duration filters
were applied to the original time range: (timestamp-duration ... timestamp]. This resulted
in stopped live tailing, since new logs have timestamps bigger than the original time range.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7028
2024-09-26 08:57:23 +02:00
Aliaksandr Valialkin
e9950f6307
lib/logstorage: add blocks_count pipe
This pipe is useful for debugging purposes when the number of processed blocks must be calculated for the given query:

    <query> | blocks_count

This helps detecting the root cause of query performance slowdown in cases like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070
2024-09-25 19:17:48 +02:00
Aliaksandr Valialkin
65b93b17b1
lib/logstorage: lazily read column headers metadata during queries
This improves performance for analytical queries, which do not need column headers metadata.
For example, the following query doesn't need column headers metadata, since _stream and min(_time)
are stored in block header, which is read separately from colum headers metadata:

  _time:1w | stats by (_stream) min(_time) min_time

This commit significantly improves the performance for this query.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070
2024-09-25 19:17:48 +02:00
Aliaksandr Valialkin
4599429f51
lib/logstorage: read timestamps column when it is really needed during query execution
Previously timestamps column was read unconditionally on every query.
This could significantly slow down queries, which do not need reading this column
like in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7070 .
2024-09-25 19:17:47 +02:00
Aliaksandr Valialkin
7f1ba18719
lib/logstorage: improve the performance of obtaining _stream column value
Substitute global streamTagsCache with per-blockSearch cache for ((stream.id) -> (_stream value)) entries.
This improves scalability of obtaining _stream values on a machine with many CPU cores, since every CPU
has its own blockSearch instance.

This also should reduce memory usage when querying logs over big number of streams, since per-blockSearch
cache of ((stream.id) -> (_stream value)) entries is limited in size, and its lifetime is bounded by a single query.
2024-09-24 20:57:00 +02:00
Aliaksandr Valialkin
cf2e7d0d92
lib/logstorage/consts.go: document that it isn't recommended setting maxColumnsPerBlock constant to too big values
This should help avoiding cases like this one - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6425#issuecomment-2337446083
2024-09-24 18:51:46 +02:00
Aliaksandr Valialkin
f86e093b20
lib/logstorage: improve performance for streamID.marshalString() by more than 2x
The streamID.marshalString() is executed in hot path if the query selects _stream_id field.

Command to run the benchmark:

go test ./lib/logstorage/ -run=NONE -bench=BenchmarkStreamIDMarshalString -benchtime=5s

Results before the commit:

BenchmarkStreamIDMarshalString-16    	438480714	        14.04 ns/op	  71.23 MB/s	       0 B/op	       0 allocs/op

Results after the commit:

BenchmarkStreamIDMarshalString-16    	982459660	         6.049 ns/op	 165.30 MB/s	       0 B/op	       0 allocs/op
2024-09-24 18:35:04 +02:00
Aliaksandr Valialkin
919d2dc90e
lib/logstorage: add benchmark for streamID.marshalString 2024-09-24 18:31:38 +02:00
Aliaksandr Valialkin
a3d8077959
lib/logstorage: make sure that getCommonTokens returns common tokens in the original order of tokens inside tokenSets arg
This fixes flaky test TestGetCommonTokensForOrFilters:

    filter_or_test.go:143: unexpected tokens for field "_msg"; got ["foo" "bar"]; want ["bar" "foo"]
2024-09-19 15:59:48 +02:00