github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-01 14:47:38 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	92b9b13df1	lib/logstorage: optimize performance for queries, which select all the log fields for logs containing hundreds of log fields (aka "wide events") Unpack the full columnsHeader block instead of unpacking meta-information per each individual column when the query, which selects all the columns, is executed. This improves performance when scanning logs with big number of fields. (cherry picked from commit `2023f017b1`)	2024-10-18 11:42:15 +02:00
Aliaksandr Valialkin	5d541322c6	lib/logstorage: improve performance of `top` and `field_values` pipes on systems with many CPU cores - Parallelize mering of per-CPU results. - Parallelize writing the results to the next pipe. (cherry picked from commit `78c6fb0883`)	2024-10-18 11:42:15 +02:00
Aliaksandr Valialkin	cd7823a310	lib/logstorage: optimize 'stats by(...)' calculations for by(...) fields with millions of unique values on multi-CPU systems - Parallelize merging of per-CPU `stats by(...)` result shards. - Parallelize writing `stats by(...)` results to the next pipe. (cherry picked from commit `c4b2fdff70`)	2024-10-18 11:42:15 +02:00
Aliaksandr Valialkin	1000ae437c	lib/logstorage: optimize performance for `top` pipe when it is applied to a field with millions of unique values - Use parallel merge of per-CPU shard results. This improves merge performance on multi-CPU systems. - Use topN heap sort of per-shard results. This improves performance when results contain millions of entries. (cherry picked from commit `192c07f76a`)	2024-10-18 11:42:15 +02:00
Andrii Chubatiuk	7b49d4f5dc	vlogs: added basic alerts (#7252 ) ### Describe Your Changes Added basic VLogs alerts Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-10-17 11:11:45 -03:00
Aliaksandr Valialkin	e675b98b77	docs/VictoriaLogs/CHANGELOG.md: add missing part of the sentence	2024-10-17 11:09:18 -03:00
Aliaksandr Valialkin	1e19d9df3f	docs/VictoriaLogs/CHANGELOG.md: typo fix: refer the correct endpoints for stats results	2024-10-17 11:09:17 -03:00
Aliaksandr Valialkin	1ee0de573d	docs/VictoriaLogs/CHANGELOG.md: cut v0.36.0-victorialogs release	2024-10-17 11:09:17 -03:00
Aliaksandr Valialkin	54ccf09fdd	lib/logstorage: follow-up for `72941eac36` - Allow dropping metrics if the query result contains at least a single metric. - Allow copying by(...) fields. - Disallow overriding by(...) fields via `math` pipe. - Allow using `format` pipe in stats query. This is useful for constructing some labels from the existing by(...) fields. - Add more tests. - Remove the check for time range in the query filter according to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7254/files#r1803405826 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7254	2024-10-17 11:09:16 -03:00
Hui Wang	21864de527	victorialogs: add more checks for stats query APIs (#7254 ) 1. Verify if field in [fields pipe](https://docs.victoriametrics.com/victorialogs/logsql/#fields-pipe) exists. If not, it generates a metric with illegal float value "" for prometheus metrics protocol. 2. check if multiple time range filters produce conflicted query time range, for instance: ``` query: _time: 5m \| stats count(), start:2024-10-08T10:00:00.806Z, end: 2024-10-08T12:00:00.806Z, time: 2024-10-10T10:02:59.806Z ``` must give no result due to invalid final time range. --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-10-17 11:09:16 -03:00
Aliaksandr Valialkin	3346576a3a	lib/logstorage: refactor storage format to be more efficient for querying wide events It has been appeared that VictoriaLogs is frequently used for collecting logs with tens of fields. For example, standard Kuberntes setup on top of Filebeat generates more than 20 fields per each log. Such logs are also known as "wide events". The previous storage format was optimized for logs with a few fields. When at least a single field was referenced in the query, then the all the meta-information about all the log fields was unpacked and parsed per each scanned block during the query. This could require a lot of additional disk IO and CPU time when logs contain many fields. Resolve this issue by providing an (field -> metainfo_offset) index per each field in every data block. This index allows reading and extracting only the needed metainfo for fields used in the query. This index is stored in columnsHeaderIndexFilename ( columns_header_index.bin ). This allows increasing performance for queries over wide events by 10x and more. Another issue was that the data for bloom filters and field values across all the log fields except of _msg was intermixed in two files - fieldBloomFilename ( field_bloom.bin ) and fieldValuesFilename ( field_values.bin ). This could result in huge disk read IO overhead when some small field was referred in the query, since the Operating System usually reads more data than requested. It reads the data from disk in at least 4KiB blocks (usually the block size is much bigger in the range 64KiB - 512KiB). So, if 512-byte bloom filter or values' block is read from the file, then the Operating System reads up to 512KiB of data from disk, which results in 1000x disk read IO overhead. This overhead isn't visible for recently accessed data, since this data is usually stored in RAM (aka Operating System page cache), but this overhead may become very annoying when performing the query over large volumes of data which isn't present in OS page cache. The solution for this issue is to split bloom filters and field values across multiple shards. This reduces the worst-case disk read IO overhead by at least Nx where N is the number of shards, while the disk read IO overhead is completely removed in best case when the number of columns doesn't exceed N. Currently the number of shards is 8 - see bloomValuesShardsCount . This solution increases performance for queries over large volumes of newly ingested data by up to 1000x. The new storage format is versioned as v1, while the old storage format is version as v0. It is stored in the partHeader.FormatVersion. Parts with the old storage format are converted into parts with the new storage format during background merge. It is possible to force merge by querying /internal/force_merge HTTP endpoint - see https://docs.victoriametrics.com/victorialogs/#forced-merge .	2024-10-17 11:09:16 -03:00
Yury Molodov	066ed48c95	vmui: fix alert display with long messages (#7228 ) ### Describe Your Changes Fix `Alert` component to prevent it from overflowing the screen when displaying long messages. Related issue: #7207 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `86029de0d4`)	2024-10-15 16:36:44 +02:00
Yury Molodov	7bc20086ec	vmui: add the ability to cancel running queries (#7204 ) ### Describe Your Changes - Added functionality to cancel running queries on the Explore Logs and Query pages. - The loader was changed from a spinner to a top bar within the block. This still indicates loading, but solves the issue of the spinner "flickering," especially during graph dragging. Related issue: #7097 https://github.com/user-attachments/assets/98e59aeb-905b-4b9d-bbb2-688223b22a82 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). (cherry picked from commit `6c9772b101`)	2024-10-15 14:53:27 +02:00
Aliaksandr Valialkin	0881e5fd5c	app/vlselect: do not show empty fields in query results Empty fields are treated as non-existing fields by VictoriaLogs data model. So there is no sense in returning empty fields in query results, since they may mislead and confuse users. (cherry picked from commit `bac193e50b`)	2024-10-15 11:49:32 +02:00
Aliaksandr Valialkin	f627d7f686	app/vlstorage: add support for forced merge via /internal/force_merge HTTP endpoint (cherry picked from commit `3c73dbbacc`)	2024-10-15 11:49:31 +02:00
Aliaksandr Valialkin	ac2b6e8704	lib/logstorage: make a copy of s.partitions slice when performing queries over the selected partitions s.partitions can be changed when new partition is registered or when old partition is dropped. This could lead to data races and panics when s.partitions slice is accessed by concurrently executed queries. The fix is to make a copy of the selected partitions under s.partitionsLock before performing the query. (cherry picked from commit `b4b79a4961`)	2024-10-15 11:49:31 +02:00
Aliaksandr Valialkin	e581338b84	lib/logstorage: make sure that bs.br is non-nil before checking br.bs.bsw.bh.rowsCount there br.bs may be nil when br contains the block with additional filters applied during pipe calculations. For example, `* \| count() if (error) errors`. (cherry picked from commit `867f671cc4`)	2024-10-15 11:49:29 +02:00
Aliaksandr Valialkin	ff63816b06	docs/VictoriaLogs: cut v0.35.0 release (cherry picked from commit `252aa792f7`)	2024-10-11 14:27:46 +02:00
Aliaksandr Valialkin	4bb5f588bc	app/vlogscli: add -accountID and -projectID command-line flags for querying the given tenants (cherry picked from commit `ad5d8097da`)	2024-10-11 14:27:45 +02:00
Aliaksandr Valialkin	d07e09b1e4	app/vlogscli: add support for live tailing (cherry picked from commit `e31625e0b2`) Signed-off-by: hagen1778 <roman@victoriametrics.com> # Conflicts: # Makefile	2024-10-11 14:27:26 +02:00
Aliaksandr Valialkin	db75455fbd	docs/VictoriaLogs/CHANGELOG.md: cut v0.34.0 release	2024-10-08 12:21:50 +02:00
Aliaksandr Valialkin	efe5935497	app/vlogscli: add ability to display query results in logfmt, single-line and multi-line json modes (cherry picked from commit `492190885d`)	2024-10-07 14:46:21 +02:00
Aliaksandr Valialkin	2e5dbd6f91	app/vlogscli: return back sorting result fields by name This simplifies locating the needed field when the number of fields per each returned result is big (cherry picked from commit `daad96b3a5`)	2024-10-07 14:46:20 +02:00
Aliaksandr Valialkin	026560df73	app/vlogscli: preserve the original order of fields in the displayed responses	2024-10-05 21:30:10 +02:00
Aliaksandr Valialkin	7a44614e0b	lib/logstorage: add `len` pipe for calculating byte length of log field values (cherry picked from commit `364f084b43`)	2024-10-04 10:42:51 +02:00
Aliaksandr Valialkin	b2c3dbef09	docs/VictoriaLogs/CHANGELOG.md: cut v0.33.0-victorialogs release	2024-10-01 13:42:27 +02:00
Aliaksandr Valialkin	81f3e07e1e	lib/logstorage: do not count dictionary values which have no matching logs in `count_uniq` stats function Create blockResultColumn.forEachDictValue* helper functions for visiting matching dictionary values. These helper functions should prevent from counting dictionary values without matching logs in the future. This is a follow-up for `0c0f013a60` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7152	2024-10-01 13:36:27 +02:00
Aliaksandr Valialkin	8c55b699f4	app/vlogscli: add interactive command-line tool for querying VictoriaLogs	2024-10-01 12:24:53 +02:00
Aliaksandr Valialkin	c07746c1af	docs/VictoriaLogs/CHANGELOG.md: cut v0.32.1-victorialogs release	2024-09-30 14:30:34 +02:00
Aliaksandr Valialkin	dbcf06cd85	lib/logstorage: skip values with zero hits for 'uniq', 'top' and 'field_values' pipes See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/72#issuecomment-2352078483	2024-09-30 14:16:21 +02:00
Aliaksandr Valialkin	3babcb0bbd	docs/VictoriaLogs/CHANGELOG.md: cut v0.32.0-victorialogs	2024-09-29 14:48:36 +02:00
Aliaksandr Valialkin	58d1e517de	lib/logstorage: clear hits slice obtained from encoding.GetUint64s() before updating it with hits for valueTypeDict column encoding.GetUint64s() returns uninitialized slice, which may contain arbitrary values. So values in this slice must be reset to zero before using it for counting hits in `uniq` and `top` pipes.	2024-09-29 10:29:50 +02:00
Aliaksandr Valialkin	7f8b1300a9	lib/logstorage: add non-empty `if (...)` condition to automatically generated result names in `stats` pipe This allows executing queries with `stats` pipe, which calculate multiple results with the same functions, but with different `if (...)` conditions. For example: _time:5m \| count(), count() if (error) Previously such queries couldn't be executed becasue automatically generated name for the second result didn't include `if (error)`, so names for both results were identical - `count(*)`.	2024-09-29 09:52:19 +02:00
Aliaksandr Valialkin	04c73d54d4	lib/logstorage: support `order` alias for `sort` pipe Now the following queries are equivalents: _time:5s \| sort by (_time) _time:5s \| order by (_time) This is needed for convenience, since `order by` is commonly used in other query languages such as SQL.	2024-09-29 09:52:18 +02:00
Aliaksandr Valialkin	0f1b3852dd	app/vlinsert: support unix timestamps in seconds and milliseconds in JSON stream data ingestion API	2024-09-28 21:57:19 +02:00
Aliaksandr Valialkin	b8fa213310	app/vlinsert: accept unix timestamp in seconds additionally to milliseconds at ElasticSearch bulk API Timestamps in seconds are sometimes used for data ingestion via ElasticSearch bulk API	2024-09-28 21:21:19 +02:00
Aliaksandr Valialkin	8c62845211	docs/VictoriaLogs/CHANGELOG.md: cut v0.31.0-victorialogs release	2024-09-27 13:54:24 +02:00
Yury Molodov	64793ff5f0	vmui/logs: improve graph usability (#7025 ) ### Describe Your Changes - Show the time range in the tooltip when hovering over staircase graphs. - Use bolder lines for staircase graphs. - Increase the number of steps on the staircase graph to 100. - Reduce the maximum width of the tooltip to 1/3 of the screen. - Insert only the label name under the cursor into the query input field when `Ctrl`-clicking the line legend. See [this comment](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6545#issuecomment-2336805237). ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-09-27 13:20:18 +02:00
Aliaksandr Valialkin	1a6313ca68	lib/logstorage: allow using `-` instead of `!` as a shorthand for `NOT` operator in LogsQL	2024-09-27 13:15:55 +02:00
Aliaksandr Valialkin	b60cb98377	lib/logstorage: support skipping _stream: prefix for stream filters '_stream:{...}' can be written as '{...}' This simplifies writing queries with stream filters, and makes them more familier to Loki users.	2024-09-27 13:15:55 +02:00
Yury Molodov	b95af2accf	vmui: add functionality to preserve selected columns (#7037 ) ### Describe Your Changes 1) Changed table settings from a popup to a modal window to simplify future functionality additions. 2) Added functionality to save selected columns when data is modified or the page is reloaded. See #7016. <details> <summary>Example screenshots</summary> <img alt="demo-1" width="600" src="https://github.com/user-attachments/assets/a5d9a910-363c-4931-8b12-18ea8b3d97d8"/> </details> ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> (cherry picked from commit `c896bf340d`)	2024-09-27 12:40:52 +02:00
Aliaksandr Valialkin	ebb2c605f9	docs/VictoriaLogs/CHANGELOG.md: cut v0.30.1-victorialogs release	2024-09-27 11:21:28 +02:00
Aliaksandr Valialkin	bc0bb0c36a	lib/logstorage: consistently sort stream contexts belonging to different streams by the minimum time seen in the matching logs This should simplify debugging of stream_context output, since it remains stable over repeated requests.	2024-09-27 11:21:28 +02:00
Aliaksandr Valialkin	ce8eda4b51	docs/VictoriaLogs/CHANGELOG.md: cut v0.30.0-victorialogs release	2024-09-27 09:18:47 +02:00
Aliaksandr Valialkin	f5dfe1cacd	lib/logstorage: properly return surrounding logs outside the selected time range by stream_context pipe Previously only logs inside the selected time range could be returned by stream_context pipe. For example, the following query could return up to 10 surrounding logs only for the last 5 minutes, while most users expect this query should return up to 10 surrounding logs without restrictions on the time range. _time:5m panic \| stream_context before 10 This enables the ability to implement stream context feature at VictoriaLogs web UI: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7063 . Reduce memory usage when returning stream context over big log streams with millions of entries. The new logic scans over all the log messages for the selected log stream, while keeping in memory only the given number of surrounding logs. Previously all the logs for the given log stream on the selected time range were loaded in memory before selecting the needed surrounding logs. This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6730 . Reduce the scan performance for big log streams by fetching only the requested fields. For example, the following query should be executed much faster than before if logs contain many fields other than _stream, _msg and _time: panic \| stream_context after 30 \| fields _stream, _msg, _time	2024-09-26 17:04:39 +02:00
Aliaksandr Valialkin	4d27933041	app/vlinsert: support `_time` field without timezone information during data ingestion Use local timezone of the host server in this case. The timezone can be overridden with TZ environment variable if needed. While at it, allow using whitespace instead of T as a delimiter between data and time in the ingested _time field. For example, '2024-09-20 10:20:30' is now accepted during data ingestion. This is valid ISO8601 format, which is used by some log shippers, so it should be supported. This format is also known as SQL datetime format. Also assume local time zone when time without timezone information is passed to querying APIs. Previously such a time was parsed in UTC timezone. Add `Z` to the end of the time string if the old behaviour is preferred. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6721	2024-09-26 12:50:14 +02:00
Aliaksandr Valialkin	e63d50e0c0	docs/VictoriaLogs/CHANGELOG.md: typo fix: itentifying -> identifying	2024-09-26 09:41:50 +02:00
Zhu Jiekun	3fa72b2c1b	feature: [victorialogs] drop logs without non-empty _msg field (#7056 ) ### Describe Your Changes VictoriaLogs allows logs without `_msg` field or `_msg` field is empty. This lead to incorrect search result. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6785 This pull request search for non-empty `_msg` field before log entry is added to `LogRows`. New counter `vl_rows_dropped_total{reason="msg_not_exist"}` is introduced. Example log output: ``` 2024-09-23T02:33:19.719Z warn app/vlinsert/insertutils/common_params.go:189 dropping log line without _msg field; [{@timestamp 2024-09-18T13:42:16.600000000Z} {Attributes.array.attribute ["many","values"]} {Attributes.boolean.attribute true} {Attributes.double.attribute 637.704} {Attributes.int.attribute 10} {Attributes.map.attribute.some.map.key some value} {Attributes.string.attribute some string} {Body Example ddddddddddlog record} {Resource.service.name my.service} {Scope.my.scope.attribute some scope attribute} {Scope.name my.library} {Scope.version 1.0.0} {SeverityNumber 10} {SeverityText Information} {SpanId eee19b7ec3c1b174} {TraceFlags 0} {TraceId 5b8efff798038103d269b633813fc60c}] ``` ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). - [ ] Benchmark for potential performance loss. --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-09-26 09:35:58 +02:00
Aliaksandr Valialkin	92885f99dd	docs/VictoriaLogs/CHANGELOG.md: document the fix for Windows build This is a follow-up for `264c2ec6bd` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6998 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6973	2024-09-26 09:17:23 +02:00
Aliaksandr Valialkin	15db8d3c47	docs/VictoriaLogs/CHANGELOG.md: typo fix after `255d1d4e13`: returns -> return	2024-09-26 09:01:01 +02:00

1 2 3 4

188 commits