github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2025-03-21 15:45:01 +00:00

Author	SHA1	Message	Date
Zakhar Bessarab	dc43c73ae0	lib/storage/search_options: do not use ref to atomic as it is not needed `searchOptions` are always passed by ref, so there is no need to pass atomic.Uint64 by ref as well. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2025-03-12 19:50:44 +04:00
Zakhar Bessarab	3caecce812	lib/storage/search_options: address review feedback - remove pooling for `searchOptions` - update metric to use `path` as label Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2025-03-12 19:50:44 +04:00
Zakhar Bessarab	62c2896c9c	lib/storage/search_options: use summaries for tracking of read metricIDs Use summaries in order to reduce cardinality generated by these metrics. Also update label to include source of metric. Include request path for cases when it is guaranteed to be used by a single endpoint, use generic name in other cases ("search" - used by both "/query" and "/query_range", "search_metric_names" used by "/series" and graphite tag handlers). Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2025-03-12 19:50:44 +04:00
Zakhar Bessarab	e6e106b02c	{lib,app}/vmstorage: move metrics implementation to storage This will reduce difference between single-node and cluster versions. Also, this will be a good starting point for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7154 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2025-03-12 19:50:44 +04:00
Zakhar Bessarab	deb5836c92	{lib,app}/vmstorage: added metrics to show number of series read per query Added metrics to track number of series read per individual query. Based on checking number of metricIDs fetched when performing search. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7029 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2025-03-12 19:50:39 +04:00
Aliaksandr Valialkin	ca65aa1cce	lib/logstorage: properly parse floating-point numbers with leading zeroes in fractional part Parsing for floating-point numbers with leading zeroes such as 1.023, 1.00234 has been broken in the commit `ae5e28524e` . Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8464 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8361	2025-03-12 15:29:21 +01:00
Evgeny	c223aade17	lib/promscrape: use original job name as scrapePool value in targets api (#8457 ) ### Fix scrapePool name If in the scrape file, I do some magic and manipulate the job name then Prometheus will show scrapePool as the original job name in the targets API, but vmagent will set it to the final value which is wrong. example ``` job: consul-targets ... - source_labels: [ __meta_consul_service ] regex: (\w+)[_-]exporter target_label: job replacement: $1 ``` curl to prom API will show `"scrapePool": "consul-targets",` vmagent: `""scrapePool": "node",` before changes: ``` curl -s 'http://localhost:8429/api/v1/targets' \| jq -r '.data.activeTargets[].scrapePool'\| sort\|uniq blackbox pgbackrest postgres ``` after changes ``` curl -s 'http://localhost:8429/api/v1/targets' \| jq -r '.data.activeTargets[].scrapePool'\| sort\|uniq blackbox consul-targets ``` ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `486b9e1c64`)	2025-03-11 13:13:41 +01:00
Andrii Chubatiuk	394654c127	lib/streamaggr: fixed streamaggr panic (#8471 ) ### Describe Your Changes fixes #8469 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). (cherry picked from commit `c174a046e2`)	2025-03-10 13:54:23 +01:00
f41gh7	e24a9d3053	lib/metricnamestats: follow-up after `b85b28d30a` * properly save state for cross-device mount points * properly check empty state for tracker Signed-off-by: f41gh7 <nik@victoriametrics.com>	2025-03-06 23:18:42 +01:00
Nikolay	773b8b0b28	lib/storage: add tracker for time series metric names statistics This feature allows to track query requests by metric names. Tracker state is stored in-memory, capped by 1/100 of allocated memory to the storage. If cap exceeds, tracker rejects any new items add and instead registers query requests for already observed metric names. This feature is disable by default and new flag: `-storage.trackMetricNamesStats` enables it. New API added to the select component: * /api/v1/status/metric_names_stats - which returns a JSON object with usage statistics. * /admin/api/v1/status/metric_names_stats/reset - which resets internal state of the tracker and reset tsid/cache. New metrics were added for this feature: * vm_cache_size_bytes{type="storage/metricNamesUsageTracker"} * vm_cache_size{type="storage/metricNamesUsageTracker"} * vm_cache_size_max_bytes{type="storage/metricNamesUsageTracker"} Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4458 --------- Signed-off-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2025-03-06 22:10:41 +01:00
Andrii Chubatiuk	c72d5690cc	lib/protoparser/opentelemetry: properly marshal nested attributes into JSON Previously, opentelemetry attribute parsed added extra field names according to golang JSON parser spec for structs: ``` struct AnyValue{ StringValue string } ``` Was serialized into: ``` {"StringValue": "some-string"} ``` While opentelemetry-collector serializes it as ``` "some-string" ``` This commit changes this behaviour it makes parses compatible with opentelemetry-collector format. See test cases for examples. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8384	2025-03-05 18:38:25 +01:00
hagen1778	a0501d01fd	lib/timeutil: add test for `ParseDuration` See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8403#discussion_r1976110052 Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `6db97d6f79`)	2025-03-03 10:46:23 +01:00
Roman Khavronenko	d5d143f849	lib/promutils: move time-related funcs from `promutils` to `timeutil` (#8403 ) Since funcs `ParseDuration` and `ParseTimeMsec` are used in vlogs, vmalert, victoriametrics and other components, importing promutils only for this reason makes them to export irrelevant `vm_rows_invalid_total{type="prometheus"}` metric. This change removes `vm_rows_invalid_total{type="prometheus"}` metric from /metrics page for these components. ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `63f6ac3ff8`)	2025-03-03 10:28:07 +01:00
Zakhar Bessarab	04b6939c34	lib/promrelabel/scrape_url: properly parse IPv6 address from __address__ label Fix parsing of IPv6 addresses after discovery. Previously, it could lead to target being discovered and discarded afterwards. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8374 --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> (cherry picked from commit `99de272b72`)	2025-02-28 14:20:24 +04:00
Aliaksandr Valialkin	c8a12435ec	lib/logstorage: add ability to specify field name prefixes inside `fields (...)` lists passed to `pack_json` and `pack_logfmt` pipes	2025-02-27 22:56:14 +01:00
Roman Khavronenko	3ec0247ee3	lib/prompbmarshal: move MustParsePromMetrics to protoparser/prometheus (#8405 ) `MustParsePromMetrics` imports `lib/protoparser/prometheus`, and this package exposes the following metrics: ``` vm_protoparser_rows_read_total{type="promscrape"} vm_rows_invalid_total{type="prometheus"} ``` It means every package that uses `lib/prompbmarshal` will start exposing these metrics. For example, vlogs imports `lib/protoparser/common` which uses `lib/prompbmarshal.Label`. And only because of this vlogs starts exposing unrelated prometheus metrics on /metrics page. Moving `MustParsePromMetrics` to `lib/protoparser/prometheus` seems like the leas intrusive change. ----------- Depends on another change https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8403 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-02-27 22:55:32 +01:00
Aliaksandr Valialkin	a1aa4b7aa9	lib/logstorage: allow passing `` at `in()`, `contains_any()` and `contains_all()` Such filters are equivalent to `match all` filter aka `*`. These filters are needed for VictoriaLogs plugin for Grafana. See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/238#issuecomment-2685447673	2025-02-27 11:41:39 +01:00
Zhu Jiekun	6631899ead	lib/storage: properly cache extDB metricsID on search error Previously, if indexDB search failed for some reason during search at previous indexDB (aka extDB), VictoriaMetrics stored empty search result at cache. It could cause incorrect search results at subsequent requests. This commit checks search error and stores request results only on success. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8345	2025-02-26 16:07:48 +01:00
Aliaksandr Valialkin	a3ff49def0	lib/logstorage: do not treat a string with leading zeros as a number at tryParseUint64 The "00123" string shouldn't be treated as 123 number. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8361	2025-02-26 16:07:47 +01:00
Aliaksandr Valialkin	dd1c0e3bb7	lib/logstorage: optimize common regex filters generated by Grafana For example, `field:~".+"`, `field:~"."` or `field:""` Replace such filters to faster ones. For example, `field:~"."` is replaced with ``, while `field:~".+"` is replaced with `field:`.	2025-02-25 20:35:04 +01:00
Aliaksandr Valialkin	e36e28a2b0	lib/regexutil: speed up Regex.MatchString for ".*"	2025-02-25 20:35:03 +01:00
Aliaksandr Valialkin	14a5ccdc83	lib/logstorage: run `make fmt` after `30974e7f3f` (cherry picked from commit `82cdcec6c6`)	2025-02-25 19:13:31 +01:00
Aliaksandr Valialkin	9e0581533c	lib/logstorage: add `le_field` and `lt_field` filters These filters can be used for selecting logs where one field value is less than another field value. These filter complement `<=` and `<` filters for constant literals. (cherry picked from commit `30974e7f3f`)	2025-02-25 19:13:31 +01:00
Aliaksandr Valialkin	3bc89226bb	lib/logstorage: optimize eq_filter when it is applied to fields of the same type (cherry picked from commit `edc750dd55`)	2025-02-25 19:13:30 +01:00
Aliaksandr Valialkin	dc09d0bff4	lib/mergeset: explicitly pass the interval for flushing in-memory data to disk at MustOpenTable() This allows using different intervals for flushing in-memory data among different mergeset.Table instances. The initial user of this feature is lib/logstorage.Storage, which explicitly passes Storage.flushInterval to every created mereset.Table instance. Previously mergeset.Table instances were using 5 seconds flush interval, which didn't depend on the Storage.flushInterval. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775	2025-02-24 15:34:59 +01:00
Aliaksandr Valialkin	a964cc7a0c	lib/logstorage: properly use datadb.flushInterval as an interval between flushes for the in-memory parts The dataFlushInterval variable has been mistakenly introduced in the commit `9dbd0f9085` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775	2025-02-24 15:34:59 +01:00
Aliaksandr Valialkin	d56f9327ec	lib/logstorage: limit the maximum log field name length, which can be generated by JSONParser.ParseLogMessage Make sure that the maximum log field name, which can be generated by JSONParser.ParseLogMessage, doesn't exceed the hardcoded limit maxFieldNameSize. Stop flattening of nested JSON objects when the resulting field name becomes longer than maxFieldNameSize, and return the nested JSON object as a string instead. This should prevent from parse errors when ingesting deeply nested JSON logs with long field names.	2025-02-24 15:34:59 +01:00
Aliaksandr Valialkin	dc536d5626	lib/logstorage: add a benchmark for JSONParser.ParseLogMessage	2025-02-24 15:34:58 +01:00
Aliaksandr Valialkin	0d3ee707ba	lib/encoding/zstd: reduce the number of cached zstd.Encoder instances Use the real compression level supported by github.com/klauspost/compress/zstd as a cache map key. The number of real compression levels is smaller than the number of zstd compression levels. This should reduce the number of cached zstd.Encoder instances. See https://github.com/klauspost/compress/discussions/1025 See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7503#issuecomment-2500088591	2025-02-24 15:34:58 +01:00
Aliaksandr Valialkin	3ee4b3ef24	lib/logstorage: add `contains_any` and `contains_all` filters - `contains_any` selects logs with fields containing at least one word/phrase from the provided list. The provided list can be generated by a subquery. - `contains_all` selects logs with fields containing all the words and phrases from the provided list. The provided list can be generated by a subquery.	2025-02-24 15:34:58 +01:00
Aliaksandr Valialkin	3e941920f6	lib/logstorage: do not spend CPU time on preparing values for already filtered out rows according to bm at filterEqField.applyToBlockSearch	2025-02-24 15:34:57 +01:00
Aliaksandr Valialkin	6975352d5a	lib/logstorage: avoid extra memory allocations at getEmptyStrings()	2025-02-24 15:34:57 +01:00
Aliaksandr Valialkin	a2d0846e86	lib/logstorage: add an ability to drop duplicate words at unpack_words pipe	2025-02-24 15:34:57 +01:00
Aliaksandr Valialkin	518ed87a3a	lib/logstorage: rename unpack_tokens to unpack_words pipe The LogsQL defines a word at https://docs.victoriametrics.com/victorialogs/logsql/#word , so it is more natural to use unpack_words instead of unpack_tokens name for the pipe.	2025-02-24 15:34:57 +01:00
Aliaksandr Valialkin	4beceb67ab	lib/logstorage: optimize `OR` filter a bit for many inner filters Use two operations on bitmaps per each inner filter instead of three operations.	2025-02-24 15:34:57 +01:00
Aliaksandr Valialkin	bff5551ba5	lib/logstorage: use clear() for clearing bitmap bits at resetBits() instead of a loop The clear() call is easier to read and understand than the loop.	2025-02-24 15:34:56 +01:00
Aliaksandr Valialkin	4dfd1407ba	lib/logstorage: avoid calling bitmap.reset() at getBitmap() The bitmap at getBitamp() must be already reset when it was returned to the pool via putBitamp(). Thise saves CPU a bit.	2025-02-24 15:34:56 +01:00
Aliaksandr Valialkin	bc3e557f02	lib/logstorage: improve error logging for improperly escaped backslashes inside quoted strings This should simplify debugging LogsQL queries by users	2025-02-24 15:34:56 +01:00
Aliaksandr Valialkin	1f11bc948e	lib/logstorage: add `field1:eq_field(field2)` filter, which returns logs with identical values at field1 and field2	2025-02-24 15:34:56 +01:00
Aliaksandr Valialkin	504c034cbf	lib/logstorage: optimize `len`, `hash` and `json_array_len` pipes for repeated values Re-use the previous result instead of calculating new result for repated input values	2025-02-24 15:34:56 +01:00
Aliaksandr Valialkin	959282090a	lib/logstorage: add `json_array_len` pipe for calculating the length of JSON arrays	2025-02-24 15:34:56 +01:00
Aliaksandr Valialkin	aef939dc20	lib/logstorage: refactor unroll_tokens into unpack_tokens pipe unpack_tokens pipe generates a JSON array of unpacked tokens from the source field. This composes better with other pipes such as unroll pipe.	2025-02-24 15:34:55 +01:00
Aliaksandr Valialkin	afd74d82db	lib/logstorage: add `unroll_tokens` pipe for unrolling individual word tokens from the log field	2025-02-24 15:34:55 +01:00
Aliaksandr Valialkin	2dfd6bb689	lib/logstorage: simplify usage of `top`, `uniq` and `unroll` pipes by allowing comma-separated list of fields without parens Examples: - `top 5 x, y` is equivalent to `top 5 by (x, y)` - `uniq foo, bar` is equivalent to `uniq by (foo, bar)` - `unroll foo, bar` is equivalent to `unroll (foo, bar)`	2025-02-21 12:43:26 +01:00
Aliaksandr Valialkin	061fd098b5	lib/logstorage: properly handle _time:<=max_time filter _time:<=max_time filter must include logs with timestamps matching max_time. For example, _time:<=2025-02-24Z must include logs with timestamps until the end of February 24, 2025.	2025-02-21 12:43:26 +01:00
Aliaksandr Valialkin	80d173471f	lib/logstorage: allow using '>', '>=', '<' and '<=' in '_time:...' filter Examples: _time:>=2025-02-24Z selects logs with timestamps bigger or equal to 2025-02-24 UTC _time:>1d selects logs with timestamps older than one day comparing to the current time This simplifies writing queries with _time filters. See https://docs.victoriametrics.com/victorialogs/logsql/#time-filter	2025-02-21 12:43:26 +01:00
Hui Wang	93bbe10074	app/vmselect: add query resource limits priority This commit adds support for overriding vmstorage `maxUniqueTimeseries` with specific resource limits: 1. `-search.maxLabelsAPISeries` for [/api/v1/labels](https://docs.victoriametrics.com/url-examples/#apiv1labels), [/api/v1/label/.../values](https://docs.victoriametrics.com/url-examples/#apiv1labelvalues) 2. `-search. maxSeries` for [/api/v1/series](https://docs.victoriametrics.com/url-examples/#apiv1series) 3. `-search.maxTSDBStatusSeries` for [/api/v1/status/tsdb](https://docs.victoriametrics.com/#tsdb-stats) 4. `-search.maxDeleteSeries` for [/api/v1/admin/tsdb/delete_series](https://docs.victoriametrics.com/url-examples/#apiv1admintsdbdelete_series) Currently, this limit priority logic cannot be applied to flags `-search.maxFederateSeries` and `-search.maxExportSeries`, because they share the same RPC `search_v7` with the /api/v1/query and /api/v1/query_range APIs, preventing vmstorage from identifying the actual API of the request. To address that, we need to add additional information to the protocol between vmstorage and vmselect, which should be introduced in the future when possible. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7857	2025-02-19 18:14:54 +01:00
Andrii Chubatiuk	94bf90842a	app/vlinsert/syslog: properly parse log line with characters escaped by rfc5424 Inside PARAM-VALUE, the characters '"' (ABNF %d34), '\' (ABNF %d92), and ']' (ABNF %d93) MUST be escaped. This is necessary to avoid parsing errors. Escaping ']' would not strictly be necessary but is REQUIRED by this specification to avoid syslog application implementation errors. Each of these three characters MUST be escaped as '\"', '\\', and '\]' respectively. The backslash is used for control character escaping for consistency with its use for escaping in other parts of the syslog message as well as in traditional syslog. Related RFC: https://datatracker.ietf.org/doc/html/rfc5424#section-6.3.3 Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8282	2025-02-19 18:12:40 +01:00
Andrii Chubatiuk	99de7456c3	lib/protoparser/influx: add -influx.forceStreamMode flag to force parsing all Influx data in stream mode (#8319 ) Addresses #8269 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2025-02-19 17:40:32 +01:00
Andrii Chubatiuk	a041488786	lib/streamaggr: added aggregation windows (#6314 ) ### Describe Your Changes By default, stream aggregation and deduplication stores a single state per each aggregation output result. The data for each aggregator is flushed independently once per aggregation interval. But there's no guarantee that incoming samples with timestamps close to the aggregation interval's end will get into it. For example, when aggregating with `interval: 1m` a data sample with timestamp 1739473078 (18:57:59) can fall into aggregation round `18:58:00` or `18:59:00`. It depends on network lag, load, clock synchronization, etc. In most scenarios it doesn't impact aggregation or deduplication results, which are consistent within margin of error. But for metrics represented as a collection of series, like [histograms](https://docs.victoriametrics.com/keyconcepts/#histogram), such inaccuracy leads to invalid aggregation results. For this case, streaming aggregation and deduplication support mode with aggregation windows for current and previous state. With this mode, flush doesn't happen immediately but is shifted by a calculated samples lag that improves correctness for delayed data. Enabling of this mode has increased resource usage: memory usage is expected to double as aggregation will store two states instead of one. However, this significantly improves accuracy of calculations. Aggregation windows can be enabled via the following settings: - `-streamAggr.enableWindows` at [single-node VictoriaMetrics](https://docs.victoriametrics.com/single-server-victoriametrics/) and [vmagent](https://docs.victoriametrics.com/vmagent/). At [vmagent](https://docs.victoriametrics.com/vmagent/) `-remoteWrite.streamAggr.enableWindows` flag can be specified individually per each `-remoteWrite.url`. If one of these flags is set, then all aggregators will be using fixed windows. In conjunction with `-remoteWrite.streamAggr.dedupInterval` or `-streamAggr.dedupInterval` fixed aggregation windows are enabled on deduplicator as well. - `enable_windows` option in [aggregation config](https://docs.victoriametrics.com/stream-aggregation/#stream-aggregation-config). It allows enabling aggregation windows for a specific aggregator. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `c8fc903669`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-02-19 13:31:37 +01:00

1 2 3 4 5 ...

2983 commits