github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2025-04-20 16:09:25 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	4202917eda	lib/protoparser/protoparserutil: optimize ReadUncompressedData for zstd and snappy It is faster to read the whole data and then decompress it in one go for zstd and snappy encodings. This reduces the number of potential read() syscalls and decompress CGO calls needed for reading and decompressing the data.	2025-03-27 15:22:16 +01:00
Aliaksandr Valialkin	f83e780a55	lib/httputil: automatically initialize data transfer metrics for the created HTTP transports via NewTransport()	2025-03-27 15:22:15 +01:00
Dan Dascalescu	1d29bf503d	chore: minor grammar fix in error messages (#8580 ) ### Describe Your Changes `its'` -> `its` ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). (cherry picked from commit `0a49d8c930`)	2025-03-27 10:41:14 +01:00
Max Kotliar	2121c727bd	vmagent: fix stream parse flaky test (#8581 ) ### Describe Your Changes It was spotted that the test introduced In https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8515#issuecomment-2741063155 was flaky. This PR fixes it. ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). (cherry picked from commit `75995fc4db`)	2025-03-27 10:41:13 +01:00
Aliaksandr Valialkin	4e85206c25	lib/{httputil,promauth}: move functions, which create TLS config and TLS-based HTTP transport, from lib/httputil to lib/promauth - Move lib/httputil.Transport to lib/promauth.NewTLSTransport. Remove the first arg to this function (URL), since it has zero relation to the created transport. - Move lib/httputil.TLSConfig to lib/promauth.NewTLSConfig. Re-use the existing functionality from lib/promauth.Config for creating TLS config. This enables the following features: - Ability to load key, cert and CA files from http urls. - Ability to change the key, cert and CA files without the need to restart the service. It automatically re-loads the new files after they change.	2025-03-26 20:22:33 +01:00
Aliaksandr Valialkin	88e82614bf	lib/httputil: add NewTransport() function for creating pre-initialized net/http.Transport	2025-03-26 20:16:39 +01:00
Aliaksandr Valialkin	e887879a8c	lib/promscrape: rename lib/promscrape/discoveryutils to lib/promscrape/discoverytuil for the sake of consistency of *util package naming	2025-03-26 18:01:37 +01:00
Aliaksandr Valialkin	a7b20ff241	lib: rename lib/influxutils to lib/influxutil for the sake of consistency naming of *util packages	2025-03-26 17:39:01 +01:00
Aliaksandr Valialkin	f3f9141ebb	lib: rename lib/promutils to lib/promutil for the sake of consistency for *util package naming	2025-03-26 17:33:13 +01:00
Aliaksandr Valialkin	e9bd27753b	lib/protoparser: rename lib/protoparser/datadogutils to lib/protoparser/datadogutil for the sake of consistency for *util package naming	2025-03-26 17:13:36 +01:00
Aliaksandr Valialkin	7ee4621617	lib: rename lib/httputils to lib/httputil for the sake of consistency for *util package naming	2025-03-26 16:48:09 +01:00
Aliaksandr Valialkin	420cd074c3	lib/promauth: follow-up for the commit `eefae85450` - Avoid a data race when multiple goroutines access and update roundTripper.trBase inside roundTripper.getTransport(). The way to go is to make sure the roundTripper.trBase is updated only during roundTripper creation, and then can be only read without updating. - Use the http.DefaultTransport for http2 client connections at Kubernetes service discovery. Previously golang.org/x/net/http2.Transport was used there. This had the following issues: - An additional dependency on golang.org/x/net/http2. - Missing initialization of Transport.DialContext with netutil.Dialer.DialContext for http2 client. - Missing initialization of Transport.TLSHandshakeTimeout for http2 client. - Introduction of the lib/promauth.Config.NewRoundTripperFromGetter() method, which is hard to use properly. - Unnecessary complications of the lib/promauth.roundTripper, which led to the data race described above. - Avoid a data race when multiple goroutines access and update tls config shared between multiple net/http.Transport instances at the TLSClientConfig field. The way to go is to always make a copy of the tls config before assigning it to the net/http.Transport.TLSClientConfig field. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5971 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7114	2025-03-26 16:39:37 +01:00
Aliaksandr Valialkin	5a1d828753	lib/promauth: panic when programming error is detected at Config.GetTLSConfig() It is much better to panic instead of returning an error on programming error (aka BUG), since this significantly increases chances that the bug will be noticed, reported and fixed ASAP. The returned error can be ignored, even if it is logged, while panic is much harder to ignore. The code must always panic instead of returning errors when any programming error (aka unexpected state) is detected. This is a follow-up for the commit `9feee15493` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6783 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6771	2025-03-26 15:44:57 +01:00
Artem Fetishev	be43aca14f	lib/{mergeset,storage}: Update MustClose() method comments with the condition then the method must be called (#8568 ) Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>	2025-03-25 14:49:02 +01:00
Max Kotliar	0778c90901	lib/promscrape: improve streamParse performance Previously, performance of stream.Parse could be limited by mutex.Lock on callback function. It used shared writeContext. With complicated relabeling rules and any slowness at pushData function, it could significantly decrease parsed rows processing performance. This commit removes locks and makes parsed rows processing lock-free in the same manner as `stream.Parse` processing implemented at push ingestion processing. Implementation details: - Removing global lock around stream.Parse callback. - Using atomic operations for counters - Creating write contexts per callback instead of sharing - Improving series limit checking with sync.Once - Optimizing labels hash calculation with buffer pooling - Adding comprehensive tests for concurrency correctness Benchmark performance: ``` # before BenchmarkScrapeWorkScrapeInternalStreamBigData-10 13 81973945 ns/op 37.68 MB/s 18947868 B/op 197 allocs/op # after goos: darwin goarch: arm64 pkg: github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape cpu: Apple M1 Pro BenchmarkScrapeWorkScrapeInternalStreamBigData-10 74 15761331 ns/op 195.98 MB/s 15487399 B/op 148 allocs/op PASS ok github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape 1.806s ``` Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8159 --------- Signed-off-by: Maksim Kotlyar <kotlyar.maksim@gmail.com> Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>	2025-03-20 16:56:05 +01:00
Zakhar Bessarab	2ee91f6c5a	lib/backup/s3remote: add retries for "IncompleteBody" errors These errors could be caused by intermittent network issues, especially in case of using proxies when accessing S3 storage. Previously, such error would abort backup/restore process and require manual intervention to ensure backups consistency. This commit adds automatic retries to handle this to improve backups reliability and resilience to network issues.	2025-03-20 15:36:50 +01:00
Andrii Chubatiuk	ba8708af34	lib/streamaggr: fix threshold update, when deduplication and windows are enabled (#8525 ) ### Describe Your Changes during initial flush with deduplication and windows enabled lower timestamps threshold is set to an upper bound of the next deduplication interval, which leads to ignoring all samples on subsequent intervals ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `511517f491`)	2025-03-20 09:56:12 +01:00
Aliaksandr Valialkin	1f17c7f397	lib/chunkedbuffer: add Buffer.Len() method, which returns the byte length of the data stored in the buffer	2025-03-19 14:04:48 +01:00
Aliaksandr Valialkin	04b23fba33	lib/logstorage: typo fix in the comment to Storage.GetStreamFieldValues() function	2025-03-19 14:04:48 +01:00
Aliaksandr Valialkin	a93bb3c22d	lib/logstorage: support for `{field in ()}` and `{field not_in ()}` syntax in LogsQL This is needed for https://github.com/VictoriaMetrics/victorialogs-datasource/issues/238 to be consistent with `in(*)` feature, which has been added in the commit `84d5771b41`	2025-03-19 14:04:48 +01:00
Nikolay	16972a078f	lib/promscrape: properly send staleness markers Previously, vmagent may incorrectly store partial scrape response in case of scrapping error. It may happen if `sw.ReadData` call fetched some chunked response and store it at buffer. And later context deadline exceed error happened. As a result, at the next scrape iteration this partial response could be forwarded to the `sw.sendStaleSeries(lastScrape...)` function call and lead to `Prometheus line` parsing error. This commit properly set response body to the empty value in case of scrapping error. It prevents storing partial scrape response body. And it no longer sends partial staleness markers to the remote storage. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8528	2025-03-19 14:04:47 +01:00
Aliaksandr Valialkin	c0e9b15606	lib/protoparser: rename lib/protoparser/common to lib/protoparser/protoparserutil This improves readability of the code, which uses this package.	2025-03-18 16:40:06 +01:00
Aliaksandr Valialkin	5cec930842	lib/protoparser/common: limit the maximum memory, which could be occupied by snappy-compressed message at ReadUncompressedData	2025-03-18 11:18:00 +01:00
Alexander Frolov	51e293d351	lib/promrelabel: comment typo (#8520 ) ### Describe Your Changes `prasedRelabelConfig` -> `parsedRelabelConfig` ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). (cherry picked from commit `127d4f37b8`)	2025-03-17 16:44:16 +01:00
Guillem Jover	1d8b7faf71	spelling and grammar fixes via codespell (#8497 ) ### Describe Your Changes Fix many spelling errors and some grammar, including misspellings in filenames. The change also fixes a typo in metric `vm_mmaped_files` to `vm_mmapped_files`. While this is a breaking change, this metric isn't used in alerts or dashboards. So it seems to have low impact on users. The change also deprecates `cspell` as it is much heavier and less usable. --------- Co-authored-by: Andrii Chubatiuk <achubatiuk@victoriametrics.com> Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com> (cherry picked from commit `76d205feae`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-03-17 16:38:11 +01:00
Aliaksandr Valialkin	d7918d4caa	lib/logstorage: switch the type of LogRows.streamTagCanonicals from [][]byte to []string This reduces the size of LogRows.streamTagCanonicals by 1/3 because of the eliminated `cap` field in the slice header (reflect.SliceHeader) compared to the string header (reflect.StringHeader).	2025-03-17 15:04:27 +01:00
Aliaksandr Valialkin	0217198d5c	lib/prompb: use clear() function instead of loops for clearing WriteRequest fields inside WriteRequest.Reset This makes the code shorter without lossing the clarity.	2025-03-17 14:32:02 +01:00
Aliaksandr Valialkin	d0cbf0ab9c	app/vlinsert/opentelemetry: follow-up for `a884949aba` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8502 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8511	2025-03-16 01:09:38 +01:00
Devops	4fd2cb9102	fix:Fixed an issue where and were incorrectly displayed (#8511 ) ### Describe Your Changes Fixed an issue where and were incorrectly displayed when sent from OpenTelemetry Collector to Victoria Logs Fixes #8502	2025-03-16 01:09:38 +01:00
Aliaksandr Valialkin	6f9d70ae89	lib/{mergeset,storage,logstorage}: use chunked buffer instead of bytesutil.ByteBuffer as a storage for in-memory parts This commit adds lib/chunkedbuffer.Buffer - an in-memory chunked buffer optimized for random access via MustReadAt() function. It is better than bytesutil.ByteBuffer for storing large volumes of data, since it stores the data in chunks of a fixed size (4KiB at the moment) instead of using a contiguous memory region. This has the following benefits over bytesutil.ByteBuffer: - reduced memory fragmentation - reduced memory re-allocations when new data is written to the buffer - reduced memory usage, since the allocated chunks can be re-used by other Buffer instances after Buffer.Reset() call Performance tests show up to 2x memory reduction for VictoriaLogs when ingesting logs with big number of fields (aka wide events) under high speed.	2025-03-15 21:20:04 +01:00
Aliaksandr Valialkin	9ef0d7002e	lib/logstorage: pre-allocate buffers for fields and rows inside block.appendRowsTo() This reduces the number of memory re-allocations inside the loop, which copies the rows.	2025-03-15 21:20:03 +01:00
Aliaksandr Valialkin	22eec97422	lib/logstorage: pre-allocated buffers for fields and rows inside rows.appendRows() This should reduce the number of memory re-allocations inside the loop, which copies the rows.	2025-03-15 21:20:03 +01:00
Aliaksandr Valialkin	0019621d38	lib/logstorage: pre-allocate the buffer needed for marshaling a block of strings inside marshalStringsBlock This reduces the number of memory re-allocations when appending the strings to the buffer in the loop.	2025-03-15 21:20:02 +01:00
Aliaksandr Valialkin	2f3e55f41f	lib/logstorage: optimize copying dict values inside valuesDict.copyFrom a bit Pre-allocate the needed slice of strings and then assign items to it by index instead of appending them. This reduces the number of memory allocations and improves performance a bit.	2025-03-15 21:20:02 +01:00
Aliaksandr Valialkin	b0ac8c1f35	lib/logstorage: intern column names instead of cloning them during data ingestion This reduces the number of memory allocations when ingesting logs with big number of fields (aka wide events)	2025-03-15 21:20:01 +01:00
Aliaksandr Valialkin	619c9a4eeb	lib/protoparser/common: properly decode snappy-encoded requests Snappy-encoded requests are encoded in block mode instead of stream mode. Stream mode is incompatible with block mode. See https://pkg.go.dev/github.com/golang/snappy That's why Snappy-encoded requests must be read in block mode. Also add a protection against passing invalid readers to PutUncompressedReader(). This is a follow-up for `0451a1c9e0` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8416	2025-03-15 14:45:33 +01:00
Roman Khavronenko	53904f8816	lib/bytesutil: don't drop ByteBuffer.B when its capacity is bigger th… (#8510 ) …an 64KB at Reset This commit reverts `b58e2ab214` as it has negative impacts when ByteBuffer is used for workloads that always exceed 64KiB size. This significantly slows down affected components because: * buffers aren't beign reused; * growing new buffers to >64KiB is very slow. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8501 ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-03-15 01:39:01 +01:00
Aliaksandr Valialkin	32128e5d4b	lib/logstorage: support for `{label in (v1,...,vN)}` and `{label not_in (v1, ..., vN)}` syntax	2025-03-15 01:36:41 +01:00
Aliaksandr Valialkin	f8aeb0e7fc	app/vlinsert: follow-up for `37ed1842ab` - Properly decode protobuf-encoded Loki request if it has no Content-Encoding header. Protobuf Loki message is snappy-encoded by default, so snappy decoding must be used when Content-Encoding header is missing. - Return back the previous signatures of parseJSONRequest and parseProtobufRequest functions. This eliminates the churn in tests for these functions. This also fixes broken benchmarks BenchmarkParseJSONRequest and BenchmarkParseProtobufRequest, which consume the whole request body on the first iteration and do nothing on subsequent iterations. - Put the CHANGELOG entries into correct places, since they were incorrectly put into already released versions of VictoriaMetrics and VictoriaLogs. - Add support for reading zstd-compressed data ingestion requests into the remaining protocols at VictoriaLogs and VictoriaMetrics. - Remove the `encoding` arg from PutUncompressedReader() - it has enough information about the passed reader arg in order to properly deal with it. - Add ReadUncompressedData to lib/protoparser/common for reading uncompressed data from the reader until EOF. This allows removing repeated code across request-based protocol parsers without streaming mode. - Consistently limit data ingestion request sizes, which can be read by ReadUncompressedData function. Previously this wasn't the case for all the supported protocols. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8416 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8380 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8300	2025-03-15 00:11:58 +01:00
Aliaksandr Valialkin	cd7be54731	app/vlinsert: add an ability to ignore log fields starting with the given prefixes The `ignore_fields` HTTTP query args can contain prefixes ending with ''. For example, `ignore_fields=foo.,bar` skips all the fields starting with `foo.` during data ingestion.	2025-03-15 00:06:16 +01:00
Aliaksandr Valialkin	5069b253de	lib/logstorage: show a link to query options docs in the error message emitted during failure to parse query options This should help figuring out and fixing the error by the user.	2025-03-15 00:06:15 +01:00
Aliaksandr Valialkin	4b8f2ee5d4	lib/logstorage: optimize handling long constant fields Long constant fields cannot be stored in columnsHeader as a const column, because their size exceeds maxConstColumnValueSize, so they are stored as regular values. This commit optimizes storing such fields by storing only a single value across the field values in a block instead of storing multiple values. This should improve data ingestion performance a bit. This also should improve query performance when the query accesses such fields because of better cache locality. Also improve persisting of constant string lengths by storing them only once.	2025-03-14 03:17:18 +01:00
Aliaksandr Valialkin	46b408d054	lib/logstorage: add a test for marshalUint64Block / unmarshalUint64Block	2025-03-14 03:17:18 +01:00
Aliaksandr Valialkin	375c86b077	lib/logstorage: newTestLogRows: create a const column, which cannot be stored in the column header because its length exceeds maxConstColumnValueSize	2025-03-14 03:17:17 +01:00
f41gh7	dd32d2f99d	lib/protoparser: support zstd in all logs http ingestion, datadog and otel metrics protocols (#8416 ) This commit introduces common readers for multiple compression encoding algorithms. Currently, supported encodings are: * zstd * gzip * deflat * snappy It adds new common reader to the all VictoriaLogs ingestion protocols. And updates opentelemetry metrics parsing for VictoriaMetrics components. Also, it ports zstd stream parses from cluster branch. Related issues: fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8380 fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8300 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com> Co-authored-by: f41gh7 <nik@victoriametrics.com>	2025-03-14 00:44:50 +01:00
Zhu Jiekun	bcd775098f	app/vmagent: prevent dropping persistent queue if -remoteWrite.showURL changed Previously, if the command-line flag value `-remoteWrite.showURL` changed, vmagent dropped content of persistent queues. It's not expected behavior and may lead to data-loss at queue. Further more if command-line flag value `-remoteWrite.showURL` is set to `true`, any changes to url query arguments will lead to persistent queue drop. The most common uses is kafka and gcp pub-sub integration. It uses url query arguments for client configuration. Also, it complicates copy content of persistent queue between vmagents. Since it requires to properly change name inside metainfo.json. This commit removes persistent queue name equality check from `lib/persistentqueue`. This check was added as an additional protection from on-disk data corruption. It's safe to skip this check for vmagent, because vmagent encodes remoteWrite.url as part of path to the queue. It guarantees that there will be no collision. related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8477. ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: f41gh7 <nik@victoriametrics.com>	2025-03-14 00:16:52 +01:00
Andrii Chubatiuk	7c2874ff39	lib/awsapi: add EKS Pod Identity auth method AWS introduced a new secure way for Kubernetes Pod authorization at AWS API. The feature is called Pod Identity. It adds the following env variables to the Pod: * AWS_CONTAINER_CREDENTIALS_FULL_URI - endpoint URI served by the EKS Pod Identity Agent running on the worker node. * AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE - projected JWT token that is used to exchange for IAM credentials. See related blog post https://aws.amazon.com/blogs/containers/amazon-eks-pod-identity-a-new-way-for-applications-on-eks-to-obtain-iam-credentials/ related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5780	2025-03-14 00:16:52 +01:00
Zakhar Bessarab	a43c317e8f	lib/httputils: always set up TLS config Previously, TLS config was only created for URLs with `https` scheme. This could lead to unexpected errors when original URL was redirecting to `https` one as TLS config is not applied. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8494	2025-03-14 00:16:52 +01:00
Artem Fetishev	415f1a1527	lib/storage: Deduplication integration test (#8480 ) Add an integration test to confirm that deduplication works for the current month. See #6965. Signed-off-by: Artem Fetishev <rtm@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2025-03-13 17:09:33 +01:00
Artem Fetishev	ca49ac9c8c	lib/storage: increment indexdb refcount during data ingestion and retrieval (#8437 ) Almost all storage API operations, both ingestion and retrieval, involve writing and/or reading the indexdb. However, during these operations, the indexdb refcount is not incremented. This may lead to panics if indexdb is rotated more than once during these operations. This commit increments the refcount before using indexdb and decrements it after use. Note that rotating indexdb more than once during some operation is an impossible case under normal circumstances as the min retention period is 1 day (i.e. the indexdb will be rotated once per day). However, we want the storage to behave correctly in all cases. Signed-off-by: Artem Fetishev <rtm@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>	2025-03-13 12:26:52 +01:00
Artem Fetishev	4c853c1dd3	lib/storage: fix active timeseries collection when per-day index is disabled (#8485 ) Fix metric that shows number of active time series when per-day index is disabled. Previously, once per-day index was disabled, the active time series metric would stop being populated and the `Active time series` chart would show 0. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8411. Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>	2025-03-12 17:54:14 +01:00
Aliaksandr Valialkin	ca65aa1cce	lib/logstorage: properly parse floating-point numbers with leading zeroes in fractional part Parsing for floating-point numbers with leading zeroes such as 1.023, 1.00234 has been broken in the commit `ae5e28524e` . Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8464 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8361	2025-03-12 15:29:21 +01:00
Evgeny	c223aade17	lib/promscrape: use original job name as scrapePool value in targets api (#8457 ) ### Fix scrapePool name If in the scrape file, I do some magic and manipulate the job name then Prometheus will show scrapePool as the original job name in the targets API, but vmagent will set it to the final value which is wrong. example ``` job: consul-targets ... - source_labels: [ __meta_consul_service ] regex: (\w+)[_-]exporter target_label: job replacement: $1 ``` curl to prom API will show `"scrapePool": "consul-targets",` vmagent: `""scrapePool": "node",` before changes: ``` curl -s 'http://localhost:8429/api/v1/targets' \| jq -r '.data.activeTargets[].scrapePool'\| sort\|uniq blackbox pgbackrest postgres ``` after changes ``` curl -s 'http://localhost:8429/api/v1/targets' \| jq -r '.data.activeTargets[].scrapePool'\| sort\|uniq blackbox consul-targets ``` ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `486b9e1c64`)	2025-03-11 13:13:41 +01:00
Andrii Chubatiuk	394654c127	lib/streamaggr: fixed streamaggr panic (#8471 ) ### Describe Your Changes fixes #8469 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). (cherry picked from commit `c174a046e2`)	2025-03-10 13:54:23 +01:00
f41gh7	e24a9d3053	lib/metricnamestats: follow-up after `b85b28d30a` * properly save state for cross-device mount points * properly check empty state for tracker Signed-off-by: f41gh7 <nik@victoriametrics.com>	2025-03-06 23:18:42 +01:00
Nikolay	773b8b0b28	lib/storage: add tracker for time series metric names statistics This feature allows to track query requests by metric names. Tracker state is stored in-memory, capped by 1/100 of allocated memory to the storage. If cap exceeds, tracker rejects any new items add and instead registers query requests for already observed metric names. This feature is disable by default and new flag: `-storage.trackMetricNamesStats` enables it. New API added to the select component: * /api/v1/status/metric_names_stats - which returns a JSON object with usage statistics. * /admin/api/v1/status/metric_names_stats/reset - which resets internal state of the tracker and reset tsid/cache. New metrics were added for this feature: * vm_cache_size_bytes{type="storage/metricNamesUsageTracker"} * vm_cache_size{type="storage/metricNamesUsageTracker"} * vm_cache_size_max_bytes{type="storage/metricNamesUsageTracker"} Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4458 --------- Signed-off-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2025-03-06 22:10:41 +01:00
Andrii Chubatiuk	c72d5690cc	lib/protoparser/opentelemetry: properly marshal nested attributes into JSON Previously, opentelemetry attribute parsed added extra field names according to golang JSON parser spec for structs: ``` struct AnyValue{ StringValue string } ``` Was serialized into: ``` {"StringValue": "some-string"} ``` While opentelemetry-collector serializes it as ``` "some-string" ``` This commit changes this behaviour it makes parses compatible with opentelemetry-collector format. See test cases for examples. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8384	2025-03-05 18:38:25 +01:00
hagen1778	a0501d01fd	lib/timeutil: add test for `ParseDuration` See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8403#discussion_r1976110052 Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `6db97d6f79`)	2025-03-03 10:46:23 +01:00
Roman Khavronenko	d5d143f849	lib/promutils: move time-related funcs from `promutils` to `timeutil` (#8403 ) Since funcs `ParseDuration` and `ParseTimeMsec` are used in vlogs, vmalert, victoriametrics and other components, importing promutils only for this reason makes them to export irrelevant `vm_rows_invalid_total{type="prometheus"}` metric. This change removes `vm_rows_invalid_total{type="prometheus"}` metric from /metrics page for these components. ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `63f6ac3ff8`)	2025-03-03 10:28:07 +01:00
Zakhar Bessarab	04b6939c34	lib/promrelabel/scrape_url: properly parse IPv6 address from __address__ label Fix parsing of IPv6 addresses after discovery. Previously, it could lead to target being discovered and discarded afterwards. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8374 --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> (cherry picked from commit `99de272b72`)	2025-02-28 14:20:24 +04:00
Aliaksandr Valialkin	c8a12435ec	lib/logstorage: add ability to specify field name prefixes inside `fields (...)` lists passed to `pack_json` and `pack_logfmt` pipes	2025-02-27 22:56:14 +01:00
Roman Khavronenko	3ec0247ee3	lib/prompbmarshal: move MustParsePromMetrics to protoparser/prometheus (#8405 ) `MustParsePromMetrics` imports `lib/protoparser/prometheus`, and this package exposes the following metrics: ``` vm_protoparser_rows_read_total{type="promscrape"} vm_rows_invalid_total{type="prometheus"} ``` It means every package that uses `lib/prompbmarshal` will start exposing these metrics. For example, vlogs imports `lib/protoparser/common` which uses `lib/prompbmarshal.Label`. And only because of this vlogs starts exposing unrelated prometheus metrics on /metrics page. Moving `MustParsePromMetrics` to `lib/protoparser/prometheus` seems like the leas intrusive change. ----------- Depends on another change https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8403 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-02-27 22:55:32 +01:00
Aliaksandr Valialkin	a1aa4b7aa9	lib/logstorage: allow passing `` at `in()`, `contains_any()` and `contains_all()` Such filters are equivalent to `match all` filter aka `*`. These filters are needed for VictoriaLogs plugin for Grafana. See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/238#issuecomment-2685447673	2025-02-27 11:41:39 +01:00
Zhu Jiekun	6631899ead	lib/storage: properly cache extDB metricsID on search error Previously, if indexDB search failed for some reason during search at previous indexDB (aka extDB), VictoriaMetrics stored empty search result at cache. It could cause incorrect search results at subsequent requests. This commit checks search error and stores request results only on success. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8345	2025-02-26 16:07:48 +01:00
Aliaksandr Valialkin	a3ff49def0	lib/logstorage: do not treat a string with leading zeros as a number at tryParseUint64 The "00123" string shouldn't be treated as 123 number. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8361	2025-02-26 16:07:47 +01:00
Aliaksandr Valialkin	dd1c0e3bb7	lib/logstorage: optimize common regex filters generated by Grafana For example, `field:~".+"`, `field:~"."` or `field:""` Replace such filters to faster ones. For example, `field:~"."` is replaced with ``, while `field:~".+"` is replaced with `field:`.	2025-02-25 20:35:04 +01:00
Aliaksandr Valialkin	e36e28a2b0	lib/regexutil: speed up Regex.MatchString for ".*"	2025-02-25 20:35:03 +01:00
Aliaksandr Valialkin	14a5ccdc83	lib/logstorage: run `make fmt` after `30974e7f3f` (cherry picked from commit `82cdcec6c6`)	2025-02-25 19:13:31 +01:00
Aliaksandr Valialkin	9e0581533c	lib/logstorage: add `le_field` and `lt_field` filters These filters can be used for selecting logs where one field value is less than another field value. These filter complement `<=` and `<` filters for constant literals. (cherry picked from commit `30974e7f3f`)	2025-02-25 19:13:31 +01:00
Aliaksandr Valialkin	3bc89226bb	lib/logstorage: optimize eq_filter when it is applied to fields of the same type (cherry picked from commit `edc750dd55`)	2025-02-25 19:13:30 +01:00
Aliaksandr Valialkin	dc09d0bff4	lib/mergeset: explicitly pass the interval for flushing in-memory data to disk at MustOpenTable() This allows using different intervals for flushing in-memory data among different mergeset.Table instances. The initial user of this feature is lib/logstorage.Storage, which explicitly passes Storage.flushInterval to every created mereset.Table instance. Previously mergeset.Table instances were using 5 seconds flush interval, which didn't depend on the Storage.flushInterval. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775	2025-02-24 15:34:59 +01:00
Aliaksandr Valialkin	a964cc7a0c	lib/logstorage: properly use datadb.flushInterval as an interval between flushes for the in-memory parts The dataFlushInterval variable has been mistakenly introduced in the commit `9dbd0f9085` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775	2025-02-24 15:34:59 +01:00
Aliaksandr Valialkin	d56f9327ec	lib/logstorage: limit the maximum log field name length, which can be generated by JSONParser.ParseLogMessage Make sure that the maximum log field name, which can be generated by JSONParser.ParseLogMessage, doesn't exceed the hardcoded limit maxFieldNameSize. Stop flattening of nested JSON objects when the resulting field name becomes longer than maxFieldNameSize, and return the nested JSON object as a string instead. This should prevent from parse errors when ingesting deeply nested JSON logs with long field names.	2025-02-24 15:34:59 +01:00
Aliaksandr Valialkin	dc536d5626	lib/logstorage: add a benchmark for JSONParser.ParseLogMessage	2025-02-24 15:34:58 +01:00
Aliaksandr Valialkin	0d3ee707ba	lib/encoding/zstd: reduce the number of cached zstd.Encoder instances Use the real compression level supported by github.com/klauspost/compress/zstd as a cache map key. The number of real compression levels is smaller than the number of zstd compression levels. This should reduce the number of cached zstd.Encoder instances. See https://github.com/klauspost/compress/discussions/1025 See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7503#issuecomment-2500088591	2025-02-24 15:34:58 +01:00
Aliaksandr Valialkin	3ee4b3ef24	lib/logstorage: add `contains_any` and `contains_all` filters - `contains_any` selects logs with fields containing at least one word/phrase from the provided list. The provided list can be generated by a subquery. - `contains_all` selects logs with fields containing all the words and phrases from the provided list. The provided list can be generated by a subquery.	2025-02-24 15:34:58 +01:00
Aliaksandr Valialkin	3e941920f6	lib/logstorage: do not spend CPU time on preparing values for already filtered out rows according to bm at filterEqField.applyToBlockSearch	2025-02-24 15:34:57 +01:00
Aliaksandr Valialkin	6975352d5a	lib/logstorage: avoid extra memory allocations at getEmptyStrings()	2025-02-24 15:34:57 +01:00
Aliaksandr Valialkin	a2d0846e86	lib/logstorage: add an ability to drop duplicate words at unpack_words pipe	2025-02-24 15:34:57 +01:00
Aliaksandr Valialkin	518ed87a3a	lib/logstorage: rename unpack_tokens to unpack_words pipe The LogsQL defines a word at https://docs.victoriametrics.com/victorialogs/logsql/#word , so it is more natural to use unpack_words instead of unpack_tokens name for the pipe.	2025-02-24 15:34:57 +01:00
Aliaksandr Valialkin	4beceb67ab	lib/logstorage: optimize `OR` filter a bit for many inner filters Use two operations on bitmaps per each inner filter instead of three operations.	2025-02-24 15:34:57 +01:00
Aliaksandr Valialkin	bff5551ba5	lib/logstorage: use clear() for clearing bitmap bits at resetBits() instead of a loop The clear() call is easier to read and understand than the loop.	2025-02-24 15:34:56 +01:00
Aliaksandr Valialkin	4dfd1407ba	lib/logstorage: avoid calling bitmap.reset() at getBitmap() The bitmap at getBitamp() must be already reset when it was returned to the pool via putBitamp(). Thise saves CPU a bit.	2025-02-24 15:34:56 +01:00
Aliaksandr Valialkin	bc3e557f02	lib/logstorage: improve error logging for improperly escaped backslashes inside quoted strings This should simplify debugging LogsQL queries by users	2025-02-24 15:34:56 +01:00
Aliaksandr Valialkin	1f11bc948e	lib/logstorage: add `field1:eq_field(field2)` filter, which returns logs with identical values at field1 and field2	2025-02-24 15:34:56 +01:00
Aliaksandr Valialkin	504c034cbf	lib/logstorage: optimize `len`, `hash` and `json_array_len` pipes for repeated values Re-use the previous result instead of calculating new result for repated input values	2025-02-24 15:34:56 +01:00
Aliaksandr Valialkin	959282090a	lib/logstorage: add `json_array_len` pipe for calculating the length of JSON arrays	2025-02-24 15:34:56 +01:00
Aliaksandr Valialkin	aef939dc20	lib/logstorage: refactor unroll_tokens into unpack_tokens pipe unpack_tokens pipe generates a JSON array of unpacked tokens from the source field. This composes better with other pipes such as unroll pipe.	2025-02-24 15:34:55 +01:00
Aliaksandr Valialkin	afd74d82db	lib/logstorage: add `unroll_tokens` pipe for unrolling individual word tokens from the log field	2025-02-24 15:34:55 +01:00
Aliaksandr Valialkin	2dfd6bb689	lib/logstorage: simplify usage of `top`, `uniq` and `unroll` pipes by allowing comma-separated list of fields without parens Examples: - `top 5 x, y` is equivalent to `top 5 by (x, y)` - `uniq foo, bar` is equivalent to `uniq by (foo, bar)` - `unroll foo, bar` is equivalent to `unroll (foo, bar)`	2025-02-21 12:43:26 +01:00
Aliaksandr Valialkin	061fd098b5	lib/logstorage: properly handle _time:<=max_time filter _time:<=max_time filter must include logs with timestamps matching max_time. For example, _time:<=2025-02-24Z must include logs with timestamps until the end of February 24, 2025.	2025-02-21 12:43:26 +01:00
Aliaksandr Valialkin	80d173471f	lib/logstorage: allow using '>', '>=', '<' and '<=' in '_time:...' filter Examples: _time:>=2025-02-24Z selects logs with timestamps bigger or equal to 2025-02-24 UTC _time:>1d selects logs with timestamps older than one day comparing to the current time This simplifies writing queries with _time filters. See https://docs.victoriametrics.com/victorialogs/logsql/#time-filter	2025-02-21 12:43:26 +01:00
Hui Wang	93bbe10074	app/vmselect: add query resource limits priority This commit adds support for overriding vmstorage `maxUniqueTimeseries` with specific resource limits: 1. `-search.maxLabelsAPISeries` for [/api/v1/labels](https://docs.victoriametrics.com/url-examples/#apiv1labels), [/api/v1/label/.../values](https://docs.victoriametrics.com/url-examples/#apiv1labelvalues) 2. `-search. maxSeries` for [/api/v1/series](https://docs.victoriametrics.com/url-examples/#apiv1series) 3. `-search.maxTSDBStatusSeries` for [/api/v1/status/tsdb](https://docs.victoriametrics.com/#tsdb-stats) 4. `-search.maxDeleteSeries` for [/api/v1/admin/tsdb/delete_series](https://docs.victoriametrics.com/url-examples/#apiv1admintsdbdelete_series) Currently, this limit priority logic cannot be applied to flags `-search.maxFederateSeries` and `-search.maxExportSeries`, because they share the same RPC `search_v7` with the /api/v1/query and /api/v1/query_range APIs, preventing vmstorage from identifying the actual API of the request. To address that, we need to add additional information to the protocol between vmstorage and vmselect, which should be introduced in the future when possible. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7857	2025-02-19 18:14:54 +01:00
Andrii Chubatiuk	94bf90842a	app/vlinsert/syslog: properly parse log line with characters escaped by rfc5424 Inside PARAM-VALUE, the characters '"' (ABNF %d34), '\' (ABNF %d92), and ']' (ABNF %d93) MUST be escaped. This is necessary to avoid parsing errors. Escaping ']' would not strictly be necessary but is REQUIRED by this specification to avoid syslog application implementation errors. Each of these three characters MUST be escaped as '\"', '\\', and '\]' respectively. The backslash is used for control character escaping for consistency with its use for escaping in other parts of the syslog message as well as in traditional syslog. Related RFC: https://datatracker.ietf.org/doc/html/rfc5424#section-6.3.3 Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8282	2025-02-19 18:12:40 +01:00
Andrii Chubatiuk	99de7456c3	lib/protoparser/influx: add -influx.forceStreamMode flag to force parsing all Influx data in stream mode (#8319 ) Addresses #8269 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2025-02-19 17:40:32 +01:00
Andrii Chubatiuk	a041488786	lib/streamaggr: added aggregation windows (#6314 ) ### Describe Your Changes By default, stream aggregation and deduplication stores a single state per each aggregation output result. The data for each aggregator is flushed independently once per aggregation interval. But there's no guarantee that incoming samples with timestamps close to the aggregation interval's end will get into it. For example, when aggregating with `interval: 1m` a data sample with timestamp 1739473078 (18:57:59) can fall into aggregation round `18:58:00` or `18:59:00`. It depends on network lag, load, clock synchronization, etc. In most scenarios it doesn't impact aggregation or deduplication results, which are consistent within margin of error. But for metrics represented as a collection of series, like [histograms](https://docs.victoriametrics.com/keyconcepts/#histogram), such inaccuracy leads to invalid aggregation results. For this case, streaming aggregation and deduplication support mode with aggregation windows for current and previous state. With this mode, flush doesn't happen immediately but is shifted by a calculated samples lag that improves correctness for delayed data. Enabling of this mode has increased resource usage: memory usage is expected to double as aggregation will store two states instead of one. However, this significantly improves accuracy of calculations. Aggregation windows can be enabled via the following settings: - `-streamAggr.enableWindows` at [single-node VictoriaMetrics](https://docs.victoriametrics.com/single-server-victoriametrics/) and [vmagent](https://docs.victoriametrics.com/vmagent/). At [vmagent](https://docs.victoriametrics.com/vmagent/) `-remoteWrite.streamAggr.enableWindows` flag can be specified individually per each `-remoteWrite.url`. If one of these flags is set, then all aggregators will be using fixed windows. In conjunction with `-remoteWrite.streamAggr.dedupInterval` or `-streamAggr.dedupInterval` fixed aggregation windows are enabled on deduplicator as well. - `enable_windows` option in [aggregation config](https://docs.victoriametrics.com/stream-aggregation/#stream-aggregation-config). It allows enabling aggregation windows for a specific aggregator. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `c8fc903669`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-02-19 13:31:37 +01:00
hagen1778	bb302df170	lib/logstorage: adjust expected compression ratio in tests A follow-up after `9bb5ba5d2f` that impacted compression ratio for data compressed with native GO zstd lib (`make test-pure`). Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `38bded4e58`)	2025-02-19 13:30:05 +01:00
Aliaksandr Valialkin	697b775a46	lib/logstorage: remove optimizations from LogRows.sortFieldsInRows It has been appeared these optimizatios do not give measurable performance improvements, while they complicate the code too much and may result in slowdown when the ingested logs have different sets of fields. This is a follow-up for `630601488e` (cherry picked from commit `dce5eb88d3`)	2025-02-19 13:30:04 +01:00
Aliaksandr Valialkin	d0d9fb2818	lib/logstorage: return back the maximum number of files for log fields data from 256 to 128 It has been appeared that 256 files increase RAM usage too much comparing to 128 files when ingesting logs with hundreds of fields (aka wide events). So let's return back 128 files limit for now. This is a follow-up for `9bb5ba5d2f` (cherry picked from commit `a50ab10998`)	2025-02-19 13:30:04 +01:00
Aliaksandr Valialkin	0a8d52376e	lib/bytesutil: drop ByteBuffer.B when its capacity is bigger than 64KB at Reset There is little sense in keeping too big buffers - they just waste RAM and do not reduce the load on GC too much. So it is better dropping such buffers at Reset instead of keeping them around. (cherry picked from commit `b58e2ab214`)	2025-02-19 13:30:03 +01:00

1 2 3 4 5 ...

3079 commits