Commit graph

1891 commits

Author SHA1 Message Date
Aliaksandr Valialkin
43a4dcdaf8
lib/promscrape/discovery/nomad: sync nomad_sd_configs fields with the Prometheus implementation
See the list of configs supported by Prometheus at f88a0a7d83/discovery/nomad/nomad.go (L76-L84)

- Removed "token" option. In can be set either via NOMAD_TOKEN env var or via `bearer_token` config option.
- Removed "scheme" option. It is automatically detected depending on whether the `tls_config` is set.
- Removed "services" and "tags" options, since they aren't supported by Prometheus.
- Added "region" option. If it is missing, then the region is read from NOMAD_REGION env var.
  If this var is empty, then it is set to "global" in the same way as Nomad client does.
  See 865ee8d37c/api/api.go (L297)
  and 865ee8d37c/api/api.go (L555-L556)
- If the "server" option is missing, then it is read from NOMAD_ADDR in the same way
  as Nomad client does - see 865ee8d37c/api/api.go (L294-L296)

This is a follow-up for 8aee209c53

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367
2023-01-09 21:30:19 -08:00
Roman Khavronenko
ca5136a0ee
lib/promscrape: remove datacenter field from nomad_sd_config (#3612)
Looks like `datacenter` field isn't part of `/v1/services` API.
See https://developer.hashicorp.com/nomad/api-docs/services#list-services
and https://developer.hashicorp.com/nomad/api-docs/services#read-service

Related issues:
https://github.com/traefik/traefik/issues/9109
https://github.com/prometheus/prometheus/issues/11776

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-09 21:24:46 -08:00
Aliaksandr Valialkin
7792ba3272
lib/promscrape/discoveryutils: cleanup after 5df9fddaf2
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468
2023-01-07 01:27:16 -08:00
Zakhar Bessarab
e8624fd781
lib/promscrape/discoveryutils: use correct timeout for blocking requests (#3609)
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-01-07 01:27:10 -08:00
Aliaksandr Valialkin
eb9a542c1f
lib/storage: simplify the fix from 488940502c
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3566
2023-01-07 01:11:35 -08:00
Dmytro Kozlov
f739e44802
lib/storage: fix returning camelcase label names (#3608)
* lib/storage: fix returning camelcase label names

* doc: add change log

* Update docs/CHANGELOG.md

* Update docs/CHANGELOG.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-07 01:11:10 -08:00
Aliaksandr Valialkin
c630115be0
lib/streamaggr: limit the the number of concurrent flushes of the aggregate data to the exact number of available CPUs
This should reduce the maximum memory usage during concurrent flushes of the aggregate data
2023-01-07 00:19:34 -08:00
Aliaksandr Valialkin
0a14b7bb82
lib/promscrape: reduce the number of concurrently executed processScrapedData calls from 2x of the number of CPUs to the number of CPUs
This should reduce the maximum memory usage for processScrapedData() function by 2x.
The only part, which can be IO-bound in the processScrapedData() is pushData() call,
when it buffers data to persistent queue if the remote storage cannot keep up
with the data ingestion speed. In this case it is OK if the scrape pace will be limited.
2023-01-07 00:17:52 -08:00
Aliaksandr Valialkin
5876821a16
all: small improvements in error messages and command-line flag descriptions related to concurrency limiters 2023-01-07 00:12:24 -08:00
Aliaksandr Valialkin
3864357772
lib/writeconcurrencylimiter: moved the error generation from incConcurrency() to the caller place 2023-01-07 00:01:44 -08:00
Aliaksandr Valialkin
7fb02f536a
lib/promscrape: limit the concurrency during parsing and relabeling the scraped samples
This should reduce memory usage when scraping big number of targets,
since this limits the summary memory usage during concurrent parsing and relabeling
by the number of available CPU cores.
2023-01-06 23:01:18 -08:00
Aliaksandr Valialkin
3461ae8f13
lib/streamaggr: limit the number of concurrent flushes of aggregate metrics in order to limit memory usage 2023-01-06 22:40:19 -08:00
Aliaksandr Valialkin
2ca48444e2
lib/vmselectapi: typo fix after 20e9598254 2023-01-06 22:13:32 -08:00
Aliaksandr Valialkin
b275983403
lib/writeconcurrencylimiter: improve the logic behind -maxConcurrentInserts limit
Previously the -maxConcurrentInserts was limiting the number of established client connections,
which write data to VictoriaMetrics. Some of these connections could be idle.
Such connections do not consume big amounts of CPU and RAM, so there is a little sense in limiting
the number of such connections. So now the -maxConcurrentInserts command-line option
limits the number of concurrently executed insert requests, not including idle connections.

It is recommended removing -maxConcurrentInserts command-line option, since the default value
for this option should work good for most cases.
2023-01-06 22:07:16 -08:00
Aliaksandr Valialkin
20e9598254
lib/vmselectapi: limit the number of concurrently executed requests
This should prevent from out of memory errors when big number of vmselect
nodes send many concurrent requests to vmstorage

The limit can be controlled at vmstorage via the following command-line flags:
- search.maxConcurrentRequests
- search.maxQueueDuration

See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#resource-usage-limits
2023-01-06 18:39:46 -08:00
Aliaksandr Valialkin
be896ddfd4
lib/protoparser/clusternative: typo fix in the comment: thic -> this 2023-01-06 18:16:25 -08:00
Aliaksandr Valialkin
ec7a3b79ab
lib/promscrape/discovery/{consul,nomad}: wait until the deleted serviceWatchers are stopped inside updateServices() call
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367
2023-01-05 21:53:08 -08:00
Aliaksandr Valialkin
54410bf51b
lib/promscrape: follow-up after bced9fb978
- Document the bugfix at docs/CHANGELOG.md
- Wait until all the worker goroutines are done in consulWatcher.mustStop()
- Do not log `context canceled` errors when discovering consul serviceNames
- Removed explicit handling of gzipped responses at lib/promscrape/discoveryutils.Client,
  since this handling is automatically performed by net/http.Transport.
  See DisableCompression option at https://pkg.go.dev/net/http#Transport .
- Remove explicit handling of the proxyURL, since it is automatically handled
  by net/http.Transport. See Proxy option at https://pkg.go.dev/net/http#Transport .
- Expliticly set MaxIdleConnsPerHost, since its default value equals to 2.
  Such a small value may result in excess tcp connection churn
  when more than 2 concurrent requests are processed by lib/promscrape/discoveryutils.Client.
- Do not set explicitly the `Host` request header, since it is automatically set by net/http.Client.
- Backport the bugfix to the recently added nomad_sd_configs - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3468
2023-01-05 21:23:21 -08:00
Zakhar Bessarab
de5aad2cde
lib/promscrape/discoveryutils: switch to native http client from fasthttp (#3568) 2023-01-05 21:23:15 -08:00
Roman Khavronenko
57277ed6bc
vmstorage: add more context to the flock acquiring msg (#3584)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3578

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-01-05 18:32:53 -08:00
Aliaksandr Valialkin
750d309f63
lib/promscrape/discovery/nomad: follow-up after 48f371a46c
- Remove undocumented `username` and `password` config options from `nomad_sd_config`.
  TODO: probably, remove these options from `consul_sd_config` too?
  These options exist there for backwards compatibility purposes.

- Add __meta_nomad_service_alloc_id and __meta_nomad_service_job_id meta-labels
  These labels contain AllocID and JobID fields for the discovered Nomad services.

- Various typo fixes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3367
2023-01-05 18:09:23 -08:00
Karan Sharma
8f42c5a024
lib/promscrape: add Prometheus-compatible service discovery for Nomad (#3549)
Add nomad_sd_config support for service discovery
2023-01-05 18:07:02 -08:00
Aliaksandr Valialkin
6eda4c6da2
lib/promscrape: use strconv.Atoi instead of strconv.ParseInt for parsing -promscrape.cluster.memberNum
In this case there is no need in converting int64 to int
2023-01-05 16:46:03 -08:00
Aliaksandr Valialkin
1af6e0b233
lib/promrelabel: pass query args via query string at /metric-relabel-debug and /target-relabel-debug pages if their length doesnt exceed 1000
This allows copy-n-pasting the url to another browser window and seeing the same result.

The limit in 1000 chars is selected in order to prevent from potential issues with systems
which limit the url length such as Internet Explorer - see https://stackoverflow.com/questions/812925/what-is-the-maximum-possible-length-of-a-query-string

If the limit is exceeded, then query args are sent via POST method and aren't visible in the url.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3580
2023-01-05 16:45:42 -08:00
Zakhar Bessarab
99f9b02283
lib/promscrape/discovery/dockerswarm: fix query encoding of filters (#3586)
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-01-05 03:35:18 -08:00
Aliaksandr Valialkin
1d16cc9349
lib/promscrape: pre-fetch metric_relabel_configs rules when debugging metric relabeling for a particular target
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407
2023-01-05 03:28:14 -08:00
Aliaksandr Valialkin
634e24e685
lib/promscrape: follow-up for a7e29c38bc
- Document the bugfix at docs/CHANGELOG.md
- Make the fix more durable against future changes when droppedTargetsMap.Register may be called from other places.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3580
2023-01-05 02:51:45 -08:00
Zakhar Bessarab
52226c392f
lib/promscrape/targetstatus: fix crash during droppedTarget registration (#3595)
* lib/promscrape/targetstatus: fix crash during droppedTarget registration in case original labels are not present

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape/targetstatus: address review comment

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-01-05 02:48:39 -08:00
Aliaksandr Valialkin
c97d6ed6a4
lib/streamaggr: sort by and without labels in the aggregate output metric name
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3460
2023-01-05 02:08:59 -08:00
Aliaksandr Valialkin
ccec8c26ed
lib/streamaggr: remove unused fields 2023-01-04 13:33:21 -08:00
Aliaksandr Valialkin
ebb8aeb0cf
app/vmselect: remove dependency on lib/promscrape from app/vmselect 2023-01-03 23:27:36 -08:00
Aliaksandr Valialkin
3369371636
app/{vmagent,vminsert}: add support for streaming aggregation
See https://docs.victoriametrics.com/stream-aggregation.html

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3460
2023-01-03 22:22:07 -08:00
Aliaksandr Valialkin
d3f8298739
lib/bytesutil: add InternBytes() function as a shortcut to InternString(ToUnsafeString(..)) 2023-01-03 22:15:49 -08:00
Aliaksandr Valialkin
a7942c6c0d
lib/promrelabel: allow calling Match on nil IfExpression
This simplifies the caller side of IfExpression
2022-12-30 16:47:59 -08:00
Roman Khavronenko
c22122212d
csvimport: support empty values (#3565)
Before, if the imported line contained multiple metrics and one
or more of them had an empty values - the whole line was ignored.

Now, only metrics with empty values are ignored, and the rest
of the metrics are accepted successfully.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3540

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-12-29 11:56:02 -08:00
Aliaksandr Valialkin
8e9548f050
lib/promscrape: log the actual response size in the error message when the response size exceeds -promscrape.maxScrapeSize
This is a follow-up for 7ad9fff7e5
2022-12-28 14:42:45 -08:00
Aliaksandr Valialkin
8dc04a86f6
lib/{storage,mergeset}: tune the threshold for assisted merge
The https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425#issuecomment-1359117221
reveals that CPU usage for incoming queries may significantly increase when the number
of in-memory parts becomes too big.

This commit reduces the maximum number of in-memory parts before starting the assisted merge
during data ingestion. This should reduce CPU usage for incoming queries,
since they need to inspect lower number of in-memory parts.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-28 14:42:45 -08:00
Clément Nussbaumer
04d536c15a
fix(promscrape): check MaxScrapeSize after gzip decompression (#3550) 2022-12-28 14:42:45 -08:00
Aliaksandr Valialkin
ae0a77d778
lib/snapshot: improve log message on unexpected status code during attempts to create or delete snapshots
Use "unexpected status code returned from %q: %d; expecting %d" log message format
instead of less clear format "unexpected status code returned from %q; expecting %d; got %d"

This is a follow-up for c612bb165e
2022-12-28 11:46:23 -08:00
Zakhar Bessarab
990c874b25
lib/snapshot: fix error message format for failed HTTP request (#3559) 2022-12-28 11:44:06 -08:00
Aliaksandr Valialkin
c2fc996e01
lib/promscrape/discovery/azure: typo fix 2022-12-21 21:25:25 -08:00
Aliaksandr Valialkin
7888712185
lib/promrelabel: make fmt after d3de110070 2022-12-21 20:25:37 -08:00
Aliaksandr Valialkin
cc482b89a3
lib/promrelabel: add support for keepequal and dropequal relabeling actions
These actions are supported by Prometheus starting from v2.41.0

See https://github.com/prometheus/prometheus/pull/11564 ,
https://github.com/prometheus/prometheus/issues/11556
and https://github.com/prometheus/prometheus/issues/3756

Side note:

It's a pity that Prometheus developers decided inventing `keepequal` and `dropequal`
relabeling actions instead of adding support for `keep_if_equal` and `drop_if_equal` relabeling
actions supported by VictoriaMetrics since June 2020 - see 2a39ba639d .
2022-12-21 20:06:09 -08:00
Aliaksandr Valialkin
fcee36081b
lib/bytesutil: make sure that the cleanup code is performed only by a single goroutine out of many concurrently running goroutines
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466
2022-12-21 12:58:27 -08:00
Zakhar Bessarab
decf46d72b
app/vmbackupmanager: add metrics for better observability (#488)
* app/vmbackupmanager: add metrics for better observability, include more information to `/api/v1/backups` API call response

* app/vmbackupmanager: drop old metrics before creating new ones

* app/vmbackupmanager: use `_total` postfix for counter metrics

* app/vmbackupmanager: remove `_total` postfix for gauge-like metrics

* app/vmbackupmanager: add `_last_run_failed` metrics for backups and retention

* app/vmbackupmanager: address review feedback

* app/vmbackupmanager: fix metric name

* app/vmbackupmanager: address review feedback, remove background updates of metrics, add restoring state of `_last_run_failed` metric from remote storage

* app/vmbackupmanager: improve performance for backup size calculation

* app/vmbackupmanager: refactor backup and retention runs to deduplicate each run logic

* {app/vmbackupmanager,lib/formatutil}: move HumanizeBytes into lib package

* app/vmbackupmanager: fix creating new metrics instead of reusing existing ones

* lit/formatutil: add comment to make linter happy

* app/vmbackupmanager: address review feedback
2022-12-20 14:18:43 -08:00
Aliaksandr Valialkin
1ff62629f4
lib/storage: clear the err if it is set to io.EOF when searching for the TSID by metricID
This is expected error after when recently added indexdb data isn't available for search yet
or wasn't flushed to disk after unclean shutdown of VictoriaMetrics.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3515
2022-12-20 14:05:53 -08:00
Aliaksandr Valialkin
2184de3bf2
lib/storage: do not check for the result returned by db.doExtDB() where this isn't necessary
This simplifies the code a bit
2022-12-19 13:23:30 -08:00
Aliaksandr Valialkin
9330da3195
lib/promscrape/discovery/consul: expose service tags in individual labels __meta_consul_tag_<tagname>
This simplifies copying service tags to target labels with the following relabeling rule:

- action: labelmap
  regex: __meta_consul_tag_(.+)

See https://stackoverflow.com/questions/44339461/relabeling-in-prometheus
2022-12-19 13:02:56 -08:00
Aliaksandr Valialkin
11bd290201
lib/storage: search for TSIDs for the given metricIDs in the previous indexdb if they aren't found in the current indexdb
The issue triggers after the indexdb rotation for time series, which stop receiving new samples.
This results in missing data for such time series in query responses.

This commit should address the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502

The issue has been introduced in 2dd93449d8
2022-12-19 11:56:49 -08:00
Aliaksandr Valialkin
8c08d625ee
lib/storage: optimize partSearch.searchBHS() for common case when the TSID for the current block header is bigger or equal to the current tsid
This should help improving performance at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-19 10:31:39 -08:00
Aliaksandr Valialkin
512c73cef9
lib/storage: properly set buf capacity inside marshalMetricID
Previously it was always set to 0. In theory this could result into incorrect marshaling
of metricIDs.

The issue has been introduced in 5e4dfe50c6
2022-12-19 10:31:38 -08:00
Aliaksandr Valialkin
2fad03d85e
lib/logger: follow-up for 72f8fce107
- Document the change at docs/CHANELOG.md
- Log fatal errors if the -loggerJSONFields contains unexpected values
- Rename -loggerJsonFields to -loggerJSONFields for the sake of consistency naming commonly used in Go

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2348
2022-12-16 17:44:05 -08:00
Michal Kralik
cffd2f79a1
lib/logger: support for renaming json fields (#3488) 2022-12-16 17:43:39 -08:00
Aliaksandr Valialkin
70c720d640
lib/logger: follow-up for 72f8fce107
- Document the change at docs/CHANELOG.md
- Log fatal errors if the -loggerJSONFields contains unexpected values
- Rename -loggerJsonFields to -loggerJSONFields for the sake of consistency naming commonly used in Go

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2348
2022-12-16 17:42:31 -08:00
Aliaksandr Valialkin
2e597555ec
lib/promscrape: stop dropping metric name if relabeling rules do not instruct to do this on the /metric-relabel-debug page 2022-12-16 16:44:12 -08:00
Aliaksandr Valialkin
fbeebe4869
lib/storage: skip missing tsids in the current block header by using binary search
This improves performance by up to 10x when big number of the requested TSIDs
are missing in the searched parts.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-14 22:07:55 -08:00
Aliaksandr Valialkin
1a88fe5b1f
lib/flagutil/bytes.go: properly handle values bigger than 2GiB on 32-bit architectures
This fixes handling of values bigger than 2GiB for the following command-line flags:

- -storage.minFreeDiskSpaceBytes
- -remoteWrite.maxDiskUsagePerURL
2022-12-14 19:29:57 -08:00
Aliaksandr Valialkin
3a28a52667
lib/flagutil: support for TB and TiB suffixes for command-line flags, which accept byte sizes 2022-12-14 17:53:18 -08:00
Zakhar Bessarab
1e58eabde6
lib/backup/azremote: fix copying for parts larger than 256M by using async copy (#3479)
* lib/backup/azremote: fix copying for parts larger than 256M by using async copy

* lib/backup/azremote: add description of an error for log message
2022-12-13 09:36:36 -08:00
Aliaksandr Valialkin
ea7940e5a7
lib/mergeset: reduce the parts threshold before starting assisted merges
This should improve query speed in general case.

This is a follow-up for d1af6046c7
2022-12-13 09:14:08 -08:00
Aliaksandr Valialkin
2a190f6451
lib/{mergeset,storage}: do not block small merges by pending big merges - assist with small merges instead
Blocked small merges may result into big number of small parts, which, in turn,
may result in increased CPU and memory usage during queries, since queries need to inspect
all the existing small parts.

The issue has been introduced in 8189770c50

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-12 17:01:33 -08:00
Aliaksandr Valialkin
f3e5c9c246
lib/bytesutil: cache results for all the input strings, which were passed during the last 5 minutes from FastStringMatcher.Match(), FastStringTransformer.Transform() and InternString()
Previously only up to 100K results were cached.
This could result in sub-optimal performance when more than 100K unique strings were actually used.
For example, when the relabeling rule was applied to a million of unique Graphite metric names
like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466

This commit should reduce the long-term CPU usage for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3466
after all the unique Graphite metrics are registered in the FastStringMatcher.Transform() cache.

It is expected that the number of unique strings, which are passed to FastStringMatcher.Match(),
FastStringTransformer.Transform() and to InternString() during the last 5 minutes,
is limited, so the function results fit memory. Otherwise OOM crash can occur.
This should be the case for typical production workloads.
2022-12-12 14:47:00 -08:00
Aliaksandr Valialkin
87390443d1
lib/protoparser/datadog: do not re-use previously parsed field values if they are missing in the currently parsed message
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3432
2022-12-11 13:09:41 -08:00
Aliaksandr Valialkin
a521135b7b
lib/promscrape: allow editing relabeling configs and labels at /target-relabel-debug page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407
2022-12-10 12:47:47 -08:00
Aliaksandr Valialkin
97b41e727c
lib/promscrape: implement target-level and metric-level relabel debugging
Target-level debugging is performed by clicking the 'debug' link at the corresponding target
on either http://vmagent:8429/targets page or on http://vmagent:8428/service-discovery page.

Metric-level debugging is perfromed at http://vmagent:8429/metric-relabel-debug page.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407

See https://docs.victoriametrics.com/vmagent.html#relabel-debug
2022-12-10 02:25:56 -08:00
Aliaksandr Valialkin
04abd5e113
docs/CHANGELOG.md: document the bugfix at 05b42601c3
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3247
2022-12-08 18:35:47 -08:00
Zakhar Bessarab
c939a8e8a2
lib/promscrape/discovery/azure: remove API server from URL returned by azure (#3403)
* lib/promscrape/discovery/azure: remove API server from URL returned by azure

* lib/promscrape/discovery/azure: validate nextLink contains same URL as apiServer
2022-12-08 18:35:46 -08:00
Aliaksandr Valialkin
6e390f3b99
lib/querytracer: fix remaining tests after 49ebc48809 2022-12-08 18:18:50 -08:00
Aliaksandr Valialkin
e56d5e1918
lib/storage: follow-up after 7c0ae3a86a
- Update docs at https://docs.victoriametrics.com/#deduplication
- Optimize the deduplication loop a bit

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333
2022-12-08 18:18:49 -08:00
Roman Khavronenko
909cd04c55
lib/storage: keep sample with the biggest value on timestamp conflict (#3421)
The change leaves raw sample with the biggest value for identical
timestamps per each `-dedup.minScrapeInterval` discrete interval
when the deduplication is enabled.

```
benchstat old.txt new.txt
name                                         old time/op    new time/op    delta
DeduplicateSamples/minScrapeInterval=1s-10      817ns ± 2%     832ns ± 3%      ~     (p=0.052 n=10+10)
DeduplicateSamples/minScrapeInterval=2s-10     1.56µs ± 1%    2.12µs ± 0%   +35.19%  (p=0.000 n=9+7)
DeduplicateSamples/minScrapeInterval=5s-10     1.32µs ± 3%    1.65µs ± 2%   +25.57%  (p=0.000 n=10+10)
DeduplicateSamples/minScrapeInterval=10s-10    1.13µs ± 2%    1.50µs ± 1%   +32.85%  (p=0.000 n=10+10)

name                                         old speed      new speed      delta
DeduplicateSamples/minScrapeInterval=1s-10   10.0GB/s ± 2%   9.9GB/s ± 3%      ~     (p=0.052 n=10+10)
DeduplicateSamples/minScrapeInterval=2s-10   5.24GB/s ± 1%  3.87GB/s ± 0%   -26.03%  (p=0.000 n=9+7)
DeduplicateSamples/minScrapeInterval=5s-10   6.22GB/s ± 3%  4.96GB/s ± 2%   -20.37%  (p=0.000 n=10+10)
DeduplicateSamples/minScrapeInterval=10s-10  7.28GB/s ± 2%  5.48GB/s ± 1%   -24.74%  (p=0.000 n=10+10)
```

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3333
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-08 18:18:36 -08:00
Aliaksandr Valialkin
323fc14e0a
lib/querytracer: fix tests after 49ebc48809 2022-12-08 17:21:47 -08:00
Aliaksandr Valialkin
a13f16e48a
lib/promscrape: allow using sample_limit and series_limit options in stream parsing mode
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3458
2022-12-08 17:04:12 -08:00
Aliaksandr Valialkin
255c04bc20
lib/querytracer: put the version of VictoriaMetrics in the first message of query trace
This should simplify further debugging, since the first thing to start the debugging by query trace
is to know the version of VictoriaMetrics, which produced this trace.
2022-12-07 09:49:51 -08:00
Pedro Gonçalves
84cfe4fcaf
Datadog - Add device as a tag if it's present as a field in the series object (#3431)
* Datadog - Add device as a tag if it's present as a field in the series object

* address PR comments
2022-12-05 23:10:42 -08:00
Aliaksandr Valialkin
0a9992a9c6
lib/{storage,mergeset}: log the duration for flushing in-memory parts on graceful shutdown 2022-12-05 21:55:21 -08:00
Aliaksandr Valialkin
7d5c64eb7a
all: add -inmemoryDataFlushInterval command-line flag for controlling the frequency of saving in-memory data to disk
The main purpose of this command-line flag is to increase the lifetime of low-end flash storage
with the limited number of write operations it can perform. Such flash storage is usually
installed on Raspberry PI or similar appliances.

For example, `-inmemoryDataFlushInterval=1h` reduces the frequency of disk write operations
to up to once per hour if the ingested one-hour worth of data fits the limit for in-memory data.

The in-memory data is searchable in the same way as the data stored on disk.
VictoriaMetrics automatically flushes the in-memory data to disk on graceful shutdown via SIGINT signal.
The in-memory data is lost on unclean shutdown (hardware power loss, OOM crash, SIGKILL).

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337
2022-12-05 15:28:09 -08:00
Aliaksandr Valialkin
9ac1174493
lib/{mergeset,storage}: add start background workers via startBackgroundWorkers() function 2022-12-04 00:01:14 -08:00
Aliaksandr Valialkin
a13d21513e
lib/mergeset: panic when too long item is passed to Table.AddItems() 2022-12-03 23:37:20 -08:00
Aliaksandr Valialkin
dccd70ce10
lib/storage: remove duplicate logging for filepath on errors 2022-12-03 23:15:28 -08:00
Aliaksandr Valialkin
813e8402f6
lib/storage: pass a single arg - rowsPerBlock - to getCompressLevel() function instead of two args 2022-12-03 23:10:26 -08:00
Aliaksandr Valialkin
bb93494eac
lib/{storage,mergeset}: use a single sync.WaitGroup for all background workers
This simplifies the code
2022-12-03 23:03:32 -08:00
Aliaksandr Valialkin
106332cd9f
lib/storage: properly pass retentionMsecs to OpenStorage() at TestIndexDBRepopulateAfterRotation 2022-12-03 23:03:30 -08:00
Aliaksandr Valialkin
ea55c16422
lib/{mergeset,storage}: pass compressLevel to blockStreamWriter.InitFromInmemoryPart
This allows packing in-memory blocks with different compression levels
depending on its contents. This may save memory usage.
2022-12-03 22:47:06 -08:00
Aliaksandr Valialkin
eca7f32151
lib/mergeset: use the given compressLevel for index and metaindex compression in in-memory part
Previously only data was compressed with the given compressLevel
2022-12-03 22:35:16 -08:00
Aliaksandr Valialkin
7ffa66d249
lib/{mergeset,storage}: take into account byte slice capacity when returning the size of in-memory part
This results in more correct reporting of memory usage for in-memory parts
2022-12-03 22:31:34 -08:00
Aliaksandr Valialkin
886ce94739
lib/mergeset: reduce the time needed for the slowest tests 2022-12-03 22:26:46 -08:00
Aliaksandr Valialkin
10a17bfa16
lib/{storage,mergeset}: consistency rename: `flushRaw{Rows,Items} -> flushPending{Rows,Items} 2022-12-03 22:18:05 -08:00
Aliaksandr Valialkin
233301a549
lib/storage: optimization: do not scan block for rows outside retention if it is covered by the retention 2022-12-03 22:14:20 -08:00
Aliaksandr Valialkin
fd9d0a550b
lib/storage: remove logging redundant path values in a single error message 2022-12-03 22:14:19 -08:00
Aliaksandr Valialkin
82f64072d2
lib/filestream: remove logging redundant path values in a single error message 2022-12-03 22:02:04 -08:00
Aliaksandr Valialkin
81400c80f0
lib/fs: remove logging redundant path values in a single error message 2022-12-03 22:00:43 -08:00
Aliaksandr Valialkin
dc890ab80c
lib/backup: remove logging duplicate path values in a single error message 2022-12-03 21:55:12 -08:00
Aliaksandr Valialkin
6910b1de2e
all: typo fix: the the -> the 2022-12-03 21:53:07 -08:00
Aliaksandr Valialkin
0b8e7deabd
lib/mergeset: drop the crufty code responsible for direct upgrade from releases prior v1.28.0
Upgrade to v1.84.0, wait until the "finished round 2 of background conversion" message
appears in the log and then upgrade to newer release.
2022-12-03 21:18:41 -08:00
Aliaksandr Valialkin
8e9822bc7f
lib/storage: speed up search for data block for the given tsids
Use binary search instead of linear scan for looking up the needed
data block inside index block.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425
2022-12-03 20:59:59 -08:00
Aliaksandr Valialkin
d8303845ef
lib/storage: fix TestUpdateCurrHourMetricIDs test when it runs on the first hour of the day by UTC 2022-12-02 17:23:59 -08:00
Aliaksandr Valialkin
d8d4d21d7a
lib/{mergeset,storage}: re-use the code for removing isInMerge flag at parts
Move the common code into releasePartsToMerge() method and consistently use it throughout the code.
2022-12-02 17:07:52 -08:00
Aliaksandr Valialkin
be6da5053f
lib/promscrape: optimize service discovery speed
- Return meta-labels for the discovered targets via promutils.Labels
  instead of map[string]string. This improves the speed of generating
  meta-labels for discovered targets by up to 5x.

- Remove memory allocations in hot paths during ScrapeWork generation.
  The ScrapeWork contains scrape settings for a single discovered target.
  This improves the service discovery speed by up to 2x.
2022-11-29 21:26:23 -08:00
Aliaksandr Valialkin
2524d94fe1
lib/promscrape/discovery: add a benchmark for measuring the performance of creating pod meta-labels 2022-11-29 21:11:03 -08:00
Aliaksandr Valialkin
027ab74efb
lib/httpserver: link to url format docs in error message emitted on parse error for the provided url path
This should help users identifying and fixing improperly set up urls.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3402
2022-11-28 18:49:03 -08:00
Aliaksandr Valialkin
8ce5b095b7
lib/promscrape: add exported_ prefix to metric names exported by scrape targets if they clash with automatically generated metrics
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3406
2022-11-28 18:37:34 -08:00
匠心零度
d4808d5b84
lib/storage: remove extra error check (#3396) 2022-11-28 17:07:11 +01:00
Aliaksandr Valialkin
c5eebaffd8
app/{vminsert,vmagent}: follow-up after 53a63c6c4c
Extend /api/v1/import/prometheus with the support for Pushgateway way of specifying additional labels.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1415
2022-11-25 16:42:38 -08:00
Zakhar Bessarab
e407e7243a
{app/vmstorage,app/vmselect}: add API to get list of existing tenants (#3348)
* {app/vmstorage,app/vmselect}: add API to get list of existing tenants

* {app/vmstorage,app/vmselect}: add API to get list of existing tenants

* app/vmselect: fix error message

* {app/vmstorage,app/vmselect}: fix error messages

* app/vmselect: change log level for error handling

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-11-25 10:32:45 -08:00
Roman Khavronenko
d1169c1559
vmagent: expose metrics for tracking config state (#3375)
Expose `vm_relabel_config_*` and `vm_promscrape_config_*` metrics
for tracking relabel and scrape configuration hot-reloads.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3345
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-22 00:48:12 +02:00
Aliaksandr Valialkin
c33bcae457
lib/promscrape/discovery/gce: do not pass filter arg when discovering zones
The filter arg isn't supported by zones API in GCE.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3202
2022-11-21 22:32:45 +02:00
Aliaksandr Valialkin
48dda17e36
lib/workingsetcache: expose -cacheExpireDuration command-line flag for fine-tuning of the cache expiration
While at it, decrease -prevCacheRemovalPercent from 0.2 to 0.1 and increase -cacheExpireDuration from 20 minutes to 30 minutes.

This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-17 21:55:11 +02:00
Aliaksandr Valialkin
047fe3ee67
lib/promscrape: add a benchmark for internLabelStrings() 2022-11-16 23:02:41 +02:00
Aliaksandr Valialkin
b0b8f05fa4
lib/mergeset: properly reset bsr.bhIdx after the call to blockStreamReader.readNextBHS()
The issue has been introduced in 58b40f514c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 21:22:51 +02:00
Aliaksandr Valialkin
c8d7e1312c
lib/workingsetcache: add -prevCacheRemovalPercent command-line flag for tuning memory usage vs CPU usage ratio
Reduce the default value of this flag from 1% to 0.2% after 71335e6024

This flag should help determining the best ratio for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 12:41:37 +02:00
Aliaksandr Valialkin
06300fe9b8
lib/mergeset: retain the buffer with the data used by indexBlock.bhs, inside indexBlock.buf
Previously indexBlock.bhs pointed to the buffer, which could be changed over time.
This could result in incorrect time series search over time.

This is a follow-up for 58b40f514c

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 12:10:15 +02:00
Aliaksandr Valialkin
454060fd78
lib/mergeset: remove string allocation and copying when unmarshaling blockHeader
This should reduce CPU usage for the case from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3343
2022-11-16 12:10:14 +02:00
Aliaksandr Valialkin
a5a0fa75bf
lib/workingsetcache: tune cache miss threshold for resetting the previous cache from 5% to 1%
It has been appeared that some production workloads could suffer for some time
after every reset of the previous cache when it gets less than 5% of requests
after the needed item isn't found in the current cache. This could result
in reduced cache hit rates, which, in turn, could increase CPU, disk IO and RAM
usage needed for reading, unpacking and caching the missed data from disk.

This commit reduces the cache miss threshold for resetting the previous cache from 5% to 1%.
This should reduce the possible negative impact after each cache reset by at least 5x,
while reducing the total memory used by caches.

This is a follow-up for d906d8573e
2022-11-10 12:38:53 +02:00
Aliaksandr Valialkin
24213eaeba
lib/promscrape: add more cases to TestAddRowToTimeseries
This is a follow-up for 16fdd2af8a
2022-11-09 16:15:32 +02:00
Jeremy PLANCKEEL
87375b004a
test(golang): add test to function addRowToTimeseries (#3282)
Co-authored-by: jplanckeel-externe <jplanckeel.externe@bedrockstreaming.com>
2022-11-09 16:15:30 +02:00
Aliaksandr Valialkin
2091693f16
lib/protoparser/opentsdb: follow-up after 04b0e4e7bf
- Simplify the parser code to be less error prone
- Document the change
- Add a test for OpenTSDB put line with trailing whitespace without tags

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3290
2022-11-09 15:36:15 +02:00
Roman Khavronenko
71dfe4d697
protoparser/opentsdb: allow lines without tags (#3303)
According to http://opentsdb.net/docs/build/html/api_telnet/put.html
"At least one tag pair must be present".
However, in VictoriaMetrics datamodel tags aren't required.
This could be confusing for users. Allowing accept lines without
tags seems to do no harm.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3290
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-11-09 15:36:13 +02:00
Aliaksandr Valialkin
abf7e4e72f
lib/promscrape/discovery/consul: add __meta_consul_partition label in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/11482
2022-11-07 15:26:45 +02:00
Aliaksandr Valialkin
d3035b1ca1
lib/storage: follow-up for 790768f20b
- Document the bugfix at docs/CHANGELOG.md
- Simplify the bugfix a bit
2022-11-07 14:18:06 +02:00
Aliaksandr Valialkin
be78950011
lib/storage: typo fix after 32d48f8dfbb03174858c00bdfe6d9d22431dc8d8 2022-11-07 13:58:13 +02:00
Aliaksandr Valialkin
9d901ee55a
lib/envtemplate: allow non-env var names inside "%{ ... }" 2022-11-07 13:16:00 +02:00
Aliaksandr Valialkin
99e6a937a5
lib/storage: remove unused isFull field from hourMetricIDs struct 2022-11-07 13:15:59 +02:00
Aliaksandr Valialkin
4a6d5ab1b1
lib/promrelabel: go fmt after 5cec9706dc 2022-10-29 05:17:49 +03:00
Aliaksandr Valialkin
a72bf87e04
lib/promrelabel: add a test from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3251
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3251
2022-10-29 04:34:08 +03:00
Aliaksandr Valialkin
eae334f70e
lib/envflag: small refactoring after 518c340ae3 and 02096e06d0 2022-10-29 02:29:19 +03:00
Aliaksandr Valialkin
ac5528cb46
lib/promscrape: properly add exported_ prefix to labels, which clash with target labels if honor_labels: true option isn't set.
The issue was in the `labels := dst[offset:]` line in the beginning of appendExtraLabels() function.
The `dst` may be re-allocated when adding extra labels to it. In this case the addition of `exported_`
prefix to labels inside `labels` slice become invisible in the returned `dst` labels.

While at it, properly handle some corner cases:

- Add additional `exported_` prefix to clashing metric labels with already existing `exported_` prefix.
- Store scraped metric names in `exported___name__` label if scrape target contains `__name__` label.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3278

Thanks to @jplanckeel for the initial attempt to fix this issue
at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3281
2022-10-28 22:15:31 +03:00
Aliaksandr Valialkin
ba739c8052
lib/promscrape/discovery/kubernetes: do not print an empty kubeconfig_file option in yaml at /config page 2022-10-28 22:15:30 +03:00
Aliaksandr Valialkin
450a32970a
lib/envtemplate: allow referring env vars from other env vars via %{ENV_VAR} syntax
This is a follow-up for 02096e06d0
2022-10-26 14:51:02 +03:00
Aliaksandr Valialkin
36b92f07f7
lib/envflag: allow referring environment variables in command-line flags 2022-10-26 01:55:23 +03:00
Aliaksandr Valialkin
ecb71a7221
lib/fs: add canOverwrite arg to WriteFileAtomically when it is allowed to overwrite the file atomically if it already exists 2022-10-26 01:08:35 +03:00
Aliaksandr Valialkin
4f53147ed4
app/{vminsert,vmselect}/netstorage: allow calling Init()+MustStop() in a loop
Previously netstorage.MustStop() call didn't free up all the resources,
so the subsequent call to nestorage.Init() would panic.

This allows writing tests, which call nestorage.Init() + nestorage.MustStop() in a loop.
2022-10-25 14:43:05 +03:00
Aliaksandr Valialkin
a6d4711ac6
lib/storage: add support for retention filters (aka multiple retentions for distinct sets of time series)
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/143
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/289
2022-10-24 16:41:59 +03:00
Aliaksandr Valialkin
51f2e473f5
lib/storage: skip blocks outside the configured retention during search
Blocks outside the configured retention are eventually deleted during background merge.
But such blocks may reside in the storage for long time until background merge.
Previously VictoriaMetrics could spend additional CPU time on processing such blocks
during search queries. Now these blocks are skipped.
2022-10-24 02:56:13 +03:00
Aliaksandr Valialkin
2fc82b846e
lib/storage: do not pass retentionMsecs and isReadOnly args explicitly - access them via Storage arg
This makes code easier to read.

This is a follow-up after d2d30581a0
2022-10-24 01:32:56 +03:00
Aliaksandr Valialkin
d51f9b9284
lib/storage: small code cleanups 2022-10-24 01:17:58 +03:00
Aliaksandr Valialkin
5ace1587e6
lib/storage: re-use newTestStorage() instead of manually initializing Storage mock
This is a follow-up for d2d30581a0
2022-10-23 16:24:42 +03:00
Aliaksandr Valialkin
57ea7a3ee8
lib/storage: pass Storage to table and partition instead of getDeletedMetricIDs callback
This improves code readability a bit.
2022-10-23 16:11:02 +03:00
Aliaksandr Valialkin
63419d8e7c
lib/storage: small refactoring: move retentionDeadline to blockStreamMerger
This allows defining per-block retention in the future by updating the getRetentionDeadline function
2022-10-23 16:11:01 +03:00
Aliaksandr Valialkin
31071347ca
lib/storage: use a single reference to the currently merged block - bsm.Block during the block merge loop 2022-10-23 14:09:14 +03:00
Aliaksandr Valialkin
5d0a91afd5
lib/storage: properly pass uint64 constant to fmt.Errorf on 32-bit platforms 2022-10-23 12:48:43 +03:00
Aliaksandr Valialkin
2dd93449d8
lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function
The searchTSIDs function was searching for metricIDs matching the the given tag filters
and then was locating the corresponding TSID entries for the found metricIDs.

The TSID entries aren't needed when searching for time series names (aka MetricName),
so this commit removes the uneeded TSID search from the implementation of /api/v1/series API.
This improves perfromance of /api/v1/series calls.

This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls,
since now these calls cache small metricIDs instead of big TSID entries
in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs)
without the need to compress the saved entries in order to save cache space.

This commit also removes concurrency limiter during searching for matching time series,
which was introduced in 8f16388428, since the concurrency
for all the read queries is already limited with -search.maxConcurrentRequests command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2022-10-23 12:43:44 +03:00
Aliaksandr Valialkin
fe5611d6e1
lib/storage: free up memory occupied by Storage.pendingHourEntries after a temporary spike in its memory usage
This reduces vmstorage memory usage by up to 20% in production workload
2022-10-21 14:59:14 +03:00
Aliaksandr Valialkin
32b6ce691b
lib/storage: move common code to newRawRowsBlock() function 2022-10-21 14:46:06 +03:00
Aliaksandr Valialkin
2f8861ed9c
lib/storage: simplify code a bit after 3f5959c053 2022-10-21 14:39:44 +03:00
Aliaksandr Valialkin
1fb2be0cae
lib/{mergeset,storage}: simplify the code a bit after ae55ad8749 2022-10-21 14:33:15 +03:00
Aliaksandr Valialkin
af648279ce
lib/storage: validate timestamps in the block only if they use encoding, which needs validation
This reduces CPU usage when there is no sense in validating timestamps.

This is a follow-up for 5fa9525498

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011
2022-10-21 00:54:37 +03:00
Aliaksandr Valialkin
edf3b7be47
lib/storage: try generating initial parts from inmemory rows with identical sizes under high ingestion rate
This should improve background merge rate under high load a bit
2022-10-20 23:27:44 +03:00
Aliaksandr Valialkin
4d71023eb9
lib/workingsetcache: increase default cache expiration from 10 minutes to 20 minutes
This increases the maximum time for cache population with new entries from 20 minutes to 40 minutes.
This

This change shouldn't increase memory usage for caches, since the prev cache cleaner
should free up memory by deleting unused prev cache as soon as possible.
See 08ca45d238 for details on prev cache cleaner.
2022-10-20 21:59:08 +03:00
Aliaksandr Valialkin
9a52b56b89
lib/workingsetcache: move the cleaner for the prev cache into a separate goroutine
This makes the code more clear after d906d8573e
2022-10-20 21:59:02 +03:00
Aliaksandr Valialkin
324e119172
lib/procutil: stop immediately after receiving the second SIGINT or SIGTERM signal
Previously VictoriaMetrics apps could stop responding to SIGINT and SIGTERM signals
if they hang for some reason in graceful shutdown procedure.
2022-10-20 21:58:49 +03:00