Commit graph

1985 commits

Author SHA1 Message Date
Aliaksandr Valialkin
d9bbf24183
app/{vminsert,vmselect}/netstorage: allow calling Init()+MustStop() in a loop
Previously netstorage.MustStop() call didn't free up all the resources,
so the subsequent call to nestorage.Init() would panic.

This allows writing tests, which call nestorage.Init() + nestorage.MustStop() in a loop.
2022-10-25 17:47:17 +03:00
Aliaksandr Valialkin
8e998aa1a1
lib/storage: add support for retention filters (aka multiple retentions for distinct sets of time series)
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/143
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/289
2022-10-24 16:40:20 +03:00
Aliaksandr Valialkin
dba218a8ce
lib/storage: skip blocks outside the configured retention during search
Blocks outside the configured retention are eventually deleted during background merge.
But such blocks may reside in the storage for long time until background merge.
Previously VictoriaMetrics could spend additional CPU time on processing such blocks
during search queries. Now these blocks are skipped.
2022-10-24 02:52:44 +03:00
Aliaksandr Valialkin
e2f0b76ebf
lib/storage: do not pass retentionMsecs and isReadOnly args explicitly - access them via Storage arg
This makes code easier to read.

This is a follow-up after d2d30581a0
2022-10-24 01:31:04 +03:00
Aliaksandr Valialkin
89a1108b1a
lib/storage: small code cleanups 2022-10-24 01:17:47 +03:00
Aliaksandr Valialkin
05512fdd74
lib/storage: re-use newTestStorage() instead of manually initializing Storage mock
This is a follow-up for d2d30581a0
2022-10-23 16:24:00 +03:00
Aliaksandr Valialkin
d2d30581a0
lib/storage: pass Storage to table and partition instead of getDeletedMetricIDs callback
This improves code readability a bit.
2022-10-23 16:10:04 +03:00
Aliaksandr Valialkin
54f35c175c
lib/storage: small refactoring: move retentionDeadline to blockStreamMerger
This allows defining per-block retention in the future by updating the getRetentionDeadline function
2022-10-23 16:10:02 +03:00
Aliaksandr Valialkin
187e294a53
lib/storage: use a single reference to the currently merged block - bsm.Block during the block merge loop 2022-10-23 14:08:57 +03:00
Aliaksandr Valialkin
d0a9ca1bc2
lib/storage: properly pass uint64 constant to fmt.Errorf on 32-bit platforms 2022-10-23 12:48:00 +03:00
Aliaksandr Valialkin
5e4dfe50c6
lib/storage: subsitute searchTSIDs functions with more lightweight searchMetricIDs function
The searchTSIDs function was searching for metricIDs matching the the given tag filters
and then was locating the corresponding TSID entries for the found metricIDs.

The TSID entries aren't needed when searching for time series names (aka MetricName),
so this commit removes the uneeded TSID search from the implementation of /api/v1/series API.
This improves perfromance of /api/v1/series calls.

This commit also improves performance a bit for /api/v1/query and /api/v1/query_range calls,
since now these calls cache small metricIDs instead of big TSID entries
in the indexdb/tagFilters cache (now this cache is named indexdb/tagFiltersToMetricIDs)
without the need to compress the saved entries in order to save cache space.

This commit also removes concurrency limiter during searching for matching time series,
which was introduced in 8f16388428, since the concurrency
for all the read queries is already limited with -search.maxConcurrentRequests command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2022-10-23 12:23:47 +03:00
Aliaksandr Valialkin
4128ad71e2
lib/storage: move common code to newRawRowsBlock() function 2022-10-21 14:46:55 +03:00
Aliaksandr Valialkin
b5674164c6
lib/storage: simplify code a bit after 3f5959c053 2022-10-21 14:39:27 +03:00
Aliaksandr Valialkin
fd7c86ae25
lib/{mergeset,storage}: simplify the code a bit after ae55ad8749 2022-10-21 14:33:03 +03:00
Aliaksandr Valialkin
99d67ac8ad
lib/storage: validate timestamps in the block only if they use encoding, which needs validation
This reduces CPU usage when there is no sense in validating timestamps.

This is a follow-up for 5fa9525498

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011
2022-10-21 00:52:32 +03:00
Aliaksandr Valialkin
3f5959c053
lib/storage: try generating initial parts from inmemory rows with identical sizes under high ingestion rate
This should improve background merge rate under high load a bit
2022-10-20 23:28:24 +03:00
Aliaksandr Valialkin
891ff6af2a
lib/workingsetcache: increase default cache expiration from 10 minutes to 20 minutes
This increases the maximum time for cache population with new entries from 20 minutes to 40 minutes.
This

This change shouldn't increase memory usage for caches, since the prev cache cleaner
should free up memory by deleting unused prev cache as soon as possible.
See 08ca45d238 for details on prev cache cleaner.
2022-10-20 21:48:25 +03:00
Aliaksandr Valialkin
08ca45d238
lib/workingsetcache: move the cleaner for the prev cache into a separate goroutine
This makes the code more clear after d906d8573e
2022-10-20 21:45:29 +03:00
Aliaksandr Valialkin
4cd173bbaa
lib/procutil: stop immediately after receiving the second SIGINT or SIGTERM signal
Previously VictoriaMetrics apps could stop responding to SIGINT and SIGTERM signals
if they hang for some reason in graceful shutdown procedure.
2022-10-20 21:40:20 +03:00
Aliaksandr Valialkin
150e99d403
lib/{mergeset,storage}: avoid unaligned 64-bit atomic operation panic on 32-bit platforms
The panic has been introduced in 68f3a02589

While at it, add padding to shard structs in order to avoid false sharing on mordern CPUs

This should improve scalability on systems with many CPU cores
2022-10-20 16:25:43 +03:00
Aliaksandr Valialkin
d906d8573e
lib/workingsetcache: drop the previous cache whenever it recieves less than 5% of requests comparing to the current cache
This means that the majority of requests are successfully served from the current cache,
so the previous cache can be reset in order to free up memory.
2022-10-20 10:47:58 +03:00
Aliaksandr Valialkin
817aeafd69
lib/workingsetcache: use per-bucket stats counters instead of global stats counters for cache hits/misses
This should improve cache scalability on systems with many CPU cores.
2022-10-20 09:12:17 +03:00
Aliaksandr Valialkin
9c02c39487
lib/workingsetcache: randomize interval for swapping curr and prev caches
This should make CPU usage smoother over time, since different caches
will be swapped at different times.
2022-10-20 08:42:43 +03:00
Nikolay
1059c4d84a
lib/promscrape/discovery/kubernetes: correctly wrap error (#3250)
* lib/promscrape/discovery/kubernetes: correctly wrap error
follow-up after 1304824201

* Update docs/CHANGELOG.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-18 20:37:42 +03:00
Aliaksandr Valialkin
069401a304
all: log error when environment variables referred from -promscrape.config are missing
This should prevent from using incorrect config files
2022-10-18 10:47:16 +03:00
Aliaksandr Valialkin
fb50730ba7
lib/storage: double the number of rawRows shards on multi-core systems
This should increase data ingestion scalability on multi-core systems at the cost of slightly higher memory usage
2022-10-17 18:19:51 +03:00
Aliaksandr Valialkin
ae55ad8749
lib/{storage,mergeset}: do not hold per-shard lock in fast path when adding per-shard items to the flush list 2022-10-17 18:01:26 +03:00
Aliaksandr Valialkin
b6e8c1403a
lib/promrelabel: add relabeling tests when the source label is missing 2022-10-17 14:47:52 +03:00
Aliaksandr Valialkin
2e3be68617
lib/bytesutil: make sure that the string passed to FastStringMather.Match() is copied before using it as a key in the internal cache map
This prevents from possible corruption of the internal cache map
when the underlying byte slice used by the string key is modified.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3227
2022-10-14 09:51:19 +03:00
Nikolay
b856581ad3
lib/backup: set s3 default region to us-west-2 (#3224)
* lib/backup: set s3 default region to us-west-2
it should fix an error with region detection for bucket, if AWS_REGION env var is not set

* Update lib/backup/s3remote/s3.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-13 10:30:07 +03:00
Aliaksandr Valialkin
185cff307b
lib/mergeset: mention in the error message the path to the part, which triggered the error
This should improve debuggability
2022-10-12 09:54:21 +03:00
Aliaksandr Valialkin
50f5eae0e0
lib/promrelabel: remove unconditional sorting of the labels in ParsedConfigs.Apply(), since the sorting isnt needed in many places
Sort labels explicitly after calling the ParsedConfigs.Apply() when needed.

This reduces CPU usage when performing metric-level relabeling, where labels' sorting isn't needed.
2022-10-09 14:51:16 +03:00
Aliaksandr Valialkin
5269b1ad77
lib/promscrape: allow controlling staleness tracking on a per-scrape_config basis
Add support for no_stale_markers option at scrape_config section.
See https://docs.victoriametrics.com/sd_configs.html#scrape_configs and
https://docs.victoriametrics.com/vmagent.html#prometheus-staleness-markers
2022-10-07 23:36:14 +03:00
Aliaksandr Valialkin
f9df0cae16
lib/promscrape: allow specifying full target url in __address__ label
Previously the `__address__` label could contain only `host:port` part of the target url,
while the scheme and metrics path were obtained from `__scheme__` and `__metrics_path__`
labels. Now it is possible to set the full url in `__address__` label.

This makes valid the following scrape config, which is frequently used by novice users:

scrape_configs:
- job_name: foo
  static_configs:
  - targets:
    - http://host1/metrics1
    - https://host2/metrics2
2022-10-07 22:43:04 +03:00
Aliaksandr Valialkin
711698b858
lib/backup/azremote: typo fixes after 03872025b747fcc4ee98710ad10fc98764328511 2022-10-07 01:02:06 +03:00
Zakhar Bessarab
176f10f5b2
app/vmbackup: fix compatibility with latest azure sdk (#461) 2022-10-07 01:02:03 +03:00
Aliaksandr Valialkin
d9282027e6
app: follow-up after ec04fcac93
* Optimize fast path for /api/v1/import when importing numeric values
* Move the docs about the change from features to bugfixes at docs/CHANGELOG.md
* Update tests at lib/protoparser/vmimport

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3161
2022-10-06 14:52:02 +03:00
Dmytro Kozlov
ec04fcac93
Properly parse json when export import metric (#3180)
* app/vmselect: properly work when export import json from `api/v1/{export, import}` API

* app/vmselect: update convert function

* app/vmselect: export null if `math.IsNaN(v)`

* app/vmselect: get float from json

* lib/protoparser: add test

* docs: add change log

* lib/protoparser: make export import api compatible
2022-10-06 13:54:20 +03:00
Zakhar Bessarab
97239e05ce
lib/backup/s3remote: fix error checking for alternative S3 providers (#3191) 2022-10-06 13:36:40 +03:00
Aliaksandr Valialkin
1e93ad84e3
lib/backup/azremote: remove unused methods after the 262ce77e2d 2022-10-06 13:08:58 +03:00
Zakhar Bessarab
262ce77e2d
lib/backup: add support of Azure Blob Storage (#460)
* lib/backup: add support of Azure Blob Storage

* lib/backup: add enterprise support of Azure Blob Storage
2022-10-06 00:32:46 +03:00
Aliaksandr Valialkin
0dc93cca7f
app/vmagent/remotewrite: allow specifying per--remoteWrite.url disk limits for persistent queue with pending data
This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3071

Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2970
2022-10-01 18:40:59 +03:00
Aliaksandr Valialkin
c1fa9828b3
lib/flagutil: rename Array to ArrayString
This makes the ArrayString more consistent with other Array* types.

While at it, add ArrayBytes type, which will be used for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3071
2022-10-01 18:26:36 +03:00
Zakhar Bessarab
87c77727e4
vmbackup: update AWS SDK to v2 (#3174)
* lib/backup/s3remote: update AWS SDK to v2

* Update lib/backup/s3remote/s3.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>

* lib/backup/s3remote: refactor error handling

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-10-01 17:12:07 +03:00
Aliaksandr Valialkin
725dfb0ed6
lib/httpserver: use 302 redirects instead of 301 redirects
Incorrect 301 redirects can be cached by user agents such as web browsers.
This can complicate recovery procedure after the incorrect redirect is fixed,
e.g. web browser cache must be reset.

The related issue - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1752
2022-10-01 16:53:35 +03:00
Aliaksandr Valialkin
4998402004
lib/promscrape: add external_labels from global section of -promscrape.config after the relabeling is applied to the scraped metrics
This aligns with Prometheus behaviour.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3137
2022-10-01 16:13:19 +03:00
Aliaksandr Valialkin
3a98ef2f5f
lib/promrelabel: export MustParseMetricWithLabels function, which can be used for simplifying tests 2022-10-01 16:05:51 +03:00
Aliaksandr Valialkin
f86070169d
lib/promscrape/discovery/azure: remove unneeded conversion to string 2022-10-01 16:04:37 +03:00
Aliaksandr Valialkin
db16759c68
lib/storage: optimize matching speed for non-trivial regexp filters
Wrap re.Match into bytesutil.FastStringMatcher.

This increases performance for `{foo=~"complex_regex_here"}` filters
by up to 4x.
2022-10-01 12:06:06 +03:00
Aliaksandr Valialkin
e8a64f6e7a
lib/promrelabel: remove redundant memory allocations by using interned strings 2022-10-01 11:50:21 +03:00
Aliaksandr Valialkin
73dc17ef64
lib/promrelabel: add a benchmark for realistic Kubernetes relabeling
The benchmark name is BenchmarkApplyRelabelConfigs/kubernetes

This benchmark has been copied from d521933053/model/relabel/relabel_test.go (L505)

See also https://github.com/prometheus/prometheus/pull/11147
2022-10-01 10:38:22 +03:00
Aliaksandr Valialkin
c54e14cdec
lib/promscrape/discovery/ec2: expose __meta_ec2_region label in the same way as Prometheus 2.39 does
See https://github.com/prometheus/prometheus/pull/11326
2022-09-30 20:48:32 +03:00
Nikolay
33f40f4a5f
app/vminsert: allows parsing tenant id from labels (#3009)
* app/vminsert: allows parsing tenant id from labels
it should help mitigate issues with vmagent's multiTenant mode, which works incorrectly at heavy load
and it cannot handle more then 100 different tenants.
This functional hidden with flag and do not change vminsert default behaviour
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2970

* Update docs/Cluster-VictoriaMetrics.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* wip

* app/vminsert/netstorage: clean remaining labels in order to free up GC

* docs/Cluster-VictoriaMetrics.md: typo fix

* wip

* wip

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-30 18:35:53 +03:00
Aliaksandr Valialkin
171dd14aa3
lib/promrelabel: go fmt 2022-09-30 12:28:55 +03:00
Aliaksandr Valialkin
a18d6d5ccc
lib/promrelabel: optimize action: replace for non-trivial regex values
Cache `action: replace` results for non-trivial regexs and return them next time
instead of performing CPU-intensive regex replacement.

Optimize also `action: labelmap_all` and `action: replace_all` in the same way.
2022-09-30 12:25:05 +03:00
Aliaksandr Valialkin
146021a076
lib/promrelabel: there is no need in calling regex.HasPrefix() after the optimization at 17289ff481 2022-09-30 10:49:18 +03:00
Aliaksandr Valialkin
899d2c40fb
lib/promrelabel: optimize action: labelmap for non-trivial regexs 2022-09-30 10:43:31 +03:00
Aliaksandr Valialkin
17289ff481
lib/regexutil: cache MatchString results for unoptimized regexps
This increases relabeling performance by 3x for unoptimized regexs
2022-09-30 10:41:29 +03:00
Aliaksandr Valialkin
fda60b3d4d
lib/promrelabel: properly parse regex with escaped $ at the end
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3131

Thanks to @dmitryk-dk for the initial fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3179
2022-09-30 08:15:43 +03:00
Aliaksandr Valialkin
593da3603e
lib/bytesutil: move InternString() from lib/promscrape/discoverytutils to lib/bytesutil
lib/bytesutil is more appropriate place for InternString() function
2022-09-30 07:44:35 +03:00
Nikolay
f61b8cec69
lib/awsapi: fixes sign encoding (#3183)
* lib/awsapi: fixes sign encoding

previously white spaces at filter were incorrectly encoded
encoding tip was copied from aws signing lib
For example, the space character must be encoded as %20 (not using '+', as some encoding schemes do)
https://docs.aws.amazon.com/general/latest/gr/sigv4-create-canonical-request.html
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3171

* Update lib/awsapi/sign.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-30 07:43:44 +03:00
Aliaksandr Valialkin
6a32a64073
lib/bytesutil: add FastStringTransformer and use it in the rest of the code where needed 2022-09-28 10:41:00 +03:00
Aliaksandr Valialkin
92b3622253
lib/protoparser/datadog: optimize sanitizeName() function by using result cache for input strings
This is a follow-up for 7c2474dac7

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3105
2022-09-28 10:40:59 +03:00
Aliaksandr Valialkin
ef435f8cc4
lib/promrelabel: add SanitizeName() function for sanitizing Prometheus metric names and label names
Optimize this function by using results cache for input strings.
Use this function all over the code.

This is a follow-up for fcffdba9dc

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3113
2022-09-28 10:40:59 +03:00
Aliaksandr Valialkin
6411bbcce7
lib/netutil/tls.go: consistently use tlsMinVersion name across source code
This should simplify further code maintenance and refactoring

This is a follow-up after 6ab1cede62
2022-09-26 17:58:01 +03:00
Dmytro Kozlov
6ab1cede62
lib/{httpserver,netutil}: allow to define min and max TLS version of the http server (#3109)
* lib/{httpserver,netutil}: allow to define min and max TLS version of the http server

* lib/httpserver: added descriptions about tls supported versions

* lib/netutil: check minimal tls version, added supported tls versions to error

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-26 17:35:45 +03:00
Roman Khavronenko
e96ccf3f71
lib/mergeset: follow-up after a0e7432e42 (#3145)
* lib/mergeset: follow-up after a0e7432e42

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Apply suggestions from code review

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-09-26 16:39:56 +03:00
Zakhar Bessarab
f022296d96
vmbackup: configure retries for GCS remote FS (#3156) 2022-09-26 16:28:20 +03:00
Aliaksandr Valialkin
41f8c2987d
lib/protoparser/graphite: accept whitespace in metric names and tags according to the specification
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/99
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3102

See the specification https://graphite.readthedocs.io/en/latest/tags.html
2022-09-26 15:17:25 +03:00
Aliaksandr Valialkin
7c2474dac7
lib/protoparser/datadog: sanitize metric names by default in the same way as DataDog does
This commit is based on the pull request https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3105

Thanks to @PerGon for the idea and initial implementation.
2022-09-26 13:57:23 +03:00
匠心零度
3d5509a720
lib/querytracer: fix comment (#3135) 2022-09-22 19:19:48 +03:00
Aliaksandr Valialkin
56ce7ce85b
lib/promscrape: typo fix after 74c00a8762 2022-09-14 15:06:50 +03:00
Aliaksandr Valialkin
74c00a8762
lib/promscrape: read response body into memory in stream parsing mode before parsing it
This reduces scrape duration for targets returning big responses.

The response body was already read into memory in stream parsing mode before this change,
so this commit shouldn't increase memory usage.
2022-09-14 13:15:29 +03:00
Aliaksandr Valialkin
ccad651a61
lib/promscrape/discovery/kubernetes: add more context on WatchEvent parse error
This should improve debugging issues with Kubernetes API server
2022-09-13 19:36:55 +03:00
Aliaksandr Valialkin
ce2c07c5a7
lib/mergeset: atomically remove part dirs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
042a532f70
lib/storage: substitute remaining calls to fs.MustRemoveAll with fs.MustRemoveDirAtomic
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
68e32b0764
lib/storage: atomically remove parts inside partitions
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:38 +03:00
Aliaksandr Valialkin
340ada871d
lib/storage: atomically remove partitions, which went outside the configured retention
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:37 +03:00
Aliaksandr Valialkin
978dcb4574
lib/storage: properly remove cache directory contents if reset_cache_on_startup file is located there
Previously the cache directory was removed. This could result in error when the cache directory
is mounted to a separate filesystem.
2022-09-13 16:17:36 +03:00
Aliaksandr Valialkin
5f28ca1f42
lib/storage: atomically remove snapshot directories
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3038
2022-09-13 16:17:36 +03:00
Aliaksandr Valialkin
5fa9525498
lib/storage: verify that timestamps in block are in the range specified by blockHeader.{Min,Max}Timestamp when upacking the block
This should reduce chances of unnoticed on-disk data corruption.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3011

This change modifies the format for data exported via /api/v1/export/native -
now this data contains MaxTimestamp and PrecisionBits fields from blockHeader.

This is OK, since the native export format is undocumented.
2022-09-06 13:08:09 +03:00
Bryce Lampe
74f8e12e87
Support "HTTP" and "HTTPS" schemes (#3019)
* Support "HTTP" and "HTTPS" schemes

* Update lib/promscrape/config.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2022-08-27 02:22:37 +03:00
Aliaksandr Valialkin
30b8d91727
lib/promscrape/discoveryutils: always store just allocated string to sanitized label names cache
This is a follow-up for c06e7a142c
2022-08-27 00:28:39 +03:00
Aliaksandr Valialkin
c06e7a142c
lib/promscrape: optimize discoveryutils.SanitizeLabelName()
Cache sanitized label names and return them next time.
This reduces the number of allocations and speeds up the SanitizeLabelName()
function for common case when the number of unique label names is smaller than 100k
2022-08-27 00:17:45 +03:00
Aliaksandr Valialkin
a2cd79576f
lib/promrelabel: call PromRegex.MatchString() on a slow path only if it contains non-empty literal prefix
This should improve slow path speed for regexps without literal prefixes
2022-08-26 21:48:30 +03:00
Aliaksandr Valialkin
f49c9bb700
lib/promrelabel: optimize common regex mismatch cases for action: replace and action: labelmap 2022-08-26 15:45:31 +03:00
Aliaksandr Valialkin
4c6916f32a
lib/promrelabel: use regexutil.PromRegex for regex matching in actions labeldrop,labelkeep,drop and keep
This makes possible optimizing additional cases inside regexutil.PromRegex
2022-08-26 15:23:45 +03:00
Aliaksandr Valialkin
7afe8450fc
lib/promrelabel: optimize matching for commonly used regex patterns in if option
The following regex patterns are optimized:

- literal string match, e.g. "foo"
- prefix match, e.g. "foo.*" and "foo.+"
- substring match, e.g. ".*foo.*" and ".+foo.+"
- alternate values match, e.g. "foo|bar|baz"
2022-08-26 14:53:06 +03:00
Aliaksandr Valialkin
0ad3bbadd3
lib/regexutil: add Simplify() function for simplifying the regular expression 2022-08-26 11:57:12 +03:00
Aliaksandr Valialkin
b373661988
lib/promrelabel: optimize action: {drop,keep,labeldrop,labelkeep} with anchored regex prefix
The following commonly used relabeling rules must work faster now:

- action: labeldrop
  regex: "^foo.+$"

- action: labeldrop
  regex: "^bar.*"
2022-08-25 23:23:55 +03:00
Aliaksandr Valialkin
0d4ea03a73
lib/promrelabel: optimize action: {labeldrop,labelkeep,keep,drop} with regex containing alternate values
For example, the following relabeling rule must work much faster now:

- action: labeldrop
  regex: "foo|bar|baz"
2022-08-24 17:54:29 +03:00
Aliaksandr Valialkin
0d46e24af5
lib/storage: increase the maximum possible or values extracted from regexp from 20 to 100
This should improve time series search speed for regexp filters with big number of `or` values.
2022-08-24 17:15:25 +03:00
Aliaksandr Valialkin
fdbf5b5795
lib/storage: ignore start text and end text anchors in getOrValues(regexp) function
This is OK, since the anchors are implicitly applied to the whole regexp.
This optimization should improve the speed for regexp series filters with explicit $ and ^ anchors.
For example, `{label="^(foo|bar)$"}`
2022-08-24 17:12:52 +03:00
Aliaksandr Valialkin
796aa310c2
app/vmstorage: expose vm_{hourly,daily}_series_limit_{max,current}_series metrics if -storage.max{Hourly,Daily}Series limits are set
These metrics allow alerting when the number of unique series approach the limit.
For example, the following query alerts when the number of series reaches 90% of the configured limit:

    vm_hourly_series_limit_current_series / vm_hourly_series_limit_max_series > 0.9
2022-08-24 13:44:04 +03:00
Aliaksandr Valialkin
1f89278d88
all: subsitute ioutil.ReadAll with io.ReadAll
ioutil.ReadAll is deprecated since Go1.16 - see https://tip.golang.org/doc/go1.16#ioutil
VictoriaMetrics requires at least Go1.18, so it is OK to switch from ioutil.ReadAll to io.ReadAll.

This is a follow-up for 02ca2342ab
2022-08-22 00:16:37 +03:00
Aliaksandr Valialkin
2c3a89339d
all: use os.ReadDir instead of ioutil.ReadDir
The ioutil.ReadDir is deprecated since Go1.16 - see https://tip.golang.org/doc/go1.16#ioutil
VictoriaMetrics requires at least Go1.18, so it is time to switch from io.ReadDir to os.ReadDir

This is a follow-up for 02ca2342ab
2022-08-22 00:02:25 +03:00
Aliaksandr Valialkin
9f94c295ab
all: use os.{Read|Write}File instead of ioutil.{Read|Write}File
The ioutil.{Read|Write}File is deprecated since Go1.16 -
see https://tip.golang.org/doc/go1.16#ioutil

VictoriaMetrics needs at least Go1.18, so it is safe to remove ioutil usage
from source code.

This is a follow-up for 02ca2342ab
2022-08-21 23:52:35 +03:00
Roman Khavronenko
d59d829cdb
lib/storage: bump max merge concurrency for small parts to 15 (#2997)
* lib/storage: bump max merge concurrency for small parts to 15

The change is based on the feedback from users on github.
Thier examples show, that limit of 8 sometimes become a
bottleneck. Users report that without limit concurrency
can climb up to 15-20 merges at once.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update lib/storage/partition.go

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-21 23:32:08 +03:00
Aliaksandr Valialkin
8550c44e31
app/vmagent: add ability to construct a label from multiple existing labels by referring them in the replacement field during relabeling
For example:

- target_label: composite-label
  replacement: "{{source_label1}}-{{source_label2}}"
2022-08-21 22:50:01 +03:00
Roman Khavronenko
31f922944e
lib/storage: fix the search for empty label name (#2991)
* lib/storage: fix the search for empty label name

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Apply suggestions from code review

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-17 21:32:25 +03:00
Aliaksandr Valialkin
7d26414b2e
lib/promscrape: automatically generate additional per-target labels for targets with non-zero series limit
The following metrics are generated:

- scrape_series_limit
- scrape_series_current
- scrape_series_limit_samples_dropped

These metrics simplify alerting on targets, which expose too many time series

See https://docs.victoriametrics.com/vmagent.html#automatically-generated-metrics
and https://docs.victoriametrics.com/vmagent.html#cardinality-limiter for more details
2022-08-17 13:19:33 +03:00
Aliaksandr Valialkin
bb68ab99fa
lib/promscrape: retry http requests if the server returns 429 status code
The 429 status code means that the server is overwhelmed with requests.
The client can retry the request after some wait time.
Implement this strategy for service discovery and scrape requests.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2940
2022-08-16 15:01:08 +03:00
Aliaksandr Valialkin
b0e1bb517e
lib/storage: typo fix in comments after f830edc0bc 2022-08-16 13:44:45 +03:00
Aliaksandr Valialkin
f830edc0bc
lib/storage: improve performance for /api/v1/labels and /api/v1/label/.../values endpoints when match[] filter matches small number of time series
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2978
2022-08-16 13:32:40 +03:00
Aliaksandr Valialkin
c3f8481011
lib/promscrape: update links to sd_configs from Prometheus site to https://docs.victoriametrics.com/sd_configs.html 2022-08-15 01:40:20 +03:00
Aliaksandr Valialkin
95d36da358
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_pod_container_image label in the same way as Prometheus 2.38 does
See https://github.com/prometheus/prometheus/pull/11034
2022-08-15 01:18:23 +03:00
Aliaksandr Valialkin
c4fcd9f1c5
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_service_port_number label to role: service in the same way as Prometheus 2.38 does
See https://github.com/prometheus/prometheus/pull/11002
2022-08-15 01:06:34 +03:00
Aliaksandr Valialkin
511805d88d
lib/promscrape/discovery/dns: add support for resolving MX records
See https://github.com/prometheus/prometheus/pull/10099
2022-08-15 00:32:34 +03:00
Roman Khavronenko
a0e7432e42
lib/storage: prevent excessive loops when storage is in RO (#2962)
* lib/storage: prevent excessive loops when storage is in RO

Returning nil error when storage is in RO mode results
into excessive loops and function calls which could
result into CPU exhaustion. Returning an err instead
will trigger delays in the for loop and save some resources.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-08-09 12:17:00 +03:00
Aliaksandr Valialkin
46d7792b72
lib/promscrape: follow-up after 2c553d5a2f
- fix broken tests
- cosmetic code cleanup
- document the change at https://docs.victoriametrics.com/vmagent.html#multitenancy
- document the change at https://docs.victoriametrics.com/CHANGELOG.html
2022-08-08 14:46:26 +03:00
Fury
2c553d5a2f
add support to scrape multi tenant metrics (#2950)
* add support to scrape multi tenant metrics

* add support to scrape multi tenant metrics

Co-authored-by: 赵福玉 <zhaofuyu@zhaofuyudeMac-mini.local>
2022-08-08 14:10:18 +03:00
Roman Khavronenko
d3f13ab85b
lib/promrelabel: fix expected test result (#2957)
follow-up after 68c4ec9472

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-08-08 13:47:29 +03:00
Aliaksandr Valialkin
68c4ec9472
lib/promrelabel: do not split regex into multiple lines if it contains groups
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2928
2022-08-08 03:15:26 +03:00
Aliaksandr Valialkin
892c97e350
lib/auth: follow-up after b6a6a659f4 2022-08-07 23:14:39 +03:00
Dmytro Kozlov
b6a6a659f4
lib/auth: add tests for NewToken function (#2921)
* lib/auth: add tests from NewToken function

* lib/auth: update test, fix problem with type conversion

* lib/auth: update test description

* lib/auth: simplify failure tests
2022-08-07 23:07:57 +03:00
Aliaksandr Valialkin
9fa6b25fb2
lib/logger: prettify logging the defined command-line flags 2022-08-07 22:58:29 +03:00
Aliaksandr Valialkin
0ef29ceb14
lib/promscrape/discovery/kubernetes: add missing __meta_kubernetes_ingress_class_name label for role: ingress
See 7e65ad3e43
and 7e1111ff14
2022-08-05 20:55:00 +03:00
Aliaksandr Valialkin
f2816ef031
lib/promscrape/discovery/ec2: properly handle custom endpoint option in ec2_sd_configs
This option was ignored since d289ecded1

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287
2022-08-05 18:50:02 +03:00
Aliaksandr Valialkin
3e8890e71b
lib/promscrape/discovery/dockerswarm: properly set __meta_dockerswarm_container_label_* labels instead of __meta_dockerswarm_task_label_* labels
See https://github.com/prometheus/prometheus/issues/9187
2022-08-05 16:11:28 +03:00
Aliaksandr Valialkin
68de1f4e4a
lib/promscrape/discovery/consul: allow stale responses from Consul service discovery by default
This aligns with Prometheus behaviour.

See `allow_stale` option description at https://prometheus.io/docs/prometheus/latest/configuration/configuration/#consul_sd_config
2022-08-05 14:41:40 +03:00
Aliaksandr Valialkin
02de848c88
lib/promscrape/discovery/yandexcloud: further code cleanup after 83a4abda3f 2022-08-05 10:30:47 +03:00
Aliaksandr Valialkin
83a4abda3f
lib/promscrape/discovery/yandexcloud: follow-up after 6e5ac32fba
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1386
2022-08-04 22:26:43 +03:00
Igor Tiunov
6e5ac32fba
YC service discovery (#2923)
* YC service discovery

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1386

* Fixed linter suggestions

* fixed golint errors
2022-08-04 20:44:16 +03:00
Aliaksandr Valialkin
d5df08e9c2
lib/mergeset: cleanup after de6dd1cd5a
Remove unused getInmemoryPart and putInmemoryPart functions

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2249
2022-08-04 18:23:01 +03:00
Aliaksandr Valialkin
7c99b9eaad
lib/backup/actions: rename removeLockFile -> removeRestoreLock to have consistent naming with createRestoreLock function 2022-08-04 17:42:43 +03:00
Aliaksandr Valialkin
6b0550c023
app/{vmselect,vmalert}: properly generate http redirects if -http.pathPrefix command-line flag is set
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2918
2022-08-02 12:59:07 +03:00
Aliaksandr Valialkin
5a4c58f9a2
lib/storage: explain why the GetOrCreateTSIDByName function doesnt check whether the per-day entry for the given date exists if TSID is found in global index 2022-08-02 09:12:29 +03:00
Aliaksandr Valialkin
78520f2702
lib/storage: do not compress small number of tsids when storing them in tagFiltersCache
This speeds up tsids retreival from the cache for 0-2 tsids
2022-07-30 00:08:51 +03:00
Aliaksandr Valialkin
de6dd1cd5a
lib/mergeset: optimize mergeInmemoryBlocks() function
Do not spend CPU time on converting inmemoryBlock structs to inmemoryPart structs.
Just merge inmemoryBlock structs directly.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2249
2022-07-27 23:58:05 +03:00
Aliaksandr Valialkin
a3f5822dc2
lib/mergeset: do not update blockStreamReader.bh.firstItem during the merge
Just read the current item directly from blockStreamReader.Block.Items
with the helper method - blockStreamReader.CurrItem()
2022-07-27 23:05:02 +03:00
Aliaksandr Valialkin
be1c82beb1
benchmark inmemoryBlock.{Marshal,Unmarshal} for different prefix length
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2254

This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2913
2022-07-27 22:20:27 +03:00
Aliaksandr Valialkin
5c84f09762
lib/mergeset: add tests and benchmarks for commonPrefixLen function
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2254

This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2913
2022-07-27 21:24:51 +03:00
Aliaksandr Valialkin
f5676123cc
lib/pushmetrics: make fmt 2022-07-26 20:40:19 +03:00
Aliaksandr Valialkin
da11056d85
all: rename -pushmetrics.extraLabels to -pushmetrics.extraLabel for the sake of consistency 2022-07-26 19:24:24 +03:00
Aliaksandr Valialkin
ad6b3cd47d
lib/pushmetrics: properly handle errors when initializing pushmetrics 2022-07-22 13:36:06 +03:00
Aliaksandr Valialkin
4c2f9a1a2e
lib/promscrape: set up=0 for partially failed scrape in stream parsing mode
This behaviour aligns with Prometheus behavior
2022-07-22 13:29:44 +03:00
Roman Khavronenko
2914ce5ca5
vmalert: remove dependency on datasource pkg from config (#2905)
* vmalert: remove dependency on datasource pkg from config

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-22 10:44:55 +02:00
Aliaksandr Valialkin
4ce5875fa8
all: add ability to push internal metrics to remote storage system specified via -pushmetrics.url 2022-07-21 20:36:27 +03:00
Roman Khavronenko
88edb3f6cf
vmalert: allow configuring custom headers per group (#2901)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2860

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-07-21 15:59:55 +02:00
Aliaksandr Valialkin
0fd86e2364
lib/promscrape: reload all the scrape configs when the global section is changed inside -promscrape.config
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2884
2022-07-18 17:15:07 +03:00
Boris Petersen
2f9668eba5
fix assume role when running in ECS. (#2876)
This fixes #2875

Signed-off-by: Boris Petersen <boris.petersen@idealo.de>
2022-07-18 12:33:52 +03:00
Aliaksandr Valialkin
814bb1685f
all: fix other typos in the same way as 6f4d9b2a48 does 2022-07-18 12:08:15 +03:00
zhenyuxie
f3ea7823f3
fix inmemoryBlock's Less method (#2881) 2022-07-18 11:56:17 +03:00
Nikolay
7301aa678c
lib/promscrape: adds azure service discovery (#2743)
* lib/promscrape: adds azure service discovery
Adds azure service discovery mechanism
implements authorization with oauth and msi
lists virtual machines and virtual machines managed by scaleSet

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1364

* makes linter happy

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* wip

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-07-13 23:43:18 +03:00
guidao
91faa152a5
add next retention metric (#2863)
Co-authored-by: wangfeng <wangfeng@zhihu.com>
2022-07-13 12:37:04 +03:00
Dmytro Kozlov
306ec10c39
lib/mergeset: fix linter error (#2864) 2022-07-13 12:31:35 +03:00
Aliaksandr Valialkin
17b5ac1608
lib/mergeset: optimize merge speed a bit
Use heap.Fix instead of heap.Pop + heap.Push when merging blocks
2022-07-12 12:50:26 +03:00
Aliaksandr Valialkin
5c8eee26bf
all: make fmt via the upcoming Go1.19 2022-07-11 19:22:15 +03:00
Aliaksandr Valialkin
f97355d9fb
lib/promscrape: properly set Host header when sending requests via http proxy
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2794
2022-07-07 02:27:52 +03:00
Aliaksandr Valialkin
10cb67adb5
app/{vmagent,vminsert}: follow-up after d19e46de55
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2839
2022-07-07 01:30:58 +03:00
Aliaksandr Valialkin
01f55bc66b
lib/promscrape/discovery/kubernetes: properly populate service-level labels for role: endpointslice targets
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2823
2022-07-07 00:32:26 +03:00
Aliaksandr Valialkin
b186b63e07
lib/promscrape/discovery/kubernetes: allow attaching node-level labels to role: endpoints and role: endpointlice targets in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/10759
2022-07-06 23:18:59 +03:00
Aliaksandr Valialkin
e6ba2af7a1
lib/promscrape: fix a test after c66f676f3b 2022-07-06 13:26:35 +03:00
Aliaksandr Valialkin
c66f676f3b
lib/promscrape: push scrape_samples_limit metric to remote storage if sample_limit option is set in scrape_config for this target
See https://github.com/VictoriaMetrics/operator/issues/497
2022-07-06 12:37:55 +03:00
Aliaksandr Valialkin
77cbbacfdb
lib/vmselectapi: pass storage.SearchQuery to API calls instead of []*storage.TagFilters + storage.TimeRange + maxMetrics
This reduces the number of args to vmselectapi calls
2022-07-06 12:37:54 +03:00
Aliaksandr Valialkin
e1b8059086
lib/vmselectapi: rename deleteMetrics to more correct deleteSeries 2022-07-06 12:37:54 +03:00
Aliaksandr Valialkin
a60e03b3a7
lib/vmselectapi: use string type for tagKey and tagValuePrefix args at TagValueSuffixes()
This improves the API consistency
2022-07-06 12:37:53 +03:00
Aliaksandr Valialkin
edc76286ac
lib/storage: put the (date, metricID) entry in dateMetricIDCache just after the corresponding series is registered in the per-day inverted index
Previously the time series could be put into dateMetricIDCache without
registering in the per-day inverted index if GetOrCreateTSIDByName
finds TSID entry in the global index. This could lead to missing
series in query results.

The issue has been introduced in the commit 55e7afae3a,
which has been included in VictoriaMetrics v1.78.0
2022-07-05 14:54:03 +03:00
Aliaksandr Valialkin
855436efd2
lib/promauth: refactor NewConfig in order to improve maintainability
1. Split NewConfig into smaller functions
2. Introduce Options struct for simplifying construction of the Config with various options

This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2684
2022-07-04 14:31:12 +03:00
Aliaksandr Valialkin
c392d6d173
app/vmagent/remotewrite: add -remoteWrite.header command-line flag for setting additional http headers to send to -remoteWrite.url
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2805
2022-06-30 20:00:23 +03:00
Aliaksandr Valialkin
e40b40afe6
Revert "lib/promscrape, vmagent: fix path to files (#2801)"
This reverts commit 0a8e35835c.

Reason for revert: it incorrectly fixes the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2799

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2799#issuecomment-1171392005
2022-06-30 18:23:56 +03:00
Aliaksandr Valialkin
3e2dd85f7d
all: readability improvements for query traces
- show dates in human-readable format, e.g. 2022-05-07, instead of a numeric value
- limit the maximum length of queries and filters shown in trace messages
2022-06-30 18:20:33 +03:00
Dmytro Kozlov
0a8e35835c
lib/promscrape, vmagent: fix path to files (#2801)
vmagent: respect `-pathPrefix` flag for static files and links
2022-06-30 16:22:54 +02:00
ttyv
bdf9f4669a
lib/promscrape: fix vmagent tickerCh reload behaviour (#2786)
Co-authored-by: Dmitriy <dab@ttyv.ru>
2022-06-30 12:33:01 +02:00
Aliaksandr Valialkin
a350d1e81c
lib/storage: return marshaled metric names from SearchMetricNames
Previously SearchMetricNames was returning unmarshaled metric names.
This wasn't great for vmstorage, which should spend additional CPU time
for marshaling the metric names before sending them to vmselect.

While at it, remove possible duplicate metric names, which could occur when
multiple samples for new time series are ingested via concurrent requests.

Also sort the metric names before returning them to the client.
This simplifies debugging of the returned metric names across repeated requests to /api/v1/series
2022-06-28 18:17:15 +03:00
Aliaksandr Valialkin
2c836bd398
lib/storage: put into query trace the number of found entries in SearchMetricNames 2022-06-28 14:50:53 +03:00
Aliaksandr Valialkin
e578549b8a
app/vmselect: optimize /api/v1/series a bit for time ranges smaller than one day 2022-06-28 13:02:47 +03:00
Aliaksandr Valialkin
a963b2a0aa
all: show timeRange in traces in human-readable format instead of timestamps in milliseconds 2022-06-27 13:45:51 +03:00
Aliaksandr Valialkin
ba514284f1
lib/storage: add querytracer to more contexts
querytracer has been added to the following storage.Storage methods:
- RegisterMetricNames
- DeleteMetrics
- SearchTagValueSuffixes
- SearchGraphitePaths
2022-06-27 13:45:51 +03:00
Aliaksandr Valialkin
134751e43e
all: locate throttled loggers via logger.WithThrottler() only once and then use them
This reduces the contention on logThrottlerRegistryMu mutex when logger.WithThrottler()
is called frequently from concurrent goroutines.
2022-06-27 13:45:50 +03:00
Aliaksandr Valialkin
52eadb729e
lib/promscrape: always send stale markers with the real scrape timestamp
This guarantees that query won't return data just after the series is disappeared.
2022-06-23 11:34:18 +03:00
Aliaksandr Valialkin
1c4f67c5d2
lib/promauth: add ability to send additional http headers in requests to scrape targets
This solves https://stackoverflow.com/questions/66032498/prometheus-scrape-metric-with-custom-header
2022-06-22 20:39:43 +03:00
Aliaksandr Valialkin
e6ed92529b
all: remove explicit "xxhash" name when importing github.com/cespare/xxhash/v2 package
This package already has the same name, so there is no need in explicit name
2022-06-21 20:23:32 +03:00
Loki's Wager
ac411be904
BugFix part_header.go (#2763)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2757

Co-authored-by: haotingyi <haotingyi@corp.netease.com>
2022-06-21 15:56:41 +03:00
Aliaksandr Valialkin
49586566a3
docs: follow-up after e4d6b750f6 2022-06-20 17:14:43 +03:00
Nikolay
e4d6b750f6
lib/httpserver: adds flagsAuthKey command-line flag (#2758)
* lib/httpserver: adds flagsAuthKey command-line flag
It protects /flags endpoint with authKey.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2753O

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-20 17:09:32 +03:00
Aliaksandr Valialkin
b958fc7846
lib/storage: properly take into account already registered series when -storage.maxHourlySeries or -storage.maxDailySeries limits are enabled
The commit 5fb45173ae takes into account only newly registered series
when applying cardinality limits. This means that the cardinality limit could be exceeded with already registered series.
This commit returns back accounting for already registered series when applying cardinality limits.
2022-06-20 13:47:47 +03:00
Aliaksandr Valialkin
55e7afae3a
lib/storage: create per-day indexes together with global indexes when registering new time series
Previously the creation of per-day indexes and global indexes
for the newly registered time series was decoupled.

Now global indexes and per-day indexes for the current day are created toghether for new time series.
This should speed up registering new time series a bit.
2022-06-19 22:42:10 +03:00
Aliaksandr Valialkin
5fb45173ae
lib/storage: do not register new series if -storage.maxHourlySeries or -storage.maxDailySeries limits are exceeded
Previously samples for new series weren't added as expected when series limits were reached,
but new series were still registered in indexdb.
2022-06-19 22:42:09 +03:00
Aliaksandr Valialkin
62e2371a67
lib/storage: reset metric id caches for the previous and the current hour
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698
2022-06-19 22:42:09 +03:00
Aliaksandr Valialkin
c18f8cccfa
lib/promrelabel: support action: graphite relabeling
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2737
2022-06-16 20:24:22 +03:00
Aliaksandr Valialkin
ec7963208d
app/vmselect: accept focusLabel query arg at /api/v1/status/tsdb
This allows filling the seriesCountByFocusLabelValue list in the /api/v1/status/tsdb response
with label values for the specified focusLabel, which contain the highest number of time series.

TODO: add this to Cardinality explorer at VMUI - https://docs.victoriametrics.com/#cardinality-explorer
2022-06-14 18:36:54 +03:00
Aliaksandr Valialkin
b6c1ca12b7
lib/storage: show top labels with the highest number of series in cardinality explorer 2022-06-14 16:32:38 +03:00
Aliaksandr Valialkin
a75e59700f
lib/storage: improve error message when -search.max* command-line flag values are exceeded 2022-06-14 13:27:59 +03:00
Aliaksandr Valialkin
52cf05c6d2
lib/storage: test GetTSDBStatusWithFiltersForDate on a global time range 2022-06-12 14:27:40 +03:00
Aliaksandr Valialkin
374beb350e
app/vmselect: optimize /api/v1/labels and /api/v1/label/.../values handlers when match[] query arg is passed to them 2022-06-12 04:32:13 +03:00
Aliaksandr Valialkin
2bcb960f17
all: improve query tracing coverage for indexdb search
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403
2022-06-09 20:07:07 +03:00
Howie
76f05f8670
feat: rule limit (#2676)
vmalert: support `limit` param in groups definition

`limit` param limits number of time series samples produced by a single rule
during execution.
On reaching the limit rule will return an err.

Signed-off-by: lihaowei <haoweili35@gmail.com>
2022-06-09 08:21:30 +02:00
Aliaksandr Valialkin
12ac255dae
lib/querytracer: make it easier to use by passing trace context message to New and NewChild
The context message can be extended by calling Donef.
If there is no need to extend the message, then just call Done.
2022-06-08 21:06:52 +03:00
Dmytro Kozlov
018d2303c4
Cardinality explorer (#2625)
* Cardinality explorer

* vmui, vmselect: updated field name, added description to spinner

* make vmui-update

* updated const name, make vmui-update

* lib/storage: changes calculation for totalSeries values

* added static files

* wip

* wip

* wip

* wip

* docs/CHANGELOG.md: document cardinality explorer feature

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233

Co-authored-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-06-08 18:43:05 +03:00
Roman Khavronenko
63b538ecd1
vmagent: update SD duration histogram metric if SD is active (#2677)
The change updates histogram for registering SD update duration
only SD is considered as `active`. SD is active if at least
one scraper for this SD has started.

This change supposed to reduce metrics cardinality produced
by duration histogram which gets updated even if SD isn't configured.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2671

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-07 15:46:44 +03:00
Roman Khavronenko
1ee1e986da
lib/storage: limit max mergeConcurrency value for systems with high number of CPUs (#2673)
Workers count for merges affects the max part size during merges. Such behaviour
protects storage from running out of disk space for scenario when all workers
are merging parts with the max size.

This works very well for most cases. But for systems where high number of CPUs
is allocated for vmstorage components this could significantly impact the max
part size and result in more unmerged parts than expected.

While checking multiple production highly loaded setups it was discovered that
`max_over_time(vm_active_merges{type="storage/big}[1h]}"` rarely exceeds 2,
and `max_over_time(vm_active_merges{type="storage/small}[1h]}"` rarely exceeds 4.
The change in this commit limits the max value for concurrency accordingly.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-06-07 14:55:09 +03:00
Aliaksandr Valialkin
a5814fe16a
lib/promscrape/discovery/kubernetes: use unsupportedFieldError() function instead of errContext string
This improves code readability and maintainability a bit, since the format string
is passed as string literal into fmt.Errorf.
2022-06-07 01:22:07 +03:00
Aliaksandr Valialkin
8608dd093c
all: follow-up after 8edb390e21
- Remove unused js bloatware from /targets page. This strips down binary size by more than 100Kb
- Add /service-discovery page for API compatibility with Prometheus
- Properly load bootstrap.min.css from /prometheus/targets
- Serve static contents for /targets page from app/vminsert instead of app/vmselect, because /targets page is served from there
2022-06-07 00:57:09 +03:00
Aliaksandr Valialkin
6f0a0e3072
lib/promscrape/discovery/kubernetes: follow-up after 006b8c7534
- make more clear error logs
- simplify testing for newKubeConfig by passing only the path to kube_config file instead of SDConfig struct
2022-06-06 14:40:52 +03:00
Aliaksandr Valialkin
cfefdde042
lib/promauth: follow-up after 006b8c7534
- Take into account `ca`, `key` and `cert` values when generating string representation of TLSConfig.
  Print hashes instead of real values because of security considerations.
- Properly update Config.tlsCertDigets when `key` and `cert` values are set.
  This allows properly updating scrape targets after these values are updated in configs.
- Do not re-generate certificate from `key` and `cert` values per each call to getTLSCert,
  because these values are immutable.
- Do not set `ca` value from `ca_file` value, so it isn't exposed at `/config` page.
- Generate proper error messages on incorrect `key`, `cert` or `ca` values.
2022-06-04 01:01:16 +03:00
Aliaksandr Valialkin
0922ed2b7e
lib/promscrape: add -promscrape.cluster.name command-line flag
This flag is used for proper data de-duplication when the same target is scraped
from multiple vmagent clusters.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679
2022-06-04 00:37:01 +03:00
Dmytro Kozlov
8edb390e21
lib/promscrape: adds service discovery visualization for /targets page(#2675)
* lib/promscrape: updated template

* lib/promscrape: fixed click on unhealthy and all btns

* app/vmselect: jquery scripts into static folder

Co-authored-by: f41gh7 <nik@victoriametrics.com>
2022-06-03 15:38:45 +02:00
Nikolay
a18914abee
lib/promscrape/discovery/kubernetes: follow-up after 0b5c874911 (#2672) 2022-06-01 20:44:45 +02:00
hadesy
006b8c7534
promscrape/discovery: support kubeconfig (#2533) 2022-06-01 20:34:00 +02:00
Aliaksandr Valialkin
ca689fec54
docs/CHANGELOG.md: follow-up after 2177089f94 2022-06-01 14:51:26 +03:00
Aliaksandr Valialkin
ea06d2fd3c
lib/storage: stop background merge when storage enters read-only mode
This should prevent from `no space left on device` errors when VictoriaMetrics
under-estimates the additional disk space needed for background merge.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2603
2022-06-01 14:36:45 +03:00
Roman Khavronenko
642eb1c534
lib/storage: make indexdb/tagFilters cache size configurable (#2667)
The default size of `indexdb/tagFilters` now can be overridden via
`storage.cacheSizeIndexDBTagFilters` flag.
Please, be careful with changing default size since it may
lead to inefficient work of the vmstorage or OOM exceptions.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2663
Signed-off-by: hagen1778 <roman@victoriametrics.com>

Co-authored-by: Nikolay <nik@victoriametrics.com>
2022-06-01 10:07:53 +02:00
Roman Khavronenko
2177089f94
promrelabel: add support of lowercase and uppercase relabeling actions (#2665)
* promrelabel: add support of `lowercase` and `uppercase` relabeling actions

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2664
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/storage: make golangci-lint happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Co-authored-by: Nikolay <nik@victoriametrics.com>
2022-06-01 10:02:37 +02:00
Aliaksandr Valialkin
41958ed5dd
all: add initial support for query tracing
See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#query-tracing

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403
2022-06-01 02:29:23 +03:00
Aliaksandr Valialkin
d2567ccdd6
lib/promscrape: use strconv.Atoi instead of strconv.ParseInt for parsing -promscrape.cluster.memberNum
In this case there is no need in converting int64 to int
2022-06-01 01:42:34 +03:00
Aliaksandr Valialkin
a1add5c2c7
lib/storage: make fmt 2022-05-31 12:54:37 +03:00
Aliaksandr Valialkin
bac75ea8a2
lib/storage: do not take into account series from the next day when match[] filter is passed to /api/v1/status/tsdb 2022-05-31 12:15:26 +03:00
Dmytro Kozlov
11f91532c5
issue-2594: use embedded for static files (#2650)
embed static js and css files from CDN into vmalert, vmagent and vmsingle binaries.

Co-authored-by: f41gh7 <nik@victoriametrics.com>

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2594
2022-05-31 01:55:28 +02:00
Dmytro Kozlov
1eb29794e6
removed redundant return (fixed linter) (#2647)
* removed redundant return

* updated lint package version
2022-05-26 16:24:01 +02:00
Aliaksandr Valialkin
796804e4b0
lib/promscrape: add -promscrape.suppressScrapeErrorsDelay command-line flag
This flag can be used for reducing the amounts of logs when scraping unreliable scrape targets.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2575

The patch is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2576 .
Thanks to @jelmd .
2022-05-25 22:59:36 +03:00
Aliaksandr Valialkin
f6d11a49aa
lib/storage: add ability to change the indexdb rotation time offset with -retentionTimezoneOffset command-line flag
This is a follow-up for 0fbf59199a

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2574
2022-05-25 16:05:29 +03:00
阳明
0fbf59199a
lib/storage: Remove the effect of time zone on next retention period (#2568) (#2574) 2022-05-25 15:08:24 +03:00
Roman Khavronenko
d5eb6afe26
lib/promscrape/discovery/kubernetes: fixes kubernetes service discovery (#2615)
* lib/promscrape/discovery/kubernetes: properly updates discovered scrape works
previously, added or updated scrapeworks may override previuosly
discovered.
it happens because swosByKey may contain small subset of kubernetes
objects with it's labels.
It happens for objectsUpdated and objectsAdded maps, which include only changed elements

* Properly calculate vm_promscrape_discovery_kubernetes_scrape_works

Co-authored-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-21 01:01:37 +03:00
Boris Petersen
3df8caca15
Add ability to sign requests for all AWS services (#2604)
This adds the ability to utilize sigv4 signing for all AWS services not
just "aps". When the newly introduced property "service" is not set it
will default to "aps".

Signed-off-by: Boris Petersen <boris.petersen@idealo.de>
2022-05-18 14:58:31 +02:00
Aliaksandr Valialkin
a0727ab1b1
docs/vmagent.md: typo fix in the description for -promscrape.cluster.replicationFactor command-line flag 2022-05-12 18:50:29 +03:00
Aliaksandr Valialkin
9ea3f0c0d3
lib/awsapi: remove whitelist arg from GetFiltersQueryString(), since it may break new filters in the future
Let users decide which filters to use. If users start using disallowed filters, then AWS will return an error.
2022-05-09 15:33:22 +03:00
Aliaksandr Valialkin
123aa4c79e
lib/promscrape: properly implement ScrapeConfig.clone()
Previously ScrapeConfig.clone() was improperly copying promauth.Secret fields -
their contents was replaced with `<secret>` value.

This led to inability to use passwords and secrets in `-promscrape.config` file.
The bug has been introduced in v1.77.0 in the commit 67b10896d2

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2551
2022-05-07 00:05:40 +03:00
Aliaksandr Valialkin
1dc4cc243b
lib/promscrape: rename promscrape_stale_samples_created_total metric to vm_promscrape_stale_samples_created_total, so its name is consistent with the rest of vm_promscrape_ metrics 2022-05-06 15:33:13 +03:00
Aliaksandr Valialkin
d5b55fe22d
lib/promscrape/discovery/ec2: add ability to filter Availability Zones in ec2_sd_config via az_filters section 2022-05-06 12:43:29 +03:00
Aliaksandr Valialkin
97f9c2f667
lib/promscrape/discovery/ec2: properly pass filters to DescribeAvailabilityZones API call
Previously filters wheren't passed to this call after the commit 0e09fdb8b0

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626
2022-05-05 11:00:23 +03:00
Aliaksandr Valialkin
d285c2fea7
lib/awsapi: pass filtersQueryString arg to GetEC2APIResponse() function, so the caller could decide whether to use the filters during the AWS API query
The filters shouldn't be passed to DescribeAvailabilityZones API call.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287

Related commits:
0e09fdb8b0
d289ecded1
2022-05-05 10:29:34 +03:00
Dmytro Kozlov
7dd9f3b98e
{vmbackup, vmbackup/snapshot}: fixed problem with snapshot backup in another snapshot folder (#2535)
* {vmbackup, vmbackup/snapshot}: validate snapshot name

* vmbackup/snapshot: added another checks

* backup/actions: added check that we ignore backup_complete.ignore file

* vmbackup: moved snapshot to lib directory

* lib/snapshot: added functions description

* lib/snapshot: fixed typo

* vmbackup: code cleanup

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-04 22:12:03 +03:00
Nikolay
d289ecded1
{lib/promscrape,app/vmagent}: adds sigv4 support for vmagent remoteWrite (#2458)
* {lib/promscrape,app/vmagent}: adds sigv4 support for vmagent remoteWrite
moves aws related code into separate lib from lib/promscrape
it allows to write data from vmagent to the AWS managed prometheus (cortex)

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287

* Apply suggestions from code review

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-04 20:24:19 +03:00
Nikolay
3575aabeaf
lib/promscrape: adds correct http status codes for redirect (#2530)
standard http client accepts multiple http status codes as redirect
it should fix issue with incorrect redirects
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2482
2022-05-03 13:31:31 +03:00
Aliaksandr Valialkin
0d86644d65
lib/storage: leave the last sample per each discrete interval during the deduplicaton
This aligns better with staleness logic in Prometheus - https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness
2022-05-02 21:50:45 +03:00
Artem Navoiev
37cf509c3a
lib/{storage,flagutil} - Add option for snapshot autoremoval (#2487)
* lib/{storage,flagutil} - Add option for snapshot autoremoval

- add prometheus-like duration as command flag
- add option to delete stale snapshots
- update duration.go flag to re-use own code

* wip

* lib/flagutil: re-use Duration.Set() call in NewDuration

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-05-02 11:00:15 +03:00
Aliaksandr Valialkin
20bc2a2c44
lib/flagutil: re-use Duration.Set() call in NewDuration 2022-05-02 10:56:39 +03:00
Dima Lazerka
84683b8569
Fix targetstatus qtpl paths (#2517)
Ran `make quicktemplate-gen` from the root directory
2022-04-29 10:36:03 +03:00
Aliaksandr Valialkin
6be07e8c25
lib/promscrape/discovery/kubernetes: do not drop pod meta-labels even if the corresponding node objects are missing
This reflects the logic used in Prometheus.

See https://github.com/prometheus/prometheus/pull/10080
2022-04-26 15:26:01 +03:00
Aliaksandr Valialkin
9fe1bf5d53
lib/promauth: take into account tls_config and proxy_url when serializing OAuth2Config to string 2022-04-23 00:23:19 +03:00
Aliaksandr Valialkin
eb5d7ad089
lib/promauth: add support for min_version option at tls_config section in the same way as Prometheus does 2022-04-23 00:16:39 +03:00
Aliaksandr Valialkin
174431e31b
lib/promauth: add support for proxy_url option at oauth2 section in the same way as Prometheus does 2022-04-23 00:00:44 +03:00
Aliaksandr Valialkin
18b14aad8e
lib/promauth: add support for tls_config section at oauth2 config in the same way as Prometheus does 2022-04-22 23:51:07 +03:00
Aliaksandr Valialkin
6f79b2b68b
lib/promscrape/discovery/kubernetes: limit the minimum sleep time between updating dependent ScrapeWork objects
Previously the sleep time could be dropped to nanoseconds, which could result in CPU time waste
2022-04-22 23:14:17 +03:00
Aliaksandr Valialkin
15190fcdae
lib/promscrape/discovery/kubernetes: allow attaching node-level labels and annotations to discovered pod targets in the same way as Prometheus 2.35 does
See https://github.com/prometheus/prometheus/issues/9510
and https://github.com/prometheus/prometheus/pull/10080
2022-04-22 20:15:41 +03:00
Aliaksandr Valialkin
57a0aa204d
lib/promscrape/discovery/kubernetes: improve the performance of urlWatcher.reloadObjects() on multi-CPU systems
Parallelize the generation of ScrapeWork objects there. Previously they were generated in a single goroutine.
2022-04-22 13:22:01 +03:00
Aliaksandr Valialkin
67b10896d2
lib/promscrape: prevent from memory leaks on -promscrape.config reload when only a small part of scrape jobs is updated
This is a follow-up after 26b78ad707
2022-04-22 13:19:43 +03:00
Aliaksandr Valialkin
98129d4a8e
app/vmstorage: expose vm_indexdb_items_added_total and vm_indexdb_items_added_size_bytes_total counters at /metrics page
These counters can be used for monitoring the rate of addition of new entries in indexdb (aka inverted index).

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2471
2022-04-21 13:18:39 +03:00
Aliaksandr Valialkin
167d1bea8f
lib/promscrape/discovery/kubernetes: properly update endpoints and endpointslice objects when the related pod or service objects are updated
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240

This is a follow-up for 2341bd48d7
2022-04-21 13:06:22 +03:00
Aliaksandr Valialkin
c75d0095f5
lib/promscrape: remove possible data race when cleaning up internStringsMap 2022-04-20 18:40:53 +03:00
Aliaksandr Valialkin
82e34984dd
lib/promscrape: zero out labels after duplicate removal inside mergeLabels() 2022-04-20 18:35:33 +03:00
Aliaksandr Valialkin
a2de31f8d3
lib/promscrape/discovery/kubernetes: do not pre-allocate memory for ScrapeWork objects
There is high chance that ScrapeWork objects won't be generated because of relabeling
2022-04-20 16:40:25 +03:00
Aliaksandr Valialkin
2341bd48d7
lib/promscrape: follow-up after 91e290a8ff 2022-04-20 16:11:37 +03:00
Nikolay
91e290a8ff
lib/promscrape: reduce latency for k8s GetLabels (#2454)
replaces internStringMap with sync.Map - it greatly reduces lock contention
concurently reload scrape work for api watcher - each object labels added by dedicated CPU

changes can be tested with following script https://gist.github.com/f41gh7/6f8f8d8719786aff1f18a85c23aebf70
2022-04-20 16:09:40 +03:00
Aliaksandr Valialkin
3d0549c982
lib/promscrape: optimize getScrapeWork() function
Reduce the number of memory allocations in this function. This improves its performance by up to 50%.
This should improve service discovery speed when big number of potential targets with big number of meta-labels
are generated by service discovery.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2270
2022-04-20 15:37:00 +03:00
Aliaksandr Valialkin
4513893ead
lib/promscrape: use a hash over target labels as a key for dropped targets' map
This reduces the number of allocations and improves the performance for updating dropped targets' map.
This map is exposed at /api/v1/targets as in droppedTargets list.
2022-04-20 15:37:00 +03:00
Dmytro Kozlov
136a44bcfc
lib/promscrape: simply update UI (#2479)
* lib/promscrape: simply update UI

* lib/promscrape: added vm icon
2022-04-20 10:25:04 +02:00
Aliaksandr Valialkin
f6d0e5e74a
all: typo fix: Kuberntes -> Kubernetes 2022-04-20 10:50:49 +03:00
Dmytro Kozlov
a3ee275149
lib/promscrape: Enable filters for endpoint and labels (#2466)
* lib/promscrape: Enable filters for endpoint and labels

* lib/promscrape: cleanup

* lib/promscrape: update template

* lib/promscrape: move logic filter logic to backend

* lib/promscrape: updated placeholder

* lib/promscrape: updated placeholder

* lib/promscrape: use two different fields for filters, updated form, added error on parsing queries

* lib/promscrape: rename functions

* lib/promscrape: removed unused values

* wip

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-19 18:26:21 +03:00
Nikolay
26b78ad707
lib/promscrape: adds job restart method (#2455)
* lib/promscrape: adds job restart method
it must restart only ScrapeConfig with changed content
this change greatly reduce time, that needed for job restart
and it should decrease possible data loss when config frequently changed at kubernetes based deployments

Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* wip

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-16 20:28:46 +03:00
Aliaksandr Valialkin
1097ebebe6
lib/httpserver: clarify that -tls flag enables TLS for http requests to -httpListenAddr 2022-04-16 16:59:26 +03:00
Aliaksandr Valialkin
cad488fe7e
app/vmstorage: add support for mTLS cipher suites via -cluster.tlsCipherSuites command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2404
2022-04-16 16:39:21 +03:00
Aliaksandr Valialkin
7810375c5f
lib/httpserver: move the code, which creates tls.Config, into lib/netutil/tls.go
This syncs the corresponding code with cluster branch
2022-04-16 15:52:36 +03:00
Aliaksandr Valialkin
7e4bdf31ba
lib/httpserver: follow up after def0032c7d 2022-04-16 15:27:21 +03:00
Dmytro Kozlov
def0032c7d
lib/httpserver: added tlsCipherSuites flag (#2468)
* lib/httpserver: added tlsCipherSuites flag

* lib/httpserver: compare lower case strings

* lib/httpserver: use EqualFold

* lib/httpserver: used flagutil.NewArray, supported only strings cipher suites

* lib/httpserver: updated flag description, added flag to documentation

* Update lib/httpserver/httpserver.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-16 15:07:07 +03:00
Aliaksandr Valialkin
ebaa1c7ad5
lib/promscrape: follow-up after baa1c24b36 2022-04-16 14:25:54 +03:00
Nikolay
baa1c24b36
lib/promscrape: removes omitempty for ScrapeConfig (#2457)
This change fixes incorrect marshalling for ScrapeConfig
it affects http endpoint and ScrapeConfig checksum.

With omitempty, custom Marshaller is not called if field is not a pointer.

Previously this issue happened at vmalert
2022-04-16 13:22:11 +03:00
Aliaksandr Valialkin
c6eb404c69
lib/encoding: explicitly set slice length passed to binary.BigEndian.Uint*
This allows Go complier to generate more optimal code without bound checks
2022-04-12 12:55:21 +03:00
Aliaksandr Valialkin
f3d4671bb6
lib/promscrape: follow-up after 7e79adfb55 2022-04-12 12:36:17 +03:00
Nikolay
7e79adfb55
lib/promscrape: allows to use k8s pod name as clusterMemberNum (#2436)
* lib/promscrape: allows to use k8s pod name as clusterMemberNum
it must improve user expirience and simplify clustering scrapers.
it must allow to use vmagent cluster with distroless images
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2359

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-04-12 12:24:11 +03:00
Aliaksandr Valialkin
54de0531a4
app/vmstorage: properly handle maxSeries limit passed from vmselect to vmstorage 2022-04-12 11:23:04 +03:00
Aliaksandr Valialkin
deaa8c1ffa
lib/protoparser/native: follow-up after fe01f4803d 2022-04-11 19:27:07 +03:00
Nikolay
fe01f4803d
lib/protoparser/native: fixes parseStream dead-lock (#2423)
previously, if native block cannot be unmarshaled, wg.Done wasn't called by unmarshal work.
It leads to connection blocking and possible dead-lock at client side
2022-04-11 19:22:24 +03:00
Aliaksandr Valialkin
a96eb16329
lib/memory: export process_memory_limit_bytes metric, which shows the amounts of memory the current process has access to
This metric is equivalent to `vm_available_memory_bytes`, but it has better name,
since the metric is related to a process, not VictoriaMetrics itself.

Leave `vm_available_memory_bytes` for backwards compatibility.
2022-04-07 15:23:00 +03:00
Aliaksandr Valialkin
57143e9435
lib/storage: increase the number of rawRowsShard shards on systems with more than 4 CPU cores
This should improve data ingestion scalability on systems with many CPU cores
2022-04-06 19:49:20 +03:00
Aliaksandr Valialkin
7bad7133bc
lib/mergeset: use more rawItemsShard shards on multi-CPU systems
This should improve the scalability for registering of new time series on multi-CPU system
2022-04-06 19:35:55 +03:00
Aliaksandr Valialkin
ad35068c3a
lib/mergeset: skip common prefixes when comparing inmemoryBlock items
This should improve the performance for items sorting inside inmemoryBlock.MarshalUnsortedData
if they have common prefix.

While at it, improve the performance for inmemoryBlock.updateCommonPrefix for sorted items.
This should improve performance for inmemoryBlock.MarshalSortedData during background merge.
2022-04-06 18:51:36 +03:00
Aliaksandr Valialkin
5acd70109b
lib/protoparser: remove superflowous memory allocations during protocol parsing 2022-04-06 14:00:08 +03:00
Aliaksandr Valialkin
50cf74ce4b
lib/storage: reuse sync.WaitGroup objects
This reduces GC load by up to 10% according to memory profiling
2022-04-06 13:34:04 +03:00
Aliaksandr Valialkin
077193d87c
lib/cgroup: reduce the default GOGC value from 50% to 30%
This reduces memory usage under production workloads by up to 10%,
while CPU spent on GC remains roughly the same.

The CPU spent on GC can be monitored with go_memstats_gc_cpu_fraction metric
2022-04-06 13:32:07 +03:00
Aliaksandr Valialkin
319e910897
lib/workingsetcache: reuse prev cache after its reset
This should reduce memory churn rate
2022-04-05 20:37:45 +03:00
Aliaksandr Valialkin
29cebb3d95
lib/workingsetcache: check more frequently for cache size overflow
This should reduce the probability of cache size limit overflow
2022-04-05 18:05:43 +03:00
Aliaksandr Valialkin
4785d04312
lib/workingsetcache: reduce the expiration duration from 20 minutes to 10 minutes
This should reduce memory usage for the cache under high churn rate
2022-04-05 17:12:13 +03:00
Nikolay
0c0efc7781
vmctl verify-blocks command (#2390)
* lib/protoparser: changes ParseStream for native format
uses reader instead of http.Request
updates app/vmagent and app/vmagent method usage

* app/vmctl: add verify-block subcommand
it allows to check exported from VictoriaMetrics data block in native format
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2362

Update app/vmctl/README.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2022-04-05 16:01:32 +02:00
Nikolay
9a88c1a91e
lib/{storage,regexpcache}: replaces regexpCacheMap with LRU cache (#2293)
* lib/{storage,regexpcache}: replaces regexpCacheMap with LRU cache

It should decrease memory usage for regexp caching
with storing cacheEntry by pointer - golang map should be able to effectivly shrink it's size
original issue with this case - unexpected map grows and storage OOM

Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

Adds missing metrics for regexp cache and regexpPrefixes cache

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-03-26 12:54:50 +02:00
Aliaksandr Valialkin
6e364e19ef
app/vmselect: add fine-grained limits for the number of returned/scanned time series for various APIs 2022-03-26 11:29:49 +02:00
Aliaksandr Valialkin
e3a10b327c
lib/blockcache: properly remove references to deleted parts
Previously references to deleted parts may remain active as cache.m keys.
This could prevent from proper memory de-allocation.
This could lead to increased memory usage for the following caches starting from v1.73.0:

* indexdb/indexBlocks
* indexdb/dataBlocks
* storage/indexBlocks

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2242
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007

This is a follow-up for 88605a7ea2
2022-03-18 17:07:59 +02:00
Aliaksandr Valialkin
2ae3a9a8a3
lib/storage: reduce the interval for checking for free disk space from 30 seconds to 1 second
This should reduce the probability of out of disk space panics when -storage.minFreeDiskSpaceBytes is set to low values.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2305
2022-03-18 16:52:27 +02:00
Aliaksandr Valialkin
88605a7ea2
lib/blockcache: properly release memory occupied by deleted entries
Proviously the deleted entries could remain referenced via lastAccessHeap for long time.
This could lead to increased memory usage for the following caches starting from v1.73.0:

* indexdb/indexBlocks
* indexdb/dataBlocks
* storage/indexBlocks

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2242
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-03-18 16:52:27 +02:00
jduncan0000
e5868b9c29
Fix for issue #2255 - matchTagFilters for positive empty-match filters (#2304)
* fix for issue 2255 - matchTagFilters for positive empty-match filters

* add example to comments

* formatting

* add test for positive empty match

* formatting
2022-03-18 12:58:22 +02:00
Aliaksandr Valialkin
3eef1ddc7d
lib/storage: trashing -> thrashing typo in docs
This is a follow-up for 918ed5cb32
2022-03-16 13:05:26 +02:00
Vic (Shihang) Li
918ed5cb32
fix: change thrashing typo (#2317) 2022-03-15 07:05:52 +00:00
Aliaksandr Valialkin
0a4aadffac
lib/mergeset: remove aux buffers from inmemoryPart
This should reduce the size of inmemoryPart items and may improve performance a bit during registering new time series

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2247
2022-03-03 17:08:44 +02:00
Aliaksandr Valialkin
c84a8b34cc
lib/mergeset: eliminate copying of itemsData and lensData from storageBlock to inmemoryBlock
This should improve performance when registering new time series.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2247
2022-03-03 16:46:37 +02:00
Aliaksandr Valialkin
7da4068f48
lib/mergeset: consistency renaming: ip->mp for inmemoryPart vars 2022-03-03 15:48:22 +02:00
Aliaksandr Valialkin
e8fdb27625
lib/mergeset: move storageBlock from inmemoryPart to a sync.Pool
The lifetime of storageBlock is much shorter comparing to the lifetime of inmemoryPart,
so sync.Pool usage should reduce overall memory usage and improve performance
because of better locality of reference when marshaling inmemoryBlock to inmemoryPart.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2247
2022-03-03 15:44:02 +02:00
Aliaksandr Valialkin
59877d9f32
lib/{mergeset,storage}: tune compression levels for small blocks
This should reduce CPU usage spent on compression
2022-02-25 15:33:40 +02:00
Aliaksandr Valialkin
7e99bbb967
lib/storage: document why job-like and instance-like labels must be stored at mn.Tags[0] and mn.Tags[1]
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2244
2022-02-25 13:21:07 +02:00
Aliaksandr Valialkin
8bf3fb917a
lib/storage: add a comment to indexSearch.containsTimeRange() on why it allows false positives
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2239
2022-02-24 12:47:27 +02:00
Aliaksandr Valialkin
a16f1ae565
lib/storage: properly handle series selector matching multiple metric names plus a negative filter
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2238

This is a follow-up for 00cbb099b6
2022-02-24 12:15:54 +02:00
Aliaksandr Valialkin
af5bdb9254
lib/mergeset: remove superflouos sorting of inmemoryBlock.data at inmemoryBlock.sort()
There is no need to sort the underlying data according to sorted items there.
This should reduce cpu usage when registering new time series in `indexdb`.

Thanks to @ahfuzhang for the suggestion at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2245
2022-02-24 11:20:32 +02:00
Aliaksandr Valialkin
3f49bdaeff
lib/promrelabel: add support for conditional relabeling via if filter
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1998
2022-02-24 02:27:26 +02:00
Aliaksandr Valialkin
d128a5bf99
lib/workingsetcache: do not rotate cache if it is in whole state
This should reduce the maximum memory usage for the cache in `whole` state
2022-02-23 22:55:18 +02:00
Aliaksandr Valialkin
62b46007c5
lib/workingsetcache: reduce the default cache rotation period from hour to 20 minutes
This should reduce memory usage under high time series churn rate
2022-02-23 13:41:45 +02:00
Aliaksandr Valialkin
f72b35665f
lib/storage: optimize /api/v1/status/tsdb call by skipping all the artificially created tag entries at once
This is a follow-up for b71be42d90
2022-02-21 18:23:35 +02:00
Aliaksandr Valialkin
ed12c60826
lib/mergeset: typo fix after b6ed9afd6d 2022-02-21 17:58:22 +02:00
Aliaksandr Valialkin
5d45ea1003
lib/blockcache: evict entries from the cache in LRU order
This should improve hit rate for smaller caches
2022-02-21 17:44:24 +02:00
Roman Khavronenko
69d1893f4c
Consul SD - update services on the watcher's start (#2202)
* lib/discovery/consul: update services on the watcher's start

Previously, watcher's start was only initing goroutines for discovery
but not waiting for the first iteration to end. It means first Consul
discovery wasn't returning discovered targets until the next iteration.

The change makes the watcher's start blocking until we get first discovery
iteration done and all registries updated.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: remove workarounds for consul SD

Now when consul SD lib properly updates services
on the first start, we don't need workarounds in vmalert.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/discovery/consul: update after review

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-21 15:32:45 +02:00
Roman Khavronenko
b6ed9afd6d
lib: allow to configure cache size by type (#2206)
* lib: allow to configure cache size by type

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1940
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Apply suggestions from code review

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-21 13:50:34 +02:00
Aliaksandr Valialkin
2b87b4d183
lib/storage: typo fix after c3affb0c4f 2022-02-17 12:55:54 +02:00
Aliaksandr Valialkin
c3affb0c4f
lib/storage: simplify code for searching for label values
This is a follow-up after 9dd191b27c
2022-02-17 12:29:38 +02:00
Aliaksandr Valialkin
9dd191b27c
lib/storage: properly skip composite tag entries when searching for tag names or tag values
This is a follow-up for b71be42d90

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2200
2022-02-16 23:01:19 +02:00
Aliaksandr Valialkin
5366d9be73
lib/blockcache: fix TestCache by ensuring that the cache size can be divided by the number of cache shards
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2204
2022-02-16 18:47:35 +02:00
Aliaksandr Valialkin
6ff71474a6
lib/storage: document why tsid cache is reset before saving it to disk
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2205
2022-02-16 18:37:56 +02:00
Aliaksandr Valialkin
b71be42d90
lib/storage: use binary search instead of full scan for skipping artificial tags when searching for tag names or tag values
This should improve performance for /api/v1/labels and /api/v1/label/<label_name>/values

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2200
2022-02-16 18:15:41 +02:00
Roman Khavronenko
d91c1d4eee
vmagent: fix js error on CollapseAll/ExpandAll buttons click (#2192)
* vmagent: fix js error on CollapseAll/ExpandAll buttons click

`Uncaught TypeError: Cannot read properties of null (reading 'style')`

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-15 12:52:48 +02:00
Corporte Gadfly
ad6bdd78d0
match fileSDCheckInterval with prometheus file_sd_config default (#2188) 2022-02-15 12:04:26 +02:00
Aliaksandr Valialkin
1215f51043
docs/CHANGELOG.md: document 3d890e89f1 2022-02-14 17:39:12 +02:00
Nikolay
3d890e89f1
Adds server certificate reload for lib/http (#2186)
* Adds server certificate reload for lib/http
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2171

* Update lib/httpserver/httpserver.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-14 17:32:13 +02:00
Nikolay
c90c1c4d54
fixes all_tenants query option usage for openstack service discovery (#2184)
explicit use configuration parametr instead of conditional add
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2182
2022-02-14 13:07:30 +02:00
Aliaksandr Valialkin
f10c38b827
lib/promscrape: add expand all and collapse all buttons to /targets page 2022-02-12 18:41:29 +02:00
Aliaksandr Valialkin
96dce63dbd
lib/storage: tune the logic for pre-populating of the per-day inverted index for the next day
- Postpone the pre-poulation to the last hour of the current day. This should reduce the number
  of useless entries in the next per-day index, which shouldn't be created there,
  when the corresponding time series are stopped to be pushed during the current day.

- Make the pre-population more smooth in time by using the hash of MetricID instead of MetricID itself
  when calculating the need for for the given MetricID pre-population.

- Sync the logic for pre-population of the next day inverted index with the logic of pre-populating tsid cache
  after indexdb rotation. This should improve code maintainability.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401
2022-02-12 16:33:16 +02:00
artifactori
ea153e5f90
Show gce sdconfig zone on vmagent:8429/config (#2178)
* vmagent: add test for marshalling gce sdconfig with ZoneYAML

* vmagent: implement MarshalYAML for ZoneYAML on gce sdconfig
2022-02-12 00:39:23 +02:00
Roman Khavronenko
cf1a8bce6b
lib/index: reduce read/write load after indexDB rotation (#2177)
* lib/index: reduce read/write load after indexDB rotation

IndexDB in VM is responsible for storing TSID - ID's used for identifying
time series. The index is stored on disk and used by both ingestion and read path.

IndexDB is stored separately to data parts and is global for all stored data.
It can't be deleted partially as VM deletes data parts. Instead, indexDB is
rotated once in `retention` interval.

The rotation procedure means that `current` indexDB becomes `previous`,
and new freshly created indexDB struct becomes `current`. So in any time,
VM holds indexDB for current and previous retention periods.
When time series is ingested or queried, VM checks if its TSID is present
in `current` indexDB. If it is missing, it checks the `previous` indexDB.
If TSID was found, it gets copied to the `current` indexDB. In this way
`current` indexDB stores only series which were active during the retention
period.

To improve indexDB lookups, VM uses a cache layer called `tsidCache`. Both
write and read path consult `tsidCache` and on miss the relad lookup happens.

When rotation happens, VM resets the `tsidCache`. This is needed for ingestion
path to trigger `current` indexDB re-population. Since index re-population
requires additional resources, every index rotation event may cause some extra
load on CPU and disk. While it may be unnoticeable for most of the cases,
for systems with very high number of unique series each rotation may lead
to performance degradation for some period of time.

This PR makes an attempt to smooth out resource usage after the rotation.
The changes are following:
1. `tsidCache` is no longer reset after the rotation;
2. Instead, each entry in `tsidCache` gains a notion of indexDB to which
they belong;
3. On ingestion path after the rotation we check if requested TSID was
found in `tsidCache`. Then we have 3 branches:
3.1 Fast path. It was found, and belongs to the `current` indexDB. Return TSID.
3.2 Slow path. It wasn't found, so we generate it from scratch,
add to `current` indexDB, add it to `tsidCache`.
3.3 Smooth path. It was found but does not belong to the `current` indexDB.
In this case, we add it to the `current` indexDB with some probability.
The probability is based on time passed since the last rotation with some threshold.
The more time has passed since rotation the higher is chance to re-populate `current` indexDB.
The default re-population interval in this PR is set to `1h`, during which entries from
`previous` index supposed to slowly re-populate `current` index.

The new metric `vm_timeseries_repopulated_total` was added to identify how many TSIDs
were moved from `previous` indexDB to the `current` indexDB. This metric supposed to
grow only during the first `1h` after the last rotation.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-12 00:30:08 +02:00
Aliaksandr Valialkin
08428464e9
lib/storage: fix broken BenchmarkHeadPostingForMatchers for {i=~".*"} after f4dead529f
The commit f4dead529f makes such query to return nothing instead of all the time series.
This aligns more with Prometheus behaviour.
2022-02-12 00:27:10 +02:00
Roman Khavronenko
e3adcbec6e
lib/promscrape: support prometheus-like duration in scrape configs (#2169)
* lib/promscrape: support prometheus-like duration in scrape configs

The change allows to specify duration values like `1d`, `1w`
for fields `scrape_interval`, `scrape_timeout`, etc.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817#issuecomment-1033384766
Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/blockcache: make linter happy

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/promscrape: support prometheus-like duration in scrape configs

* add support for extra fields `scrape_align_interval` and `scrape_offset`;
* support Prometheus duration parsing for `__scrape_interval__`
and `__scrape_duration__` labels;

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

* wip

* docs/CHANGELOG.md: document the feature

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-11 16:17:00 +02:00
Aliaksandr Valialkin
3cb72ccc2a
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_endpointslice_{label,annotation}* labels to be consistent with other role values for Kubernetes service discovery 2022-02-11 14:54:47 +02:00
Nikolay
4e7f7f3302
fixes service discovery for kubernetes (#2173)
* fixes service discovery for kubernetes
now it must take in account all pods that belong to the discovered endpoint and endpointslice
adds simple test for endpoints
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2134

* wip

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-11 13:34:22 +02:00
Aliaksandr Valialkin
f9a17cb5fe
lib/mergeset: tune indexdb/{indexBlocks,dataBlocks} cache sizes further according to production stats
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-02-10 19:09:46 +02:00
Aliaksandr Valialkin
a9bb22b213
lib/blockcache: use higher number of shards for higher number of CPU cores
This should reduce mutex contention and increase performance

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-02-10 19:06:12 +02:00
Aliaksandr Valialkin
db8c4054e5
lib/promscrape: fix errors in test config
The errors were discovered after enabling strict parse mode by default.
See 9bb60ab00f
2022-02-08 19:56:37 +02:00
Aliaksandr Valialkin
4507b111a9
lib/blockcache: split the cache into multiple shards
This should reduce contention on cache mutex on hosts with many CPU cores,
which, in turn, should increase overall throughput for the cache.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-02-08 19:44:29 +02:00
Aliaksandr Valialkin
2455a988e4
lib/mergeset: tune sizes for indexdb/dataBlocks and indexdb/indexBlocks according to production workload
This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007#issuecomment-1032308742
2022-02-08 17:58:49 +02:00
Aliaksandr Valialkin
9bb60ab00f
lib/promscrape: set -promscrape.config.strictParse to true by default
This allows detecting long-living silent errors in -promscrape.config
2022-02-08 15:41:43 +02:00
Aliaksandr Valialkin
a19e7f8c5b
lib/blockcache: make fmt 2022-02-08 15:24:11 +02:00
Aliaksandr Valialkin
d0f785defd
lib/blockcache: eliminate possible race when Cache.Put is called for the same entry from multiple goroutines
The race could result in incorrect cache size tracking, which, in turn, could result in too frequent cache cleaning
2022-02-08 01:10:43 +02:00
Aliaksandr Valialkin
46bd2c4d6d
lib/blockcache: increase the lifetime for rarely accessed blocks from 2 minutes to 5 minutes
This should improve data ingestion speed if time series samples are ingested with interval bigger than 2 minutes.
The actual interval could exceed 2 minutes if the original interval between samples doesn't exceed 2 minutes
in the case of slow inserts. Slow inserts may appear in the following cases:

* Big number of new time series are pushed to VictoriaMetrics, so they couldn't be registered in 2 minutes.
* MetricName->tsid cache reset on indexdb rotation or due to unclean shutdown.
  In this case VictoriaMetrics needs to load MetricName->tsid entries for all the incoming series from IndexDB.
  IndexDB uses the block cache for increasing lookup performance. If the cache has no the needed block,
  then IndexDB reads and unpacks the block from disk. This requires an extra disk read IO and CPU.
  See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007

This also should increase performance for periodically executed queries with intervals from 2 minutes to 5 minutes.
See the previous similar commit - 43103be011

It is possible that the timeout can be increased further. Let's collect production numbers for this change
so the timeout could be adjusted further.
2022-02-08 00:15:56 +02:00
Aliaksandr Valialkin
e86b7cc9a5
lib/workingsetcache: use the original cache size limits when rotating caches
Previously limits for new caches were taken from cache stats.
These limits could mismatch the original limits. This could result in failed cache load
if the stored cache has been created with the limits obtained from cache stats.
2022-02-08 00:10:14 +02:00
Aliaksandr Valialkin
cde4664f0d
lib/blockcache: return proper number of entries from the cache
This has been broken in 0d7374ad2f
2022-02-07 19:28:42 +02:00
Aliaksandr Valialkin
b5b3c585b3
lib/promscrape: show the total number of scrapes and the total number of scrape errors per target at /targets page
This information may be useful when debugging unreliable scrape targets
2022-02-03 20:22:41 +02:00
Aliaksandr Valialkin
2968779f16
lib/promscrape: provide the ability to fetch target responses on behalf of vmagent or single-node VictoriaMetrics
This feature may be useful when debugging metrics for the given target located in isolated environment
2022-02-03 19:00:55 +02:00
Aliaksandr Valialkin
9c62b25ad6
lib/mergeset: pre-allocate data and items for inmemoryBlock in order to reduce memory allocations under high churn rate
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-02-01 00:57:14 +02:00
Aliaksandr Valialkin
4bdd10ab90
lib/bytesutil: split Resize* funcs to MayOverallocate and NoOverallocate for more fine-grained control over memory allocations
Follow-up for f4989edd96

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-02-01 00:18:42 +02:00
Aliaksandr Valialkin
e13ce2ee98
lib/encoding: substitute 64-bits.LeadingZeros64() with bits.Len64() 2022-01-31 23:36:48 +02:00
Aliaksandr Valialkin
a8509c112a
lib/storage: avoid allocations of tsidPrev on every blockStreamReader.NextBlock() call
This is a follow-up for 00b7c97d2a

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2082
2022-01-31 22:46:53 +02:00
Aliaksandr Valialkin
f50cf60534
lib/cgroup: fall back to runtime.NumCPU() when determining process_cpu_cores_available metric if it is impossible to determine cpu quota via cgroups
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2107
2022-01-31 20:30:14 +02:00
Aliaksandr Valialkin
ead66155ef
lib/cgroup: expose process_cpu_cores_available metric
This metric shows the number of CPU cores available to the process.
This allows creating alerting rules on CPU saturation with the following query:

    rate(process_cpu_seconds_total[5m]) / process_cpu_cores_available > 0.9

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2107
2022-01-31 20:24:41 +02:00
Aliaksandr Valialkin
96aa3761fc
lib/storage/table.go: add missing tb.ptwsLock.Unlock() before the return
This is a follow-up for a1083d0531

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2103
2022-01-28 14:15:42 +02:00
匠心零度
1999bbfe82
optimized code (#2103)
* optimized code ,because only the first error,so no need var errors []error

* optimized code ,because only the first error,so no need var errors []error

Co-authored-by: lirenzuo <lirenzuo@shein.com>
2022-01-28 14:15:41 +02:00
Aliaksandr Valialkin
f4989edd96
lib/bytesutil: split Resize() into ResizeNoCopy() and ResizeWithCopy() functions
Previously bytesutil.Resize() was copying the original byte slice contents to a newly allocated slice.
This wasted CPU cycles and memory bandwidth in some places, where the original slice contents wasn't needed
after slize resizing. Switch such places to bytesutil.ResizeNoCopy().

Rename the original bytesutil.Resize() function to bytesutil.ResizeWithCopy() for the sake of improved readability.

Additionally, allocate new slice with `make()` instead of `append()`. This guarantees that the capacity of the allocated slice
exactly matches the requested size. The `append()` could return a slice with bigger capacity as an optimization for further `append()` calls.
This could result in excess memory usage when the returned byte slice was cached (for instance, in lib/blockcache).

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-25 15:24:44 +02:00
Aliaksandr Valialkin
91f2af2d7a
lib/mergeset: allocate the needed amounts of memory when unmarshaling inmemoryBlock
This should reduce the memory required for indexdb/dataBlocks cache.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-24 18:50:40 +02:00
Aliaksandr Valialkin
4c13bae1cf
lib/logger: removed broken test after 746ee191e8 2022-01-24 12:14:32 +02:00
Aliaksandr Valialkin
746ee191e8
lib/logger/throttler.go: show the original location of the error and warning message
Previously the location inside LogThrottler implementation was shown. This could complicate debugging.
2022-01-23 13:55:00 +02:00
Aliaksandr Valialkin
0d7374ad2f
lib/blockcache: optimize blockcache a bit
- Optimize Cache.RemoveBlocksFromPart(), so it doesn't need to iterate over all the cached blocks.
- Cache blocks if there were no cache misses during the last 2 minutes.
  This may be the case when new blocks are added simultaneously to the storage and to the cache.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-23 13:13:45 +02:00
Aliaksandr Valialkin
ede93469ea
lib/mergeset: tune caches size limits for indexdb/dataBlocks and indexdb/indexBlocks
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-21 12:45:43 +02:00
Aliaksandr Valialkin
5f84b17ed6
lib/storage: properly limit cardinality when ingesting multiple samples for the same time series in a single request 2022-01-21 12:38:09 +02:00
Aliaksandr Valialkin
00b7c97d2a
lib/storage: verify that blocks in a single part are sorted by TSID when reading sequential blocks from the part
This may help narrowing down the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2082
2022-01-20 20:36:37 +02:00
Aliaksandr Valialkin
ea87f21e23
lib/storage: set bsm.Block to nil on error, so the previous block couldn't be used.
This may help nailing down the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2082
2022-01-20 20:13:14 +02:00
Aliaksandr Valialkin
9797c928ef
lib/blockcache: add missing dependency after 145337792d
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-20 18:50:44 +02:00
Aliaksandr Valialkin
145337792d
lib/{mergeset,storage}: properly limit cache sizes for indexdb
Previously these caches could exceed limits set via `-memory.allowedPercent` and/or `-memory.allowedBytes`,
since limits were set independently per each data part. If the number of data parts was big, then limits could be exceeded,
which could result to out of memory errors.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-20 18:37:17 +02:00
Aliaksandr Valialkin
1d05444b33
lib/promscrape: expose promscrape_stale_samples_created_total metric for monitoring the number of created stale samples 2022-01-14 01:00:46 +02:00
Aliaksandr Valialkin
80f03177c4
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_node_provider_id label for discovered Kubernetes nodes in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/9603
2022-01-13 23:16:02 +02:00
Aliaksandr Valialkin
355a63733d
lib/promscrape/discovery/kubernetes: add the ability to limit service discovery to the current namespace
See https://github.com/prometheus/prometheus/issues/9782 and https://github.com/prometheus/prometheus/pull/9881
2022-01-13 22:44:35 +02:00
Aliaksandr Valialkin
17eb86a689
lib/promscrape/discovery/dockerswarm: follow up after 68a117a25a
- Document the bugfix at docs/CHANGELOG.md
- Set __address__ field after copying commonLabels to the resulting map of discovered labels.
  This makes sure that the correct __address__ label is used.
2022-01-11 09:20:10 +02:00
Alexander Shtuchkin
68a117a25a
Fix for #2038: Make correct __address__ value for dockerswarm promscrape (#2041) 2022-01-11 08:59:06 +02:00
Aliaksandr Valialkin
e4e36383e2
lib/promscrape: do not send staleness markers on graceful shutdown
This follows Prometheus behavior.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2013#issuecomment-1006994079
2022-01-07 01:17:57 +02:00
Aliaksandr Valialkin
178dd87e26
lib/storage: follow-up for 38bf5fc136 2022-01-05 16:00:11 +02:00
weng zhao
38bf5fc136
vmstorage: fix query like {foo=~"bar|"} return extra timeseries cause by negative filter transformation malfunction (#2032)
1. L2749 make kb.B remain the value of comonPrefix instead of tf.prefix
2. L2762 avoid change tf.value from "bar|" to ".+r|"
2022-01-05 15:59:15 +02:00
Aliaksandr Valialkin
cbaa2af280
lib/promscrape: scrape replicated targets at different offsets in vmagent replicated clustering mode
This guarantees that the deduplication consistently leaves samples from the same vmagent replica.

See https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets
2021-12-23 00:20:39 +02:00
Nikolay
8ff7da7202
adds restore.lock (#1988)
* adds restore.lock
it must prevent from running storage after incomplete restore process
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1958

* return back flock file deletion

* Apply suggestions from code review

* wip

* docs/CHANGELOG.md: document https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1958

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-22 13:10:15 +02:00
Aliaksandr Valialkin
ce333f28d8
all: use logger.WithThrottler() where appropriate 2021-12-21 17:03:25 +02:00
Roman Khavronenko
34fdc8881b
vmagent: add error log for skipped data block when rejected by receiv… (#1956)
* vmagent: add error log for skipped data block when rejected by receiving side

Previously, rejected data blocks were silently dropped - only metrics were update.
From operational perspective, having an additional logging for such cases is preferable.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1911

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmagent: throttle log messages about skipped blocks

The new type of logger was added to logger pacakge.
This new type supposed to control number of logged messages
by time.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/logger: make LogThrottler public, so its methods can be inspected by external packages

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-21 16:36:09 +02:00
Aliaksandr Valialkin
b9363d9726
lib/promscrape: take into account the original job_name when creating an unique key per each scrape target
This should handle the case when the original job_name has been changed in -promscrape.config ,
while the resulting job label remains the same because it is overriden via relabeling.
2021-12-20 18:38:05 +02:00
Aliaksandr Valialkin
afafeb379a
all: typo fix: unexected -> unexpected 2021-12-20 17:39:52 +02:00
Aliaksandr Valialkin
5a36e241f4
lib/persistentqueue: check that readerOffset doesnt exceed writerOffset after each readerOffset increase
This should help detecting the source of the panic from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1981
2021-12-20 17:25:11 +02:00
Aliaksandr Valialkin
8a7f08ded3
lib/storage: properly update per-part min_dedup_interval file contents after merge
Previously 0s was always written even if -dedup.minScrapeInterval was set to non-zero value

This is a follow-up for 4ff647137a
2021-12-17 20:13:24 +02:00
Aliaksandr Valialkin
a3adf24527
lib/promscrape: allow up to 5 redirects when scraping a target by default
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945
2021-12-16 00:14:14 +02:00
Aliaksandr Valialkin
4ff647137a
lib/storage: deduplicate samples more thoroughly
Previously some duplicate samples may be left on disk for time series with high churn rate.
This may result in higher disk space usage.
2021-12-15 15:59:58 +02:00
Aliaksandr Valialkin
92070cbb67
lib/storage: return dedup interval in milliseconds from GetDedupInterval()
This removes duplicate .Milliseconds() calls after GetDedupInterval() calls.
2021-12-15 13:26:38 +02:00
Aliaksandr Valialkin
1d20a19c7d
lib/storage: explicitly pass dedupInterval to DeduplicateSamples() and deduplicateSamplesDuringMerge()
This improves the code readability and debuggability, since the output of these functions
stops depending on global state.
2021-12-14 20:49:12 +02:00
Aliaksandr Valialkin
e1a715b0f5
lib/storage: convert alternate regexps into Graphite wildcards inside __graphite__ pseudo-label
For example, `{__graphite__=~"foo.(bar|baz)"}` is automatically converted to `{__graphite__=~"foo.{bar,baz}"}` before execution.
This allows using multi-value Grafana template variables such as `{__graphite__=~"foo.($app)"}`.
2021-12-14 19:51:49 +02:00
Yury Molodov
c1fd93e8a0
vmui: multiple queries (#1916)
* feat: change duration by "enter"

* fix: optimize data processing for chart

* feat: set minimum step to 1ms

* update dependencies

* feat: remove save the last query to local storage

* fix: handle an error in a table with subqueries

* feat: store display type in URL

* Revert "feat: store display type in URL"

This reverts commit ccc242c69a.

* feat: store display type in URL

* refactor: move the time setting to a folder

* refactor: move the query configurator to a folder

* refactor: move the auth settings to a folder

* feat: improve styles

* feat: add multi query

* update package-lock

* feat: add display multiple queries

* feat: add limits for multiple queries

* update dependencies

* feat: add history for multiple queries

* feat: add line type to legend

* feat: change style for switch

* feat: change the logic for axes limits for multiple queries

* update package-lock.json

* update dependencies

* feat: add the filter to legend

* wip

* lib/httpserver: add missing 127.0.0.1 hostname to the logged address for http and pprof server if the address starts with ':'

This allows copy-pasting the url to http server from logs.

* lib/httpserver: add missing 127.0.0.1 hostname to the logged address for http and pprof server if the address starts with ':'

This allows copy-pasting the url to http server from logs.

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-08 16:40:15 +02:00
Aliaksandr Valialkin
45d082bbe2
app/vminsert: add -maxLabelValueLen command-line flag
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1908
2021-12-06 11:40:34 +02:00
Aliaksandr Valialkin
da402fbdfa
lib/workingsetcache: fix unaligned 64-bit atomic operation panic on 32-bit architectures
The panic has been introduced in 7275ebf91a
2021-12-03 01:21:51 +02:00
Aliaksandr Valialkin
06642d97f5
app: allow specifying http and https urls in the following command-line flags
* -promscrape.config
* -relabelConfig
* -remoteWrite.relabelConfig
* -remoteWrite.urlRelabelConfig
2021-12-03 00:10:02 +02:00
Aliaksandr Valialkin
62b4efb3e7
app/vmauth: follow-up for 13368bed18
* Document the ability to specify http or https urls in `-auth.config` at docs/CHANGELOG.md
* Move the ReadFileOrHTTP to lib/fs, so it can be re-used in other places where a file
  should be read from the given path. For example, in `-promscrape.config` at `vmagent`.
2021-12-02 23:32:05 +02:00
Aliaksandr Valialkin
394a345ae0
lib/httpserver: expose /-/healthy and /-/ready endpoints as Prometheus does
This improves integration with third-party solutions, which rely on these endpoints.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1833
2021-12-02 14:36:58 +02:00
Aliaksandr Valialkin
90c542af12
app: use relative paths instead of absolute paths for the supported http handlers on the main page
This allows hiding VictoriaMetrics components behind proxies, which serve pages at different path prefixes

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1858
2021-12-02 13:52:39 +02:00
Aliaksandr Valialkin
03f5ad3060
lib/protoparser/graphite: allow multiple separators between metric name, value and timestamp 2021-12-02 13:43:49 +02:00
Aliaksandr Valialkin
49a18b8660
lib/protoparser/graphite: properly parse Graphite line with whitespace after the timestamp
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1865
2021-12-02 13:33:26 +02:00
Aliaksandr Valialkin
c0cbf0de2a
app/{vmbackup,vmrestore}: export internal metrics at /metrics http handler 2021-12-02 11:55:58 +02:00
Aliaksandr Valialkin
7275ebf91a
app/vmstorage: export vm_cache_size_max_bytes metrics for determining capacity of various caches
The vm_cache_size_max_bytes metric can be used for determining caches which reach their capacity via the following query:

   vm_cache_size_bytes / vm_cache_size_max_bytes > 0.9
2021-12-02 10:30:43 +02:00
Aliaksandr Valialkin
2f63dec2e3
lib/fs: add vm_filestream_read_duration_seconds_total and vm_filestream_write_duration_seconds_total metrics
These metrics help determining persistent disk saturation with `rate(vm_filestream_read_duration_seconds_total) > 0.9`
2021-12-02 10:30:42 +02:00
Aliaksandr Valialkin
2fb5a6ca78
lib/storage: do not take into account -storage.minFreeDiskSpaceBytes during background merges 2021-12-01 11:02:36 +02:00
Nikolay
06eff5a72c
removes FileSize from backup part key (#1872)
* removes FileSize from backup part key
it should fix download restoration for backups

* Update lib/backup/common/part.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-01 11:01:28 +02:00
Aliaksandr Valialkin
d666755159
lib/storage: take into account -storage.minFreeDiskSpaceBytes when performing big merges
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-11-30 12:56:35 +02:00
guidao
f05cddd2fc
fix #1830 (#1861)
Co-authored-by: wangfeng <wangfeng@zhihu.com>
2021-11-30 01:12:24 +02:00
Aliaksandr Valialkin
ba927d1c77
lib/protoparser/prometheus: follow-up for 8e338632a3
Do not spend CPU time on error message formatting if error logger is disabled
2021-11-30 00:50:11 +02:00
Nikolay
8e338632a3
Changes unmarshallRow logger to noop for getRowsDiff (#1835) 2021-11-30 00:48:13 +02:00
Aliaksandr Valialkin
d44c585ca4
lib/protoparser: do not log connection reset by peer error when reading the data via InfluxDB, Graphite and OpenTSDB protocols over plain TCP connections
This error is expected, so there is no need in spamming the log with this error.
2021-11-29 21:47:56 +02:00
Aliaksandr Valialkin
b688960db0
lib/persistentqueue: add vm_persistentqueue_read_duration_seconds_total and vm_persistentqueue_write_duration_seconds_total metrics for determining disk usage saturation at vmagent 2021-11-17 16:41:35 +02:00
Lan
b72eed1f5e
Add flag of S3ForcePathStyle (#1802) 2021-11-17 01:03:03 +02:00
Aliaksandr Valialkin
e5ac9d8e57
all: consistently return application/json content-type without charset=utf-8
The `application/json` content-type has utf-8 encoding by default.
See https://stackoverflow.com/questions/9254891/what-does-content-type-application-json-charset-utf-8-really-mean

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897
2021-11-09 18:04:44 +02:00
Aliaksandr Valialkin
fd596945e7
lib/promscrape: improve logging for scrape_config_files parse errors
Log the actual file path, which led to the parse error.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1789
2021-11-08 13:34:12 +02:00
Aliaksandr Valialkin
cbfc7b7c92
app/{vminsert,vmagent}: hide passwords and auth tokens by default at /config page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764
2021-11-05 14:41:16 +02:00
Aliaksandr Valialkin
e73a82f7a5
lib/promauth: do not show empty values in oauth2 config section at /config page 2021-11-05 12:53:39 +02:00
Aliaksandr Valialkin
aa534c2582
lib/promscrape: add -promscrape.maxResponseHeadersSize command-line flag for tuning the maximum http response headers size from Prometheus scrape targets 2021-11-03 22:26:56 +02:00
Aliaksandr Valialkin
d1eb87c831
app/{vmagent,vminsert}: add ability to restrict access to /config page with authKey query arg
The authKey can be configured via `-configAuthKey` command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764
2021-11-01 16:44:54 +02:00
Aliaksandr Valialkin
bb87949d5c
lib/protoparser/influx: automatically detect timestamp precision depending on the number of decimal digits in the timestamp 2021-10-28 12:47:22 +03:00