github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-01 14:47:38 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	f00a6bf837	all: add ability to push internal metrics to remote storage system specified via -pushmetrics.url	2022-07-21 20:15:29 +03:00
Aliaksandr Valialkin	2d1366353c	lib/promscrape: reload all the scrape configs when the `global` section is changed inside `-promscrape.config` See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2884	2022-07-18 17:15:42 +03:00
Boris Petersen	61e5f89cfb	fix assume role when running in ECS. (#2876 ) This fixes #2875 Signed-off-by: Boris Petersen <boris.petersen@idealo.de>	2022-07-18 12:37:33 +03:00
Aliaksandr Valialkin	979444b4ed	all: fix other typos in the same way as `6f4d9b2a48` does	2022-07-18 12:10:41 +03:00
zhenyuxie	14c6212a61	fix inmemoryBlock's Less method (#2881 )	2022-07-18 12:00:45 +03:00
Nikolay	c007b129cb	lib/promscrape: adds azure service discovery (#2743 ) * lib/promscrape: adds azure service discovery Adds azure service discovery mechanism implements authorization with oauth and msi lists virtual machines and virtual machines managed by scaleSet https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1364 * makes linter happy * Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> * wip Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-07-13 23:45:43 +03:00
guidao	f2d24a660b	add next retention metric (#2863 ) Co-authored-by: wangfeng <wangfeng@zhihu.com>	2022-07-13 12:41:22 +03:00
Dmytro Kozlov	5256af2291	lib/mergeset: fix linter error (#2864 )	2022-07-13 12:34:28 +03:00
Aliaksandr Valialkin	7cbcbea49d	lib/mergeset: optimize merge speed a bit Use heap.Fix instead of heap.Pop + heap.Push when merging blocks	2022-07-12 12:52:36 +03:00
Aliaksandr Valialkin	eab8ebbe11	all: `make fmt` via the upcoming Go1.19	2022-07-11 19:23:25 +03:00
Aliaksandr Valialkin	5794886662	lib/promscrape: properly set Host header when sending requests via http proxy Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2794	2022-07-07 02:28:47 +03:00
Aliaksandr Valialkin	95add1e8e4	app/{vmagent,vminsert}: follow-up after `d19e46de55` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2839	2022-07-07 01:32:11 +03:00
Aliaksandr Valialkin	4d03ac90fc	lib/promscrape/discovery/kubernetes: properly populate service-level labels for `role: endpointslice` targets Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2823	2022-07-07 00:36:25 +03:00
Aliaksandr Valialkin	c4cc45d7f8	lib/promscrape/discovery/kubernetes: allow attaching node-level labels to `role: endpoints` and `role: endpointlice` targets in the same way as Prometheus does See https://github.com/prometheus/prometheus/pull/10759	2022-07-07 00:36:24 +03:00
Aliaksandr Valialkin	f9303e494c	lib/promscrape: fix a test after `c66f676f3b`	2022-07-06 13:25:17 +03:00
Aliaksandr Valialkin	195dccf678	app/vmselect: add ability to query `vmselect` from another `vmselect`	2022-07-06 13:19:45 +03:00
Aliaksandr Valialkin	498c6d6e72	lib/promscrape: push `scrape_samples_limit` metric to remote storage if `sample_limit` option is set in `scrape_config` for this target See https://github.com/VictoriaMetrics/operator/issues/497	2022-07-06 12:46:23 +03:00
Aliaksandr Valialkin	b4489028f3	lib/storage: typo fix in MetricName.Unmarshal error	2022-07-06 12:46:23 +03:00
Aliaksandr Valialkin	1ec4dfd678	lib/vmselectapi: pass storage.SearchQuery to API calls instead of []*storage.TagFilters + storage.TimeRange + maxMetrics This reduces the number of args to vmselectapi calls	2022-07-06 12:46:22 +03:00
Aliaksandr Valialkin	2e721f7d16	lib/vmselectapi: rename Server.MustClose to more clear Server.MustStop	2022-07-06 12:46:22 +03:00
Aliaksandr Valialkin	270e555f47	lib/vmselectapi: pass maxSuffixes arg to tagValueSuffixes RPC call	2022-07-06 12:46:22 +03:00
Aliaksandr Valialkin	78eeca6f0d	lib/vmselectapi: rename deleteMetrics to more correct deleteSeries	2022-07-06 12:46:21 +03:00
Aliaksandr Valialkin	5afa54e845	lib/vmselectapi: use string type for tagKey and tagValuePrefix args at TagValueSuffixes() This improves the API consistency	2022-07-06 12:46:21 +03:00
Aliaksandr Valialkin	78f9a8aafd	lib/storage: put the (date, metricID) entry in dateMetricIDCache just after the corresponding series is registered in the per-day inverted index Previously the time series could be put into dateMetricIDCache without registering in the per-day inverted index if GetOrCreateTSIDByName finds TSID entry in the global index. This could lead to missing series in query results. The issue has been introduced in the commit `55e7afae3a`, which has been included in VictoriaMetrics v1.78.0	2022-07-05 14:56:55 +03:00
Aliaksandr Valialkin	ecc11dc32d	lib/promauth: refactor NewConfig in order to improve maintainability 1. Split NewConfig into smaller functions 2. Introduce Options struct for simplifying construction of the Config with various options This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2684	2022-07-04 14:31:43 +03:00
Aliaksandr Valialkin	7fc03a1deb	app/vmagent/remotewrite: add `-remoteWrite.header` command-line flag for setting additional http headers to send to -remoteWrite.url Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2805	2022-06-30 20:00:59 +03:00
Aliaksandr Valialkin	4fb0f15322	all: readability improvements for query traces - show dates in human-readable format, e.g. 2022-05-07, instead of a numeric value - limit the maximum length of queries and filters shown in trace messages	2022-06-30 18:19:43 +03:00
ttyv	00956e585d	lib/promscrape: fix vmagent tickerCh reload behaviour (#2786 ) Co-authored-by: Dmitriy <dab@ttyv.ru>	2022-06-30 13:52:44 +03:00
Aliaksandr Valialkin	7d5d33fd71	lib/storage: return marshaled metric names from SearchMetricNames Previously SearchMetricNames was returning unmarshaled metric names. This wasn't great for vmstorage, which should spend additional CPU time for marshaling the metric names before sending them to vmselect. While at it, remove possible duplicate metric names, which could occur when multiple samples for new time series are ingested via concurrent requests. Also sort the metric names before returning them to the client. This simplifies debugging of the returned metric names across repeated requests to /api/v1/series	2022-06-28 18:16:32 +03:00
Aliaksandr Valialkin	15da802f5f	lib/storage: put into query trace the number of found entries in SearchMetricNames	2022-06-28 14:52:39 +03:00
Aliaksandr Valialkin	399d4c36ae	app/vmselect: optimize /api/v1/series a bit for time ranges smaller than one day	2022-06-28 12:55:20 +03:00
Aliaksandr Valialkin	64505e924d	app/vmstorage: extract vmselect api server into a separate package - lib/vmselectapi This opens doors for implementing vmselect api server at vmselect level, so top-level vmselect could query lower-level vmselect nodes in the same way as it queries vmstorage nodes. This will create the ability to create highly available querying architecture when multiple independent VictoriaMetrics clusters with the same data are located in distinct availability zones. In this case we can use top-level vmselect instead of Promxy for simultaneous querying of all the clusters in all the AZs.	2022-06-27 14:20:41 +03:00
Aliaksandr Valialkin	6386f117c8	all: show timeRange in traces in human-readable format instead of timestamps in milliseconds	2022-06-27 13:42:57 +03:00
Aliaksandr Valialkin	926fccbb8d	lib/storage: add querytracer to more contexts querytracer has been added to the following storage.Storage methods: - RegisterMetricNames - DeleteMetrics - SearchTagValueSuffixes - SearchGraphitePaths	2022-06-27 12:53:49 +03:00
Aliaksandr Valialkin	6c66804fd3	all: locate throttled loggers via logger.WithThrottler() only once and then use them This reduces the contention on logThrottlerRegistryMu mutex when logger.WithThrottler() is called frequently from concurrent goroutines.	2022-06-27 12:34:30 +03:00
Aliaksandr Valialkin	71b0dfdefa	lib/promscrape: always send stale markers with the real scrape timestamp This guarantees that query won't return data just after the series is disappeared.	2022-06-23 11:49:13 +03:00
Aliaksandr Valialkin	3ae6300497	lib/promauth: add ability to send additional http headers in requests to scrape targets This solves https://stackoverflow.com/questions/66032498/prometheus-scrape-metric-with-custom-header	2022-06-22 20:40:50 +03:00
Aliaksandr Valialkin	fe2269b999	all: remove explicit "xxhash" name when importing github.com/cespare/xxhash/v2 package This package already has the same name, so there is no need in explicit name	2022-06-21 20:24:28 +03:00
Loki's Wager	ca4730c00f	BugFix part_header.go (#2763 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2757 Co-authored-by: haotingyi <haotingyi@corp.netease.com>	2022-06-21 15:59:11 +03:00
Aliaksandr Valialkin	288d13af8d	lib/netutil: parallelize background pings for remote addresses This should improve the time needed for determining unavailale remote addresses across big numer of ConnPool's. This is a follow-up for `a1629bd3be` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711	2022-06-21 13:32:27 +03:00
Aliaksandr Valialkin	a1629bd3be	lib/netutil.ConnPool: skip dialing remote address if the previous dial attempt was unsuccessful If the previous dial attempt was unsuccessful, then all the new dial attempts are skipped until the background goroutine determines that the given address can be successfully dialed. This reduces query latency when some of vmstorage nodes are unavailable and dialing them is slow. This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711 This commit is based on ideas from the https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2756 The main differences are: - The check for healthy/unhealthy storage nodes is moved one level lower from app/vmselect/netstorage to lib/netutil.ConnPool. This makes possible re-using this feature everywhere lib/netutil.ConnPool is used. - The check doesn't take into account handshake errors for already established connections. Handshake errors usually mean improperly configured VictoriaMetrics cluster, so they shouldn't be ignored.	2022-06-20 17:33:54 +03:00
Aliaksandr Valialkin	45e9732764	docs: follow-up after `e4d6b750f6`	2022-06-20 17:15:52 +03:00
Nikolay	15662c0f29	lib/httpserver: adds flagsAuthKey command-line flag (#2758 ) * lib/httpserver: adds flagsAuthKey command-line flag It protects /flags endpoint with authKey. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2753O * Apply suggestions from code review Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-06-20 17:15:51 +03:00
Aliaksandr Valialkin	b28c6febf9	app/{vminsert,vmselect}: add `-vmstorageDialTimeout` command-line flag for tuning the maximum time needed for establishing connections to vmstorage Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711	2022-06-20 15:17:34 +03:00
Aliaksandr Valialkin	270ad39359	lib/storage: properly take into account already registered series when `-storage.maxHourlySeries` or `-storage.maxDailySeries` limits are enabled The commit `5fb45173ae` takes into account only newly registered series when applying cardinality limits. This means that the cardinality limit could be exceeded with already registered series. This commit returns back accounting for already registered series when applying cardinality limits.	2022-06-20 13:53:41 +03:00
Aliaksandr Valialkin	7a79e7c0ef	lib/storage: create per-day indexes together with global indexes when registering new time series Previously the creation of per-day indexes and global indexes for the newly registered time series was decoupled. Now global indexes and per-day indexes for the current day are created toghether for new time series. This should speed up registering new time series a bit.	2022-06-19 22:32:41 +03:00
Aliaksandr Valialkin	88e1221b35	lib/storage: do not register new series if `-storage.maxHourlySeries` or `-storage.maxDailySeries` limits are exceeded Previously samples for new series weren't added as expected when series limits were reached, but new series were still registered in indexdb.	2022-06-19 22:03:02 +03:00
Aliaksandr Valialkin	c5ac176153	lib/storage: reset metric id caches for the previous and the current hour Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698	2022-06-19 22:02:51 +03:00
Aliaksandr Valialkin	450aa0ae5a	lib/promrelabel: support `action: graphite` relabeling Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2737	2022-06-16 20:25:49 +03:00
Aliaksandr Valialkin	45fa9d798d	app/vmselect: accept `focusLabel` query arg at /api/v1/status/tsdb	2022-06-14 18:39:00 +03:00
Aliaksandr Valialkin	fb77843639	lib/storage: show top labels with the highest number of series in cardinality explorer	2022-06-14 16:34:13 +03:00
Aliaksandr Valialkin	3167fbc21d	lib/storage: improve error message when -search.max* command-line flag values are exceeded	2022-06-14 13:28:21 +03:00
Nikolay	e23af8f05c	lib/httpserver: backport changes from master branch (#2697 ) * lib/httpserver: backport changes from master branch adds basicAuth adds authKey check for /metrics and /debug/pprof requests it should improve security for cluster components * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-06-14 13:02:44 +03:00
Aliaksandr Valialkin	4af43a4a75	lib/storage: test GetTSDBStatusWithFiltersForDate on a global time range	2022-06-12 14:28:37 +03:00
Aliaksandr Valialkin	61e03f172b	app/vmselect: optimize `/api/v1/labels` and `/api/v1/label/.../values` handlers when `match[]` query arg is passed to them	2022-06-12 14:06:24 +03:00
Aliaksandr Valialkin	cb39eada77	all: improve query tracing coverage for indexdb search Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403	2022-06-09 20:04:02 +03:00
Howie	4afd7aa695	feat: rule limit (#2676 ) vmalert: support `limit` param in groups definition `limit` param limits number of time series samples produced by a single rule during execution. On reaching the limit rule will return an err. Signed-off-by: lihaowei <haoweili35@gmail.com>	2022-06-09 13:15:33 +03:00
Aliaksandr Valialkin	a9ea3fee38	lib/querytracer: make it easier to use by passing trace context message to New and NewChild The context message can be extended by calling Donef. If there is no need to extend the message, then just call Done.	2022-06-08 21:16:12 +03:00
Dmytro Kozlov	f2754c3e90	Cardinality explorer (#2625 ) * Cardinality explorer * vmui, vmselect: updated field name, added description to spinner * make vmui-update * updated const name, make vmui-update * lib/storage: changes calculation for totalSeries values * added static files * wip * wip * wip * wip * docs/CHANGELOG.md: document cardinality explorer feature See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2233 Co-authored-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-06-08 18:54:27 +03:00
Roman Khavronenko	2b5e1dee91	vmagent: update SD duration histogram metric if SD is active (#2677 ) The change updates histogram for registering SD update duration only SD is considered as `active`. SD is active if at least one scraper for this SD has started. This change supposed to reduce metrics cardinality produced by duration histogram which gets updated even if SD isn't configured. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2671 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-06-07 15:53:06 +03:00
Roman Khavronenko	5f33445f66	lib/storage: limit max mergeConcurrency value for systems with high number of CPUs (#2673 ) Workers count for merges affects the max part size during merges. Such behaviour protects storage from running out of disk space for scenario when all workers are merging parts with the max size. This works very well for most cases. But for systems where high number of CPUs is allocated for vmstorage components this could significantly impact the max part size and result in more unmerged parts than expected. While checking multiple production highly loaded setups it was discovered that `max_over_time(vm_active_merges{type="storage/big}[1h]}"` rarely exceeds 2, and `max_over_time(vm_active_merges{type="storage/small}[1h]}"` rarely exceeds 4. The change in this commit limits the max value for concurrency accordingly. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2022-06-07 15:02:55 +03:00
Aliaksandr Valialkin	b6e3c12811	lib/promscrape/discovery/kubernetes: use unsupportedFieldError() function instead of errContext string This improves code readability and maintainability a bit, since the format string is passed as string literal into fmt.Errorf.	2022-06-07 01:24:14 +03:00
Aliaksandr Valialkin	68b6ddfb14	all: follow-up after `8edb390e21` - Remove unused js bloatware from /targets page. This strips down binary size by more than 100Kb - Add /service-discovery page for API compatibility with Prometheus - Properly load bootstrap.min.css from /prometheus/targets - Serve static contents for /targets page from app/vminsert instead of app/vmselect, because /targets page is served from there	2022-06-07 01:05:53 +03:00
Aliaksandr Valialkin	3dbb19d624	lib/promscrape/discovery/kubernetes: follow-up after `006b8c7534` - make more clear error logs - simplify testing for newKubeConfig by passing only the path to kube_config file instead of SDConfig struct	2022-06-06 14:41:28 +03:00
Aliaksandr Valialkin	dd0d773c13	lib/promauth: follow-up after `006b8c7534` - Take into account `ca`, `key` and `cert` values when generating string representation of TLSConfig. Print hashes instead of real values because of security considerations. - Properly update Config.tlsCertDigets when `key` and `cert` values are set. This allows properly updating scrape targets after these values are updated in configs. - Do not re-generate certificate from `key` and `cert` values per each call to getTLSCert, because these values are immutable. - Do not set `ca` value from `ca_file` value, so it isn't exposed at `/config` page. - Generate proper error messages on incorrect `key`, `cert` or `ca` values.	2022-06-04 01:11:23 +03:00
Aliaksandr Valialkin	6c2fb9d8c4	lib/promscrape: add `-promscrape.cluster.name` command-line flag This flag is used for proper data de-duplication when the same target is scraped from multiple vmagent clusters. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679	2022-06-04 01:11:23 +03:00
Dmytro Kozlov	ce8aade80e	lib/promscrape: adds service discovery visualization for /targets page(#2675 ) * lib/promscrape: updated template * lib/promscrape: fixed click on unhealthy and all btns * app/vmselect: jquery scripts into static folder Co-authored-by: f41gh7 <nik@victoriametrics.com>	2022-06-04 01:11:23 +03:00
Nikolay	72e43ef2fe	lib/promscrape/discovery/kubernetes: follow-up after `0b5c874911` (#2672 )	2022-06-04 01:11:23 +03:00
hadesy	28d4624f60	promscrape/discovery: support kubeconfig (#2533 )	2022-06-04 01:11:23 +03:00
Aliaksandr Valialkin	cc226e6ebe	docs/CHANGELOG.md: follow-up after `2177089f94`	2022-06-01 14:57:39 +03:00
Roman Khavronenko	e9ee043879	lib/storage: make `indexdb/tagFilters` cache size configurable (#2667 ) The default size of `indexdb/tagFilters` now can be overridden via `storage.cacheSizeIndexDBTagFilters` flag. Please, be careful with changing default size since it may lead to inefficient work of the vmstorage or OOM exceptions. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2663 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2022-06-01 14:57:39 +03:00
Roman Khavronenko	bca90d7148	promrelabel: add support of `lowercase` and `uppercase` relabeling actions (#2665 ) * promrelabel: add support of `lowercase` and `uppercase` relabeling actions https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2664 Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/storage: make golangci-lint happy Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2022-06-01 14:57:39 +03:00
Aliaksandr Valialkin	fedfc9e686	lib/storage: stop background merge when storage enters read-only mode This should prevent from `no space left on device` errors when VictoriaMetrics under-estimates the additional disk space needed for background merge. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2603	2022-06-01 14:22:12 +03:00
Aliaksandr Valialkin	afced37c0b	all: add initial support for query tracing See https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#query-tracing Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1403	2022-06-01 02:31:44 +03:00
Aliaksandr Valialkin	386f6110ec	lib/promscrape: use strconv.Atoi instead of strconv.ParseInt for parsing -promscrape.cluster.memberNum In this case there is no need in converting int64 to int	2022-06-01 01:43:25 +03:00
Aliaksandr Valialkin	945e9fa8c4	lib/storage: `make fmt`	2022-05-31 12:42:48 +03:00
Aliaksandr Valialkin	727cc119b6	lib/storage: do not take into account series from the next day when `match[]` filter is passed to /api/v1/status/tsdb	2022-05-31 12:42:48 +03:00
Dmytro Kozlov	cd1fa2e4cd	issue-2594: use embedded for static files (#2650 ) embed static js and css files from CDN into vmalert, vmagent and vmsingle binaries. Co-authored-by: f41gh7 <nik@victoriametrics.com> https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2594	2022-05-31 12:42:48 +03:00
Dmytro Kozlov	6add79143b	removed redundant return (fixed linter) (#2647 ) * removed redundant return * updated lint package version	2022-05-30 12:25:58 +03:00
Aliaksandr Valialkin	f149d56ac2	lib/promscrape: add -promscrape.suppressScrapeErrorsDelay command-line flag This flag can be used for reducing the amounts of logs when scraping unreliable scrape targets. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2575 The patch is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2576 . Thanks to @jelmd .	2022-05-25 23:00:30 +03:00
Aliaksandr Valialkin	38beb9fe04	lib/storage: add ability to change the indexdb rotation time offset with -retentionTimezoneOffset command-line flag This is a follow-up for `0fbf59199a` See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2574	2022-05-25 16:07:14 +03:00
阳明	e4df648ea0	lib/storage: Remove the effect of time zone on next retention period (#2568 ) (#2574 )	2022-05-25 15:10:19 +03:00
Roman Khavronenko	7406665fc3	lib/promscrape/discovery/kubernetes: fixes kubernetes service discovery (#2615 ) * lib/promscrape/discovery/kubernetes: properly updates discovered scrape works previously, added or updated scrapeworks may override previuosly discovered. it happens because swosByKey may contain small subset of kubernetes objects with it's labels. It happens for objectsUpdated and objectsAdded maps, which include only changed elements * Properly calculate vm_promscrape_discovery_kubernetes_scrape_works Co-authored-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-05-21 01:17:21 +03:00
Boris Petersen	3a8b4fab97	Add ability to sign requests for all AWS services (#2604 ) This adds the ability to utilize sigv4 signing for all AWS services not just "aps". When the newly introduced property "service" is not set it will default to "aps". Signed-off-by: Boris Petersen <boris.petersen@idealo.de>	2022-05-20 14:20:00 +03:00
Aliaksandr Valialkin	116c0b8f2e	docs/vmagent.md: typo fix in the description for `-promscrape.cluster.replicationFactor` command-line flag	2022-05-12 18:51:20 +03:00
Aliaksandr Valialkin	d8a276fbe4	lib/netutil: limit the number of concurrently established connections when calling ConnPool.Get() This should reduce potential spikes in the number of established connections in the following cases: - when the connection establishing procedure becomes temporarily slow - after a temporary spike in the rate of ConnPool.Get() calls See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2552	2022-05-11 14:11:06 +03:00
Aliaksandr Valialkin	0d0561ca8c	lib/awsapi: remove whitelist arg from GetFiltersQueryString(), since it may break new filters in the future Let users decide which filters to use. If users start using disallowed filters, then AWS will return an error.	2022-05-09 15:34:56 +03:00
Aliaksandr Valialkin	810dd74fb9	lib/promscrape: properly implement ScrapeConfig.clone() Previously ScrapeConfig.clone() was improperly copying promauth.Secret fields - their contents was replaced with `<secret>` value. This led to inability to use passwords and secrets in `-promscrape.config` file. The bug has been introduced in v1.77.0 in the commit `67b10896d2` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2551	2022-05-07 00:06:19 +03:00
Aliaksandr Valialkin	af0da45d3e	lib/promscrape: rename `promscrape_stale_samples_created_total` metric to `vm_promscrape_stale_samples_created_total`, so its name is consistent with the rest of `vm_promscrape_` metrics	2022-05-06 15:33:43 +03:00
Aliaksandr Valialkin	9d40bb7137	lib/promscrape/discovery/ec2: add ability to filter Availability Zones in `ec2_sd_config` via `az_filters` section	2022-05-06 12:44:01 +03:00
Aliaksandr Valialkin	2ce1d09135	lib/promscrape/discovery/ec2: properly pass filters to DescribeAvailabilityZones API call Previously filters wheren't passed to this call after the commit `0e09fdb8b0` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626	2022-05-05 11:01:17 +03:00
Aliaksandr Valialkin	873f55bac5	lib/awsapi: pass `filtersQueryString` arg to GetEC2APIResponse() function, so the caller could decide whether to use the filters during the AWS API query The filters shouldn't be passed to DescribeAvailabilityZones API call. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287 Related commits: `0e09fdb8b0` `d289ecded1`	2022-05-05 10:29:47 +03:00
Dmytro Kozlov	4f40dc9829	{vmbackup, vmbackup/snapshot}: fixed problem with snapshot backup in another snapshot folder (#2535 ) * {vmbackup, vmbackup/snapshot}: validate snapshot name * vmbackup/snapshot: added another checks * backup/actions: added check that we ignore backup_complete.ignore file * vmbackup: moved snapshot to lib directory * lib/snapshot: added functions description * lib/snapshot: fixed typo * vmbackup: code cleanup * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-05-04 22:12:48 +03:00
Nikolay	7e58cba6cf	{lib/promscrape,app/vmagent}: adds sigv4 support for vmagent remoteWrite (#2458 ) * {lib/promscrape,app/vmagent}: adds sigv4 support for vmagent remoteWrite moves aws related code into separate lib from lib/promscrape it allows to write data from vmagent to the AWS managed prometheus (cortex) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1287 * Apply suggestions from code review * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-05-04 20:28:37 +03:00
Nikolay	51a77759c1	lib/promscrape: adds correct http status codes for redirect (#2530 ) standard http client accepts multiple http status codes as redirect it should fix issue with incorrect redirects https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2482	2022-05-03 14:01:57 +03:00
Aliaksandr Valialkin	361b08c30e	lib/storage: leave the last sample per each discrete interval during the deduplicaton This aligns better with staleness logic in Prometheus - https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness	2022-05-02 21:59:31 +03:00
Aliaksandr Valialkin	190c8b463c	lib/netutil: close connections in ConnPool if they are idle for more than 30 seconds Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2508	2022-05-02 15:01:52 +03:00
Artem Navoiev	11db05a4ff	lib/{storage,flagutil} - Add option for snapshot autoremoval (#2487 ) * lib/{storage,flagutil} - Add option for snapshot autoremoval - add prometheus-like duration as command flag - add option to delete stale snapshots - update duration.go flag to re-use own code * wip * lib/flagutil: re-use Duration.Set() call in NewDuration * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-05-02 11:24:12 +03:00
Aliaksandr Valialkin	a436836402	lib/flagutil: re-use Duration.Set() call in NewDuration	2022-05-02 10:58:08 +03:00
Dima Lazerka	837e440865	Fix targetstatus qtpl paths (#2517 ) Ran `make quicktemplate-gen` from the root directory	2022-04-29 11:18:14 +03:00
Aliaksandr Valialkin	aa82987d70	lib/promscrape/discovery/kubernetes: do not drop pod meta-labels even if the corresponding node objects are missing This reflects the logic used in Prometheus. See https://github.com/prometheus/prometheus/pull/10080	2022-04-26 15:27:42 +03:00
Aliaksandr Valialkin	a85ef60b4b	lib/promauth: take into account tls_config and proxy_url when serializing OAuth2Config to string	2022-04-23 00:24:13 +03:00
Aliaksandr Valialkin	4c3cd96db5	lib/promauth: add support for `min_version` option at `tls_config` section in the same way as Prometheus does	2022-04-23 00:24:11 +03:00
Aliaksandr Valialkin	808a2f3b61	lib/promauth: add support for `proxy_url` option at `oauth2` section in the same way as Prometheus does	2022-04-23 00:01:53 +03:00
Aliaksandr Valialkin	4ade8511e2	lib/promauth: add support for `tls_config` section at `oauth2` config in the same way as Prometheus does	2022-04-23 00:01:52 +03:00
Aliaksandr Valialkin	c2b13e6a04	lib/promscrape/discovery/kubernetes: limit the minimum sleep time between updating dependent ScrapeWork objects Previously the sleep time could be dropped to nanoseconds, which could result in CPU time waste	2022-04-22 23:15:34 +03:00
Aliaksandr Valialkin	a89e31b304	lib/promscrape/discovery/kubernetes: allow attaching node-level labels and annotations to discovered pod targets in the same way as Prometheus 2.35 does See https://github.com/prometheus/prometheus/issues/9510 and https://github.com/prometheus/prometheus/pull/10080	2022-04-22 20:15:34 +03:00
Aliaksandr Valialkin	cc6eae6992	lib/promscrape/discovery/kubernetes: improve the performance of urlWatcher.reloadObjects() on multi-CPU systems Parallelize the generation of ScrapeWork objects there. Previously they were generated in a single goroutine.	2022-04-22 13:23:39 +03:00
Aliaksandr Valialkin	60f74dab56	lib/promscrape: prevent from memory leaks on -promscrape.config reload when only a small part of scrape jobs is updated This is a follow-up after `26b78ad707`	2022-04-22 13:23:37 +03:00
Aliaksandr Valialkin	ed1b394a1a	app/vmstorage: expose `vm_indexdb_items_added_total` and `vm_indexdb_items_added_size_bytes_total` counters at `/metrics` page These counters can be used for monitoring the rate of addition of new entries in indexdb (aka inverted index). See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2471	2022-04-21 13:19:42 +03:00
Aliaksandr Valialkin	fea9d1e6ee	lib/promscrape/discovery/kubernetes: properly update endpoints and endpointslice objects when the related pod or service objects are updated Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240 This is a follow-up for `2341bd48d7`	2022-04-21 13:06:49 +03:00
Aliaksandr Valialkin	1e0517b9cd	lib/promscrape: remove possible data race when cleaning up internStringsMap	2022-04-20 18:41:23 +03:00
Aliaksandr Valialkin	1ae16bf671	lib/promscrape: zero out labels after duplicate removal inside mergeLabels()	2022-04-20 18:35:27 +03:00
Aliaksandr Valialkin	e9f08b1e6a	lib/promscrape/discovery/kubernetes: do not pre-allocate memory for ScrapeWork objects There is high chance that ScrapeWork objects won't be generated because of relabeling	2022-04-20 16:42:41 +03:00
Aliaksandr Valialkin	909a3ee0e4	lib/promscrape: follow-up after `91e290a8ff`	2022-04-20 16:12:26 +03:00
Nikolay	429848a67d	lib/promscrape: reduce latency for k8s GetLabels (#2454 ) replaces internStringMap with sync.Map - it greatly reduces lock contention concurently reload scrape work for api watcher - each object labels added by dedicated CPU changes can be tested with following script https://gist.github.com/f41gh7/6f8f8d8719786aff1f18a85c23aebf70	2022-04-20 16:12:25 +03:00
Dmytro Kozlov	9dbfd99777	lib/promscrape: simply update UI (#2479 ) * lib/promscrape: simply update UI * lib/promscrape: added vm icon	2022-04-20 15:38:04 +03:00
Aliaksandr Valialkin	45385a5dc6	lib/promscrape: optimize getScrapeWork() function Reduce the number of memory allocations in this function. This improves its performance by up to 50%. This should improve service discovery speed when big number of potential targets with big number of meta-labels are generated by service discovery. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2270	2022-04-20 15:34:18 +03:00
Aliaksandr Valialkin	bfa0b8f710	lib/promscrape: use a hash over target labels as a key for dropped targets' map This reduces the number of allocations and improves the performance for updating dropped targets' map. This map is exposed at /api/v1/targets as in droppedTargets list.	2022-04-20 15:23:54 +03:00
Aliaksandr Valialkin	d0bac8e224	all: typo fix: Kuberntes -> Kubernetes	2022-04-20 10:51:41 +03:00
Dmytro Kozlov	17552dba8b	lib/promscrape: Enable filters for endpoint and labels (#2466 ) * lib/promscrape: Enable filters for endpoint and labels * lib/promscrape: cleanup * lib/promscrape: update template * lib/promscrape: move logic filter logic to backend * lib/promscrape: updated placeholder * lib/promscrape: updated placeholder * lib/promscrape: use two different fields for filters, updated form, added error on parsing queries * lib/promscrape: rename functions * lib/promscrape: removed unused values * wip * wip * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-04-19 18:27:44 +03:00
Nikolay	628905f080	lib/promscrape: adds job restart method (#2455 ) * lib/promscrape: adds job restart method it must restart only ScrapeConfig with changed content this change greatly reduce time, that needed for job restart and it should decrease possible data loss when config frequently changed at kubernetes based deployments Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> * wip Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-04-16 20:29:33 +03:00
Aliaksandr Valialkin	7debf57ca6	lib/httpserver: clarify that `-tls` flag enables TLS for http requests to `-httpListenAddr`	2022-04-16 16:59:41 +03:00
Aliaksandr Valialkin	a7689e1b0c	app/vmstorage: add support for mTLS cipher suites via `-cluster.tlsCipherSuites` command-line flag Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2404	2022-04-16 16:36:38 +03:00
Aliaksandr Valialkin	27e74f25d6	lib/httpserver: follow up after `def0032c7d`	2022-04-16 15:52:44 +03:00
Dmytro Kozlov	26ae50ec26	lib/httpserver: added tlsCipherSuites flag (#2468 ) * lib/httpserver: added tlsCipherSuites flag * lib/httpserver: compare lower case strings * lib/httpserver: use EqualFold * lib/httpserver: used flagutil.NewArray, supported only strings cipher suites * lib/httpserver: updated flag description, added flag to documentation * Update lib/httpserver/httpserver.go Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-04-16 15:52:42 +03:00
Aliaksandr Valialkin	c50e48a74c	lib/promscrape: follow-up after `baa1c24b36`	2022-04-16 14:26:38 +03:00
Nikolay	a56ee034af	lib/promscrape: removes omitempty for ScrapeConfig (#2457 ) This change fixes incorrect marshalling for ScrapeConfig it affects http endpoint and ScrapeConfig checksum. With omitempty, custom Marshaller is not called if field is not a pointer. Previously this issue happened at vmalert	2022-04-16 14:26:36 +03:00
Aliaksandr Valialkin	4a3172f150	lib/encoding: explicitly set slice length passed to binary.BigEndian.Uint* This allows Go complier to generate more optimal code without bound checks	2022-04-12 12:56:52 +03:00
Aliaksandr Valialkin	70ad171070	lib/promscrape: follow-up after `7e79adfb55`	2022-04-12 12:37:03 +03:00
Nikolay	e26bcb8bbb	lib/promscrape: allows to use k8s pod name as clusterMemberNum (#2436 ) * lib/promscrape: allows to use k8s pod name as clusterMemberNum it must improve user expirience and simplify clustering scrapers. it must allow to use vmagent cluster with distroless images https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2359 * Apply suggestions from code review Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-04-12 12:37:02 +03:00
Aliaksandr Valialkin	81b7a31cb1	app/vmstorage: properly handle `maxSeries` limit passed from vmselect to vmstorage	2022-04-12 11:19:07 +03:00
Aliaksandr Valialkin	e3bf464f11	lib/protoparser/native: follow-up after `fe01f4803d`	2022-04-11 19:27:53 +03:00
Nikolay	39225fc809	lib/protoparser/native: fixes parseStream dead-lock (#2423 ) previously, if native block cannot be unmarshaled, wg.Done wasn't called by unmarshal work. It leads to connection blocking and possible dead-lock at client side	2022-04-11 19:27:51 +03:00
Aliaksandr Valialkin	edb139cfe4	lib/memory: export `process_memory_limit_bytes` metric, which shows the amounts of memory the current process has access to This metric is equivalent to `vm_available_memory_bytes`, but it has better name, since the metric is related to a process, not VictoriaMetrics itself. Leave `vm_available_memory_bytes` for backwards compatibility.	2022-04-07 15:24:08 +03:00
Aliaksandr Valialkin	cb319b15bb	lib/storage: increase the number of rawRowsShard shards on systems with more than 4 CPU cores This should improve data ingestion scalability on systems with many CPU cores	2022-04-06 19:50:41 +03:00
Aliaksandr Valialkin	8ef9348801	lib/mergeset: use more rawItemsShard shards on multi-CPU systems This should improve the scalability for registering of new time series on multi-CPU system	2022-04-06 19:50:41 +03:00
Aliaksandr Valialkin	db00ddd23e	lib/mergeset: skip common prefixes when comparing inmemoryBlock items This should improve the performance for items sorting inside inmemoryBlock.MarshalUnsortedData if they have common prefix. While at it, improve the performance for inmemoryBlock.updateCommonPrefix for sorted items. This should improve performance for inmemoryBlock.MarshalSortedData during background merge.	2022-04-06 18:55:25 +03:00
Aliaksandr Valialkin	88c2631320	lib/protoparser: remove superflowous memory allocations during protocol parsing	2022-04-06 14:00:50 +03:00
Aliaksandr Valialkin	123a88bb65	lib/storage: reuse sync.WaitGroup objects This reduces GC load by up to 10% according to memory profiling	2022-04-06 14:00:50 +03:00
Aliaksandr Valialkin	f526c7814e	lib/cgroup: reduce the default GOGC value from 50% to 30% This reduces memory usage under production workloads by up to 10%, while CPU spent on GC remains roughly the same. The CPU spent on GC can be monitored with go_memstats_gc_cpu_fraction metric	2022-04-06 14:00:50 +03:00
Aliaksandr Valialkin	0f1ebd911d	lib/workingsetcache: reuse prev cache after its reset This should reduce memory churn rate	2022-04-05 20:39:44 +03:00
Aliaksandr Valialkin	ac93c36be7	lib/workingsetcache: check more frequently for cache size overflow This should reduce the probability of cache size limit overflow	2022-04-05 18:05:33 +03:00
Nikolay	7eb49d204f	vmctl verify-blocks command (#2390 ) * lib/protoparser: changes ParseStream for native format uses reader instead of http.Request updates app/vmagent and app/vmagent method usage * app/vmctl: add verify-block subcommand it allows to check exported from VictoriaMetrics data block in native format https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2362 Update app/vmctl/README.md Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2022-04-05 17:46:36 +03:00
Aliaksandr Valialkin	fca0cb8156	lib/workingsetcache: reduce the expiration duration from 20 minutes to 10 minutes This should reduce memory usage for the cache under high churn rate	2022-04-05 17:08:43 +03:00
Aliaksandr Valialkin	8752cce157	app/vminsert: reduce the max packet size, which vminsert can send to vmstorage This reduces the max memory usage for vminsert and vmstorage under heavy ingestion rate by up to 50% on production workload	2022-04-05 15:39:58 +03:00
Nikolay	4cf6219e07	lib/{storage,regexpcache}: replaces regexpCacheMap with LRU cache (#2293 ) * lib/{storage,regexpcache}: replaces regexpCacheMap with LRU cache It should decrease memory usage for regexp caching with storing cacheEntry by pointer - golang map should be able to effectivly shrink it's size original issue with this case - unexpected map grows and storage OOM Apply suggestions from code review Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> Adds missing metrics for regexp cache and regexpPrefixes cache * wip * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-03-26 12:57:27 +02:00
Aliaksandr Valialkin	b843f0e229	app/vmselect: add fine-grained limits for the number of returned/scanned time series for various APIs	2022-03-26 11:28:14 +02:00
Aliaksandr Valialkin	a8a4581c37	lib/blockcache: properly remove references to deleted parts Previously references to deleted parts may remain active as cache.m keys. This could prevent from proper memory de-allocation. This could lead to increased memory usage for the following caches starting from v1.73.0: * indexdb/indexBlocks * indexdb/dataBlocks * storage/indexBlocks Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2242 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007 This is a follow-up for `88605a7ea2`	2022-03-18 17:07:54 +02:00
Aliaksandr Valialkin	e35c9124b7	lib/storage: reduce the interval for checking for free disk space from 30 seconds to 1 second This should reduce the probability of out of disk space panics when -storage.minFreeDiskSpaceBytes is set to low values. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2305	2022-03-18 16:53:19 +02:00
Aliaksandr Valialkin	7c92aaeaa4	lib/blockcache: properly release memory occupied by deleted entries Proviously the deleted entries could remain referenced via lastAccessHeap for long time. This could lead to increased memory usage for the following caches starting from v1.73.0: * indexdb/indexBlocks * indexdb/dataBlocks * storage/indexBlocks Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2242 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-03-18 16:53:19 +02:00
Aliaksandr Valialkin	a6d65fc824	lib/storage: typo fix after `e7831ae154`	2022-03-18 16:53:19 +02:00
jduncan0000	e7831ae154	Fix for issue #2255 - matchTagFilters for positive empty-match filters (#2304 ) * fix for issue 2255 - matchTagFilters for positive empty-match filters * add example to comments * formatting * add test for positive empty match * formatting	2022-03-18 13:08:54 +02:00
Aliaksandr Valialkin	698458b742	lib/httpserver: extract the code responsible for initializing server-side TLS config into netutil.GetServerTLSConfig	2022-03-17 19:46:20 +02:00
Aliaksandr Valialkin	191977b324	lib/storage: trashing -> thrashing typo in docs This is a follow-up for `918ed5cb32`	2022-03-16 13:28:29 +02:00
Vic (Shihang) Li	9767fcd837	fix: change thrashing typo (#2317 )	2022-03-16 13:05:55 +02:00
Aliaksandr Valialkin	3f999b11f7	lib/mergeset: remove aux buffers from inmemoryPart This should reduce the size of inmemoryPart items and may improve performance a bit during registering new time series Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2247	2022-03-03 17:12:25 +02:00
Aliaksandr Valialkin	ecf68da79e	lib/mergeset: eliminate copying of itemsData and lensData from storageBlock to inmemoryBlock This should improve performance when registering new time series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2247	2022-03-03 17:12:25 +02:00
Aliaksandr Valialkin	ecf4f7bf21	lib/mergeset: consistency renaming: ip->mp for inmemoryPart vars	2022-03-03 17:12:25 +02:00
Aliaksandr Valialkin	f4e466955d	lib/mergeset: move storageBlock from inmemoryPart to a sync.Pool The lifetime of storageBlock is much shorter comparing to the lifetime of inmemoryPart, so sync.Pool usage should reduce overall memory usage and improve performance because of better locality of reference when marshaling inmemoryBlock to inmemoryPart. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2247	2022-03-03 17:12:25 +02:00
Aliaksandr Valialkin	b47f18f555	lib/{mergeset,storage}: tune compression levels for small blocks This should reduce CPU usage spent on compression	2022-02-25 15:34:13 +02:00
Aliaksandr Valialkin	28b610db07	lib/storage: document why job-like and instance-like labels must be stored at mn.Tags[0] and mn.Tags[1] Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2244	2022-02-25 13:21:53 +02:00
Aliaksandr Valialkin	d1881fa582	lib/storage: add a comment to indexSearch.containsTimeRange() on why it allows false positives Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2239	2022-02-24 12:48:33 +02:00
Aliaksandr Valialkin	02a922b53f	lib/storage: properly handle series selector matching multiple metric names plus a negative filter Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2238 This is a follow-up for `00cbb099b6`	2022-02-24 12:11:53 +02:00
Aliaksandr Valialkin	1967b9c211	lib/mergeset: remove superflouos sorting of inmemoryBlock.data at inmemoryBlock.sort() There is no need to sort the underlying data according to sorted items there. This should reduce cpu usage when registering new time series in `indexdb`. Thanks to @ahfuzhang for the suggestion at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2245	2022-02-24 11:19:44 +02:00
Aliaksandr Valialkin	2431c9cf81	lib/promrelabel: add support for conditional relabeling via `if` filter Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1998	2022-02-24 02:35:13 +02:00
Aliaksandr Valialkin	6fd85117ac	lib/workingsetcache: do not rotate cache if it is in `whole` state This should reduce the maximum memory usage for the cache in `whole` state	2022-02-23 22:55:10 +02:00
Aliaksandr Valialkin	244c23ea2c	lib/workingsetcache: reduce the default cache rotation period from hour to 20 minutes This should reduce memory usage under high time series churn rate	2022-02-23 13:42:27 +02:00
Aliaksandr Valialkin	fcaa0c5202	lib/storage: optimize `/api/v1/status/tsdb` call by skipping all the artificially created tag entries at once This is a follow-up for `b71be42d90`	2022-02-21 19:00:04 +02:00
Aliaksandr Valialkin	986c12487a	lib/mergeset: typo fix after `b6ed9afd6d`	2022-02-21 19:00:04 +02:00
Aliaksandr Valialkin	d4df32cc6b	lib/blockcache: evict entries from the cache in LRU order This should improve hit rate for smaller caches	2022-02-21 19:00:04 +02:00
Roman Khavronenko	5a4b16794d	Consul SD - update services on the watcher's start (#2202 ) * lib/discovery/consul: update services on the watcher's start Previously, watcher's start was only initing goroutines for discovery but not waiting for the first iteration to end. It means first Consul discovery wasn't returning discovered targets until the next iteration. The change makes the watcher's start blocking until we get first discovery iteration done and all registries updated. Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmalert: remove workarounds for consul SD Now when consul SD lib properly updates services on the first start, we don't need workarounds in vmalert. Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/discovery/consul: update after review Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-21 15:33:33 +02:00
Roman Khavronenko	bd7837d524	lib: allow to configure cache size by type (#2206 ) * lib: allow to configure cache size by type https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1940 Signed-off-by: hagen1778 <roman@victoriametrics.com> * Apply suggestions from code review * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-21 13:55:51 +02:00
Aliaksandr Valialkin	5260d7a954	lib/storage: typo fix after `c3affb0c4f`	2022-02-17 12:56:33 +02:00
Aliaksandr Valialkin	d9bdb42219	lib/storage: simplify code for searching for label values This is a follow-up after `9dd191b27c`	2022-02-17 12:39:14 +02:00
Aliaksandr Valialkin	2ebc3d21c3	lib/storage: properly skip composite tag entries when searching for tag names or tag values This is a follow-up for `b71be42d90` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2200	2022-02-16 23:02:18 +02:00
Aliaksandr Valialkin	3107224306	lib/blockcache: fix TestCache by ensuring that the cache size can be divided by the number of cache shards Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2204	2022-02-16 18:48:09 +02:00
Aliaksandr Valialkin	63bc89dd81	lib/storage: document why tsid cache is reset before saving it to disk Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2205	2022-02-16 18:37:29 +02:00
Aliaksandr Valialkin	ee066aa0d5	lib/storage: use binary search instead of full scan for skipping artificial tags when searching for tag names or tag values This should improve performance for /api/v1/labels and /api/v1/label/<label_name>/values See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2200	2022-02-16 18:17:27 +02:00
Roman Khavronenko	8cbcc560b9	vmagent: fix js error on CollapseAll/ExpandAll buttons click (#2192 ) * vmagent: fix js error on CollapseAll/ExpandAll buttons click `Uncaught TypeError: Cannot read properties of null (reading 'style')` Signed-off-by: hagen1778 <roman@victoriametrics.com> * Apply suggestions from code review Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com> Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2021	2022-02-15 12:54:39 +02:00
Corporte Gadfly	cf8df24227	match fileSDCheckInterval with prometheus file_sd_config default (#2188 )	2022-02-15 12:05:57 +02:00
Aliaksandr Valialkin	5d8ea8c918	docs/CHANGELOG.md: document `3d890e89f1`	2022-02-14 17:42:33 +02:00
Nikolay	748034e7af	Adds server certificate reload for lib/http (#2186 ) * Adds server certificate reload for lib/http https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2171 * Update lib/httpserver/httpserver.go Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-14 17:42:33 +02:00
Nikolay	c11d0949c8	fixes all_tenants query option usage for openstack service discovery (#2184 ) explicit use configuration parametr instead of conditional add https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2182	2022-02-14 13:13:03 +02:00
Aliaksandr Valialkin	31b42e9c57	lib/promscrape: add `expand all` and `collapse all` buttons to `/targets` page See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2021	2022-02-12 18:42:01 +02:00
Aliaksandr Valialkin	53c2135d2a	lib/storage: tune the logic for pre-populating of the per-day inverted index for the next day - Postpone the pre-poulation to the last hour of the current day. This should reduce the number of useless entries in the next per-day index, which shouldn't be created there, when the corresponding time series are stopped to be pushed during the current day. - Make the pre-population more smooth in time by using the hash of MetricID instead of MetricID itself when calculating the need for for the given MetricID pre-population. - Sync the logic for pre-population of the next day inverted index with the logic of pre-populating tsid cache after indexdb rotation. This should improve code maintainability. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401	2022-02-12 16:39:33 +02:00
artifactori	943fc056ca	Show gce sdconfig zone on vmagent:8429/config (#2178 ) * vmagent: add test for marshalling gce sdconfig with ZoneYAML * vmagent: implement MarshalYAML for ZoneYAML on gce sdconfig	2022-02-12 00:48:38 +02:00
Roman Khavronenko	d107f86fbc	lib/index: reduce read/write load after indexDB rotation (#2177 ) * lib/index: reduce read/write load after indexDB rotation IndexDB in VM is responsible for storing TSID - ID's used for identifying time series. The index is stored on disk and used by both ingestion and read path. IndexDB is stored separately to data parts and is global for all stored data. It can't be deleted partially as VM deletes data parts. Instead, indexDB is rotated once in `retention` interval. The rotation procedure means that `current` indexDB becomes `previous`, and new freshly created indexDB struct becomes `current`. So in any time, VM holds indexDB for current and previous retention periods. When time series is ingested or queried, VM checks if its TSID is present in `current` indexDB. If it is missing, it checks the `previous` indexDB. If TSID was found, it gets copied to the `current` indexDB. In this way `current` indexDB stores only series which were active during the retention period. To improve indexDB lookups, VM uses a cache layer called `tsidCache`. Both write and read path consult `tsidCache` and on miss the relad lookup happens. When rotation happens, VM resets the `tsidCache`. This is needed for ingestion path to trigger `current` indexDB re-population. Since index re-population requires additional resources, every index rotation event may cause some extra load on CPU and disk. While it may be unnoticeable for most of the cases, for systems with very high number of unique series each rotation may lead to performance degradation for some period of time. This PR makes an attempt to smooth out resource usage after the rotation. The changes are following: 1. `tsidCache` is no longer reset after the rotation; 2. Instead, each entry in `tsidCache` gains a notion of indexDB to which they belong; 3. On ingestion path after the rotation we check if requested TSID was found in `tsidCache`. Then we have 3 branches: 3.1 Fast path. It was found, and belongs to the `current` indexDB. Return TSID. 3.2 Slow path. It wasn't found, so we generate it from scratch, add to `current` indexDB, add it to `tsidCache`. 3.3 Smooth path. It was found but does not belong to the `current` indexDB. In this case, we add it to the `current` indexDB with some probability. The probability is based on time passed since the last rotation with some threshold. The more time has passed since rotation the higher is chance to re-populate `current` indexDB. The default re-population interval in this PR is set to `1h`, during which entries from `previous` index supposed to slowly re-populate `current` index. The new metric `vm_timeseries_repopulated_total` was added to identify how many TSIDs were moved from `previous` indexDB to the `current` indexDB. This metric supposed to grow only during the first `1h` after the last rotation. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip * wip Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-12 00:34:44 +02:00
Aliaksandr Valialkin	21a42e990f	lib/storage: fix broken BenchmarkHeadPostingForMatchers for `{i=~".*"}` after `f4dead529f` The commit `f4dead529f` makes such query to return nothing instead of all the time series. This aligns more with Prometheus behaviour.	2022-02-12 00:28:21 +02:00
Roman Khavronenko	791cad8c2e	lib/promscrape: support prometheus-like duration in scrape configs (#2169 ) * lib/promscrape: support prometheus-like duration in scrape configs The change allows to specify duration values like `1d`, `1w` for fields `scrape_interval`, `scrape_timeout`, etc. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817#issuecomment-1033384766 Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/blockcache: make linter happy Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/promscrape: support prometheus-like duration in scrape configs * add support for extra fields `scrape_align_interval` and `scrape_offset`; * support Prometheus duration parsing for `__scrape_interval__` and `__scrape_duration__` labels; Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip * wip * docs/CHANGELOG.md: document the feature Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-11 16:17:51 +02:00
Aliaksandr Valialkin	727b29d4a3	lib/promscrape/discovery/kubernetes: add `__meta_kubernetes_endpointslice_{label,annotation}*` labels to be consistent with other `role` values for Kubernetes service discovery	2022-02-11 14:56:10 +02:00
Nikolay	265938a385	fixes service discovery for kubernetes (#2173 ) * fixes service discovery for kubernetes now it must take in account all pods that belong to the discovered endpoint and endpointslice adds simple test for endpoints https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2134 * wip * docs/CHANGELOG.md: document the change Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2022-02-11 13:35:34 +02:00
Aliaksandr Valialkin	895c9f4f11	lib/mergeset: tune indexdb/{indexBlocks,dataBlocks} cache sizes further according to production stats Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-02-10 19:11:13 +02:00
Aliaksandr Valialkin	102c9a4bf9	lib/blockcache: use higher number of shards for higher number of CPU cores This should reduce mutex contention and increase performance Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-02-10 19:11:11 +02:00
Aliaksandr Valialkin	ad19c9d302	lib/promscrape: fix errors in test config The errors were discovered after enabling strict parse mode by default. See `9bb60ab00f`	2022-02-08 20:10:28 +02:00
Aliaksandr Valialkin	fae3040868	lib/blockcache: split the cache into multiple shards This should reduce contention on cache mutex on hosts with many CPU cores, which, in turn, should increase overall throughput for the cache. This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-02-08 19:48:32 +02:00
Aliaksandr Valialkin	a0a56d6c1c	lib/mergeset: tune sizes for `indexdb/dataBlocks` and `indexdb/indexBlocks` according to production workload This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007#issuecomment-1032308742	2022-02-08 18:04:03 +02:00
Aliaksandr Valialkin	eed66b6640	lib/promscrape: set `-promscrape.config.strictParse` to true by default This allows detecting long-living silent errors in -promscrape.config	2022-02-08 15:42:33 +02:00
Aliaksandr Valialkin	8863339b6b	lib/blockcache: `make fmt`	2022-02-08 15:42:31 +02:00
Aliaksandr Valialkin	1caee74235	lib/blockcache: eliminate possible race when Cache.Put is called for the same entry from multiple goroutines The race could result in incorrect cache size tracking, which, in turn, could result in too frequent cache cleaning	2022-02-08 01:18:27 +02:00
Aliaksandr Valialkin	10476738a8	lib/blockcache: increase the lifetime for rarely accessed blocks from 2 minutes to 5 minutes This should improve data ingestion speed if time series samples are ingested with interval bigger than 2 minutes. The actual interval could exceed 2 minutes if the original interval between samples doesn't exceed 2 minutes in the case of slow inserts. Slow inserts may appear in the following cases: * Big number of new time series are pushed to VictoriaMetrics, so they couldn't be registered in 2 minutes. * MetricName->tsid cache reset on indexdb rotation or due to unclean shutdown. In this case VictoriaMetrics needs to load MetricName->tsid entries for all the incoming series from IndexDB. IndexDB uses the block cache for increasing lookup performance. If the cache has no the needed block, then IndexDB reads and unpacks the block from disk. This requires an extra disk read IO and CPU. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007 This also should increase performance for periodically executed queries with intervals from 2 minutes to 5 minutes. See the previous similar commit - `43103be011` It is possible that the timeout can be increased further. Let's collect production numbers for this change so the timeout could be adjusted further.	2022-02-08 01:18:27 +02:00
Aliaksandr Valialkin	b7cefff7b0	lib/workingsetcache: use the original cache size limits when rotating caches Previously limits for new caches were taken from cache stats. These limits could mismatch the original limits. This could result in failed cache load if the stored cache has been created with the limits obtained from cache stats.	2022-02-08 01:18:27 +02:00
Aliaksandr Valialkin	87071640a7	lib/blockcache: return proper number of entries from the cache This has been broken in `0d7374ad2f`	2022-02-08 01:18:27 +02:00
Aliaksandr Valialkin	34d14c4940	all: substitute zeroTime with time.Time{}, since this generates more optimal binary code	2022-02-07 14:36:41 +02:00
Aliaksandr Valialkin	e2d12a25e0	lib/netutil: increase dial timeout from 1 second to 5 seconds There are real-world cases when TCP connection needs more than 1 second to be established.	2022-02-07 12:33:40 +02:00
Aliaksandr Valialkin	d24e5d9efd	lib/promscrape: show the total number of scrapes and the total number of scrape errors per target at /targets page This information may be useful when debugging unreliable scrape targets	2022-02-03 20:23:27 +02:00
Aliaksandr Valialkin	678b3e71db	lib/promscrape: provide the ability to fetch target responses on behalf of vmagent or single-node VictoriaMetrics This feature may be useful when debugging metrics for the given target located in isolated environment	2022-02-03 19:02:12 +02:00
Aliaksandr Valialkin	5f266370c5	all: follow-up after `4bdd10ab90` Properly use new bytesutil.Resize* functions	2022-02-01 17:49:28 +02:00
Aliaksandr Valialkin	d8d59ff760	lib/mergeset: pre-allocate data and items for inmemoryBlock in order to reduce memory allocations under high churn rate Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-02-01 11:20:20 +02:00
Aliaksandr Valialkin	02b2bfcff3	lib/bytesutil: split Resize* funcs to MayOverallocate and NoOverallocate for more fine-grained control over memory allocations Follow-up for `f4989edd96` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-02-01 11:20:20 +02:00
Aliaksandr Valialkin	084664d780	lib/encoding: substitute `64-bits.LeadingZeros64()` with `bits.Len64()`	2022-02-01 11:20:20 +02:00
Aliaksandr Valialkin	0fbfa8c245	lib/storage: avoid allocations of tsidPrev on every blockStreamReader.NextBlock() call This is a follow-up for `00b7c97d2a` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2082	2022-01-31 22:47:16 +02:00
Aliaksandr Valialkin	a02dde6cc7	lib/cgroup: fall back to runtime.NumCPU() when determining process_cpu_cores_available metric if it is impossible to determine cpu quota via cgroups Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2107	2022-01-31 20:31:12 +02:00
Aliaksandr Valialkin	566e12874d	lib/cgroup: expose `process_cpu_cores_available` metric This metric shows the number of CPU cores available to the process. This allows creating alerting rules on CPU saturation with the following query: rate(process_cpu_seconds_total[5m]) / process_cpu_cores_available > 0.9 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2107	2022-01-31 20:25:15 +02:00
Aliaksandr Valialkin	776b7bc9f8	lib/storage/table.go: add missing `tb.ptwsLock.Unlock()` before the return This is a follow-up for `a1083d0531` See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2103	2022-01-28 12:12:58 +02:00
匠心零度	a1083d0531	optimized code (#2103 ) * optimized code ,because only the first error,so no need var errors []error * optimized code ,because only the first error,so no need var errors []error Co-authored-by: lirenzuo <lirenzuo@shein.com>	2022-01-28 12:10:47 +02:00
Aliaksandr Valialkin	6232eaa938	lib/bytesutil: split Resize() into ResizeNoCopy() and ResizeWithCopy() functions Previously bytesutil.Resize() was copying the original byte slice contents to a newly allocated slice. This wasted CPU cycles and memory bandwidth in some places, where the original slice contents wasn't needed after slize resizing. Switch such places to bytesutil.ResizeNoCopy(). Rename the original bytesutil.Resize() function to bytesutil.ResizeWithCopy() for the sake of improved readability. Additionally, allocate new slice with `make()` instead of `append()`. This guarantees that the capacity of the allocated slice exactly matches the requested size. The `append()` could return a slice with bigger capacity as an optimization for further `append()` calls. This could result in excess memory usage when the returned byte slice was cached (for instance, in lib/blockcache). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-01-25 15:28:42 +02:00
Aliaksandr Valialkin	7ec0705b98	lib/mergeset: allocate the needed amounts of memory when unmarshaling inmemoryBlock This should reduce the memory required for indexdb/dataBlocks cache. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-01-24 18:52:22 +02:00
Aliaksandr Valialkin	65176726b3	lib/logger: removed broken test after `746ee191e8`	2022-01-24 12:15:11 +02:00
Aliaksandr Valialkin	49650fe6aa	lib/logger/throttler.go: show the original location of the error and warning message Previously the location inside LogThrottler implementation was shown. This could complicate debugging.	2022-01-23 13:55:48 +02:00
Aliaksandr Valialkin	233101137d	lib/blockcache: optimize blockcache a bit - Optimize Cache.RemoveBlocksFromPart(), so it doesn't need to iterate over all the cached blocks. - Cache blocks if there were no cache misses during the last 2 minutes. This may be the case when new blocks are added simultaneously to the storage and to the cache. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-01-23 13:08:55 +02:00
Aliaksandr Valialkin	9edf407144	lib/mergeset: tune caches size limits for `indexdb/dataBlocks` and `indexdb/indexBlocks` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-01-21 12:46:05 +02:00
Aliaksandr Valialkin	4e05298756	lib/storage: properly limit cardinality when ingesting multiple samples for the same time series in a single request	2022-01-21 12:38:22 +02:00
Aliaksandr Valialkin	e3277918e4	lib/storage: verify that blocks in a single part are sorted by TSID when reading sequential blocks from the part This may help narrowing down the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2082	2022-01-20 20:37:28 +02:00
Aliaksandr Valialkin	54ee71e16d	lib/storage: set bsm.Block to nil on error, so the previous block couldn't be used. This may help nailing down the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2082	2022-01-20 20:37:24 +02:00
Aliaksandr Valialkin	5159a9451f	lib/blockcache: add missing dependency after `145337792d` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-01-20 18:51:03 +02:00
Aliaksandr Valialkin	6ae584b9b3	lib/{mergeset,storage}: properly limit cache sizes for indexdb Previously these caches could exceed limits set via `-memory.allowedPercent` and/or `-memory.allowedBytes`, since limits were set independently per each data part. If the number of data parts was big, then limits could be exceeded, which could result to out of memory errors. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007	2022-01-20 18:45:03 +02:00
Aliaksandr Valialkin	da95516a1f	lib/promscrape: expose promscrape_stale_samples_created_total metric for monitoring the number of created stale samples	2022-01-14 01:00:40 +02:00
Aliaksandr Valialkin	dd91759f1f	lib/promscrape/discovery/kubernetes: add `__meta_kubernetes_node_provider_id` label for discovered Kubernetes nodes in the same way as Prometheus does See https://github.com/prometheus/prometheus/pull/9603	2022-01-13 23:17:24 +02:00
Aliaksandr Valialkin	bc18368c15	lib/promscrape/discovery/kubernetes: add the ability to limit service discovery to the current namespace See https://github.com/prometheus/prometheus/issues/9782 and https://github.com/prometheus/prometheus/pull/9881	2022-01-13 22:44:59 +02:00
Aliaksandr Valialkin	de8299f465	lib/promscrape/discovery/dockerswarm: follow up after `68a117a25a` - Document the bugfix at docs/CHANGELOG.md - Set __address__ field after copying commonLabels to the resulting map of discovered labels. This makes sure that the correct __address__ label is used.	2022-01-11 09:22:03 +02:00
Alexander Shtuchkin	45a92e6ce1	Fix for #2038 : Make correct __address__ value for dockerswarm promscrape (#2041 )	2022-01-11 09:22:02 +02:00
Aliaksandr Valialkin	fa89f3e5a5	lib/promscrape: do not send staleness markers on graceful shutdown This follows Prometheus behavior. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2013#issuecomment-1006994079	2022-01-07 01:19:06 +02:00
Aliaksandr Valialkin	80fc3fda07	lib/storage: follow-up for `38bf5fc136`	2022-01-05 16:02:17 +02:00
weng zhao	1e0fe615ad	vmstorage: fix query like `{foo=~"bar\|"}` return extra timeseries cause by negative filter transformation malfunction (#2032 ) 1. L2749 make kb.B remain the value of comonPrefix instead of tf.prefix 2. L2762 avoid change tf.value from "bar\|" to ".+r\|"	2022-01-05 15:57:54 +02:00
Aliaksandr Valialkin	c1722003a2	lib/promscrape: scrape replicated targets at different offsets in vmagent replicated clustering mode This guarantees that the deduplication consistently leaves samples from the same vmagent replica. See https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets	2021-12-23 00:21:41 +02:00
Nikolay	6cdc934c3d	adds restore.lock (#1988 ) * adds restore.lock it must prevent from running storage after incomplete restore process https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1958 * return back flock file deletion * Apply suggestions from code review * wip * docs/CHANGELOG.md: document https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1958 Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2021-12-22 13:10:56 +02:00
Aliaksandr Valialkin	727797a6fd	all: use logger.WithThrottler() where appropriate	2021-12-21 17:10:54 +02:00
Aliaksandr Valialkin	4dbf12254d	lib/promscrape: take into account the original job_name when creating an unique key per each scrape target This should handle the case when the original job_name has been changed in -promscrape.config , while the resulting job label remains the same because it is overriden via relabeling.	2021-12-21 16:42:42 +02:00
Roman Khavronenko	23e1de06ee	vmagent: add error log for skipped data block when rejected by receiv… (#1956 ) * vmagent: add error log for skipped data block when rejected by receiving side Previously, rejected data blocks were silently dropped - only metrics were update. From operational perspective, having an additional logging for such cases is preferable. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1911 Signed-off-by: hagen1778 <roman@victoriametrics.com> * vmagent: throttle log messages about skipped blocks The new type of logger was added to logger pacakge. This new type supposed to control number of logged messages by time. Signed-off-by: hagen1778 <roman@victoriametrics.com> * lib/logger: make LogThrottler public, so its methods can be inspected by external packages Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2021-12-21 16:42:38 +02:00
Aliaksandr Valialkin	053e85ff3d	all: typo fix: unexected -> unexpected	2021-12-20 17:40:13 +02:00
Aliaksandr Valialkin	406cb06f8c	lib/persistentqueue: check that readerOffset doesnt exceed writerOffset after each readerOffset increase This should help detecting the source of the panic from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1981	2021-12-20 17:26:07 +02:00
Aliaksandr Valialkin	f22aab411b	lib/storage: properly update per-part `min_dedup_interval` file contents after merge Previously 0s was always written even if -dedup.minScrapeInterval was set to non-zero value This is a follow-up for `4ff647137a`	2021-12-17 20:12:18 +02:00
Aliaksandr Valialkin	5bd4e47a9e	lib/promscrape: allow up to 5 redirects when scraping a target by default See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945	2021-12-16 00:14:45 +02:00
Aliaksandr Valialkin	d36fdbe537	lib/storage: deduplicate samples more thoroughly Previously some duplicate samples may be left on disk for time series with high churn rate. This may result in higher disk space usage.	2021-12-15 16:00:30 +02:00
Aliaksandr Valialkin	bc3923111b	lib/storage: return dedup interval in milliseconds from GetDedupInterval() This removes duplicate .Milliseconds() calls after GetDedupInterval() calls.	2021-12-15 13:27:27 +02:00
Aliaksandr Valialkin	cdfe854c9b	lib/storage: explicitly pass dedupInterval to DeduplicateSamples() and deduplicateSamplesDuringMerge() This improves the code readability and debuggability, since the output of these functions stops depending on global state.	2021-12-14 20:52:29 +02:00
Aliaksandr Valialkin	c922c7af9a	lib/storage: convert alternate regexps into Graphite wildcards inside `__graphite__` pseudo-label For example, `{__graphite__=~"foo.(bar\|baz)"}` is automatically converted to `{__graphite__=~"foo.{bar,baz}"}` before execution. This allows using multi-value Grafana template variables such as `{__graphite__=~"foo.($app)"}`.	2021-12-14 19:55:59 +02:00
Aliaksandr Valialkin	38f5bc7451	lib/httpserver: add missing 127.0.0.1 hostname to the logged address for http and pprof server if the address starts with ':' This allows copy-pasting the url to http server from logs.	2021-12-08 16:15:12 +02:00
Aliaksandr Valialkin	9aa9b081a4	app/vminsert: add `-maxLabelValueLen` command-line flag See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1908	2021-12-06 11:42:24 +02:00
Aliaksandr Valialkin	51f2eb3c46	lib/workingsetcache: fix `unaligned 64-bit atomic operation` panic on 32-bit architectures The panic has been introduced in `7275ebf91a`	2021-12-03 01:22:30 +02:00
Aliaksandr Valialkin	d40441947a	app: allow specifying http and https urls in the following command-line flags * -promscrape.config * -relabelConfig * -remoteWrite.relabelConfig * -remoteWrite.urlRelabelConfig	2021-12-03 00:11:47 +02:00
Aliaksandr Valialkin	daaea1eb2c	app/vmauth: follow-up for `13368bed18` * Document the ability to specify http or https urls in `-auth.config` at docs/CHANGELOG.md * Move the ReadFileOrHTTP to lib/fs, so it can be re-used in other places where a file should be read from the given path. For example, in `-promscrape.config` at `vmagent`.	2021-12-02 23:34:15 +02:00
Aliaksandr Valialkin	b885a3b6e9	lib/httpserver: expose `/-/healthy` and `/-/ready` endpoints as Prometheus does This improves integration with third-party solutions, which rely on these endpoints. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1833	2021-12-02 14:37:50 +02:00
Aliaksandr Valialkin	c540235470	app: use relative paths instead of absolute paths for the supported http handlers on the main page This allows hiding VictoriaMetrics components behind proxies, which serve pages at different path prefixes See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1858	2021-12-02 13:54:15 +02:00
Aliaksandr Valialkin	d1289383eb	lib/protoparser/graphite: allow multiple separators between metric name, value and timestamp	2021-12-02 13:44:01 +02:00
Aliaksandr Valialkin	37a2bea072	lib/protoparser/graphite: properly parse Graphite line with whitespace after the timestamp See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1865	2021-12-02 13:33:50 +02:00
Aliaksandr Valialkin	2b7dee15dd	app/{vmbackup,vmrestore}: export internal metrics at `/metrics` http handler	2021-12-02 11:56:34 +02:00
Aliaksandr Valialkin	ab4be24397	app/vmstorage: export vm_cache_size_max_bytes metrics for determining capacity of various caches The vm_cache_size_max_bytes metric can be used for determining caches which reach their capacity via the following query: vm_cache_size_bytes / vm_cache_size_max_bytes > 0.9	2021-12-02 10:30:01 +02:00
Aliaksandr Valialkin	d4655beae8	lib/fs: add `vm_filestream_read_duration_seconds_total` and `vm_filestream_write_duration_seconds_total` metrics These metrics help determining persistent disk saturation with `rate(vm_filestream_read_duration_seconds_total) > 0.9`	2021-12-02 09:13:20 +02:00
Aliaksandr Valialkin	2e43cd9d62	lib/storage: do not take into account -storage.minFreeDiskSpaceBytes during background merges	2021-12-01 12:30:03 +02:00
Nikolay	cf1d2f289b	removes FileSize from backup part key (#1872 ) * removes FileSize from backup part key it should fix download restoration for backups * Update lib/backup/common/part.go Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2021-12-01 12:30:03 +02:00
Aliaksandr Valialkin	71c0f7cce3	lib/storage: take into account `-storage.minFreeDiskSpaceBytes` when performing big merges Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269	2021-11-30 12:56:53 +02:00
guidao	6fa7ad69fc	fix #1830 (#1861 ) Co-authored-by: wangfeng <wangfeng@zhihu.com>	2021-11-30 01:16:15 +02:00
Aliaksandr Valialkin	975498d402	lib/protoparser/prometheus: follow-up for `8e338632a3` Do not spend CPU time on error message formatting if error logger is disabled	2021-11-30 00:51:15 +02:00
Nikolay	40f0726147	Changes unmarshallRow logger to noop for getRowsDiff (#1835 )	2021-11-30 00:51:14 +02:00
Aliaksandr Valialkin	4ad397188e	lib/protoparser: do not log `connection reset by peer` error when reading the data via InfluxDB, Graphite and OpenTSDB protocols over plain TCP connections This error is expected, so there is no need in spamming the log with this error.	2021-11-29 21:58:11 +02:00
Aliaksandr Valialkin	e93f46187d	lib/persistentqueue: add vm_persistentqueue_read_duration_seconds_total and vm_persistentqueue_write_duration_seconds_total metrics for determining disk usage saturation at vmagent	2021-11-17 16:42:12 +02:00
Lan	6662714c6c	Add flag of S3ForcePathStyle (#1802 )	2021-11-17 01:10:22 +02:00
Aliaksandr Valialkin	4fb19fe34b	all: consistently return `application/json` content-type without `charset=utf-8` The `application/json` content-type has utf-8 encoding by default. See https://stackoverflow.com/questions/9254891/what-does-content-type-application-json-charset-utf-8-really-mean Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897	2021-11-09 18:07:22 +02:00
Aliaksandr Valialkin	f41c02e475	lib/promscrape: improve logging for `scrape_config_files` parse errors Log the actual file path, which led to the parse error. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1789	2021-11-08 13:34:26 +02:00
Aliaksandr Valialkin	847004fa77	app/{vminsert,vmagent}: hide passwords and auth tokens by default at `/config` page Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764	2021-11-05 14:42:13 +02:00
Aliaksandr Valialkin	bc72b83102	lib/promauth: do not show empty values in `oauth2` config section at `/config` page	2021-11-05 12:54:10 +02:00
Aliaksandr Valialkin	d445d22c0c	lib/promscrape: add `-promscrape.maxResponseHeadersSize` command-line flag for tuning the maximum http response headers size from Prometheus scrape targets	2021-11-03 22:27:55 +02:00
Aliaksandr Valialkin	6873d6d893	lib/protoparser/influx: automatically detect timestamp precision depending on the number of decimal digits in the timestamp	2021-10-28 12:48:34 +03:00
Aliaksandr Valialkin	105deb164c	lib/logger: show only explicitly set command-line flags in logs This reduces initial verbosity in logs	2021-10-28 11:03:21 +03:00
Aliaksandr Valialkin	b626d6d606	lib/promscrape: add `collapse` and `expand` buttons per each group of targets from the same scrape job	2021-10-27 20:04:03 +03:00
Aliaksandr Valialkin	2ebee4e741	app/{vmalert,vmagent}: improve the distribution of scrape offsets among targets / rules Previously only the lower part of 64-bit hash was used for calculating the offset. This may give uneven distribution in some cases. So let's use all the available 64 bits from the hash for calculating the offset.	2021-10-27 20:04:02 +03:00
Aliaksandr Valialkin	92d01db85a	lib/protoparser/prometheus: optimize GetRowsDiff() function This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1745 , since the provided profile shows that the majority of CPU and memory is spent in this function during `streamParse` when `-promscrape.noStaleMarkers` wasn't set.	2021-10-27 18:55:25 +03:00
Aliaksandr Valialkin	16f1aaf0b5	lib/protoparser/prometheus: add a benchmark for GetRowsDiff	2021-10-27 18:55:23 +03:00
Aliaksandr Valialkin	99784b21c1	all: fix build issues and tests for Apple M1 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1653	2021-10-27 15:07:19 +03:00
Aliaksandr Valialkin	ad445a06cd	lib/promscrape: properly show `proxy_url` option value at `/config` page Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1755	2021-10-26 21:24:22 +03:00
Aliaksandr Valialkin	b08f51f5d3	lib/promscrape: do not populate response body to memory in stream parsing mode if -promscrape.noStaleMarkers is set The response body isn't used if -promscrape.noStaleMarkers is set after the commit `2876137c92` , so there is no sense in pupulating it in memory. This should reduce memory usage when scraping big responses. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1728#issuecomment-949630694	2021-10-22 16:49:21 +03:00
Aliaksandr Valialkin	6bc10f0623	lib/promscrape: do not sort original labels and do not intern label string for the original labels before the sharding code is executed This should reduce CPU and memory usage in shard mode when service discovery finds big number of scrape targets with many long labels. See https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets This is a follow-up after `9882cda8b9` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1728	2021-10-22 13:55:39 +03:00
Aliaksandr Valialkin	a8bcc3c276	lib/promscrape: reduce memory usage if `-promscrape.noStaleMarkers` command-line flag is passed Do not store in memory the response from the last scrape per each target if -promscrape.noStaleMarkers option is enabled. This should reduce memory usage when the scraped targets return large responses.	2021-10-22 13:22:08 +03:00
Nikolay	83e1dfccba	adds tab as second separator for graphite text protocol (#1733 ) * adds tab as second separator for graphite text protocol * changes indexFunc for indexAny * Update lib/protoparser/graphite/parser_test.go Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2021-10-22 12:29:27 +03:00
Aliaksandr Valialkin	d56e676d71	lib/flagutil: do not expose sensitive info (passwords, keys and urls) at /flags page	2021-10-20 00:51:15 +03:00
Aliaksandr Valialkin	5705f4b6d1	lib/httpserver: expose command-line flags at `/flags` page This should simplify debugging. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695	2021-10-20 00:46:54 +03:00
Aliaksandr Valialkin	a105b71116	lib/envflag: use flag.Set for setting the flags from env vars This should make visible the set flags at flag.Visit(), which is used later for logging and exporting the `is_set` label for these flags at /metrics page	2021-10-20 00:46:53 +03:00
Aliaksandr Valialkin	93511b4be7	lib/storage: log a warning when the -storageDataPath has less than -storage.minFreeDiskSpaceBytes This should improve the debuggability of the readonly feature. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1727	2021-10-19 23:58:09 +03:00
Aliaksandr Valialkin	ea69eef375	lib/promscrape/discovery/kubernetes: log a warning if `role: endpoints` discovers more than 1000 targets per a single endpoint In this case `role: endpointslice` must be used instead. See the following references: * https://kubernetes.io/docs/reference/labels-annotations-taints/#endpoints-kubernetes-io-over-capacity * https://github.com/kubernetes/kubernetes/pull/99975 * https://github.com/prometheus/prometheus/issues/7572#issuecomment-934779398	2021-10-19 13:22:28 +03:00
Nikolay	e84a063209	changes job source for /target api (#1723 ) use jobNameOriginal instead of relabeled as prometheus does https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1707	2021-10-19 09:00:05 +03:00
Aliaksandr Valialkin	fbcc8b5c7d	lib/promscrape: set `honor_timestamps: true` by default if this option isnt set explicitly in scrape configs This aligns the behavior to Prometheus - see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config	2021-10-16 20:48:53 +03:00
Aliaksandr Valialkin	0a9be5ef9d	lib/promscrape: expose `promscrape_series_limit_max_series` and `promscrape_series_limit_current_series` metrics per each scrape target with the enabled unique series limiter	2021-10-16 19:14:13 +03:00
Aliaksandr Valialkin	99011c6b63	lib/promscrape: always initialize http client for stream parsing mode Stream parsing mode can be automatically enabled when scraping targets with big response bodies exceeding the -promscrape.minResponseSizeForStreamParse , so it must be always initialized.	2021-10-16 13:19:48 +03:00
Aliaksandr Valialkin	0f4fda1bda	lib/promscrape: store the last scraped response in compressed form if its size exceeds -promscrape.minResponseSizeForStreamParse This should reduce memory usage when scraping targets with big response bodies.	2021-10-16 13:00:11 +03:00
Aliaksandr Valialkin	0452a8d4e8	lib/promscrape: store the full response in stream parsing mode in scrapeWork.lastScrape byte slice This allows sending staleness marks and properly calculate scrape_series_added metric in stream parsing mode at the cost of the increased memory usage, since now the potentially big response is kept in the lastScrape byte slice per each scrapeWork. In practice the memory usage increase shouldn't be big, since the response size is usually much smaller than the parsed metrics from this response after the relabeling, which usually adds a big pile of target-specific labels per each metric.	2021-10-15 15:26:24 +03:00
Aliaksandr Valialkin	3e9beb0f8d	lib/promscrape/discovery/kubernetes: rename endpointslices.go -> endpointslice.go in order to be consistent with EndpointSlice struct name This is a follow-up for `31b42b30b6`	2021-10-15 12:27:31 +03:00
Aliaksandr Valialkin	25421fa2ae	lib/promscrape: add `-promscrape.minResponseSizeForStreamParse` command-line option for automatic switching to stream parsing mode when scraping targets with big responses This should reduce memory usage when vmagent scrapes targets with non-uniform response sizes. This is common case in Kubernetes monitoring.	2021-10-14 12:30:55 +03:00
Aliaksandr Valialkin	bee130cc78	lib/promscrape: return error if `sample_limit` or `series_limit` options are set when stream parsing mode is enabled	2021-10-14 12:30:54 +03:00
Aliaksandr Valialkin	5b7d90d178	lib/promscrape: add ability to show the original labels for discovered targets at /targets page See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1698	2021-10-13 16:44:34 +03:00
Roman Khavronenko	5dab25e8ad	lib/promscrape: make errcheck happy (#1703 )	2021-10-13 15:11:45 +03:00
Aliaksandr Valialkin	c3a729d458	lib/promscrape: shard targets among cluster nodes after relabeling is applied This guarantees that targets with the same set of labels go to the same vmagent node. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1687#issuecomment-940629495	2021-10-12 17:06:37 +03:00
Aliaksandr Valialkin	aeedfe2fe2	app/vmagent: expose -promscrape.config contents at /config page as Prometheus does See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695	2021-10-12 16:27:37 +03:00
Aliaksandr Valialkin	84aa08d93a	lib/promscrape: use Prometheus format for target labels at `/targets` page This should simplify copy-pasting the labels to/from PromQL / MetricsQL	2021-10-11 12:42:18 +03:00
Aliaksandr Valialkin	a7a1305395	lib/storage: fix unaligned access on 32-bit architectures. The bug has been introduced at `a171916ef5`	2021-10-08 19:38:20 +03:00
Aliaksandr Valialkin	a47754b689	lib/protoparser/clusternative: typo fix after `4fddcf4c83`	2021-10-08 15:38:47 +03:00
Aliaksandr Valialkin	4fddcf4c83	app/{vminsert,vmstorage}: follow-up after `a171916ef5` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269	2021-10-08 14:09:51 +03:00
Nikolay	a171916ef5	Adds read-only mode for vmstorage node (#1680 ) * adds read-only mode for vmstorage https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269 * changes order a bit * moves isFreeDiskLimitReached var to storage struct renames functions to be consistent change protoparser api - with optional storage limit check for given openned storage * renames freeSpaceLimit to ReadOnly	2021-10-08 12:52:56 +03:00
Ziqi Zhao	1db3aeab36	fix some typos (#1678 ) Co-authored-by: 柘远 <zzq237937@alibaba-inc.com>	2021-10-06 14:43:56 +03:00
Aliaksandr Valialkin	522a404b79	lib/promscrape: reduce memory allocations in mergeLabels() after `48e3e6c8df`	2021-09-30 16:56:43 +03:00
Aliaksandr Valialkin	7b69d478ec	lib/protoparser: go fmt	2021-09-29 21:17:49 +03:00
Aliaksandr Valialkin	6167890d0e	lib/protoparser/prometheus: compare invalid Prometheus lines in full	2021-09-29 19:41:23 +03:00
Aliaksandr Valialkin	8dcf814c48	app/{vmbackup,vmrestore}: switch from `gcs://...` to `gs://...` urls for backups to GCS The `gs://` urls are commonly used, so prefer them instead of `gcs://` urls, while leaving support for `gcs://` urls for backwards compatibility.	2021-09-29 12:12:37 +03:00
Nikolay	9be5689b3f	changes auth validation for openstack (#1663 ) * changes auth validation for openstack must fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1655 * Apply suggestions from code review Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2021-09-29 00:33:38 +03:00
Aliaksandr Valialkin	4e65bfcc00	app/{vminsert,vmagent}: add ability to ingest data via DataDog "submit metrics" API See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/206	2021-09-29 00:12:26 +03:00
Aliaksandr Valialkin	d15d036a5a	lib/storage: properly handle `{__name__=~"prefix(suffix1\|suffix2)",other_label="..."}` queries They were broken in the commit `00cbb099b6` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1644	2021-09-23 21:52:31 +03:00
Aliaksandr Valialkin	d8de26bbfd	lib/promscrape: add `vm_promscrape_max_scrape_size_exceeded_errors_total` metric for counting of the failed scrapes due to the exceeded response size Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1639	2021-09-23 14:48:16 +03:00
Aliaksandr Valialkin	86bafe796c	lib/httpserver: add `-enterprise` and/or `-cluster` suffixes to `short_version` label of `vm_app_version` metric See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1635	2021-09-21 23:12:50 +03:00
Aliaksandr Valialkin	c3e1f87048	lib/promrelabel: fix parsing `regex: true` in relabeling rules	2021-09-21 23:01:40 +03:00
Nikolay	dd53abf36d	changes protoparser apis for accepting reading from io.Reader (#1624 ) adds InsertHandlerForReader apis to vmagent	2021-09-20 14:54:20 +03:00
Nikolay	1ab2f844a2	makes filters optional for ec2 api requests (#1627 ) filters can be applied only for DescribeInstances requests, like prometheus does. related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626	2021-09-17 18:12:25 +03:00
Aliaksandr Valialkin	1493461244	lib/storage: follow up after `00cbb099b6`	2021-09-14 14:23:02 +03:00
faceair	61a51f7c15	lib/storage: optimize convert multiple values regexp filter to composite tag filter (#1610 ) * lib/storage: optimize convert multiple values regexp filter to composite tag filter * Apply suggestions from code review Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>	2021-09-14 14:23:01 +03:00
Aliaksandr Valialkin	184e145570	docs: consistency renaming: Influx -> InfluxDB	2021-09-13 17:14:45 +03:00
Aliaksandr Valialkin	b684624f67	lib/promscrape/discovery/docker: support host networking mode See https://github.com/prometheus/prometheus/issues/9116	2021-09-13 13:30:55 +03:00
Aliaksandr Valialkin	6ed9f10da5	lib/promscrape/discovery/kubernetes: properly use https scheme for wildcard TLS certificates in ingress target discovery See https://github.com/prometheus/prometheus/issues/8902	2021-09-13 13:04:43 +03:00
Aliaksandr Valialkin	d90834da70	lib/promscrape: generate `scrape_timeout_seconds` metric per each scrape target in the same way as Prometheus 2.30 does See https://github.com/prometheus/prometheus/pull/9247	2021-09-12 15:21:26 +03:00
Aliaksandr Valialkin	279f37c9e7	lib/promscrape: `make fmt`	2021-09-12 13:35:21 +03:00
Aliaksandr Valialkin	6c97388dde	lib/promscrape: add ability to configure scrape_timeout and scrape_interval via relabeling See https://github.com/prometheus/prometheus/pull/8911	2021-09-12 13:35:20 +03:00
Aliaksandr Valialkin	09670479cd	lib/promscrape: reduce CPU usage for common case when calculating `scrape_series_added` metric Also reduce CPU usage when applying `series_limit` to scrape targets with constant set of metrics. The main idea is to perform the calculations on scrape_series_added and series_limit only if the set of metrics exposed by the target has been changed. Scrape targets rarely change the set of exposed metrics, so this optimization should reduce CPU usage in general case.	2021-09-12 12:53:45 +03:00
Aliaksandr Valialkin	c339642858	lib/promscrape: add the actual job name to the labels of promscrape_series_limit_rows_dropped_total metric	2021-09-11 11:03:38 +03:00
Aliaksandr Valialkin	6d6cf1b6e0	lib/storage: verify that the tsidsFound contain the needed tsids in tests added at `f4dead529f`	2021-09-11 11:02:56 +03:00
Aliaksandr Valialkin	5aaaa686a4	lib/promscrape: send stale markers for disappeared metrics like Prometheus does	2021-09-11 11:02:56 +03:00
Aliaksandr Valialkin	c2f37f049b	lib/storage: properly search series by multiple tag filters matching empty labels such as foo{bar=~"baz\|",x=~"y\|"} Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1601 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395	2021-09-09 21:12:53 +03:00
Aliaksandr Valialkin	d613a018c8	lib/promscrape: add ability to set `series_limit` and `stream_parse` options via relabeling This allows managing these options on a per-target basis. Typical use case: to manage these options for pods via Kubernetes annotations.	2021-09-09 18:51:23 +03:00
Aliaksandr Valialkin	b64866e64c	lib/promscrape: add `scrape_` prefix to `job` and `target` labels exported by `promscrape_series_limit_rows_dropped_total` metric This is needed in order to prevent from possible clash with the corresponding (job, target) labels for the job, which scrapes this metric.	2021-09-09 17:31:04 +03:00
Aliaksandr Valialkin	75c3514c5c	lib/promrelabel: add `keep_metrics` and `drop_metrics` actions to relabeling rules These actions simlify metrics filtering. For example, - action: keep_metrics regex: 'foo\|bar\|baz' would leave only metrics with `foo`, `bar` and `baz` names, while the rest of metrics will be deleted. The commit also makes possible to split long regexps into multiple lines. For example, the following config is equivalent to the config above: - action: keep_metrics regex: - foo - bar - baz	2021-09-09 16:25:09 +03:00
mxlxm	42e07cfaea	reset deadline, fix #1562 . (#1597 ) * reset deadline, fix #1562. reset deadline before we put it back to pool. * make errcheck happy	2021-09-07 20:54:17 +03:00
Aliaksandr Valialkin	c4df601f43	lib/promscrape: add the ability to limit the number of unique series per each scrape target The number of series per target can be limited with the following options: * Global limit with `-promscrape.maxSeriesPerTarget` command-line option. * Per-target limit with `max_series: N` option in `scrape_config` section. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1561	2021-09-01 16:08:12 +03:00
Aliaksandr Valialkin	146c14d879	lib/promscrape/discovery/kubernetes: return back support `role: endpointslices`, since it is used by VictoriaMetrics operator This is a follow up commit after `31b42b30b6`	2021-08-29 12:37:36 +03:00
Aliaksandr Valialkin	18d7adf731	lib/protoparser/opentsdb: follow-up after `8ee75ca45a`	2021-08-29 11:50:01 +03:00
envzhu	00dddfe02f	lib/protoparser/opentsdb: accept multiple spaces between fields in a row as a deliminator. (#1575 )	2021-08-29 11:50:00 +03:00
Aliaksandr Valialkin	ca61d7c82b	lib/promscrape/discovery/kubernetes: rename `role: endpointslices` to `role: endpointslice` to be consistent with Prometheus See `2ec6c7dbb8/discovery/kubernetes/kubernetes.go (L99)`	2021-08-29 11:23:59 +03:00
Aliaksandr Valialkin	327034b54f	lib/promscrape/discovery/kubernetes: use v1 API instead of v1beta1 API for `role: ingress` and `role: endpointslices` This should fix service discovery for these roles in Kubernetes v1.22 and newer versions. See https://kubernetes.io/docs/reference/using-api/deprecation-guide/#ingress-v122 The corresponding change in Prometheus - https://github.com/prometheus/prometheus/pull/9205	2021-08-29 11:23:58 +03:00
Aliaksandr Valialkin	7fdb4db73d	lib/promscrape: add ability to load scrape configs from multiple files See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1559	2021-08-26 08:51:53 +03:00
Aliaksandr Valialkin	4a2d7aec7f	lib/promscrape: expose promscrape_discovery_http_errors_total metric for tracking errors per each http_sd config	2021-08-25 13:05:29 +03:00
Aliaksandr Valialkin	b885bd9b7d	lib/{mergeset,storage}: improve the detection of the needed free space for background merge This should prevent from possible out of disk space crashes during big merges. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1560	2021-08-25 10:01:09 +03:00
Aliaksandr Valialkin	67bc407747	lib/promscrape: reduce memory and CPU usage when Prometheus staleness tracking is enabled for metrics from deleted / disappeared scrape targets Store the scraped response body instead of storing the parsed and relabeld metrics. This should reduce memory usage, since the response body takes less memory than the parsed and relabeled metrics. This is especially true for Kubernetes service discovery, which adds many long labels for all the scraped metrics. This should also reduce CPU usage, since the marshaling of the parsed and relabeld metrics has been substituted by response body copying. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526	2021-08-21 21:24:07 +03:00
Aliaksandr Valialkin	c3b24882a7	lib/promscrape: use scrapeTimestamp when storing stale markers for failed scrape This will make timestamps for stale markers more consistent for timestamps for other samples	2021-08-19 14:19:54 +03:00
Aliaksandr Valialkin	8ee575dee9	lib/promscrape: send stale markers for the previously scraped metrics on failed scrapes like Prometheus does	2021-08-18 22:00:46 +03:00
Aliaksandr Valialkin	5d92fafc40	app/vmselect: add `-search.noStaleMarkers` command-line flag for disabling stale markers handling in queries This option allows reducing CPU usage a bit when VictoriaMetrics is used for collecting and processing non-Prometheus data. For example, InfluxDB line protocol, Graphite, OpenTSDB, CSV, etc.	2021-08-18 13:58:06 +03:00
Aliaksandr Valialkin	f21fad53b4	lib/promscrape: add ability to disable sending Prometheus staleness markers with -promscrape.disableStaleMarkers command-line flag This option can be useful when vmagent consumes too much additional memory for staleness markers functionality and when staleness markers aren't needed.	2021-08-18 13:58:05 +03:00
Aliaksandr Valialkin	db34c40aec	lib/promscrape: stop scrapers for the removed targets before starting scrapers for the added targets This should prevent from possible time series overlap when old target is substituted by new target (for example, during Kubernetes deployments). Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509	2021-08-17 01:00:40 +03:00
Aliaksandr Valialkin	5f13c519ee	lib/promscrape: restore red highlighting for DOWN targets at /targets page Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1461	2021-08-15 16:04:33 +03:00
Aliaksandr Valialkin	c1f81f08d4	all: add support for Prometheus staleness markers Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845	2021-08-13 12:13:15 +03:00
Aliaksandr Valialkin	90efb5831b	lib/envflag: add a link to docs for -envflag.enable	2021-08-11 10:32:40 +03:00
Aliaksandr Valialkin	b877538622	app/vmagent: follow-up after `fe445f753b` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1491	2021-08-05 09:51:00 +03:00
Omar Ghader	fe445f753b	feature: Add multitenant for vmagent (#1505 ) * feature: Add multitenant for vmagent * Minor fix * Fix rcs index out of range * Minor fix * Fix multi Init * Fix multi Init * Fix multi Init * Add default multi * Adjust naming * Add TenantInserted metrics * Add TenantInserted metrics * fix: remove unused metrics for vmagent * fix: remove unused metrics for vmagent Co-authored-by: mghader <marc.ghader@ubisoft.com> Co-authored-by: Sebastian YEPES <syepes@gmail.com>	2021-08-05 09:44:29 +03:00
Aliaksandr Valialkin	77bb9e1656	lib/promscrape/discovery/gce: add __meta_gce_interface_ipv4_<name> labels as in Prometheus 2.29 See https://github.com/prometheus/prometheus/pull/8978	2021-08-03 15:51:45 +03:00
Aliaksandr Valialkin	336a2aa2e0	lib/promscrape/discovery/ec2: add `__meta_ec2_availability_zone_id` label as Prometheus 2.29 does	2021-08-03 13:28:13 +03:00
Aliaksandr Valialkin	c473d8ffe1	li/storage: re-use the per-day inverted index search code for searching in global index This allows removing a big pile of outdated code for global index search. This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1486	2021-07-30 10:28:20 +03:00
Nikolay	6d47e750be	adds check for region with custom s3 endpoint (#1465 )	2021-07-27 12:39:10 +03:00
Aliaksandr Valialkin	1950f57316	lib/storage: yet another attempt to properly determine disk space shortage, which prevents from optimal merges Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373	2021-07-27 12:03:31 +03:00
Aliaksandr Valialkin	92628f9f07	lib/promrelabel: add tests for verifying that regex works as expected in single quotes and double quotes	2021-07-27 10:53:03 +03:00
Aliaksandr Valialkin	5d255846ac	all: add `go:build` lines for Go1.17 See https://tip.golang.org/doc/go1.17#gofmt for more details	2021-07-26 15:50:46 +03:00
Aliaksandr Valialkin	c857e05604	lib/promscrape: add missing whitespace at /targets page before `up` word	2021-07-26 12:23:06 +03:00
Aliaksandr Valialkin	376af3c956	lib/workingsetcache: switch from split cache to full cache after the cache size exceeds 95% of split capacity Previously the switch occurred when the cache size becomes 100% of its capacity. The cache size could never reach 100% capacity. This could prevent from switching from the split cache to full cache, thus reducing the cache effectiveness.	2021-07-15 16:53:35 +03:00
Aliaksandr Valialkin	9d3f9da5ad	lib/storage: make sure the second call to DeduplicateSamples and deduplicateSamplesDuringMerge doesnt change samples	2021-07-15 12:18:38 +03:00
Aliaksandr Valialkin	e992754e79	lib/storage: remove cache directory if it contains reset_cache_on_startup file See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447	2021-07-13 17:59:51 +03:00
Aliaksandr Valialkin	e6edb85fa2	lib/httpserver: add `is_set` label to `flag` metrics This label allows determining the set flags with the query `flag{is_set="true"}`	2021-07-13 15:10:18 +03:00
Aliaksandr Valialkin	51cd19d2e3	lib/storage: reset perKeyMisses stats less frequently This should reduce CPU usage for queries executed with intervals higher than 30 seconds	2021-07-12 14:34:54 +03:00
Aliaksandr Valialkin	3f705fe8d7	lib/storage: properly limit the size of `storage/date_metricID` cache	2021-07-12 14:25:28 +03:00
Aliaksandr Valialkin	ef3c58d7a3	lib/storage: properly determine when the deduplication is needed in needsDedup Previously needsDedup() could return true if the de-duplication wasn't needed for the following case: d < interval / \ \| v \| v \| interval interval Now it properly returns false for this case	2021-07-12 10:54:51 +03:00
Aliaksandr Valialkin	41754e12f8	lib/mergeset: cache indexBlock items only on the second request This should reduce the indexdb/indexBlocks cache size, since it won't contain one-time-wonders items.	2021-07-07 15:24:37 +03:00
Aliaksandr Valialkin	ceda2b1df4	lib/httpserver: print full requestURI in httpserver.Errorf This should simplify debugging.	2021-07-07 13:11:29 +03:00
Aliaksandr Valialkin	9826f7c1be	lib/storage: do not cache inmemoryBlock entries requested only once (aka one-time-wonder items) This should reduce the cache size and memory usage for the indexdb/dataBlocks cache	2021-07-07 10:59:45 +03:00
Aliaksandr Valialkin	74ace9340d	lib/storage: periodically reset prefetchedMetricIDs cache in order to limit its size under high churn rate	2021-07-07 10:59:39 +03:00
Aliaksandr Valialkin	a846febc89	Revert "lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method" This reverts commit `7c6d3981bf`. Reason for revert: high contention at bucket16Pool on systems with big number of CPU cores. This slows down query processing significantly.	2021-07-06 18:26:56 +03:00
Aliaksandr Valialkin	b805a675f3	lib/{mergeset,storage}: switch from sync.Pool to chan-based pool for inmemoryPart objects This should reduce memory usage on systems with big number of CPU cores, since every inmemoryPart object occupies at least 64KB of memory and sync.Pool maintains a separate pool inmemoryPart objects per each CPU core. Though the new scheme for the pool worsens per-cpu cache locality, this should be amortized by big sizes of inmemoryPart objects.	2021-07-06 16:33:25 +03:00
Aliaksandr Valialkin	d8e7c1ef27	lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method This reduces the load on memory allocator in Go runtime in production workload.	2021-07-06 16:33:24 +03:00
Aliaksandr Valialkin	db6bd69475	lib/mergeset: increase pool capacity for inmemoryBlock according to collected profiles from production workload CPU and memory profiles show that the pool capacity for inmemoryBlock objects is too small. This results in the increased load on memory allocation code in Go runtime. Increase the pool capacity in order to reduce the load on Go runtime.	2021-07-06 13:44:27 +03:00
Aliaksandr Valialkin	fd32855a6c	lib/mergeset: limit the frequency for flushCallback calls to once per 10 seconds This should improve hit ratio for tagFiltersCache when big number of new time series are constantly registered (aka high churn rate). This, in turn, should reduce CPU usage for queries over such time series.	2021-07-06 12:20:15 +03:00
Aliaksandr Valialkin	22c6e64bbc	lib/storage: consistency renaming: tagCache -> tagFiltersCache This improves code readability	2021-07-06 11:03:30 +03:00
Aliaksandr Valialkin	21abf487c3	lib/workingsetcache: properly update stats for requests and cache misses Previously the stats for cache misses could be improperly counted, because it had inflated cache misses if the entry was missing in the curr cache, but was existing in the prev cache. The same applies to cache requests - they were inflated if the entry was missing in the curr cache.	2021-07-06 10:54:38 +03:00
Aliaksandr Valialkin	e5031d9aee	lib/workingsetcache: fix cache capacity calculations after `4f0003f182`	2021-07-05 17:16:35 +03:00
Aliaksandr Valialkin	bd71f102e8	lib/workingsetcache: typo fixes after `d0c830039d`	2021-07-05 15:35:51 +03:00
Aliaksandr Valialkin	4b25e627f8	lib/workingsetcache: properly switch to `whole` mode Previously the switch from `split` to `whole` mode had been performed too early, e.g. when the current cache size became bigger than 1/4 of the allowed cache size. Now it is performed when the current cache size becomes bigger than 1/2 of the allowed cache size. This change can reduce memory usage for data ingestion path when big number of active time series are ingested.	2021-07-05 15:15:39 +03:00
Aliaksandr Valialkin	51516b96e6	lib/storage: tune cache sizes according to production workload	2021-07-05 15:14:45 +03:00
Aliaksandr Valialkin	f12f97daa1	lib/{storage,mergeset}: increase cache timeout for data and index blocks from a minute to two minutes One minute cache timeout result in slower queries in some production workloads where the interval between query execution is in the range 1 minute - 2 minutes.	2021-07-05 14:25:59 +03:00
Aliaksandr Valialkin	377bb06b47	lib/cgroup: set GOGC to 50 by default if it isn't set This should reduce memory usage for typical VictoriaMetrics workloads by up to 50%	2021-07-05 12:34:01 +03:00
Aliaksandr Valialkin	8055439fe4	lib/storage: properly detect free disk space shortage during data merge Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373	2021-07-02 17:42:23 +03:00
Aliaksandr Valialkin	6fc3696260	lib/promscrape/discovery/consul: use case-insensitive comparison for service names Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1424	2021-07-02 14:49:22 +03:00
Aliaksandr Valialkin	61e483a01c	lib/protoparser/clusternative: remove unused field - unmarshalWork.lastResetTime This is a follow-up for `b84aea1e6e`	2021-07-02 13:32:59 +03:00
Aliaksandr Valialkin	72de54f93e	lib/promauth: cache the client TLS certificate for up to a second This should reduce CPU usage when TLS connections are established at a high rate.	2021-07-02 13:20:18 +03:00
Aliaksandr Valialkin	1c12c0f79c	lib/promauth: reload TLS certificates from disk on every mTLS connection as Prometheus does This allows updating client certificates without the need to restart vmagent and/or single-node VictoriaMetrics. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1420 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470	2021-07-01 15:43:43 +03:00
Nikolay	6bd2309449	fixes /targets button style (#1423 ) * fixes /targets button style https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1422 * updates boostrap version	2021-07-01 11:52:47 +03:00
Aliaksandr Valialkin	71c856beb8	lib/workingsetcache: reset the cache mode when the cache is reset This should reduce memory usage if the working set is reduced after the cache reset.	2021-07-01 11:52:47 +03:00
Aliaksandr Valialkin	bced9ee666	lib/{mergeset,storage}: reduce the maximum lifetime for cached indexdb and data blocks from 2 minutes to a minute This should reduce memory usage on a system with high number of active time series and a high churn rate. One minute is enough for caching the blocks needed for repeated queries (e.g. alerting rules, recording rules and dashboard refreshes).	2021-06-29 19:57:53 +03:00
Aliaksandr Valialkin	b7c0b3dde3	lib/mergeset: switch from sync.Pool to a channel for a pool for inmemoryBlock structs This should reduce memory usage for the pool on systems with big number of CPU cores. The sync.Pool maintains per-CPU pools, so the total number of objects in the pool is proportional to the number of available CPU cores. The channel limits the number of pooled objects by its own capacity. This means smaller number of pooled objects on average.	2021-06-29 19:57:52 +03:00
Aliaksandr Valialkin	2edfea8c36	lib/promscrape/discovery/docker: fix golint warning: `struct field Id should be ID`	2021-06-29 13:11:33 +03:00
Aliaksandr Valialkin	609ad6d9bf	lib/storage: put indexDBName into the key for dateTagFilter cache and for uselessTagFilters cache This should prevent from stats overwriting when the previous indexdb is queried.	2021-06-29 13:11:32 +03:00
Aliaksandr Valialkin	0c4c630839	lib/promscrape: typo fix in `/targets` output The typo has been introduced in `fb72a2133f` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1408	2021-06-28 21:27:22 +03:00
Aliaksandr Valialkin	97d1ccfc8e	lib/promscrape: split docker and dockerswarm service discovery code bases, since they have very little in common This is a follow up after `c85a5b7fcb`	2021-06-25 13:22:16 +03:00
Aliaksandr Valialkin	4461e20e7d	lib/promscrape: consistently sort service discovery routines This should simplify further maintenance of the code	2021-06-25 13:22:16 +03:00
Lu Jiajing	12b4cbb68f	Support Docker ServiceDiscovery (#1402 ) * add docker discovery * add test * add labels test and add scrape work * remove TODO * refactor to merge apiConfig and sdConfig * apply suggestion	2021-06-25 13:22:16 +03:00
Nikolay	501429c3ff	adds missing MustStop call to do and http sd (#1404 )	2021-06-25 11:43:32 +03:00
Aliaksandr Valialkin	b84aea1e6e	lib/protoparser/clusternative: do not pool unmarshalWork structs, since they can occupy big amounts of memory (more than 100MB per each struct) This should reduce memory usage for vmstorage under high ingestion rate when the vmstorage runs on a system with big number of CPU cores	2021-06-23 15:45:08 +03:00
Aliaksandr Valialkin	a22f37599b	lib/storage: tune tag filters search logic Tune the logic according to the logs provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-864293624 The previous logic had a race when multiple concurrent queries execute the same tag filter without prior stats. This could result in incorrectly stored stats for such tag filter, which then could result in non-optimal sorting of tag filters for further queries. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338	2021-06-23 13:30:36 +03:00
Aliaksandr Valialkin	f10fa0d1d7	lib/promscrape/discovery/consul: properly pass namespace to Consul watcher Follow-up for `58a2989fe7`	2021-06-22 17:43:20 +03:00
Aliaksandr Valialkin	4adf6c9766	lib/promscrape/discovery/http: follow up after `e307bbb29a`	2021-06-22 13:42:10 +03:00
Nikolay	e03a3d3a36	adds http_sd (#1399 ) * adds http_sd * adds X-Prometheus-Refresh-Interval-Seconds header * Update lib/promscrape/discovery/http/api.go Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>	2021-06-22 13:42:09 +03:00
Aliaksandr Valialkin	3ab3902f17	lib/promscrape/discovery: support generic auth configs in Consul service discovery in the same way as Prometheus 2.28 does	2021-06-22 13:18:51 +03:00
Nikolay	827a2396d2	adds consul enterprise namespace support (#1400 ) * adds consul enterprise namespace support * Update lib/promscrape/discovery/consul/consul.go Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>	2021-06-22 12:56:11 +03:00
Aliaksandr Valialkin	f9069ba32a	lib/promscrape: show jobs with empty scrape targets on /targets page	2021-06-18 10:54:12 +03:00
Nikolay	9ea1dca3dd	fixes DO service discovery labels (#1389 ) adds test for digitalocean sd	2021-06-17 17:21:10 +03:00
Aliaksandr Valialkin	a207be3ffb	lib/storage: fix infinite loop introduced in `aa9b56a046` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244	2021-06-17 14:27:30 +03:00
Aliaksandr Valialkin	0efd37cec1	lib/{mergeset,storage}: reduce the number of fsync calls on data ingestion path on systems with many cpu cores VictoriaMetrics maintains a buffer per CPU core for the ingested data. These buffers are flushed to disk every second. These buffers are flushed to disk in parallel starting from the commit `56b6b893ce` . This resulted in increased write disk IO usage on systems with many cpu cores as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-863046999 . This commit merges the per-CPU buffers into bigger in-memory buffers before flushing them to disk. This should reduce the rate of fsync syscalls and, consequently, the write disk IO on systems with many CPU cores. This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338 See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244	2021-06-17 13:51:42 +03:00
Aliaksandr Valialkin	b133de1e37	lib/storage: move deletedMetricIDs set from indexDB to Storage This makes consitent the list of deleted metricIDs when it is used from both the current indexDB and the previous indexDB (aka extDB). This should fix the issue, which could lead to storing new samples under deleted metricIDs after indexDB rotation. See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136 . Thanks to @tangqipengleoo for the initial analysis and the pull request - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 . This commit resolves the issue in more generic way compared to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 . The downside of the commit is the deletedMetricIDs set isn't cleaned from the metricIDs outside the retention. It needs app restart. This should be OK in most cases.	2021-06-15 15:07:54 +03:00
Aliaksandr Valialkin	ebaf68bcb0	lib/protoparser: stop reading the input stream as soon as the callback provided by the caller returns error This is a follow-up for `af90c3c43b`	2021-06-14 15:20:38 +03:00
faceair	2ea187e801	lib/protoparser: stop read when callback error (#1380 )	2021-06-14 15:20:37 +03:00
Aliaksandr Valialkin	5f91a701fa	lib/promscrape: show the number of samples collected during the last scrape at /targets and /api/v1/targets pages Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1377	2021-06-14 14:04:35 +03:00
Nikolay	e42da47608	adds digital ocean sd (#1376 ) * adds digital ocean sd config * adds digital ocean sd https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1367 * typo fix	2021-06-14 13:19:29 +03:00
Aliaksandr Valialkin	df057177a0	lib/promscrape: increase the duration for reading the full response in stream parsing mode Increase the duration from 10x to 30x of the configured `scrape_interval'. This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365	2021-06-14 12:29:46 +03:00
Aliaksandr Valialkin	074b11fa69	lib/protoparser: measure the duration for reading the whole block of data instead of a single read operation Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365	2021-06-14 12:29:45 +03:00
Aliaksandr Valialkin	87d221f78a	lib/protoparser/common: log the duration for reading a block of data in ReadLinesBlockExt on error This may help debugging issues like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365	2021-06-14 12:21:21 +03:00
Aliaksandr Valialkin	0672cfffa2	app/vmauth: properly handle http.ErrAbortHandler panic This panic can be raised by the reverseProxy on aborted request to the backend. So handle it (e.g. suppress) at reverseProxy.ServeHTTP call. Do not suppress the panic at lib/httpserver generic HTTP handler, since it may result in an inconsistent state left after the panicking handler. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1353	2021-06-11 12:54:37 +03:00
Aliaksandr Valialkin	ce10bdc82a	lib/storage: reset cache on disk during series deletion and during indexdb rotation This should prevent from inconsistent behavior (aka partially missing data for some time series) after unclean shutdown. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347	2021-06-11 12:54:36 +03:00
Aliaksandr Valialkin	eb335d2c29	lib/storage: consistency renaming: getMaxRawRowsPerPartition -> getMaxRawRowsPerShard	2021-06-11 10:52:31 +03:00
Aliaksandr Valialkin	d06c0e7a94	lib/storage: reduce the amounts of memory which can be occupied by rawRow items during data ingestion on a system with many CPU cores	2021-06-11 10:49:02 +03:00
Nikolay	2c1611d316	disables panic for net/httpAbortHandler (#1355 )	2021-06-09 12:12:45 +03:00
Aliaksandr Valialkin	1e4a64844d	lib/storage: properly account the number of loops spent when matching for `or suffixes` This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338	2021-06-08 13:07:14 +03:00
Aliaksandr Valialkin	e7d353ee6a	lib/promrelabel: add tests for labelsToString() function	2021-06-04 20:42:14 +03:00
Aliaksandr Valialkin	269e35d676	app/{vmagent,vminsert}: follow-up after `2fe045e2a4` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1343	2021-06-04 20:33:22 +03:00
jelmd	d8b46908db	new feature: debug relabeling (#1344 ) * new feature: relabel logging Use scrape_configs[x].relabel_debug = true to log metric names inkl. labels before and after relabeling. After relabeling related metrics get dropped, i.e. not submitted to servers. * vminsert wants relabel logging, too.	2021-06-04 20:33:21 +03:00
Nikolay	3d89c01d07	fixes solaris build (#1345 )	2021-06-04 11:56:06 +03:00
Hason Chan	439c2ed510	fix eureka_sd_configs HTTPClientConfig incorrect parsing (#1350 )	2021-06-04 11:56:06 +03:00
Aliaksandr Valialkin	fc2565b4ee	lib/storage: reduce memory allocations when syncing dateMetricIDCache	2021-06-03 16:20:02 +03:00
Aliaksandr Valialkin	0b9f0de0a1	lib/promscrape: fix tests after `f0c21b6300`	2021-05-28 01:33:28 +03:00
Aliaksandr Valialkin	6865f3b497	Revert "lib/mergeset: remove a pool for inmemoryBlock structs" This reverts commit `793fe39921`. Reason to revert: production testing revealed possible slowdown when registering big number of new time series	2021-05-28 01:11:22 +03:00
Aliaksandr Valialkin	7b33bc67a1	lib/mergeset: remove a pool for inmemoryBlock structs The pool for inmemoryBlock struct doesn't give any performance gains in production workloads, while it may result in excess memory usage for inmemoryBlock structs inside the pool during background merge of indexdb.	2021-05-27 22:00:50 +03:00
Aliaksandr Valialkin	97de72054e	docs: document `f0c21b6300`	2021-05-27 15:04:13 +03:00
faceair	b801b299f0	lib/promscrape: apply body size & sample limit to stream parse (#1331 ) * lib/promscrape: apply body size limit to stream parse Signed-off-by: faceair <git@faceair.me> * lib/promscrape: apply sample limit to stream parse Signed-off-by: faceair <git@faceair.me>	2021-05-27 15:04:11 +03:00
Aliaksandr Valialkin	49490ae5a7	lib/protoparser/clusternative: remove duplicate `cannot read packet size` phrase from the log message Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1336	2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin	c85084b659	lib/handshake: pass io.EOF unmodified to the caller for BufferedConn.Read, so it could properly detect the end of stream	2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin	10b2855949	lib/storage: fix spelling typo: `borken->broken` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1336	2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin	6b90570ed3	lib/uint64set: store pointers to bucket16 instead of bucket16 objects in bucket32 This speeds up bucket32.addBucketAtPos() when bucket32.buckets contains big number of items, since the copying of bucket16 pointers is much faster than the copying of bucket16 objects. This is a cpu profile for copying bucket16 objects: 10ms 13.43s (flat, cum) 32.01% of Total 10ms 120ms 650: b.b16his = append(b.b16his[:pos+1], b.b16his[pos:]...) . . 651: b.b16his[pos] = hi . 13.31s 652: b.buckets = append(b.buckets[:pos+1], b.buckets[pos:]...) . . 653: b16 := &b.buckets[pos] . . 654: *b16 = bucket16{} . . 655: return b16 . . 656:} This is a cpu profile for copying pointers to bucket16: 10ms 1.14s (flat, cum) 2.19% of Total . 100ms 647: b.b16his = append(b.b16his[:pos+1], b.b16his[pos:]...) . . 648: b.b16his[pos] = hi 10ms 700ms 649: b.buckets = append(b.buckets[:pos+1], b.buckets[pos:]...) . 330ms 650: b16 := &bucket16{} . . 651: b.buckets[pos] = b16 . . 652: return b16 . . 653:}	2021-05-25 14:27:52 +03:00
Aliaksandr Valialkin	1c16cbacf5	lib/storage: do not stop data ingestion on the first error in Storage.AddRows Continue data ingestion for the rest of blocks.	2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin	2601844de3	lib/storage: limit the number of rows per each block in Storage.AddRows() This should reduce memory usage when ingesting big blocks or rows.	2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin	95b735a883	lib/storage: allow filling all the rows up to their capacity in rawRowsShard.addRows This should reduce memory usage a bit on data ingestion path	2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin	0f84503880	lib/bloomfilter: fix TestLimiterConcurrent	2021-05-24 05:18:29 +03:00
Aliaksandr Valialkin	745eda9e87	lib/fs: do not pass done callback to tryRemoveAll() func This improves code readability a bit. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313	2021-05-24 05:00:53 +03:00
Aliaksandr Valialkin	402a8ca710	lib/storage: do not populate MetricID->MetricName cache during data ingestion This cache isn't needed during data ingestion, so there is no need in spending RAM on it. This reduces RAM usage on data ingestion path by 30%	2021-05-24 03:06:40 +03:00
Aliaksandr Valialkin	0fc857d363	lib/{mergeset,storage}: reduce the number of IFNO log messages like `merged ... items across ... blocks in ... seconds` Log these messages if the merge takes more than 30 seconds instead of 10 seconds.	2021-05-23 14:15:49 +03:00
Aliaksandr Valialkin	71ff7ee18d	lib/promauth: follow-up after `5b8176c68e`	2021-05-22 18:02:03 +03:00
Nikolay	2780d6dbcd	basic OAuth2 support for remoteWrite and scrape targets (#1316 ) * adds OAuth2 support for remoteWrite and scrapping * adds tests changes init	2021-05-22 18:02:01 +03:00
Aliaksandr Valialkin	89e1a45cdb	lib/fs: concurrently remove up to 1024 blocked NFS directories Previously the blocked directories were removed sequentially by a single goroutine. This can be not enough for highly loaded VictoriaMetrics that accepts millions of sample per second, when big number of LSM parts are created and removed at high rate. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313	2021-05-21 17:58:08 +03:00
Aliaksandr Valialkin	23355ca34c	lib/fs: wait for a while before giving up on NFS file removal if the removal queue is full This should reduce the probability of the panic on a highly loaded VictoriaMetrics accepting millions of samples per second. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313	2021-05-21 17:21:35 +03:00
Aliaksandr Valialkin	d77db9d813	all: do not skip SIGHUP signal during service initialization This can lead to stale or incomplete configs like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240	2021-05-21 16:38:20 +03:00
Aliaksandr Valialkin	69e365cd48	Makefile: update golangci-lint from v1.29.0 to v1.40.1	2021-05-20 18:30:24 +03:00
Aliaksandr Valialkin	da0b32c31a	app/vmagent/remotewrite: expose metrics with the current number of active series per day and per hour These numbers are exposed via the following metrics: - vmagent_hourly_series_limit_current_series - vmagent_daily_series_limit_current_series Expose also the limits via the following metrics: - vmagent_hourly_series_limit_max_series - vmagent_daily_series_limit_max_series	2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin	165a9f9200	app/vmstorage: add ability to limit series cardinality via `-storage.maxHourlySeries` and `-storage.maxDailySeries` command-line flags	2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin	7aad5c3f76	app/vmagent: add ability to limit series cardinality on a per-hour and per-day basis	2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin	110a888e39	lib/promscrape/discovery/kubernetes: make `golangci-lint` happy by removing empty branches	2021-05-20 12:00:17 +03:00
Aliaksandr Valialkin	e228f479a5	lib/storage: remove possible data race when logging dropped labels	2021-05-20 11:54:06 +03:00
Aliaksandr Valialkin	9d97f44772	lib/promscrape/discovery/kubernetes: reload objects on object parse error Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240	2021-05-18 23:27:24 +03:00
Aliaksandr Valialkin	74ef40034c	lib/httpserver: typo fix in `-http.shutdownDelay` command-line flag description: servier -> server	2021-05-18 16:25:27 +03:00
Aliaksandr Valialkin	c507faec0b	lib/promscrape/discovery/kubernetes: simplify the reload logic for urlWatcher.objectsByKey	2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin	0f54c0121b	lib/promscrape/discovery/kubernetes: properly update vm_promscrape_discovery_kubernetes_scrape_works metric Previously it wasn't descreased during config update.	2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin	9f62d348db	lib/promscrape/discovery/kubernetes: log errors and stop service discovery when unexpected updates are received from Kubernetes API server Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240	2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin	6ea191d196	docs: dealay -> delay	2021-05-18 01:07:32 +03:00
Aliaksandr Valialkin	c4ed50ae54	lib/promrelabel: add tests for conditional removal of label on another label match Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1294	2021-05-18 00:23:23 +03:00
Aliaksandr Valialkin	8764b0ae21	lib/promscrape/discovery/kubernetes: key ScrapeWork objects by urlWatcher instead of namespace This makes the code less fragile if urlWatcher would depend on additional to namepsace properties. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240	2021-05-17 23:49:48 +03:00
Aliaksandr Valialkin	e08287f017	lib/promscrape: reload auth tokens from files every second Previously auth tokens were loaded at startup and couldn't be updated without vmagent restart. Now there is no need in vmagent restart. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1297	2021-05-14 20:03:35 +03:00
Aliaksandr Valialkin	a6cb4f10a7	app/{vmalert,vmauth}: explicitly set MaxIdleConnsPerHost in net/http.Client.Transport By default MaxIdleConnsPerHost is set to 2. This limits the possibility to re-use http keep-alive connections. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1300	2021-05-14 18:13:34 +03:00
Aliaksandr Valialkin	e3f61d540b	lib/promscrape: limit `scrape_timeout` by `scrape_interval` like Prometheus does Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1281	2021-05-13 16:10:42 +03:00
匠心零度	d5285ecaf0	fix vagent imbalance problem (#1292 ) /path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=0 -promscrape.config=/path/to/config.yml ... /path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=1 -promscrape.config=/path/to/config.yml ... /path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=2 -promscrape.config=/path/to/config.yml ... Co-authored-by: lirenzuo <lirenzuo@shein.com>	2021-05-13 11:19:30 +03:00
Aliaksandr Valialkin	f13585dc5d	vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.14 to v1.0.15 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289	2021-05-13 10:47:09 +03:00
Aliaksandr Valialkin	d13906bf1f	lib/promscrape: exponentially increase retry interval on unsuccesful requests to scrape targets or to service discovery services This should reduce CPU load at vmagent and at remote side when the remote side doesn't accept HTTP requests. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289	2021-05-13 10:47:07 +03:00
Aliaksandr Valialkin	66c6976723	lib/cgroup: document the ability to detect cgroup v2 memory and cpu limits. This is follow-up for `b50024812e`	2021-05-13 09:27:35 +03:00
Nikolay	8743bf541f	adds cgroupsv2 support (#1283 ) * adds cgroupv2 limits support https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1269 * small fix * changes Atoi to ParseUint	2021-05-13 09:27:33 +03:00
Aliaksandr Valialkin	2839055513	lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss	2021-05-13 09:01:05 +03:00
Aliaksandr Valialkin	008ae25b3a	lib/storage: merge getTSDBStatusForDate with getTSDBStatusWithFiltersForDate These functions are non-trivial, while their code has minimal differences. It is better from maintainability PoV to merge these functions into a single function.	2021-05-12 18:01:08 +03:00
Nikolay	be87be34a4	Adds tsdb match filters (#1282 ) * init work on filters * init propose for status filters * fixes tsdb status adds test * fix bug * removes checks from test	2021-05-12 17:16:58 +03:00
Aliaksandr Valialkin	027607db3e	lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices scrape targets every 5 seconds, since they may depend on changed service and pod objects This should make endpoints and endpointslices scrape targets eventually consistent with the maximum delay of 5 seconds after the related service or pod object changes. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240	2021-05-12 14:12:43 +03:00
Aliaksandr Valialkin	1d32b008c6	lib/httpserver: add new X-Server-Hostname header instead of overwriting already exsiting header This makes possible tracking origins of chained requests over multiple hops.	2021-05-11 23:47:19 +03:00
Aliaksandr Valialkin	f1317f7c6c	lib/httpserver: return X-Server-Hostname http header in all the responses for better debuggability	2021-05-11 22:04:41 +03:00
Aliaksandr Valialkin	4e59cf4380	lib/storage: properly apply time range when matching an empty filter It must match all the time series on the given time range. Previously it was matched to all the time series without the restriction on the given time range.	2021-05-11 01:09:35 +03:00
Aliaksandr Valialkin	326cf83eb4	lib/storage: remove dead code after the commit `3ccf7ea20c`	2021-05-08 20:15:59 +03:00
Aliaksandr Valialkin	9c505d27dd	lib/ingestserver: properly close incoming connections during graceful shutdown	2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin	4a5f45c77e	app/vminsert: add support for data ingestion via other vminsert nodes	2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin	e6c19cb09d	lib/promscrape/discovery/kubernetes: start watchers for pods and services before starting watchers for endpoints This should eliminate possible race when an update on endpoints depends on pods and/or services, which are missing in the cache yet. This could result in missing targets based on endpoints or endpointslices. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240	2021-05-05 12:23:16 +03:00
Aliaksandr Valialkin	43c52ff77a	lib/storage: use WARNING instead of INFO level for logging dropped labels	2021-05-03 13:57:28 +03:00
Aliaksandr Valialkin	ec6becd3f5	lib/httpserver: stop the process on panics in request handlers Panics may leave the process in inconsistent state. That's why it is better to stop the process after the panic instead of recovering from the panic. Unfortunately, the standard net/http.Server recovers panics in request handlers. See https://github.com/golang/go/issues/16542 . That's lib/httpserver must stop the process on itself after the panic.	2021-05-03 12:00:44 +03:00
Nikolay	62d58324dd	adds stalePartsRemover (#1261 ) for new created partitions	2021-05-03 11:34:33 +03:00
Aliaksandr Valialkin	60ffbcbb99	lib/promrelabel: add tests for removing the specified {label="value"} pair	2021-05-03 11:26:58 +03:00
Aliaksandr Valialkin	b43ba6d85f	lib/storage: log dropped labels if the number of labels in a metric exceeds `-maxLabelsPerTimeseries` command-line flag value This should improve debuggability for this case.	2021-05-01 09:29:56 +03:00
Aliaksandr Valialkin	8be1cb297b	app/vmagent: list user-visible endpoints at `http://vmagent:8429/` While at it, use common WriteAPIHelp function for the listing in vmagent, vmalert and victoria-metrics	2021-04-30 09:38:23 +03:00
Aliaksandr Valialkin	421a92983a	lib/promscrape/discovery/kubernetes: remove a mutex at urlWatcher - use groupWatcher mutex for accessing all the urlWatcher children This simplifies the code a bit and reduces the probability of improper mutex handling and deadlocks.	2021-04-29 10:17:45 +03:00
Nikolay	535b3ff618	vmagent kubernetes_sd tests (#1253 ) * first part of tests for kubernetes sd * makes linter happy * added more test cases * adds pub/sub for tests	2021-04-29 10:17:45 +03:00

... 8 9 10 11 12 ...

1968 commits