github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	c42ddce159	lib/promscrape: add support for `enable_compression` option in the same way as Prometheus does Updates https://github.com/prometheus/prometheus/pull/13166 Updates https://github.com/prometheus/prometheus/issues/12319 Do not document enable_compression option at docs/sd_configs.md, since vmagent already supports more clear disable_compression option - see https://docs.victoriametrics.com/vmagent/#scrape_config-enhancements	2024-02-18 19:40:39 +02:00
Aliaksandr Valialkin	5a092e161c	lib/promscrape/discovery/kuma: add support for `client_id` option See https://github.com/prometheus/prometheus/pull/13278	2024-02-18 19:19:40 +02:00
Aliaksandr Valialkin	2e30842582	vendor: update github.com/VictoriaMetrics/metricsql from v0.72.1 to v0.73.0 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5383	2024-02-18 18:41:41 +02:00
Aliaksandr Valialkin	dfbb6e0826	docs/LTS-releases.md: cosmetic fixes	2024-02-17 18:09:36 +02:00
Aliaksandr Valialkin	d03719e72d	docs/CHANGELOG.md: document `f8207e33a2`	2024-02-17 17:52:53 +02:00
Aliaksandr Valialkin	cee901cdf4	docs/CHANGELOG.md: add missing `for` in the description of the TLS configuration features for vmctl This is a follow-up for `6a07cb1bdb` and `f973711e56`	2024-02-17 17:44:22 +02:00
hagen1778	f973711e56	app/vmctl: follow-up after `0c293a66ec` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-16 15:22:44 +01:00
Khushi Jain	0c293a66ec	app/vmctl : support TLS config options for remote read mode (#5798 )	2024-02-16 15:12:43 +01:00
hagen1778	6a07cb1bdb	app/vmctl: follow-up after `7cd1b7d047` * cleanup code * update docs Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-16 15:08:51 +01:00
Khushi Jain	7cd1b7d047	app/vmctl : support TLS config options for InfluxDB datasource (#5783 ) * vmctl: TLS flags for influx DB * added httputils function * Add changelog and doc --------- Co-authored-by: Khushi Jain <khushi.jain@nokia.com>	2024-02-16 14:59:18 +01:00
Aliaksandr Valialkin	6b9bedd0f9	app/vmstorage: expose vm_last_partition_parts metrics, which may help identifying performance issues related to the increased number of parts in the last partition	2024-02-15 14:51:19 +02:00
Aliaksandr Valialkin	926854b0f3	docs/CHANGELOG.md: document v1.93.12 LTS release	2024-02-14 20:17:44 +02:00
Aliaksandr Valialkin	39cba7e4fa	docs/CHANGELOG.md: document v1.97.2 LTS release	2024-02-14 18:50:50 +02:00
Aliaksandr Valialkin	baaa88001e	lib/promrelabel: store the original labels before returning them them to promutils.PutLabels() This should reduce memory allocations. This is a follow-up for `b09bd6c42a` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5389	2024-02-14 16:09:10 +02:00
Aliaksandr Valialkin	e4ad41b5ff	docs/CHANGELOG.md: cut v1.98.0 release	2024-02-14 16:00:50 +02:00
Aliaksandr Valialkin	4954eee187	docs/CHANGELOG.md: various typo cleanups	2024-02-14 02:45:17 +02:00
Yury Molodov	1c9f13d6c7	vmui: improve the context for autocomplete #5736 #5737 #5739 (#5804 ) Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-14 00:22:51 +00:00
Aliaksandr Valialkin	d6e22f2888	app/vmselect: add sum_eq_over_time, sum_gt_over_time and sum_le_over_time functions to MetricsQL See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4641	2024-02-13 23:40:07 +02:00
Nikolay	88329d84ca	app/vmauth: properly release memory during config reload (#5805 ) * app/vmauth: properly release memory during config reload previously metrics package hold a refrence for channels for users concurrent requests. it case of churn at `name` field of users configuration, new metric was created. But previous one wasn't deleted. It prevented full parsed configuration from being garbace collected. now all config related metrics are bound to corresponding metrics.Set and unregistered during config reload process. It also must fix an issue with incorrect values for current concurrent user requests https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4690 * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-13 18:49:17 +00:00
Aliaksandr Valialkin	2f3091460f	vendor: update github.com/VictoriaMetrics/metricsql from v0.70.1 to v0.71.0 This adds an ability to propagate label filters across label_set() and alias() functions. This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1827#issuecomment-1654095358	2024-02-13 06:36:59 +02:00
Aliaksandr Valialkin	e963d6c789	app/vmagent/remotewrite: add -remoteWrite.tlsHandshakeTimeout command-line flag for tuning tls handshake timeout to -remoteWrite.url Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1699	2024-02-13 02:46:33 +02:00
Aliaksandr Valialkin	062cbb1130	app/vmauth: add support for mTLS-based routing of incoming requests to different backends depending on the subject field in the TLS certificate provided by the user Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1547	2024-02-13 01:03:20 +02:00
Aliaksandr Valialkin	95222b2079	all: upgrade Go builder from Go1.21.7 to Go1.22.0 See https://go.dev/doc/go1.22	2024-02-12 21:59:51 +02:00
Roman Khavronenko	8850c7431d	app/vmalert: support filtering for /api/v1/rule like Prometheus does (#5787 ) Follow-up after `62e5e2a4c8` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-09 14:35:31 +01:00
Victor Amorim dos Santos	62e5e2a4c8	app/vmalert: support `type` param for filtering /api/v1/rules response by rule type (#5749 ) Co-authored-by: Hui Wang <haley@victoriametrics.com>	2024-02-09 09:02:35 +01:00
Aliaksandr Valialkin	ae8a867924	all: add support for specifying multiple -httpListenAddr options	2024-02-09 03:15:04 +02:00
Aliaksandr Valialkin	b161e889b5	docs/CHANGELOG.md: typo fixes	2024-02-08 21:18:45 +02:00
Aliaksandr Valialkin	d8c1db7953	lib/httpserver: do not close client connections every 2 minutes by default Closing client connections every 2 minutes doesn't help load balancing - this just leads to "jumpy" connections between multiple backend servers, e.g. the load isn't spread evenly among backend servers, and instead jumps between the servers every 2 minutes. It is still possible periodically closing client connections by specifying non-zero -http.connTimeout command-line flag. This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1304#issuecomment-1636997037 This is a follow-up for `d387da142e`	2024-02-08 21:10:25 +02:00
Aliaksandr Valialkin	b718e555a6	docs/CHANGELOG.md: add a link to docs describing -disableReroutingOnUnavailable command-line flag	2024-02-08 17:03:19 +02:00
hagen1778	e1926f286b	docs: follow-up after `83e55456e2` `83e55456e2` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-08 15:55:57 +01:00
Khushi Jain	83e55456e2	app/vmbackup: support client-side TLS configuration for create/delete snapshot API (#5738 )	2024-02-08 15:52:00 +01:00
Aliaksandr Valialkin	a354924b0d	app/victoria-metrics: properly send staleness markers on victoriametrics shutdown if -selfScrapeInterval > 0 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/943 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526	2024-02-08 15:29:19 +02:00
Aliaksandr Valialkin	61d9df4c36	app/vmselect: add ability to reset rollup result cache on startup by passing -search.resetRollupResultCacheOnStartup command-line flag Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/834	2024-02-08 14:40:40 +02:00
Aliaksandr Valialkin	f7b68b466c	docs/CHANGELOG.md: clarify the bugfix description for `e1bf8440eb`	2024-02-08 13:01:19 +02:00
Aliaksandr Valialkin	e1bf8440eb	lib/mergeset: prevent from possible `too big indexBlockSize` panic This panic could occur when samples with too long label values are ingested into VictoriaMetrics. This could result in too long fistItem and commonPrefix values at blockHeader (up to 64kb each). This may inflate the maximum index block size by 4 * maxIndexBlockSize.	2024-02-08 12:54:10 +02:00
hagen1778	3380043424	dashboards: follow-up `4369bc1df2` * add more details to changelog * simplify panels description * remove capacity planning recommendation, as it proves it incompetent Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-08 09:51:43 +01:00
Hui Wang	4369bc1df2	deployment/dashboards: fix `Storage full ETA` panels (#5747 ) During background downsampling, rate(vm_deduplicated_samples_total{type="merge"}) could be much bigger than rate(vm_rows_added_to_storage_total) and it could last quite some time, which causes negative values of Storage full ETA and confuses users, see playground. Instead of trying to get more accurate results during downsampling, I think it's ok to ignore vm_deduplicated_samples_total at all, it's more reasonable to see Storage full ETA increase after downsampling. --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-08 09:43:39 +01:00
Aliaksandr Valialkin	0cf56c1ba5	app/vmselect/promql: properly handle precision errors in rollup functions changes(), increases_over_time() and resets() shouldn't take into account value changes, which may occur because of precision errors. The maximum guaranteed precision for raw samples stored in VictoriaMetrics is 12 decimal digits. So do not count relative changes for values if they are smaller than 1e-12 comparing to the value. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/767	2024-02-08 02:30:57 +02:00
Aliaksandr Valialkin	19c1066a25	docs/CHANGELOG.md: properly document the change at `b74006e2ca` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5774	2024-02-07 22:05:12 +02:00
Nihal	b74006e2ca	[vmsingle/vminsert]: change http success response code to 200 for -/reload request handler (#5776 ) * change vmsingle's response code to 200 for reload request handler Signed-off-by: Syed Nihal <syed.nihal@nokia.com> * change vmsingle's response code to 200 for the reload request handler Signed-off-by: Syed Nihal <syed.nihal@nokia.com> * change vmsingle's response code to 200 for the reload request handler. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5774 Signed-off-by: Syed Nihal <syed.nihal@nokia.com> --------- Signed-off-by: Syed Nihal <syed.nihal@nokia.com>	2024-02-07 20:00:04 +00:00
Aliaksandr Valialkin	e78e5ccfaa	docs/CHANGELOG.md: support empty command-line flag values in short array notation For example, -fooDuration=',10s,' is now supported - it sets three command-line flag values: - the first and the last one are set to the default value for `-fooDuration` - the second one is set to 10s	2024-02-07 20:53:13 +02:00
Aliaksandr Valialkin	b431ccea5b	all: update Go builder from Go1.21.6 to Go1.21.7 See https://github.com/golang/go/issues?q=milestone%3AGo1.21.7+label%3ACherryPickApproved	2024-02-07 04:00:37 +02:00
Aliaksandr Valialkin	541b644d3d	app/{vmagent,vminsert}: follow-up after `a1d1ccd6f2` - Document the change at docs/CHANGELOG.md - Copy changes from docs/Single-server-VictoriaMetrics.md to README.md - Add missing handler for processing multitenant requests ( https://docs.victoriametrics.com/vmagent/#multitenancy ) - Substitute github.com/stretchr/testify dependency with 3 lines of code in the added tests - Comment unclear code at lib/protoparser/datadogsketches/parser.go , so @AndrewChubatiuk could update it and add permalinks to the original source code there. - Various code cleanups Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5584 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3091	2024-02-07 01:28:05 +02:00
Yury Molodov	dcbdbc760e	vmui: improve select component functionality (#5755 ) * vmui: fix select closing on click outside (#5728) * vmui: clear entered text in select after selecting a value (#5727) --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-06 20:50:04 +00:00
Yury Molodov	a81ccbd749	vmui: fix handling invalid timezone (#5758 ) * vmui: fix handling invalid timezone (#5732) * vmui: switch browser timezone flag to isValid --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-06 20:47:30 +00:00
Yury Molodov	65b8002aeb	vmui: fix graph dragging (#5769 ) Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-06 20:32:03 +00:00
Aliaksandr Valialkin	61524ad87b	vendor: update github.com/VictoriaMetrics/metricsql from v0.70.0 to v0.70.1 This should help with the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5604	2024-02-06 21:52:35 +02:00
Aliaksandr Valialkin	f4f1caea2a	docs/CHANGELOG.md: add link to mTLS feature request	2024-02-06 19:32:40 +02:00
Aliaksandr Valialkin	7bc3af1224	lib/httpserver: add support for mTLS for requests to -httpListenAddr	2024-02-06 17:46:19 +02:00
Aliaksandr Valialkin	c1e50848c5	docs/CHANGELOG.md: move descriptions for recently added features from `v1.97.1` to `tip`	2024-02-06 16:31:18 +02:00
Aliaksandr Valialkin	209c96fc42	docs/Cluster-VictoriaMetrics.md: document -disableReroutingOnUnavailable command-line flag This is a follow-up for `88f0d1572e` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5713	2024-02-05 15:18:01 +02:00
Aliaksandr Valialkin	5f836c8729	docs/CHANGELOG.md: add a link to all VictoriaMetrics dashboards for Grafana	2024-02-05 11:42:05 +02:00
Aliaksandr Valialkin	1684766152	docs/CHANGELOG.md: add missing links to the corresponding dashboards	2024-02-05 10:55:30 +02:00
hagen1778	487a94565b	dashboards/all: add new panel `CPU spent on GC` It should help identifying cases when too much CPU is spent on garbage collection, and advice users on how this can be addressed. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-02 16:21:21 +01:00
hagen1778	29a9b31584	dashboards: add `Targets scraped/s` A new stat panel shows the number of targets scraped by the vmagent per-second. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-02 15:48:26 +01:00
Aliaksandr Valialkin	deed8ddfb8	docs/CHANGELOG.md: document v1.93.11 LTS release	2024-02-01 18:21:28 +02:00
Aliaksandr Valialkin	87bf1900e4	docs/CHANGELOG.md: document v1.87.14 LTS release	2024-02-01 17:08:56 +02:00
Aliaksandr Valialkin	31c53adbde	docs: mark v1.97.x as long-term support release	2024-02-01 15:16:39 +02:00
Aliaksandr Valialkin	bdfa4aee0d	docs/CHANGELOG.md: cut v1.97.1	2024-02-01 15:08:40 +02:00
Aliaksandr Valialkin	8aaa828ba3	lib/prompbmarshal: return back custom protobuf marshaler for lib/prompbmarshal.WriteRequest The easyproto-based marshaler is 2x slower than the previous custom marshaler, so let's stick with it. This improves the performance for sending data to remote storage at vmagent and reduces CPU usage to pre-v1.97.0 levels.	2024-02-01 06:33:06 +02:00
Aliaksandr Valialkin	b7fd7ee0b6	lib/promauth: follow-up for `fca3b14b7b` - Simplify the code for handling BasicAuthConfig at lib/promauth/config.go - Move the description of the change into correct place at docs/CHANGELOG.md - Put tests for username in front of tests for password at lib/promauth/config_test.go Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5720 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5511	2024-01-31 19:45:16 +02:00
Nihal	fca3b14b7b	Support for username_file in scrape config (basic_auth) similar to Prometheus for having config compatibility (#5720 ) * adding support for username_file in basic_auth of scrape config Signed-off-by: Syed Nihal <syed.nihal@nokia.com> * adding support for username_file in basic_auth of scrape config. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5511 Signed-off-by: Syed Nihal <syed.nihal@nokia.com> * adding support for username_file in basic_auth of scrape config Signed-off-by: Syed Nihal <syed.nihal@nokia.com> * adding support for username_file in basic_auth of scrape config Signed-off-by: Syed Nihal <syed.nihal@nokia.com> * adding support for username_file in basic_auth of scrape config Signed-off-by: Syed Nihal <syed.nihal@nokia.com> --------- Signed-off-by: Syed Nihal <syed.nihal@nokia.com>	2024-01-31 17:41:16 +00:00
Aliaksandr Valialkin	db4623efc2	app/vmselect/netstorage: properly handle the case when an empty brsPool points to the end of brs.brs This case is possible after a new brsPool is allocated. The fix is to verify whether len(brsPool) >= len(brs.brs) before trying to append a new item to brsPool and sharing its contents with brs.brs. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5733	2024-01-31 10:27:50 +02:00
hagen1778	02492bc1a4	dashboards/single: fix typo in query for `version` annotation The typo falsely produced many version change events. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-31 09:13:46 +01:00
Aliaksandr Valialkin	ec0ca8e7eb	app/vmselect/promql: really keep metric names when keep_metric_names modifier is applied to binary operator Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5556	2024-01-31 02:32:55 +02:00
Aliaksandr Valialkin	fcc8b14f86	deployment/docker: upgrade base Docker image from Alpine 3.19.0 to 3.19.1 See https://www.alpinelinux.org/posts/Alpine-3.19.1-released.html	2024-01-30 22:47:18 +02:00
Aliaksandr Valialkin	26488726a8	docs/CHANGELOG.md: cut v1.97.0	2024-01-30 22:45:04 +02:00
Roman Khavronenko	6939c53e48	app/vmselect: set proper timestamp for cached instant responses (#5723 ) * app/vmselect: set proper timestamp for cached instant responses The change updates `getSumInstantValues` to prefer timestamp from the most recent results. Before, timestamp from cached series was used. The old behavior had negative impact on recording rules as they were getting responses with shifted timestamps in past. Subsequent recording or alerting rules fetching results of these recording rules could get no result due to staleness interval. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5659 Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-30 20:03:34 +00:00
Yury Molodov	81b5db04f6	vmui: add the ability to expand all tracing entries (#5677 ) (#5726 )	2024-01-30 19:10:10 +00:00
Aliaksandr Valialkin	f768d5d797	docs/CHANGELOG.md: document the enhancement, which reduces initial memory usage when `vmagent` scrapes targets with large responses Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5567	2024-01-30 20:51:13 +02:00
Aliaksandr Valialkin	17f8ed8948	docs/CHANGELOG.md: refer to the related pull request for the bugfix for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945	2024-01-30 20:21:44 +02:00
Aliaksandr Valialkin	ea2752ce62	docs/CHANGELOG.md: document the bugfix addressed by the commit `bc7cf4950b` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945	2024-01-30 20:16:22 +02:00
Aliaksandr Valialkin	5d66ee88bd	lib/storage: do not check the limit for -search.maxUniqueTimeseries when performing /api/v1/labels and /api/v1/label/.../values requests This limit has little sense for these APIs, since: - Thses APIs frequently result in scanning of all the time series on the given time range. For example, if extra_filters={datacenter="some_dc"} . - Users expect these APIs shouldn't hit the -search.maxUniqueTimeseries limit, which is intended for limiting resource usage at /api/v1/query and /api/v1/query_range requests. Also limit the concurrency for /api/v1/labels, /api/v1/label/.../values and /api/v1/series requests in order to limit the maximum memory usage and CPU usage for these API. This limit shouldn't affect typical use cases for these APIs: - Grafana dashboard load when dashboard labels should be loaded - Auto-suggestion list load when editing the query in Grafana or vmui Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055	2024-01-29 16:45:12 +01:00
Roman Khavronenko	aaa526e8ff	lib/streamaggr: skip unfinished aggregation state on shutdown by default (#5689 ) Sending unfinished aggregate states tend to produce unexpected anomalies with lower values than expected. The old behavior can be restored by specifying `flush_on_shutdown: true` setting in streaming aggregation config Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-26 22:45:23 +01:00
Roman Khavronenko	df59ac7f0e	app/vmalert: fix data race during hot-config reload (#5698 ) * app/vmalert: fix data race during hot-config reload During hot-reload, the logic evokes the group update and rules evaluation interruption simultaneously. Falsely assuming that interruption happens before the update. However, it could happen that group will be updated first and only after the rules evaluation will be cancelled. Which will result in permanent interruption for all rules within the group. The fix caches the cancel context function into local variable first. And only after performs the group update. With cached cancel function we can safely call it without worrying that we cancel the evaluation for already updated group. Signed-off-by: hagen1778 <roman@victoriametrics.com> * Revert "app/vmalert: fix data race during hot-config reload" This reverts commit `a4bb7e8932`. * app/vmalert: fix data race during hot-config reload During hot-reload, the logic evokes the group update and rules evaluation interruption simultaneously. Falsely assuming that interruption happens before the update. However, it could happen that group will be updated first and only after the rules evaluation will be cancelled. Which will result in permanent interruption for all rules within the group. The fix cancels the evaulation context before applying the update, making sure that the context will be cancelled for old group always. Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-26 22:42:21 +01:00
Yury Molodov	a7b11eff7c	vmui: fix `Enter` key in query field (#5667 ) (#5681 )	2024-01-26 22:38:32 +01:00
Aliaksandr Valialkin	bb7a419cc3	lib/{mergeset,storage}: make background merge more responsive and scalable - Maintain a separate worker pool per each part type (in-memory, file, big and small). Previously a shared pool was used for merging all the part types. A single merge worker could merge parts with mixed types at once. For example, it could merge simultaneously an in-memory part plus a big file part. Such a merge could take hours for big file part. During the duration of this merge the in-memory part was pinned in memory and couldn't be persisted to disk under the configured -inmemoryDataFlushInterval . Another common issue, which could happen when parts with mixed types are merged, is uncontrolled growth of in-memory parts or small parts when all the merge workers were busy with merging big files. Such growth could lead to significant performance degradataion for queries, since every query needs to check ever growing list of parts. This could also slow down the registration of new time series, since VictoriaMetrics searches for the internal series_id in the indexdb for every new time series. The third issue is graceful shutdown duration, which could be very long when a background merge is running on in-memory parts plus big file parts. This merge couldn't be interrupted, since it merges in-memory parts. A separate pool of merge workers per every part type elegantly resolves both issues: - In-memory parts are merged to file-based parts in a timely manner, since the maximum size of in-memory parts is limited. - Long-running merges for big parts do not block merges for in-memory parts and small parts. - Graceful shutdown duration is now limited by the time needed for flushing in-memory parts to files. Merging for file parts is instantly canceled on graceful shutdown now. - Deprecate -smallMergeConcurrency command-line flag, since the new background merge algorithm should automatically self-tune according to the number of available CPU cores. - Deprecate -finalMergeDelay command-line flag, since it wasn't working correctly. It is better to run forced merge when needed - https://docs.victoriametrics.com/#forced-merge - Tune the number of shards for pending rows and items before the data goes to in-memory parts and becomes visible for search. This improves the maximum data ingestion rate and the maximum rate for registration of new time series. This should reduce the duration of data ingestion slowdown in VictoriaMetrics cluster on e.g. re-routing events, when some of vmstorage nodes become temporarily unavailable. - Prevent from possible "sync: WaitGroup misuse" panic on graceful shutdown. This is a follow-up for `fa566c68a6` . Thanks @misutoth to for the inspiration at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5190 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3790 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291	2024-01-26 22:27:47 +01:00
Roman Khavronenko	b11f4ef5ea	app/vmalert: autogenerate `ALERTS_FOR_STATE` time series for alerting rules with `for: 0` (#5680 ) * app/vmalert: autogenerate `ALERTS_FOR_STATE` time series for alerting rules with `for: 0` Previously, `ALERTS_FOR_STATE` was generated only for alerts with `for > 0`. This behavior differs from Prometheus behavior - it generates ALERTS_FOR_STATE time series for alerting rules with `for: 0` as well. Such time series can be useful for tracking the moment when alerting rule became active. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5648 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3056 Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmalert: support ALERTS_FOR_STATE in `replay` mode Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-25 15:42:57 +01:00
Alexander Marshalov	806c07ddd5	vmsingle/vmselect returns http status 429 (TooManyRequests) instead of 503 (ServiceUnavailable) when max concurrent requests limit is reached. (#5682 )	2024-01-24 17:55:06 +01:00
Aliaksandr Valialkin	ef12598ad4	lib/promscrape/discovery/kubernetes: do not generate targets for already terminated pods and containers Already terminated pods and containers cannot be scraped and will never resurrect, so there is zero sense in creating scrape targets for them.	2024-01-24 14:57:53 +02:00
Aliaksandr Valialkin	4d961c70f7	app/{vmselect,vmstorage}: return compression of the data passed from vmstorage to vmselect This reverts `cd4f641d32` , since it has been appeared that the disabled compression for vmstorage->vmselect data increase network bandwidth usage by more than 10x on typical production workloads, while it decreases CPU usage at vmstorage by up to 10% and improves query latency by up to 10%. The 10x increase in network usage is too high price for 10% improvements on query latency and vmstorage CPU usage. This may result in network bandwidth bottlenecks, which can reduce the overall performance and stability of VictoriaMetrics cluster. That's why return back the vmstorage->vmselect data compression by default. The vmstorage->vmselect compression can be disabled by passing -rpc.disableCompression command-line flag to vmstorage. The vmselect->vmselect compression in multi-level cluster setup can be disabled by passing -clusternative.disableCompression command-line flag.	2024-01-24 13:39:28 +02:00
Aliaksandr Valialkin	f888a019fe	lib/streamaggr: expand `%{ENV}` placeholders in stream aggregation configs	2024-01-24 12:31:27 +02:00
Aliaksandr Valialkin	fa566c68a6	lib/mergeset: really limit the number of in-memory parts to 15 It has been appeared that the registration of new time series slows down linearly with the number of indexdb parts, since VictoriaMetrics needs to check every indexdb part when it searches for TSID by newly ingested metric name. The number of in-memory parts grows when new time series are registered at high rate. The number of in-memory parts grows faster on systems with big number of CPU cores, because the mergeset maintains per-CPU buffers with newly added entries for the indexdb, and every such entry is transformed eventually into a separate in-memory part. The solution has been suggested in https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212 by @misutoth - to limit the number of in-memory parts with buffered channel. This solution is implemented in this commit. Additionally, this commit merges per-CPU parts into a single part before adding it to the list of in-memory parts. This reduces CPU load when searching for TSID by newly ingested metric name. The https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212 recommends setting the limit on the number of in-memory parts to 100, but my internal testing shows that much lower limit 15 works with the same efficiency on a system with 16 CPU cores while reducing memory usage for `indexdb/dataBlocks` cache by up to 50%. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5190	2024-01-24 03:38:12 +02:00
Roman Khavronenko	89e3c70ccd	lib/promscrape: respect `0` value for `series_limit` param (#5663 ) * lib/promscrape: respect `0` value for `series_limit` param Respect `0` value for `series_limit` param in `scrape_config` even if global limit was set via `-promscrape.seriesLimitPerTarget`. Previously, `0` value will be ignored in favor of `-promscrape.seriesLimitPerTarget`. This behavior aligns with possibility to override `series_limit` value via relabeling with `__series_limit__` label. Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update docs/CHANGELOG.md --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 13:09:14 +02:00
Aliaksandr Valialkin	114822d585	app/{vmstorage,vmselect}: disable vmstorage->vmselect RPC compression by default in order to improve query performance	2024-01-23 04:24:57 +02:00
Zakhar Bessarab	bf4742526d	lib/storage: print tenant ID in log when discarding or truncating labels (#5658 ) Previously, it was not possible to determine which tenant sends metrics with excessive amount of labels of label values. Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 04:24:56 +02:00
Yury Molodov	38231d5994	vmui: query report (#5497 ) * vmui: add query analyzer page * vmui: fix tabs for query analyzer * vmui: add help to export query * vmui: add time params to query analyzer * docs/vmui: add query analyzer * vmui: fix validation JSON form --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 04:23:26 +02:00
Yury Molodov	eb6def0695	vmui: add flag for default timezone setting (#5611 ) * vmui: add flag for default timezone setting #5375 * vmui: validate timezone before client return * Update app/vmselect/vmui.go --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 04:11:19 +02:00
Yury Molodov	633e6b48ad	vmui: fix cache autocomplete (#5591 ) * vmui: fix the logic of closing the popper #5470 * vmui: fix the logic of caching autocomplete results #5472 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 04:06:14 +02:00
Dmytro Kozlov	38b2a5bc44	deployment/docker: add grafana datasource to the docker-compose files (#5363 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3920 https://github.com/VictoriaMetrics/grafana-datasource/issues/113	2024-01-22 15:45:31 +01:00
Aliaksandr Valialkin	d3ee3e0ef5	Revert "lib/promscrape: do not store last scrape response when stale markers … (#5577 )" This reverts commit `cfec258803`. Reason for revert: the original code already doesn't store the last scrape response when stale markers are disabled. The scrapeWork.areIdenticalSeries() function always returns true is stale markers are disabled. This prevents from storing the last response at scrapeWork.processScrapedData(). It looks like the reverted commit could also return back the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3660 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5577	2024-01-22 00:43:48 +02:00
Aliaksandr Valialkin	1c7f990fad	app/vmselect: handle negative time range start in a generic manner inside NewSearchQuery() This is a follow-up for `cf03e11d89` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5553 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5630	2024-01-21 23:45:31 +02:00
Hui Wang	4e3242b02d	lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice (#5557 ) * lib/promscrape/discovery/kubernetes: fix watcher start order for roles endpoints and endpointslice Previously the groupWatcher could be mistakenly stopped when requests for pod or services resources take too long. * remove mislead comment * docs/sd_configs.md: mention -promscrape.kubernetes.attachNodeMetadataAll flag in the description for attach_metadata section Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640 * wip * lib/promscrape/kubernetes: prevent from stopping groupWatcher when there are in-flight apiWatcher.mustStart() calls groupWatcher is stopped if it has zero registered apiWatchers during 14 seconds. But such a groupWatcher can be still in use if apiWatcher for `role: endpoints` or `role: endpointslice` is being registered and the discovery of the associated `pod` and/or `service` objects takes longer than 14 seconds - see the beginning of groupWatcher.startWatchersForRole() function for details. Track the number of in-flight calls to apiWatcher.mustStart() and prevent from stopping the associated groupWatcher if the number of in-flight calls is non-zero. P.S. postponing the discovery of `pod` and/or `service` objects associated with `endpoints` or `endpointslice` roles isn't the best solution, since it slows down initial discovery of `endpoints` and `endpointslice` targets. * typo fix --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-21 23:13:15 +02:00
Aliaksandr Valialkin	1f105dde98	all: allow dynamically reading *AuthKey flag values from files and urls Examples: 1) -metricsAuthKey=file:///abs/path/to/file - reads flag value from the given absolute filepath 2) -metricsAuthKey=file://./relative/path/to/file - reads flag value from the given relative filepath 3) -metricsAuthKey=http://some-host/some/path?query_arg=abc - reads flag value from the given url The flag value is automatically updated when the file contents changes.	2024-01-21 22:03:38 +02:00
Nikolay	b3598ba2c1	app/vmauth: adds metric_labels and backend_errors counter (#5585 ) * app/vmauth: adds metric_labels and backend_errors counter it must improve observability for user requests with new metric - per user backend errors counter. it's needed to calculate requests fail rate to the configured backends. metric_labels configuration allows to perform additional aggregations on top of multiple users from configuration section. It could be multiple clients or clients with separate read/write tokens https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5565 * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-21 04:40:52 +02:00
Aliaksandr Valialkin	7fba73ce11	lib/promscrape/discovery/kubernetes: add -promscrape.kubernetes.attachNodeMetadataAll command-line flag This flag allows setting attach_metadata.node=true for all the kubernetes_sd_configs defined at -promscrape.config Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4640 Thanks to wasim-nihal for the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5593	2024-01-21 03:13:56 +02:00
Hui Wang	fad212c39c	app/vmselect/promql: properly handle possible negative results caused… (#5608 ) * app/vmselect/promql: properly handle possible negative results caused by float operations precision error in rollup functions like rate() or increase() * fix test	2024-01-21 02:53:29 +02:00
Nikolay	c9f39fd51f	app/vmselect/netstorage (#5649 ) * app/vmselect/netstorage correctly handle errGlobal set * wip Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5649 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-21 02:47:29 +02:00
Nikolay	8ab0ce3ded	app/vmselect: abort streaming connections for vmselect (#5650 ) * app/vmselect: abort streaming connections for vmselect due to streaming nature of export APIs, curl and simmilr tools cannot detect errors that happened after http.Header with status 200 was written to it. This PR tracks if body write was already started and closes connection. It allows client to detect not expected chunk sequence and return error to the caller. Mostly it affects vmselect at cluster version https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5645 * wip Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5645 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5650 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-21 02:12:51 +02:00
Aliaksandr Valialkin	74448a7e57	lib/promscrape/discovery/hetzner: follow-up after `03a97dc678` - docs/sd_configs.md: moved hetzner_sd_configs docs to the correct place according to alphabetical order of SD names, document missing __meta_hetzner_role label. - lib/promscrape/config.go: added missing MustStop() call for Hetzner SD, and moved the code to the correct place according to alphabetical order of SD names. - lib/promscrape/discovery/hetzner: properly handle pagination for hloud API responses, populate missing __meta_hetzner_role label like Prometheus does. - Properly populate __meta_hetzner_public_ipv6_network label like Prometheus does. - Remove unused SDConfig.Token. - Remove "omitempty" annotation from SDConfig.Role field, since this field is mandatory. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5550 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3154	2024-01-20 17:01:53 +02:00

1 2 3 4 5 ...

1953 commits