github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-12-11 14:53:49 +00:00

Author	SHA1	Message	Date
Yury Molodov	140aaafad0	vmui: add a time picker to the "Logs Explorer" page (#5808 ) * vmui: add a time picker to the "Logs Explorer" page #5673 * Update app/vmui/packages/vmui/src/pages/ExploreLogs/hooks/useFetchLogs.ts --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-23 01:38:54 +02:00
Yury Molodov	b2b9f6e900	vmui: fix display Popper.tsx (#5842 ) * vmui: fix display Popper.tsx * vmui/docs: fix display Popper.tsx --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-23 01:33:12 +02:00
Aliaksandr Valialkin	21170e558c	lib/promutils: hide the math.Round() logic inside ParseTimeMsec() function This should prevent from bugs similar to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5801 in the future This is a follow-up for `ce3ec3ff2e`	2024-02-23 01:21:42 +02:00
Nikolay	22762d7a69	app/vmselect: change export/csv timestamp format for rfc3339 to respect milliseconds (#5853 ) * app/vmselect: adds milliseconds to the csv export response for rfc3339 * milliseconds is a standard prescion for VictoriaMetrics query request responses https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5837 * app/victoria-metrics: adds tests for csv export/import follow-up after 3541a8d0cf96dd4f8563624c4aab6816615d0756 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-02-23 01:16:08 +02:00
Aliaksandr Valialkin	a982ab6bfb	app/vmstorage: expose vm_snapshots metric, which shows the current number of snapshots While at it, refresh docs about snapshots - https://docs.victoriametrics.com/#how-to-work-with-snapshots	2024-02-23 01:07:04 +02:00
Aliaksandr Valialkin	477fdc21aa	app/vmselect/promql: add `count_values_over_time()` MetricsQL function See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5847	2024-02-23 01:05:31 +02:00
Aliaksandr Valialkin	65fb54ab8f	app/vmselect/promql: move needSilenceIntervalForRollupFunc from eval.go to rollup.go This should improve maintainability of the code related to rollup functions, since it is located in rollup.go While at it, properly return empty results from holt_winters(), rate_over_sum(), sum2_over_time(), geomean_over_time() and distinct_over_time() when there are no real samples on the selected lookbehind window. Previously the previous sample value was mistakenly returned from these functions.	2024-02-23 01:05:11 +02:00
Alexander Marshalov	8322425364	[lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5814 ) * [lib/promutils, lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801) * fixed tests * fixed test * Revert "fixed test" This reverts commit `8a29764806`. * Revert "fixed tests" This reverts commit `9ce13d1042`. * Revert "[lib/promutils, lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801)" This reverts commit `a7a04bd4` * [lib/httputils] fixed floating-point error when parsing time in RFC3339 format (#5801) --------- Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-02-23 00:58:26 +02:00
Anton L	8b7ff0f66e	#5833 Fix Deadlock when using shardByURL of VMAgent (#5834 )	2024-02-22 11:54:53 +02:00
Dan Dascalescu	0c7eda7c88	app/vmselect: simplify wording for `too many samples` error (#5827 ) (cherry picked from commit `17cf031fa1`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-02-20 16:29:11 +01:00
Roman Khavronenko	2e172b9361	vmctl : Provide TLS config options for Open TSDB datasource #5797 (#5832 ) Originally implemented here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5797 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: khushijain21 <khushij393@gmail.com> (cherry picked from commit `bb1279bfc4`)	2024-02-20 16:27:52 +01:00
hagen1778	4474c23aed	app/vmalert: consistently sort groups by name and filename on `/groups` page This should prevent non-deterministic sorting for groups with identical names. Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `e2dad3a2ac`)	2024-02-20 13:51:31 +01:00
hagen1778	6c63fd831d	app/vmalert: follow-up after `b60dcbe11f` * support case-insensitive search * reflect search condition in URL, so link can be sharable * support filtering on /alerts page * fix collapseAll/expandAll logic to respect only shown entries * add changelog `b60dcbe11f` Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `11b03d9fc8`)	2024-02-20 13:35:02 +01:00
Victor Amorim dos Santos	f79abd54b0	vmalert: add filter by group or rule name to UI (#5791 ) Co-authored-by: Yury Molodov <yurymolodov@gmail.com> (cherry picked from commit `b60dcbe11f`)	2024-02-20 13:35:02 +01:00
Yury Molodov	7d15c5abeb	vmui: update package-lock.json (#5822 ) This should address detected security vulnerabilities (cherry picked from commit `524c0a2e07`)	2024-02-20 13:35:02 +01:00
Aliaksandr Valialkin	b58c429044	app/vlselect: follow-up for `451d2abf50` - Consistently return the first `limit` log entries if the total size of found log entries doesn't exceed 1Mb. See app/vlselect/logsql/sort_writer.go . Previously random log entries could be returned with each request. - Document the change at docs/VictoriaLogs/CHANGELOG.md - Document the `limit` query arg at docs/VictoriaLogs/querying/README.md - Make the change less intrusive. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5674 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5778	2024-02-18 23:06:08 +02:00
Dmytro Kozlov	2d674f98d4	Enable the `limit` query param for the `/select/logsql/query` (#5778 ) * app/vlselect: add limit for logs query * app/vlselect: CHANGELOG.md * app/vlselect: stop search process if limit is reached, update logic, remove default limit * app/vlselect: fix tests * app/vlselect: fix filter tests * app/vlselect: fix tests	2024-02-18 22:59:16 +02:00
hagen1778	e53f53aaf5	app/vmctl: follow-up after `0c293a66ec` Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `f973711e56`)	2024-02-16 15:31:58 +01:00
Khushi Jain	9ce7f21a63	app/vmctl : support TLS config options for remote read mode (#5798 ) (cherry picked from commit `0c293a66ec`)	2024-02-16 15:31:58 +01:00
hagen1778	025e52adad	app/vmctl: follow-up after `7cd1b7d047` * cleanup code * update docs Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `6a07cb1bdb`)	2024-02-16 15:31:58 +01:00
Khushi Jain	02c8b5015c	app/vmctl : support TLS config options for InfluxDB datasource (#5783 ) * vmctl: TLS flags for influx DB * added httputils function * Add changelog and doc --------- Co-authored-by: Khushi Jain <khushi.jain@nokia.com> (cherry picked from commit `7cd1b7d047`)	2024-02-16 15:31:57 +01:00
Aliaksandr Valialkin	33b2553c78	app/vmstorage: expose vm_last_partition_parts metrics, which may help identifying performance issues related to the increased number of parts in the last partition	2024-02-15 14:52:53 +02:00
Aliaksandr Valialkin	e16fc81c74	app/vmselect: add missing handler at /select/.../prometheus/vmui/timezone This is a follow-up for `3a26e4d6ec`	2024-02-14 11:18:07 +02:00
Aliaksandr Valialkin	a74dad09ad	app/vmselect/vmui: run `make vmui-update` after `1c9f13d6c7`	2024-02-14 02:36:06 +02:00
Yury Molodov	b08a23c4a5	vmui: improve the context for autocomplete #5736 #5737 #5739 (#5804 ) Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-14 02:36:06 +02:00
Aliaksandr Valialkin	5f2905d120	app/vmselect: add sum_eq_over_time, sum_gt_over_time and sum_le_over_time functions to MetricsQL See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4641	2024-02-13 23:40:30 +02:00
Nikolay	0a2cc0e873	app/vmauth: properly release memory during config reload (#5805 ) * app/vmauth: properly release memory during config reload previously metrics package hold a refrence for channels for users concurrent requests. it case of churn at `name` field of users configuration, new metric was created. But previous one wasn't deleted. It prevented full parsed configuration from being garbace collected. now all config related metrics are bound to corresponding metrics.Set and unregistered during config reload process. It also must fix an issue with incorrect values for current concurrent user requests https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4690 * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-13 20:49:57 +02:00
Aliaksandr Valialkin	67091537ae	app/vmagent/remotewrite: add -remoteWrite.tlsHandshakeTimeout command-line flag for tuning tls handshake timeout to -remoteWrite.url Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1699	2024-02-13 02:46:24 +02:00
Aliaksandr Valialkin	6bc70a883d	app/vmauth: add support for mTLS-based routing of incoming requests to different backends depending on the subject field in the TLS certificate provided by the user Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1547	2024-02-13 01:04:19 +02:00
Aliaksandr Valialkin	f5680a6857	all: upgrade Go builder from Go1.21.7 to Go1.22.0 See https://go.dev/doc/go1.22	2024-02-12 22:14:00 +02:00
Roman Khavronenko	433c3726b2	app/vmalert: support filtering for /api/v1/rule like Prometheus does (#5787 ) Follow-up after `62e5e2a4c8` Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `8850c7431d`)	2024-02-09 14:36:15 +01:00
Victor Amorim dos Santos	56b1d8e9ed	app/vmalert: support `type` param for filtering /api/v1/rules response by rule type (#5749 ) Co-authored-by: Hui Wang <haley@victoriametrics.com> (cherry picked from commit `62e5e2a4c8`)	2024-02-09 14:36:14 +01:00
Aliaksandr Valialkin	ae12ac69ba	lib/snapshot: move Time, Validate and NewName into lib/snapshot/snapshotutil package This allows removing importing unneeded command-line flags into binaries, which import lib/storage, which, in turn, was importing lib/snapshot in order to use Time, Validate and NewName functions. This is a follow-up for `83e55456e2` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5738	2024-02-09 04:19:30 +02:00
Aliaksandr Valialkin	cf64597878	all: add support for specifying multiple -httpListenAddr options	2024-02-09 03:22:49 +02:00
Khushi Jain	a076cb4a93	app/vmbackup: support client-side TLS configuration for create/delete snapshot API (#5738 ) (cherry picked from commit `83e55456e2`)	2024-02-08 15:58:34 +01:00
Aliaksandr Valialkin	202d8e2c40	docs: update -help output after `61d9df4c36` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/834	2024-02-08 14:50:56 +02:00
Aliaksandr Valialkin	b18e608016	app/vmselect: add ability to reset rollup result cache on startup by passing -search.resetRollupResultCacheOnStartup command-line flag Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/834	2024-02-08 14:42:15 +02:00
Aliaksandr Valialkin	def5573f92	app/vmselect/promql: properly handle precision errors in rollup functions changes(), increases_over_time() and resets() shouldn't take into account value changes, which may occur because of precision errors. The maximum guaranteed precision for raw samples stored in VictoriaMetrics is 12 decimal digits. So do not count relative changes for values if they are smaller than 1e-12 comparing to the value. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/767	2024-02-08 02:33:17 +02:00
Aliaksandr Valialkin	caf706fcc0	all: update Go builder from Go1.21.6 to Go1.21.7 See https://github.com/golang/go/issues?q=milestone%3AGo1.21.7+label%3ACherryPickApproved	2024-02-07 04:01:05 +02:00
Aliaksandr Valialkin	cc6f05b117	app/vminsert: fix the code after `c634859c4f`	2024-02-07 02:08:34 +02:00
Aliaksandr Valialkin	750ddeef54	app/{vmselect,vlselect}/vmui: run `make vmui-update vmui-logs-update` after the recent changes to app/vmui This is a follow-up for the following commits: - `dcbdbc760e` - `a81ccbd749` - `65b8002aeb`	2024-02-07 01:49:45 +02:00
Aliaksandr Valialkin	82f4e4e070	app/{vmagent,vminsert}: follow-up after `a1d1ccd6f2` - Document the change at docs/CHANGELOG.md - Copy changes from docs/Single-server-VictoriaMetrics.md to README.md - Add missing handler for processing multitenant requests ( https://docs.victoriametrics.com/vmagent/#multitenancy ) - Substitute github.com/stretchr/testify dependency with 3 lines of code in the added tests - Comment unclear code at lib/protoparser/datadogsketches/parser.go , so @AndrewChubatiuk could update it and add permalinks to the original source code there. - Various code cleanups Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5584 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3091	2024-02-07 01:31:52 +02:00
Andrii Chubatiuk	c634859c4f	support datadog /api/beta/sketches API (#5584 ) Co-authored-by: Andrew Chubatiuk <andrew.chubatiuk@motional.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-07 01:30:00 +02:00
Yury Molodov	5778acf9eb	vmui: improve select component functionality (#5755 ) * vmui: fix select closing on click outside (#5728) * vmui: clear entered text in select after selecting a value (#5727) --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-06 22:50:28 +02:00
Yury Molodov	0b5f5d456c	vmui: fix handling invalid timezone (#5758 ) * vmui: fix handling invalid timezone (#5732) * vmui: switch browser timezone flag to isValid --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-06 22:48:00 +02:00
Yury Molodov	0cf17068b8	vmui: fix graph dragging (#5769 ) Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-02-06 22:32:43 +02:00
Aliaksandr Valialkin	63a43331a3	docs/Cluster-VictoriaMetrics.md: document -disableReroutingOnUnavailable command-line flag This is a follow-up for `88f0d1572e` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5713	2024-02-05 15:17:09 +02:00
Muxa1L	88f0d1572e	Add flag to vminsert to disable rerouting when some of storage nodes are unavailable (#5713 ) * Flag to disable rerouting from unavailable storage nodes * Update netstorage.go * Fix fmt for netstorage.go	2024-02-05 12:46:57 +00:00
Aliaksandr Valialkin	f222cf9200	lib/cgroup: remove SetGOGC() function GOGC can be already set via environment variable. There is no need in adding new approaches for setting the GOGC (such as command-line flag), since they complicate operations.	2024-02-05 12:13:08 +02:00
Aliaksandr Valialkin	7a9f0b32a2	app/vmselect/netstorage: prevent from disk write IO when closing temporary files Remove temporary file before closing it in order to signal the OS that it shouldn't store the file contents from page cache to disk when the file is closed. Gracefully handle the case when the file cannot be removed before being closed - in this case remove the file after closing it. This allows working on Windows. Also remove superflouos opening of temporary file for reading - re-use already opened file handle for writing. This is a follow-up for `9b1e002287` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4020 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70	2024-02-01 19:54:48 +02:00
Aliaksandr Valialkin	c5f2a2b91f	app/vmselect: add missing whitespace into the description for -vmui.defaultTimezone command-line flag This is a follow-up for `eb6def0695` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5611	2024-02-01 14:49:48 +02:00
Aliaksandr Valialkin	4f5cb17042	app/vmselect/netstorage: properly handle the case when an empty brsPool points to the end of brs.brs This case is possible after a new brsPool is allocated. The fix is to verify whether len(brsPool) >= len(brs.brs) before trying to append a new item to brsPool and sharing its contents with brs.brs. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5733	2024-01-31 10:31:51 +02:00
Aliaksandr Valialkin	2033fe4caf	app/vmselect/promql: really keep metric names when keep_metric_names modifier is applied to binary operator Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5556	2024-01-31 02:33:06 +02:00
Roman Khavronenko	02e609b141	app/vmselect: set proper timestamp for cached instant responses (#5723 ) * app/vmselect: set proper timestamp for cached instant responses The change updates `getSumInstantValues` to prefer timestamp from the most recent results. Before, timestamp from cached series was used. The old behavior had negative impact on recording rules as they were getting responses with shifted timestamps in past. Subsequent recording or alerting rules fetching results of these recording rules could get no result due to staleness interval. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5659 Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-30 22:20:16 +02:00
Aliaksandr Valialkin	3164f7526c	app/vmselect/vmui: run `make vmui-update` after `81b5db04f6`	2024-01-30 21:13:14 +02:00
Yury Molodov	587c5fec2d	vmui: add the ability to expand all tracing entries (#5677 ) (#5726 )	2024-01-30 21:13:13 +02:00
Aliaksandr Valialkin	c5deb226d4	app/vmselect/vmui: run `make vmui-update` after 6e8995cfb92fb5a87fc6ad78609bf9ea5e0e712f	2024-01-30 18:45:52 +02:00
Yury Molodov	8958fb78ad	vmui: fix `Enter` key in query field (#5667 ) (#5717 ) (cherry picked from commit `7007c6a760`)	2024-01-30 14:45:47 +01:00
Aliaksandr Valialkin	3b18659487	app/vmagent/remotewrite: limit the concurrency for marshaling time series before sending them to remote storage There is no sense in running more than GOMAXPROCS concurrent marshalers, since they are CPU-bound. More concurrent marshalers do not increase the marshaling bandwidth, but they may result in more RAM usage.	2024-01-30 12:20:27 +02:00
Roman Khavronenko	a3e198588f	vmalert: set `ActiveAt` to evaluation timestamp in `newAlert` fn (#5657 ) The change fixes flaky test `TestAlertingRule_Exec` which has dependency on the actual timestamps, which resulted into inaccurate test states: https://github.com/VictoriaMetrics/VictoriaMetrics/actions/runs/7608452967/job/20717699688 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-29 17:30:14 +01:00
Aliaksandr Valialkin	f5559c038c	lib/storage: do not check the limit for -search.maxUniqueTimeseries when performing /api/v1/labels and /api/v1/label/.../values requests This limit has little sense for these APIs, since: - Thses APIs frequently result in scanning of all the time series on the given time range. For example, if extra_filters={datacenter="some_dc"} . - Users expect these APIs shouldn't hit the -search.maxUniqueTimeseries limit, which is intended for limiting resource usage at /api/v1/query and /api/v1/query_range requests. Also limit the concurrency for /api/v1/labels, /api/v1/label/.../values and /api/v1/series requests in order to limit the maximum memory usage and CPU usage for these API. This limit shouldn't affect typical use cases for these APIs: - Grafana dashboard load when dashboard labels should be loaded - Auto-suggestion list load when editing the query in Grafana or vmui Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5055	2024-01-29 16:44:46 +01:00
Aliaksandr Valialkin	2fba741af2	app/vmui: run `make vmui-update` after `a7b11eff7c`	2024-01-26 22:53:52 +01:00
Roman Khavronenko	562edb72ea	app/vmalert: fix data race during hot-config reload (#5698 ) * app/vmalert: fix data race during hot-config reload During hot-reload, the logic evokes the group update and rules evaluation interruption simultaneously. Falsely assuming that interruption happens before the update. However, it could happen that group will be updated first and only after the rules evaluation will be cancelled. Which will result in permanent interruption for all rules within the group. The fix caches the cancel context function into local variable first. And only after performs the group update. With cached cancel function we can safely call it without worrying that we cancel the evaluation for already updated group. Signed-off-by: hagen1778 <roman@victoriametrics.com> * Revert "app/vmalert: fix data race during hot-config reload" This reverts commit `a4bb7e8932`. * app/vmalert: fix data race during hot-config reload During hot-reload, the logic evokes the group update and rules evaluation interruption simultaneously. Falsely assuming that interruption happens before the update. However, it could happen that group will be updated first and only after the rules evaluation will be cancelled. Which will result in permanent interruption for all rules within the group. The fix cancels the evaulation context before applying the update, making sure that the context will be cancelled for old group always. Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-26 22:43:02 +01:00
Yury Molodov	551f48466c	vmui: fix `Enter` key in query field (#5667 ) (#5681 )	2024-01-26 22:38:51 +01:00
Aliaksandr Valialkin	7a8b92b590	lib/{mergeset,storage}: make background merge more responsive and scalable - Maintain a separate worker pool per each part type (in-memory, file, big and small). Previously a shared pool was used for merging all the part types. A single merge worker could merge parts with mixed types at once. For example, it could merge simultaneously an in-memory part plus a big file part. Such a merge could take hours for big file part. During the duration of this merge the in-memory part was pinned in memory and couldn't be persisted to disk under the configured -inmemoryDataFlushInterval . Another common issue, which could happen when parts with mixed types are merged, is uncontrolled growth of in-memory parts or small parts when all the merge workers were busy with merging big files. Such growth could lead to significant performance degradataion for queries, since every query needs to check ever growing list of parts. This could also slow down the registration of new time series, since VictoriaMetrics searches for the internal series_id in the indexdb for every new time series. The third issue is graceful shutdown duration, which could be very long when a background merge is running on in-memory parts plus big file parts. This merge couldn't be interrupted, since it merges in-memory parts. A separate pool of merge workers per every part type elegantly resolves both issues: - In-memory parts are merged to file-based parts in a timely manner, since the maximum size of in-memory parts is limited. - Long-running merges for big parts do not block merges for in-memory parts and small parts. - Graceful shutdown duration is now limited by the time needed for flushing in-memory parts to files. Merging for file parts is instantly canceled on graceful shutdown now. - Deprecate -smallMergeConcurrency command-line flag, since the new background merge algorithm should automatically self-tune according to the number of available CPU cores. - Deprecate -finalMergeDelay command-line flag, since it wasn't working correctly. It is better to run forced merge when needed - https://docs.victoriametrics.com/#forced-merge - Tune the number of shards for pending rows and items before the data goes to in-memory parts and becomes visible for search. This improves the maximum data ingestion rate and the maximum rate for registration of new time series. This should reduce the duration of data ingestion slowdown in VictoriaMetrics cluster on e.g. re-routing events, when some of vmstorage nodes become temporarily unavailable. - Prevent from possible "sync: WaitGroup misuse" panic on graceful shutdown. This is a follow-up for `fa566c68a6` . Thanks @misutoth to for the inspiration at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5212 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5190 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3790 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3425 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3647 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3641 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291	2024-01-26 22:19:52 +01:00
Aliaksandr Valialkin	d8c82b6421	app/vmselect/netstorage: initialize tmpBlocksFileWrapper at goroutine, which continues using it This may improve CPU cache locality	2024-01-26 21:29:30 +01:00
Aliaksandr Valialkin	7c7bfa27ac	app/vmauth: return 503 service unavailable status code when the backend returns response with unsupported status code, but the request cannot be re-tried. While at it, properly close response body. This should prevent from possible http keep-alive connection leak to backends because of unclosed response bodies. This is a follow-up for `3c0aa14b5b` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5688	2024-01-26 21:10:57 +01:00
Roman Khavronenko	a2f83115ae	app/vmalert: autogenerate `ALERTS_FOR_STATE` time series for alerting rules with `for: 0` (#5680 ) * app/vmalert: autogenerate `ALERTS_FOR_STATE` time series for alerting rules with `for: 0` Previously, `ALERTS_FOR_STATE` was generated only for alerts with `for > 0`. This behavior differs from Prometheus behavior - it generates ALERTS_FOR_STATE time series for alerting rules with `for: 0` as well. Such time series can be useful for tracking the moment when alerting rule became active. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5648 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3056 Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmalert: support ALERTS_FOR_STATE in `replay` mode Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-26 20:51:50 +01:00
Alexander Marshalov	ef4bb36d99	vmauth: fix `vmauth_user_request_backend_errors_total` metric calc logic for use case when only one backend is available - if we get an error from the retry_status_codes list, but cannot execute retry, we increment vmauth_user_request_backend_errors_total as well (#5688 )	2024-01-26 20:49:18 +01:00
Aliaksandr Valialkin	0715f1efcd	lib/storage: rename AssistedMerges to AssistedMergesCount in order to make these field names less misleading These fields are counters, not gauges, so adding Count suffix to them makes easier to understand this while reading the code	2024-01-25 10:21:13 +02:00
Alexander Marshalov	14712e3b99	vmsingle/vmselect returns http status 429 (TooManyRequests) instead of 503 (ServiceUnavailable) when max concurrent requests limit is reached. (#5682 )	2024-01-25 10:21:09 +02:00
Aliaksandr Valialkin	1cdef56d84	lib/mergeset: start assisted merge for file parts only if the number of file parts is bigger than maxFileParts The maxFileParts usage has been accidentally removed in `fa566c68a6` While at it, add Count suffix to *AssistedMerges counter names in order to make them less misleading. Previously their names were falsely suggesting that these are gauges, which show the number of concurrently executed assisted merges.	2024-01-24 15:10:48 +02:00
Aliaksandr Valialkin	0dca3c4025	app/{vmselect,vmstorage}: return compression of the data passed from vmstorage to vmselect This reverts `cd4f641d32` , since it has been appeared that the disabled compression for vmstorage->vmselect data increase network bandwidth usage by more than 10x on typical production workloads, while it decreases CPU usage at vmstorage by up to 10% and improves query latency by up to 10%. The 10x increase in network usage is too high price for 10% improvements on query latency and vmstorage CPU usage. This may result in network bandwidth bottlenecks, which can reduce the overall performance and stability of VictoriaMetrics cluster. That's why return back the vmstorage->vmselect data compression by default. The vmstorage->vmselect compression can be disabled by passing -rpc.disableCompression command-line flag to vmstorage. The vmselect->vmselect compression in multi-level cluster setup can be disabled by passing -clusternative.disableCompression command-line flag.	2024-01-24 13:37:05 +02:00
Aliaksandr Valialkin	be320c81bc	app/vminsert/clusternative: explain why lower-level vminsert doesnt compress responses to upper-level vminsert	2024-01-23 18:14:19 +02:00
Aliaksandr Valialkin	cfc1193d15	app/vmselect/netstorage: limit the maximum brsPool size to 32Kb at ProcessSearchQuery() This avoids slow path in Go runtime for allocating objects bigger than 32Kb - see `704401ffa0/src/runtime/malloc.go (L11)` This also reduces memory usage a bit for vmselect and single-node VictoriaMetrics after the commit `5dd37ad836` . Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 14:12:27 +02:00
Aliaksandr Valialkin	fe4ea30a79	app/vmselect/netstorage: limit the size of metricNamesBuf to 32Kb in order to avoid slow path at Go runtime for allocating a byte slice of bigger size See `704401ffa0/src/runtime/malloc.go (L11)` This also reduces the average memory usage a bit for vmselect and single-node VictoriaMetrics after the commit `508c608062` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 13:50:59 +02:00
Aliaksandr Valialkin	1880c656c1	app/vmselect/vmui: run `make vmui-update` in order to sync recent changes in app/vmui	2024-01-23 04:31:57 +02:00
Yury Molodov	1db2b991b7	vmui: query report (#5497 ) * vmui: add query analyzer page * vmui: fix tabs for query analyzer * vmui: add help to export query * vmui: add time params to query analyzer * docs/vmui: add query analyzer * vmui: fix validation JSON form --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 04:26:04 +02:00
Yury Molodov	3a26e4d6ec	vmui: add flag for default timezone setting (#5611 ) * vmui: add flag for default timezone setting #5375 * vmui: validate timezone before client return * Update app/vmselect/vmui.go --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 04:15:14 +02:00
Yury Molodov	574d69775e	vmui: fix cache autocomplete (#5591 ) * vmui: fix the logic of closing the popper #5470 * vmui: fix the logic of caching autocomplete results #5472 --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-23 04:06:39 +02:00
Aliaksandr Valialkin	e20cbfcbc3	app/vmselect/promql: remove superflouos memory allocations at aggrPrepareSeries() While at it, also remove unneeded map lookup	2024-01-23 02:29:14 +02:00
Aliaksandr Valialkin	cd4f641d32	app/{vmstorage,vmselect}: disable vmstorage->vmselect RPC compression by default in order to improve query performance	2024-01-23 02:29:13 +02:00
Aliaksandr Valialkin	b7cc1af3eb	app/vmselect/promql/aggr_incremental.go: eliminate unnecessary memory allocation in incrementalAggrFuncContext.updateTimeseries	2024-01-23 02:29:13 +02:00
Aliaksandr Valialkin	953b96ced2	app/vmselect/netstorage: remove tswPool, since it isnt efficient	2024-01-23 02:29:13 +02:00
Aliaksandr Valialkin	68a59bfabd	app/vmselect/netstorage: avoid metricName->blockRef lookup when processing multiple blocks for the same time series This saves a few CPU cycles for common case	2024-01-23 02:29:12 +02:00
Aliaksandr Valialkin	f8a9ef8cbd	app/vmselect/netstorage: group per-vmstorage fields at tmpBlocksFileWrapperShard This improves code readability a bit	2024-01-23 02:29:12 +02:00
Aliaksandr Valialkin	d52b121222	app/vmselect/netstorage: use []blockRef from blockRefPool in order to reduce memory allocations	2024-01-23 02:29:11 +02:00
Aliaksandr Valialkin	5b05224eb9	app/vmselect/netstorage: substitute pointer to blockRefs by brssPool index at the metricName->blockRefs map This should reduce the pressure on Go GC, since it will see lower number of pointers. This change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 02:29:11 +02:00
Aliaksandr Valialkin	b289f15f02	app/vmselect/netstorage: reduce the number of allocations for blockRefs objects in ProcessSearchQuery() This should reduce pressure on Go GC at vmselect The change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 02:29:11 +02:00
Aliaksandr Valialkin	2ab9a75cca	app/vmselect/netstorage: reduce the number of memory allocations in ProcessSearchQuery() by storing all the metric names in a single byte slice This reduces the number of memory allocations at the cost of possible memory usage increase, since now different metric name strings may hold references to the previous byte slice. This is good tradeoff, since ProcessSearchQuery is called in vmselect, and vmselect isn't usually limited by memory. This change has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5527	2024-01-23 02:29:10 +02:00
hagen1778	8138499439	app/vmctl/backoff: fix flaky test The change removes artificial delay before returning error, which sometimes caused less retry events than expected. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-22 18:42:05 +02:00
hagen1778	ede466be56	docs: fix Grafana link example for vmalert Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-22 18:41:38 +02:00
Aliaksandr Valialkin	d52fd73f18	all: add up to 10% random jitter to the interval between periodic tasks performed by various components This should smooth CPU and RAM usage spikes related to these periodic tasks, by reducing the probability that multiple concurrent periodic tasks are performed at the same time.	2024-01-22 18:39:16 +02:00
Aliaksandr Valialkin	3230525c36	docs: use persistent links to Grafana dashboards These links do not depend on the dashboard name, so they do not break after the renaming of the dashboard. This is a follow-up for `ff33e60a3d`	2024-01-22 01:45:42 +02:00
Aliaksandr Valialkin	d4a1a28543	app/vmselect: handle negative time range start in a generic manner inside NewSearchQuery() This is a follow-up for `cf03e11d89` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5553 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5630	2024-01-22 01:39:27 +02:00
Aliaksandr Valialkin	885ee160c2	all: allow dynamically reading *AuthKey flag values from files and urls Examples: 1) -metricsAuthKey=file:///abs/path/to/file - reads flag value from the given absolute filepath 2) -metricsAuthKey=file://./relative/path/to/file - reads flag value from the given relative filepath 3) -metricsAuthKey=http://some-host/some/path?query_arg=abc - reads flag value from the given url The flag value is automatically updated when the file contents changes.	2024-01-22 01:23:23 +02:00
Aliaksandr Valialkin	5f5fcab217	all: call atomic.Load* in front of atomic.CompareAndSwap* at places where the atomic.CompareAndSwap* returns false most of the time This allows avoiding slow inter-CPU synchornization induced by atomic.CompareAndSwap*	2024-01-22 01:13:41 +02:00
Nikolay	73c51072e6	app/vmauth: adds metric_labels and backend_errors counter (#5585 ) * app/vmauth: adds metric_labels and backend_errors counter it must improve observability for user requests with new metric - per user backend errors counter. it's needed to calculate requests fail rate to the configured backends. metric_labels configuration allows to perform additional aggregations on top of multiple users from configuration section. It could be multiple clients or clients with separate read/write tokens https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5565 * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-22 01:09:51 +02:00
Yury Molodov	0582ec5c8c	vmui: add autofocus to input for desktop version #5479 (#5592 )	2024-01-22 01:09:27 +02:00
Hui Wang	e086ef16da	app/vmselect/promql: properly handle possible negative results caused… (#5608 ) * app/vmselect/promql: properly handle possible negative results caused by float operations precision error in rollup functions like rate() or increase() * fix test	2024-01-22 01:04:50 +02:00
Roman Khavronenko	fe4934f0ec	app/vmui: send `step` param for instant queries (#5639 ) The change reverts https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3896 due to reasons explained in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3896#issuecomment-1896704401 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-22 00:25:31 +02:00
Roman Khavronenko	148e14b3f2	app/vmselect: properly calculate `start` param for queries with too big look-behind window (#5630 ) Properly determine time range search for instant queries with too big look-behind window like `foo[100y]`. Previously, such queries could return empty responses even if `foo` is present in database. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5553 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-21 23:47:09 +02:00
Aliaksandr Valialkin	2b67944eb4	app/vmselect/graphite: properly handle -N index for the array of N items This is a follow-up for `70cd09e736` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5581	2024-01-17 00:16:37 +02:00
Aliaksandr Valialkin	b3823bc091	app/vmctl: disallow insecure https connections to vm-native-dst-addr and vm-native-src-addr by default It is better from security PoV to disallow insecure https connections to vm-native-dst-addr and vm-native-src-addr . This also maintains backwards compatibility with vmctl before the commit `828aca82e9` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5595 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5606	2024-01-17 00:15:50 +02:00
Yury Molodov	9588b9bd19	vmui/vmanomaly: add support models that produce only `anomaly_score` (#5594 ) * vmui/vmanomaly: add support models that produce only `anomaly_score` * vmui/vmanomaly: fix display legend * vmui/vmanomaly: update comment on anomaly threshold	2024-01-17 00:12:43 +02:00
Aliaksandr Valialkin	b7fcdb1985	deployment/docker: update Go builder from Go1.21.5 to Go1.21.6	2024-01-17 00:05:24 +02:00
Aliaksandr Valialkin	84f11a9e6d	app/vmselect/promql: simplify the code after `388d020b7c` Add a test, which verifies the correct sorting of float64 slices with NaNs. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5506 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509	2024-01-16 22:35:51 +02:00
Aliaksandr Valialkin	6ba2fd3312	app/vmselect/promql: follow-up for `ce4f26db02` - Document the bugfix at docs/CHANGELOG.md - Filter out NaN values before sorting as suggested at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509#discussion_r1447369218 - Revert unrelated changes in lib/filestream and lib/fs - Use simpler test at app/vmselect/promql/exec_test.go Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5509 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5506	2024-01-16 22:13:13 +02:00
Zongyang	cb37df5723	FIX bottomk doesn't return any data when there are no time range overlap between timeseries (#5509 ) * FIX sort order in bottomk * Add lessWithNaNsReversed for bottomk * Add ut for TopK * Move lt from loop * FIX lint * FIX lint * FIX lint * Mod log format --------- Co-authored-by: xiaozongyang <xiaozngyang@kanyun.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2024-01-16 22:12:49 +02:00
Aliaksandr Valialkin	015c0a4d1a	app/vmselect/promql: consistently sort results of `a or b` query Previously the order of results returned from `a or b` query could change with each request because the sorting for such query has been disabled in order to satisfy https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4763 . This commit executes `a or b` query as `sortByMetricName(a) or sortByMetricName(b)`. This makes the order of returned time series consistent across requests, while maintaining the requirement from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4763 , e.g. `b` results are consistently put after `a` results. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5393	2024-01-16 22:12:15 +02:00
Aliaksandr Valialkin	825cfdb5ef	app/vmstorage: expose proper types for storage metrics when -metrics.exposeMetadata command-line flag is set This is a follow-up for `326a77c697`	2024-01-16 22:03:31 +02:00
Aliaksandr Valialkin	9b449dfadf	app/vmstorage: deregister storage metrics before stopping the storage This prevents from possible nil pointer dereference issues when the storage metrics are read after the storage is stopped. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548	2024-01-16 21:26:03 +02:00
Aliaksandr Valialkin	9e5e514faf	lib/pushmetrics: wait until the background goroutines, which push metrics, are stopped at pushmetrics.Stop() Previously the was a race condition when the background goroutine still could try collecting metrics from already stopped resources after returning from pushmetrics.Stop(). Now the pushmetrics.Stop() waits until the background goroutine is stopped before returning. This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5549 and the commit `fe2d9f6646` . Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548	2024-01-16 21:18:22 +02:00
rbizos	62db64e71b	Handling negative index in Graphite groupByNode/aliasByNode (#5581 ) Handeling the error case with -1 Signed-off-by: Raphael Bizos <r.bizos@criteo.com> Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-01-16 20:55:27 +02:00
Aliaksandr Valialkin	d566aa7d78	lib/prompbmarshal: switch to github.com/VictoriaMetrics/easyproto	2024-01-16 20:48:30 +02:00
Aliaksandr Valialkin	063ea8c773	app/vmalert/remotewrite: properly calculate vmalert_remotewrite_dropped_rows_total It was calculating the number of dropped time series instead of the number of dropped samples. While at it, drop vmalert_remotewrite_dropped_bytes_total metric, since it was inconsistently calculated - at one place it was calculating raw protobuf-encoded sample sizes, while at another place it was calculating the size of snappy-compressed prompbmarshal.WriteRequest protobuf message. Additionally, this metric has zero practical sense, so just drop it in order to reduce the level of confusion.	2024-01-16 20:47:13 +02:00
Aliaksandr Valialkin	f7b589e38a	lib/prompb: switch to github.com/VictoriaMetrics/easyproto	2024-01-16 20:43:09 +02:00
Aliaksandr Valialkin	7d40506744	lib/prompb: change type of Label.Name and Label.Value from []byte to string This makes it more consistent with lib/prompbmarshal.Label	2024-01-16 20:41:37 +02:00
Dmytro Kozlov	b95d6f5f5e	app/vmctl: add insecure skip verify flags for source and destination addresses for native protocol (#5606 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5595	2024-01-16 20:29:41 +02:00
hagen1778	2a7207f38a	app/all: follow-up after `84d710beab` https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5548 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-09 13:17:09 +01:00
zhdd99	84d710beab	lib/pushmetrics: fix a panic caused by pushing metrics during the graceful shutdown process of vmstorage nodes. (#5549 ) Co-authored-by: zhangdongdong <zhangdongdong@kuaishou.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-01-09 13:01:03 +01:00
Dmytro Kozlov	37a76de800	app/vmui: fix broken link for the statistic inaccuracy explanation (#5568 ) (cherry picked from commit `105c6b2eb7`)	2024-01-08 20:15:04 +01:00
Hui Wang	c14e229b20	vmalert: automatically add `exported_` prefix for original evaluation… (#5398 ) automatically add `exported_` prefix for original evaluation result label if it's conflicted with external or reserved one, previously it was overridden. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5161 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `1f477aba41`)	2023-12-22 16:10:33 +01:00
Aliaksandr Valialkin	12de0d39eb	lib/protoparser/datadogv2: take into account source_type_name field, since it contains useful value such as kubernetes, docker, system, etc.	2023-12-21 23:05:52 +02:00
Aliaksandr Valialkin	62a105d9e9	app/{vminsert,vmagent}: preliminary support for /api/v2/series ingestion from new versions of DataDog Agent This commit adds only JSON support - https://docs.datadoghq.com/api/latest/metrics/#submit-metrics , while recent versions of DataDog Agent send data to /api/v2/series in undocumented Protobuf format. The support for this format will be added later. Thanks to @AndrewChubatiuk for the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4451	2023-12-21 20:50:27 +02:00
Roman Khavronenko	4837616df6	app/vmselect: drop `rollupDefault` function as duplicate (#5502 ) * app/vmselect: drop `rollupDefault` function as duplicate It is unclear why there are two identical fns `rollupDefault` and `rollupDistinct`. Dropping one of them. Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update app/vmselect/promql/rollup.go * Update app/vmselect/promql/rollup.go --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-12-21 11:23:20 +02:00
Aliaksandr Valialkin	3a9cf13aaa	app/{vmagent,vmalert}: add the ability to set OAuth2 endpoint params via the corresponding *.oauth2.endpointParams command-line flags This is a follow-up for `5ebd5a0d7b` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5427	2023-12-20 21:38:16 +02:00
Morgan	64e96fccd9	Expose OAuth2 Endpoint Parameters to cli (#5427 ) The user may which to control the endpoint parameters for instance to set the audience when requesting an access token. Exposing the parameters as a map allows for additional use cases without requiring modification.	2023-12-20 21:38:13 +02:00
Aliaksandr Valialkin	c888d76c4b	app/vmselect/netstorage: make sure that at least a single result is collected from every storage group before deciding whether it is OK to skip results from the remaining storage nodes	2023-12-20 19:53:49 +02:00
Aliaksandr Valialkin	261c173f4b	all: use Gauge instead of Counter for `*_config_last_reload_successful` metrics This allows exposing the correct TYPE metadata for these labels when the app runs with -metrics.exposeMetadata command-line flag. See https://github.com/VictoriaMetrics/metrics/pull/61#issuecomment-1860085508 for more details. This is follow-up for `326a77c697`	2023-12-20 14:25:44 +02:00
Yury Molodov	d0b047a2bf	vmui: add vmanomaly explorer (#5401 )	2023-12-20 14:15:25 +02:00
Roman Khavronenko	bcb2b8247c	vmctl: rename `vm-native-disable-retries` to `vm-native-disable-per-metric-migration` (#5476 ) The change supposed to better reflect the meaning of this flag. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-12-17 19:19:22 +02:00
Roman Khavronenko	5f14fa94dd	vmctl: retry requests that failed in the very end for `vm-native` (#5475 ) Before, retries happened only on writes into a network connection between source and destination. But errors returned by server after all the data was transmitted were logged, but not retried. Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `664fa5cb78`)	2023-12-15 11:54:08 +01:00
Hui Wang	ed4f77575f	vmalert: validate schema for `-external.url` (#5450 ) Requests with wrong or no schema in `-external.url` could be rejected by alertmanager. So we validate schema on start up. (cherry picked from commit `9253c24dd6`)	2023-12-15 11:54:07 +01:00
Dima Lazerka	44c113f829	VMUI: Handle unknown query error response type (#5451 ) * VMUI: Handle unknown query error response type * vmui: add error text for unknown error type * Simplify nested `if`s for unknown error Accepting @Loori-R's suggestion Co-authored-by: Yury Molodov <yurymolodov@gmail.com> --------- Co-authored-by: Yury Moladau <yurymolodov@gmail.com> (cherry picked from commit `cd277e3f84`)	2023-12-15 11:54:07 +01:00
Aliaksandr Valialkin	42629dead1	app/vmstorage: addd missing -inmemoryDataFlushInterval command-line flag Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3337	2023-12-14 20:47:49 +02:00
Anton Tykhyy	51af1dfff7	Fix sum(aggr_over_time) 'got 1 args' error (#3028 ) (#5414 ) app/vmselect/promql/eval.go:evalAggrFunc shunts evaluation of AggrFuncExpr over rollupFunc over MetricsExpr to an optimized path. tryGetArgRollupFuncWithMetricExpr() checks whether expression can be shunted, but it mangles the AggrFuncExpr when the aggregation function has more than one argument. This results in queries like `sum(aggr_over_time("avg_over_time",m))` failing with error message 'expecting at least 2 args to "aggr_over_time"; got 1 args' while the analogous query `sum(avg_over_time(m))` executes successfully. This fix removes the unnecessary mangling. Signed-off-by: Anton Tykhyy <atykhyy@gmail.com>	2023-12-14 12:49:01 +02:00
Aliaksandr Valialkin	4ee42e9e73	app/vmauth: allow specifying an empty retry_status_codes and and zero drop_src_path_prefix_parts in order to override user-level setting Previously `retry_status_codes: []` and `drop_src_path_prefix_parts: 0` at `url_map` were equivalent to missing values. This was resulting in using the user-level values instead.	2023-12-14 01:06:50 +02:00
Aliaksandr Valialkin	51acf0179c	app/vmauth: add ability to route requests to different backends depending on the request host	2023-12-14 00:47:00 +02:00
Yury Molodov	e76c44c5b4	vmui: autocomplete usability improvements (#5422 ) * vmui: add show quick tip for autocomplete * vmui: auto-completion usability improvements #5348 * vmui: add const for min symbols in autocomplete * Use proper queries to VictoriaMetrics * vmui: fix comments for autocomplete * app/vmselect: run `make vmui-update` --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-12-13 00:33:27 +02:00
Aliaksandr Valialkin	e4bb2808f1	app/vmselect: add support for vmstorage groups with independent -replicationFactor per group Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5197 See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#vmstorage-groups-at-vmselect Thanks to @zekker6 for the initial pull request at https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/718	2023-12-13 00:14:34 +02:00
hagen1778	3b841fe9ce	app/vmctl: follow-up after `6af732b6f7` Make docs more clear about new feature. Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `242472086b`)	2023-12-12 13:45:35 +01:00
Dmytro Kozlov	63fc200f16	app/vmctl: enable range steps in reverse order (#5444 ) See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5376 (cherry picked from commit `6af732b6f7`)	2023-12-12 13:45:35 +01:00
hagen1778	307bcb8d4d	app/vmctl: follow-up after `27668c9d01` * remove duplications in error messages * mention the change in CHANGELOG.md `27668c9d01` Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `39c405ed4d`)	2023-12-11 15:38:22 +01:00
wozz	61f940591f	vmctl: check error in response from influxdb (#5446 ) (cherry picked from commit `27668c9d01`)	2023-12-11 15:38:20 +01:00
Aliaksandr Valialkin	842aba3f46	deployment/docker: update base Docker image from alpine:3.18.5 to alpine:3.19.0 See https://www.alpinelinux.org/posts/Alpine-3.19.0-released.html	2023-12-10 02:28:31 +02:00
Aliaksandr Valialkin	3d6517b05e	app/vmselect: add -search.maxResponseSeries command-line flag for limiting the number of time series a single response can return This limit can be used for preventing from high memory usage at Grafana when the response returns too many series. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5372	2023-12-10 00:54:32 +02:00
Aliaksandr Valialkin	55eb48f5ee	app: make more clear that -tls enables https at -httpListenAddr	2023-12-10 00:25:23 +02:00
Aliaksandr Valialkin	49552eaa15	app/vmauth: add support for `hot standby` mode via `first_available` load balancing policy vmauth in `hot standby` mode sends requests to the first url_prefix while it is available. If the first url_prefix becomes unavailable, then vmauth falls back to the next url_prefix. This allows building highly available setup as described at https://docs.victoriametrics.com/vmauth.html#high-availability Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4893 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4792	2023-12-08 23:32:10 +02:00
Roman Khavronenko	276e9301f4	app/vmalert: sanitize label names before sending to Alertmanager (#5442 ) Before, vmalert would send notifications with labels containing characters not supported by Alertmanager validator, resulting into validation errors like `msg="Failed to validate alerts" err="invalid label set: invalid name "foo.bar"` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-12-08 18:09:07 +02:00
Alexander Marshalov	e9cf39f519	added field `version` to the response for `/api/v1/status/buildinfo` API for using more efficient API in Grafana for receiving label values, added additional info about setup Grafana datasource (#5370 ) (#5437 )	2023-12-07 16:41:56 +02:00
Aliaksandr Valialkin	32aea90847	app/vmselect/prometheus: go fmt after `b39e9257eb`	2023-12-07 16:05:01 +02:00
Aliaksandr Valialkin	9f79342e6a	app/vmselect/prometheus: properly encode Prometheus label values at /federate endpoint Prometheus spec says that only \, \n and " must be escaped inside label values. See `995743836e/content/docs/instrumenting/exposition_formats.md (L90)` See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5431	2023-12-07 15:36:50 +02:00
Aliaksandr Valialkin	12e94f10cc	deployment/docker: update Go builder from Go1.21.4 to Go1.21.5 See https://github.com/golang/go/issues?q=milestone%3AGo1.21.5+label%3ACherryPickApproved	2023-12-06 22:33:27 +02:00
Dmytro Kozlov	6a41e1ec0c	app/vmalert: replace error metrics for gauges with counter metrics (#5217 ) See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5160 Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `935bec447b`)	2023-12-06 19:41:34 +01:00
Aliaksandr Valialkin	509339bf63	app/vmselect: properly adjust the lower bound for the time range where raw samples must be selected for default_rollup() function Previously the lower bound could be too small, which could result in missing values at the beginning of the graph for default_rollup() function. This function is automatically applied to all the series selectors if they aren't explicitly wrapped into a rollup function - see https://docs.victoriametrics.com/MetricsQL.html#implicit-query-conversions While at it, properly take into account `-search.minStalenessInterval` command-line flag when adjusting the lower bound for the selected time range. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5388	2023-12-06 14:46:18 +02:00
Aliaksandr Valialkin	559e4db512	Revert "add datadog /api/v2/series and /api/beta/sketches support (#5094 )" This reverts commit `d6b4c8e4ef`. Reason for revert: https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5094#issuecomment-1839789080	2023-12-05 02:30:40 +02:00
Aliaksandr Valialkin	bf187b2dc9	app/vmagent: add `-enableMultitenantHandlers` command-line flag This flag allows converting tenant id to (vm_account_id, vm_project_id) labels. this flag deprecates `-remoteWrite.multitenantURL` command-line flag, because `-enableMultitenantHandlers` is easier to use and combine with multitenant url at vminsert - https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy-via-labels See https://docs.victoriametrics.com/vmagent.html#multitenancy Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1505	2023-12-05 01:35:59 +02:00
Aliaksandr Valialkin	85fcefaa34	app/vmagent: code cleanup for Kafka and Google PubSub consumers / producers - Add links to relevant docs into descriptions for every -kafka.* and -gcp.pubsub.* command-line flags. - Wait until message processing goroutines are stopped before returning from gcppubsub.Stop(). - Prevent from multiple calls to Init() without Stop(). - Drop message if tenantID cannot be parsed properly. - Take into account tenantID for all the supported message formats. - Support gzip-compressed messages for graphite format. - Use exponential backoff sleep when the message cannot be pushed to remote storage systems because of disabled on-disk persistence - https://docs.victoriametrics.com/vmagent.html#disabling-on-disk-persistence - Unblock from sleep as soon as Stop() is called. Previously the sleep could take up to 2 seconds after Stop() is called. - Remove unused globalCtx and initContext from app/vmagent/remotewrite/gcppubsub - Mention Google PubSub support at docs/enterprise.md - Make Google PubSub docs more clear at docs/vmagent.md This is a follow-up for commits 115245924a5f096c5a3383d6cc8e8b6fbd421984 and e6eab781ce42285a6a1750dc01eba6801dd35516 . Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/717 Updates https://github.com/VictoriaMetrics/VictoriaMetrics-enterprise/pull/713	2023-12-04 22:51:04 +02:00
Dmytro Kozlov	6770bad207	app/vmalert: expose `/vmalert/api/v1/rule` and `/api/v1/rule` API which returns rule status in JSON format (#5397 ) * app/vmalert: expose `/vmalert/api/v1/rule` and `/api/v1/rule` API which returns rule status in JSON format * app/vmalert: hide updates if query param not set * app/vmalert: fix panic (recursion call) * app/vmalert: add needed group name and file name * app/vmalert: fix comment, update behavior * app/vmalert: fix description * app/vmalert: simplify API for /api/v1/rule Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmalert: simplify API for /api/v1/rule Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmalert: simplify API for /api/v1/rule Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmalert: simplify API for /api/v1/rule Signed-off-by: hagen1778 <roman@victoriametrics.com> * app/vmalert: simplify API for /api/v1/rule Signed-off-by: hagen1778 <roman@victoriametrics.com> --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2023-12-04 22:49:39 +02:00
Aliaksandr Valialkin	d868155751	app/vmselect: do not limit concurrency for static and fast queries Previously concurrency for static and fast queries was limited with the -search.maxConcurrentRequests command-line flag. This could complicate identifying heavy queries via `vmui` at `Top queries` and `Active queries` pages, since `vmui` and these pages couldn't be opened on overloaded vmselect. Thanks to @f41gh7 for the idea.	2023-12-04 18:14:29 +02:00
Aliaksandr Valialkin	9f352f1b93	app/vminsert/newrelic: simplify the code a bit after `1fb8dc0092` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5416 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5421	2023-12-04 16:26:52 +02:00
Dmytro Kozlov	1fb8dc0092	app/vminsert: fix newrelic ingestion in cluster version (#5421 ) Properly pass tenant ID to ingested data from newrelic. Before tenant ID was mistakenly skipped.	2023-12-04 09:38:32 +01:00
Aliaksandr Valialkin	e017176f45	docs/vmauth.md: add typical use cases (cherry picked from commit `837f6f0975`)	2023-12-01 14:00:23 +01:00
Andrii Chubatiuk	d6b4c8e4ef	add datadog /api/v2/series and /api/beta/sketches support (#5094 ) Co-authored-by: Andrew Chubatiuk <andrew.chubatiuk@motional.com> Co-authored-by: Nikolay <https://github.com/f41gh7> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> (cherry picked from commit `543f218fe9`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-12-01 13:55:32 +01:00
luckyxiaoqiang	8ce82c5400	app/vmselect/promql: add day_of_year() function (#5368 ) Co-authored-by: dingxiaoqiang <dingxiaoqiang@bytedance.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> (cherry picked from commit `d7897e0d70`)	2023-11-28 12:49:48 +01:00
Aliaksandr Valialkin	5ccc22d66d	app/vmagent: properly increase vmagent_remotewrite_samples_dropped_total when scraped samples cannot be sent to the remote storage and -remoteWrite.dropSamplesOnOverload is set This is a follow-up for `5034aa0773` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110	2023-11-25 14:44:42 +02:00
Aliaksandr Valialkin	2f14394335	app/vmagent: follow-up for `090cb2c9de` - Add Try* prefix to functions, which return bool result in order to improve readability and reduce the probability of missing check for the result returned from these functions. - Call the adjustSampleValues() only once on input samples. Previously it was called on every attempt to flush data to peristent queue. - Properly restore the initial state of WriteRequest passed to tryPushWriteRequest() before returning from this function after unsuccessful push to persistent queue. Previously a part of WriteRequest samples may be lost in such case. - Add -remoteWrite.dropSamplesOnOverload command-line flag, which can be used for dropping incoming samples instead of returning 429 Too Many Requests error to the client when -remoteWrite.disableOnDiskQueue is set and the remote storage cannot keep up with the data ingestion rate. - Add vmagent_remotewrite_samples_dropped_total metric, which counts the number of dropped samples. - Add vmagent_remotewrite_push_failures_total metric, which counts the number of unsuccessful attempts to push data to persistent queue when -remoteWrite.disableOnDiskQueue is set. - Remove vmagent_remotewrite_aggregation_metrics_dropped_total and vm_promscrape_push_samples_dropped_total metrics, because they are replaced with vmagent_remotewrite_samples_dropped_total metric. - Update 'Disabling on-disk persistence' docs at docs/vmagent.md - Update stale comments in the code Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5088 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110	2023-11-25 12:13:39 +02:00
Nikolay	25ac2aac31	app/vmagent: allow to disabled on-disk persistence (#5088 ) * app/vmagent: allow to disabled on-disk queue Previously, it wasn't possible to build data processing pipeline with a chain of vmagents. In case when remoteWrite for the last vmagent in the chain wasn't accessible, it persisted data only when it has enough disk capacity. If disk queue is full, it started to silently drop ingested metrics. New flags allows to disable on-disk persistent and immediatly return an error if remoteWrite is not accessible anymore. It blocks any writes and notify client, that data ingestion isn't possible. Main use case for this feature - use external queue such as kafka for data persistence. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2110 * adds test, updates readme * apply review suggestions * update docs for vmagent * makes linter happy --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-25 12:12:29 +02:00
Roman Khavronenko	26242f526e	lib/protoparser: decrease `import.maxLineLen` from 100MB to 10MB (#5364 ) Tests showed that importing a single line with 70MB size takes 5.3GiB RSS memory for VictoriaMetrics single-node. In the scenario when user exports and imports data from one VM to another, it could possibly lead to OOM exception for destination VM. Importing a single line with 16MB size taks 1.3GiB RSS memory. Hence, the limit for `import.maxLineLen` was decreased from 100MB to 10MB to improve reliability of VictoriaMetrics during imports. Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-24 13:13:33 +02:00
Aliaksandr Valialkin	a906a7d85c	app/vmagent/remotewrite: do not drop persistent queues when -remoteWrite.multitenantURL is set It is unsafe to drop persistent queues when -remoteWrite.multitenantURL command-line flag is set, since these queues are created on demand when a new sample for the given tenant is pushed to the remote storage. This addresses https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5357 The issue has been appeared in the commit `f3a51e8b1d` when implementing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4014	2023-11-23 20:43:21 +02:00
Aliaksandr Valialkin	10b4dfbbf9	app/vmalert/notifier: remove backticks from the description for -notifier.blackhole command-line flag Backticks in flag description are automatically converted to flag type. See https://pkg.go.dev/flag#PrintDefaults This is a follow-up for `20025d4fd6` and `25317b4e70`	2023-11-22 20:17:45 +02:00
Aliaksandr Valialkin	db6dadf1f7	docs: convert png images to webp in all the docs except of docs/operator/* This reduces the size of docs/* folder from 33MB to 18MB Images inside docs/operator/* must be converted at the https://github.com/VictoriaMetrics/operator/tree/master/docs and then the updated images must be automatically propagated to the docs/operator/* This is a follow-up for `d3f919df3e` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5206	2023-11-22 19:29:47 +02:00
Aliaksandr Valialkin	46e58f3669	app/vmagent/README.md: sync with docs/vmagent.md after `cbe4a5c251` , so `make docs-sync` properly works	2023-11-20 22:43:28 +02:00
Nikolay	c06044ef52	app/vmagent: adds google pubsub as remoteWrite dst and ingest consumer (#713 ) it allows to push and receive metrics from google pubsub queue Adds needed documentation and examples for it	2023-11-20 22:43:26 +02:00
hagen1778	0dbbffbdd5	docs: typo after `3f5a41e35e` Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `20025d4fd6`)	2023-11-20 17:06:21 +01:00
Khanh Quoc Le	03e5ebaea9	Add _stream fields log (#5068 )	2023-11-17 16:04:13 +01:00
Aliaksandr Valialkin	5492ccf0d5	app/vmselect/promql: reduce the number of memory allocations inside copyTimeseriesShallow() Previously the number of memory allocations inside copyTimeseriesShallow() was equal to 1+len(tss) Reduce this number to 2 by pre-allocating a slice of timeseries structs with len(tss) length.	2023-11-17 15:41:38 +01:00
Aliaksandr Valialkin	8723c8546a	vendor: run `make vendor-update`	2023-11-16 20:21:16 +01:00
Aliaksandr Valialkin	994b3da361	app/vmselect: simplify code a bit after `63e0f16062` Use only a single call to prometheus.WriteErrorResponse() inside sendPrometheusError	2023-11-16 18:15:08 +01:00
Aliaksandr Valialkin	633ec37022	app/vmselect/promql: typo fix after `7ca8ebef20` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5332	2023-11-16 17:01:19 +01:00
Roman Khavronenko	c0039ce7a3	docs/vmalert: clarify deduplication recommendations for HA setup (#5336 ) Please see discussion here https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5279 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-16 16:27:47 +01:00
Aliaksandr Valialkin	7ca8ebef20	app/vmselect/promql: properly handle duplicate series when merging cached results with the results obtained from the database evalRollupFuncNoCache() may return time series with identical labels (aka duplicate series) when performing queries satisfying all the following conditions: - It must select time series with multiple metric names. For example, {__name__=~"foo\|bar"} - The series selector must be wrapped into rollup function, which drops metric names. For example, rate({__name__=~"foo\|bar"}) - The rollup function must be wrapped into aggregate function, which has no streaming optimization. For example, quantile(0.9, rate({__name__=~"foo\|bar"}) In this case VictoriaMetrics shouldn't return `cannot merge series: duplicate series found` error. Instead, it should fall back to query execution with disabled cache. Also properly store the merged results. Previously they were incorrectly stored because of a typo introduced in the commit `41a0fdaf39` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5332 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5337	2023-11-16 16:16:17 +01:00
Yury Molodov	cc5f1745ca	vmui: change autocomplete hotkey to Alt/Option + A (#5328 )	2023-11-15 23:33:33 +01:00
Aliaksandr Valialkin	68c0038a5d	docs/vmbackup.md: fix links to https://docs.victoriametrics.com/vmbackup.html#permanent-deletion-of-objects-in-s3-compatible-storages This is a follow-up for `2fc7e9f47e` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5121	2023-11-15 23:27:00 +01:00
Aliaksandr Valialkin	9a1354e8a9	docs/vmagent.md: refer to proper command-line flag: -remoteWrite.shardByURL.labels instead of -remoteWrite.shardByURLLabels This is a follow-up for `ed70a40669` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4942	2023-11-15 23:03:30 +01:00
Aliaksandr Valialkin	f9355d34be	docs: mention that VictoriaMetrics and vmagent support data ingestion via New Relic protocol now This is a follow-up for `f60c08a7bd` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3520	2023-11-15 22:56:34 +01:00
Aliaksandr Valialkin	8b01c6caf4	app/vmalert-tool: add missing multiarch directory This is needed for 'make publish-vmalert-tool'	2023-11-15 18:13:05 +01:00
Aliaksandr Valialkin	de3d5943eb	docs/stream-aggregation.md: clarify that stream aggregation is applied after all the configured relabeling This is a follow-up after `68d2cb203d`	2023-11-15 15:54:57 +01:00
Aliaksandr Valialkin	9d3f1ec0d0	app/vmctl/README.md: sync with docs/vmctl.md after `7b2e2a23c2`	2023-11-15 12:58:31 +01:00
John Belmonte	e94ec36ef6	vmctl README.md typo (#5326 )	2023-11-15 12:57:49 +01:00
hagen1778	cfc58dd932	docs: clarify vmalert flag changes Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-14 21:44:46 +01:00
Aliaksandr Valialkin	5b7f40907e	app/vmselect/netstorage: do not retry request when deadline is exceeded	2023-11-14 19:57:29 +01:00
Aliaksandr Valialkin	2f885d8e57	app/vmselect/promql: typo fixes after `7cf7740d18`	2023-11-14 03:34:25 +01:00
Aliaksandr Valialkin	9ff1ee333f	app/vmselect/promql: properly handle instant query optimization conrner cases for min_over_time() and max_over_time() - If min_over_time(m[offset] @ timestamp) <= min_over_time(m[offset] @ (timestamp-window)), then the optimization can be applied. - If max_over_time(m[offset] @ timestamp) >= max_over_time(m[offset] @ (timestamp-window)), then the optimization can be applied.	2023-11-14 02:58:18 +01:00
Yury Molodov	0fe02e8d9d	vmui: reduced the number of server requests (#5253 ) * vmui: reduced the number of server requests * run `make vmui-update vmui-logs-update` --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-14 01:50:57 +01:00
Yury Molodov	33e65e2cab	vmui: fix trailing slash in serverURL (#5271 ) * vmui: add function to autoremove slash at the end of serverURL (#5203) * vmui: change removeTrailingSlash func	2023-11-14 01:24:29 +01:00
Noah Labrecque	fbb572a180	fix: apply correct bounds to sf and tf (#5274 )	2023-11-14 01:19:47 +01:00
Zakhar Bessarab	f7834767c1	vmcluster: re-routing enhancement (#5293 ) * app/vmstorage: close vminsert connections gradually before stopping storage Implements graceful shutdown approach suggested here - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4922#issuecomment-1768146878 Test results for this can be found here - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4922#issuecomment-1790640274 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * app/vmstorage: update graceful shutdown logic - close connections from vminsert in determenistic order - update flag description - lower default timeout to 25 seconds. 25 seconds value was chosen because the lowest default value used in default configuration deployments is 30s(default value in Kubernetes and ansible-playbooks). Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/cluster: add information about re-routing enhancement during restart Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/changelog: add entry for new command-line flag Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * {app/vmstorage,lib/ingestserver}: address review feedback Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * docs/cluster: add note to update workload scheduler timeout Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> * wip --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-14 01:00:42 +01:00
Aliaksandr Valialkin	c1f651a9f9	app/vmauth: add ability to drop the specified number of `/`-delimited prefix parts from request path This can be done via `drop_src_path_prefix_parts` option at `url_map` and `user` levels. See https://docs.victoriametrics.com/vmauth.html#dropping-request-path-prefix	2023-11-13 22:34:40 +01:00
Aliaksandr Valialkin	356deada8c	lib/htmlcomponents: use relative links for the top page and for favicon.ico This allows hiding VictoriaMetrics components behind proxies with arbitrary path prefixes. For example, vmagent HTTP handlers can be served via /vmagent/ path prefix: - http://proxy/vmagent/targets - http://proxy/vmagent/service-discovery The path prefix can be arbitrary. For example, below are vmagent urls for /tenantID/vmagent/ path prefix: - http://proxy/tenantID/vmagent/targets - http://proxy/tenantID/vmagent/service-discovery While at it, consistently serve favicon.ico from any path directory. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5306 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5307	2023-11-13 20:28:17 +01:00
Aliaksandr Valialkin	a45cbc101f	all: cleanup: remove `// +build ...` lines, since they are no longer needed after Go1.17, and the minimum supported Go version for VictoriaMetrics source code is Go1.20	2023-11-13 19:15:42 +01:00
Yury Molodov	695bc7ff36	vmui: ui logs enhancements (#5312 ) * vmui/logs: fix time sorting #5300 * vmui/logs: add base query validation * vmui/logs: add a message for empty results	2023-11-13 10:40:18 +01:00
Aliaksandr Valialkin	54c494ae8e	docs/vmauth.md: add missing dashes in front of command-line flags at the `Backend TLS setup` section Dashes must be consistently used in front of command-line flags across the documentation. This is a follow up for `61594d2bd8`	2023-11-13 09:45:52 +01:00
Aliaksandr Valialkin	b9aba7edfb	app/vmauth: properly pass `Host` header to backends Previously the `Host` header was remained unchanged when passing it in requests to backends. This may improperly work if the backend uses host-based routing. While at it, allows http/2.0 requests to backends. While VictoriaMetrics components do not accept http/2.0 requests, other backends can require such requests. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5240	2023-11-13 09:45:34 +01:00
Aliaksandr Valialkin	78bc816220	app/vmauth: follow-up for `323f3720ed` - Re-use identically configured http.Transport across multiple users. This fixes handling of the limit on the number of connection, which can be established per each backend via -maxIdleConnsPerBackend command-line flag. This limit stopped working after `323f3720ed` - Add docs about backend TLS setup at https://docs.victoriametrics.com/vmauth.html#backend-tls-setup - Add ability to disable backend TLS verification for all the users via -backend.tlsInsecureSkipVerify command-line flag. This flag may be useful when -auth.config contains big number of users, and every user must disable backend TLS verification. - Add ability to specify TLS Root CA via tls_ca_file option at per-user basis and via -backend.tlsCAFile command-line flag across all the users. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5240	2023-11-13 09:45:16 +01:00
Aliaksandr Valialkin	76384b6d28	app/vmauth: improve docs a bit after `323f3720ed` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5240	2023-11-13 09:44:25 +01:00
Aliaksandr Valialkin	bf12a49087	app/vmagent/README.md: sync with docs/vmagent.md after `930d26b2ff`	2023-11-13 09:44:07 +01:00
Aliaksandr Valialkin	d9ecc3f6d7	lib/logger: add `-loggerMaxArgLen` command-line flag for fine-tuning the maximum length of logged args	2023-11-13 09:43:49 +01:00
Aliaksandr Valialkin	c916294b61	app/vmselect/promql: optimize instant queries with min_over_time() and max_over_time() rollup functions This is a follow-up for `41a0fdaf39`	2023-11-13 09:43:18 +01:00
Aliaksandr Valialkin	7bbdecb79a	deployment: update Go builder from Go1.21.3 to Go1.21.4 See https://github.com/golang/go/issues?q=milestone%3AGo1.21.4+label%3ACherryPickApproved	2023-11-13 09:40:08 +01:00
Roman Khavronenko	becf7bf8df	app/vmalert: update remote-write process (#5284 ) * app/vmalert: update remote-write process * automatically retry remote-write requests on closed connections. The change should reduce the amount of logs produced in environments with short-living connections or environments without support of keep-alive on network balancers. * increment `vmalert_remotewrite_errors_total` metric if all retries to send remote-write request failed. Before, this metric was incremented only if remote-write client's buffer is overloaded. * increment `vmalert_remotewrite_dropped_rows_total` amd `vmalert_remotewrite_dropped_bytes_total` metrics if remote-write client's buffer is overloaded. Before, these metrics were incremented only after unsuccessful HTTP calls. Signed-off-by: hagen1778 <roman@victoriametrics.com> * Update docs/CHANGELOG.md --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Hui Wang <haley@victoriametrics.com>	2023-11-13 09:25:29 +01:00
hagen1778	10da9e6e01	app/vmalert: fix typo in `remoteWrite.concurrency` description Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `c07dc45786`)	2023-11-03 22:05:00 +01:00
Yury Molodov	d7c6153f68	vmui: display query error on Explore metrics page (#5272 ) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5202 (cherry picked from commit `f90d2ec843`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-11-03 16:25:21 +01:00
Zakhar Bessarab	dea4695df5	app/vmauth: add option to skip TLS verification (#5256 ) Add `tls_insecure_skip_verify` option on per-user basis which allows to disable TLS verification for all requests to backend on behalf of this user. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5240 Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> (cherry picked from commit `323f3720ed`)	2023-11-03 12:05:26 +01:00
Aliaksandr Valialkin	3d6f4da3b3	docs: update -help output after recent changes to VictoriaMetrics components	2023-11-02 20:27:16 +01:00
Aliaksandr Valialkin	bf01a97f17	docs/CHANGELOG.md: update the description of the optimization for SLO/SLI-like queries according to latest changes See commits `4497a08e3d` and `92826b0b4a`	2023-11-02 20:09:22 +01:00
Roman Khavronenko	4e8c762fd9	app/vmalert: add label `file` pointing to the group's filename to metrics (#5281 ) The filename should help identifying alerting rules belonging to specific groups with identical names but different filenames. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5267 Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `b5254199c6`)	2023-11-02 16:02:29 +01:00
hagen1778	3773510e8f	app/vmalert: verify alert name correctness in restore test Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `6eb205f8b0`)	2023-11-02 16:02:29 +01:00
Hui Wang	44fcdf0cf0	vmalert: reduce restore query request for each alerting rule (#5265 ) reduce the number of queries for restoring alerts state on start-up. The change should speed up the restore process and reduce pressure on `remoteRead.url`. (cherry picked from commit `90d45574bf`)	2023-11-02 16:02:28 +01:00
Aliaksandr Valialkin	7fc5178a4b	app/vmselect/promql: add missing trace message in rollupResultCache.GetSeries()	2023-11-02 09:17:13 +01:00
Aliaksandr Valialkin	369d37749d	app/vmagent/remotewrite: add -remoteWrite.shardByURL.labels command-line flag This command-line flag can be used for specifying a list of labels used for sharding among -remoteWrite.url entries when -remoteWrite.shardByURL command-line flag is set. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4942	2023-11-01 23:09:08 +01:00
Alexander Marshalov	ffeec24811	vmauth: add browser authorization request for http requests without… (#5234 ) * vmauth: add browser authorization request for http requests without credentials to a route that is not in the `unauthorized_user` section (when `unauthorized_user` is specified). * add link to issue in CHANGELOG * Extend vmauth docs * wip --------- Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-11-01 21:00:52 +01:00
Aliaksandr Valialkin	ece7024f11	app/vmselect/promql: reduce the minimum lookbehind window for enabling SLO/SLI optimizations from 24 hours to 6 hours This reduction is based on production testing. Also expose -search.minWindowForInstantRollupOptimization command-line flag, so users could fine-tune this arg for their needs	2023-11-01 20:19:19 +01:00
Aliaksandr Valialkin	e4365dbe3e	app/vmselect: run `make quicktemplate-gen` after `b8739bc00b`	2023-11-01 17:53:30 +01:00
Aliaksandr Valialkin	ae9b4c94bc	app/vmselect: return stats.seriesFetched as string instead of number vmalert expects string value for stats.seriesFetched, so it is impossible switching to number without breaking compatibility with old vmalert releases :( It is still unclear why stats.seriesFetched has string type in the first place...	2023-11-01 17:49:28 +01:00
Aliaksandr Valialkin	6a98f9df54	app/vmui: show query execution duration in the header of query input field This should simplify the process of query optimization	2023-11-01 16:46:42 +01:00
Hui Wang	4fafdda13e	vmalert: support specifying full http url in notifier static_configs target (#5261 ) * vmalert: support specifying full http or https urls in notifier static_configs target address * show right label results in ui	2023-11-01 16:44:54 +01:00
Aliaksandr Valialkin	c5e3b11762	app/vmselect/promql: apply SLO-like optimization to all the `count_*_over_time()` functions This is a follow-up for `41a0fdaf39`	2023-11-01 09:58:50 +01:00
Aliaksandr Valialkin	b96d55e1e4	app/vmselect/promql: typo fix, which could lead to panic during range query execution The panic is: BUG: unexpected values after merging new values This is a follow-up for `41a0fdaf39`	2023-11-01 09:58:50 +01:00
Aliaksandr Valialkin	28f0610e14	app/vmui: fix non-working `Disable cache` checkbox at `JSON` and `Table` views	2023-10-31 22:58:15 +01:00
Aliaksandr Valialkin	7b7ad44e84	app/vmselect/promql: properly calculate rollup result if lookbehind window isn't set This is a follow-up for `41a0fdaf39`	2023-10-31 22:23:04 +01:00
Aliaksandr Valialkin	744f8c3fe7	app/vmselect/promql: add outliers_iqr(q) and outlier_iqr_over_time(m[d]) functions These functions allow detecting anomalies in series and samples using Interquartile range method. See Outliers section at https://en.wikipedia.org/wiki/Interquartile_range for more details.	2023-10-31 22:14:14 +01:00
Aliaksandr Valialkin	9661918bb4	app/vmselect/promql: optimize repeated SLI-like instant queries with lookbehind windows >= 1d Repeated instant queries with long lookbehind windows, which contain one of the following rollup functions, are optimized via partial result caching: - sum_over_time() - count_over_time() - avg_over_time() - increase() - rate() The basic idea of optimization is to calculate rf(m[d] @ t) as rf(m[offset] @ t) + rf(m[d] @ (t-offset)) - rf(m[offset] @ (t-d)) where rf(m[d] @ (t-offset)) is cached query result, which was calculated previously The offset may be in the range of up to 1 hour.	2023-10-31 20:08:38 +01:00
Aliaksandr Valialkin	9ba007a636	app/vmselect/promql: wrap too long line after `a950873fff`	2023-10-31 19:11:05 +01:00
Roman Khavronenko	9d8f93050c	app/vmselect: expose `vm_memory_intensive_queries_total` counter metric (#5208 ) The new metric gets increased each time `-search.logQueryMemoryUsage` memory limit is exceeded by a query. This metric should help to identify expensive and heavy queries without inspecting the logs. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-31 19:02:22 +01:00
Hui Wang	8a786e5df4	vmalert: fix alert firing state in replay mode (#5192 ) fix possible missing firing states for alerting rules in replay mode Before if one firing stage is bigger than single query request range, like rule with a big `for`, alerting rule won't able to be detected as firing. Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `abcb21aa5e`)	2023-10-30 13:55:48 +01:00
Dima Lazerka	ed8fc04898	lib/httpserver: add flags to specify HSTS / Frame-Options / CSP headers for httpserver (#5111 ) support `Strict-Transport-Security`, `Content-Security-Policy` and `X-Frame-Options` HTTP headers in all VictoriaMetrics components. The values for headers can be specified by users via the following flags: `-http.header.hsts`, `-http.header.csp` and `-http.header.frameOptions`. Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `ad839aa492`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-30 11:41:38 +01:00
Aliaksandr Valialkin	a66c261b55	app/vmui: change the order of tables at `Top queries` tab Move the most interesting table - queries with the most summary time to execute - to the top	2023-10-28 11:57:08 +02:00
hagen1778	ddedeb1d42	app/vmalert: remove unclear comment The timestamp alignment should be applied as a last step to keep the timestamp consistent. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2023-10-27 14:09:01 +02:00
Aliaksandr Valialkin	f03e81c693	lib/promauth: follow-up for `e16d3f5639` - Make sure that invalid/missing TLS CA file or TLS client certificate files at vmagent startup don't prevent from processing the corresponding scrape targets after the file becomes correct, without the need to restart vmagent. Previously scrape targets with invalid TLS CA file or TLS client certificate files were permanently dropped after the first attempt to initialize them, and they didn't appear until the next vmagent reload or the next change in other places of the loaded scrape configs. - Make sure that TLS CA is properly re-loaded from file after it changes without the need to restart vmagent. Previously the old TLS CA was used until vmagent restart. - Properly handle errors during http request creation for the second attempt to send data to remote system at vmagent and vmalert. Previously failed request creation could result in nil pointer dereferencing, since the returned request is nil on error. - Add more context to the logged error during AWS sigv4 request signing before sending the data to -remoteWrite.url at vmagent. Previously it could miss details on the source of the request. - Do not create a new HTTP client per second when generating OAuth2 token needed to put in Authorization header of every http request issued by vmagent during service discovery or target scraping. Re-use the HTTP client instead until the corresponding scrape config changes. - Cache error at lib/promauth.Config.GetAuthHeader() in the same way as the auth header is cached, e.g. the error is cached for a second now. This should reduce load on CPU and OAuth2 server when auth header cannot be obtained because of temporary error. - Share tls.Config.GetClientCertificate function among multiple scrape targets with the same tls_config. Cache the loaded certificate and the error for one second. This should significantly reduce CPU load when scraping big number of targets with the same tls_config. - Allow loading TLS certificates from HTTP and HTTPs urls by specifying these urls at `tls_config->cert_file` and `tls_config->key_file`. - Improve test coverage at lib/promauth - Skip unreachable or invalid files specified at `scrape_config_files` during vmagent startup, since these files may become valid later. Previously vmagent was exitting in this case. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959	2023-10-26 09:55:47 +02:00
Aliaksandr Valialkin	19940b5629	app/vmalert/config: fix flacky test TestParseBad It could return either `failed to read` or `failed to parse` errors depending on whether the given url can be loaded or not under the current environment	2023-10-26 09:53:40 +02:00
Aliaksandr Valialkin	36a1fdca6c	all: consistently use %w instead of %s in when error is passed to fmt.Errorf() This allows consistently using errors.Is() for verifying whether the given error wraps some other known error.	2023-10-26 09:44:40 +02:00
Aliaksandr Valialkin	94e061087f	docs: use https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest instead of https://github.com/VictoriaMetrics/VictoriaMetrics/releases link where needed The https://github.com/VictoriaMetrics/VictoriaMetrics/releases link may show non-latest releases at the top, such as LTS releases or VictoriaLogs releases. So it is better to use https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest link, which always redirect to the latest available release of VictoriaMetrics.	2023-10-26 09:23:17 +02:00
Roman Khavronenko	cd2247b24a	app/vmselect: limit the number of parallel workers by 32 (#5195 ) * app/vmselect: limit the number of parallel workers by 32 The change should improve performance and memory usage during query processing on machines with big number of CPU cores. The number of parallel workers for query processing is controlled via `-search.maxWorkersPerQuery` command-line flag. By default, the number of workers is limited by the number of available CPU cores, but not more than 32. The limit can be increased via `-search.maxWorkersPerQuery`. Signed-off-by: hagen1778 <roman@victoriametrics.com> * wip - The `-search.maxWorkersPerQuery` command-line flag doesn't limit resource usage, so move it from the `resource usage limits` to `troubleshooting` chapter at docs/Single-server-VictoriaMetrics.md - Make more clear the description for the `-search.maxWorkersPerQuery` command-line flag - Add the description of `-search.maxWorkersPerQuery` to docs/Cluster-VictoriaMetrics.md - Limit the maximum value, which can be passed to `-search.maxWorkersPerQuery`, to GOMAXPROCS, because bigger values may worsen query performance and increase CPU usage - Improve the the description of the change at docs/CHANGELOG.md. Mark it as FEATURE instead of BUGFIX, since it is closer to a feature than to a bugfix. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5087 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>	2023-10-26 09:15:27 +02:00
Yury Molodov	45501eccab	vmui: update dependencies (#5194 )	2023-10-26 09:08:05 +02:00
Hui Wang	d7dd7614eb	fix inconsistent behaviors with prometheus when scraping (#5153 ) * fix inconsistent behaviors with prometheus when scraping 1. address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4959. skip job with wrong syntax in `scrape_configs` with error logs instead of exiting; 2. show error messages on vmagent /targets ui if there are wrong auth configs in `scrape_configs`, previously will print error logs and do scrape without auth header; 3. don't send requests if there are wrong auth configs in: 1. vmagent remoteWrite; 2. vmalert datasource/remoteRead/remoteWrite/notifier. * add changelogs * address review comments * fix ut	2023-10-26 08:56:54 +02:00
hagen1778	f00729ee24	app/vmalert: fix typo in tests Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `c07909a20b`)	2023-10-26 08:55:20 +02:00
hagen1778	cf541c757a	app/vmalert: fix tests after `a216fe6728` `a216fe6728` Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `eed0c3c6b0`)	2023-10-26 08:55:06 +02:00
Hui Wang	855c25b6c4	remove vmalert-tool code from branch cluster (#5229 ) Follow up `130e0ea5f0`. vmalert-tool can't be easily adapted for vmcluster now, cause it needs to set up the whole vmcluster[vminsert+vmstorage+vmselect] first. You can use vmalert-tool to run unit tests for alerting and recording rules. It will perform the following actions: - sets up an isolated VictoriaMetrics instance; - simulates the periodic ingestion of time series; - queries the ingested data for recording and alerting rules evaluation like vmalert; But component packages have functions that not exported and variables with same name, so to implement this for cluster will need amount of code refactor and doesn't look like a good thing to themselves. So I want to remove it from the cluster branch.	2023-10-25 14:48:11 +02:00

... 3 4 5 6 7 ...

3292 commits