github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2025-03-21 15:45:01 +00:00

Author	SHA1	Message	Date
Max Kotliar	c05ffa906d	lib/promscrape: improve streamParse performance Previously, performance of stream.Parse could be limited by mutex.Lock on callback function. It used shared writeContext. With complicated relabeling rules and any slowness at pushData function, it could significantly decrease parsed rows processing performance. This commit removes locks and makes parsed rows processing lock-free in the same manner as `stream.Parse` processing implemented at push ingestion processing. Implementation details: - Removing global lock around stream.Parse callback. - Using atomic operations for counters - Creating write contexts per callback instead of sharing - Improving series limit checking with sync.Once - Optimizing labels hash calculation with buffer pooling - Adding comprehensive tests for concurrency correctness Benchmark performance: ``` # before BenchmarkScrapeWorkScrapeInternalStreamBigData-10 13 81973945 ns/op 37.68 MB/s 18947868 B/op 197 allocs/op # after goos: darwin goarch: arm64 pkg: github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape cpu: Apple M1 Pro BenchmarkScrapeWorkScrapeInternalStreamBigData-10 74 15761331 ns/op 195.98 MB/s 15487399 B/op 148 allocs/op PASS ok github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape 1.806s ``` Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8159 --------- Signed-off-by: Maksim Kotlyar <kotlyar.maksim@gmail.com> Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>	2025-03-20 16:49:24 +01:00
Nikolay	27d7d0b25c	lib/promscrape: properly send staleness markers Previously, vmagent may incorrectly store partial scrape response in case of scrapping error. It may happen if `sw.ReadData` call fetched some chunked response and store it at buffer. And later context deadline exceed error happened. As a result, at the next scrape iteration this partial response could be forwarded to the `sw.sendStaleSeries(lastScrape...)` function call and lead to `Prometheus line` parsing error. This commit properly set response body to the empty value in case of scrapping error. It prevents storing partial scrape response body. And it no longer sends partial staleness markers to the remote storage. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8528	2025-03-19 10:37:54 +01:00
Aliaksandr Valialkin	f645479b5e	lib/protoparser: rename lib/protoparser/common to lib/protoparser/protoparserutil This improves readability of the code, which uses this package.	2025-03-18 16:24:51 +01:00
Guillem Jover	76d205feae	spelling and grammar fixes via codespell (#8497 ) ### Describe Your Changes Fix many spelling errors and some grammar, including misspellings in filenames. The change also fixes a typo in metric `vm_mmaped_files` to `vm_mmapped_files`. While this is a breaking change, this metric isn't used in alerts or dashboards. So it seems to have low impact on users. The change also deprecates `cspell` as it is much heavier and less usable. --------- Co-authored-by: Andrii Chubatiuk <achubatiuk@victoriametrics.com> Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>	2025-03-17 16:32:10 +01:00
Aliaksandr Valialkin	0451a1c9e0	app/vlinsert: follow-up for `37ed1842ab` - Properly decode protobuf-encoded Loki request if it has no Content-Encoding header. Protobuf Loki message is snappy-encoded by default, so snappy decoding must be used when Content-Encoding header is missing. - Return back the previous signatures of parseJSONRequest and parseProtobufRequest functions. This eliminates the churn in tests for these functions. This also fixes broken benchmarks BenchmarkParseJSONRequest and BenchmarkParseProtobufRequest, which consume the whole request body on the first iteration and do nothing on subsequent iterations. - Put the CHANGELOG entries into correct places, since they were incorrectly put into already released versions of VictoriaMetrics and VictoriaLogs. - Add support for reading zstd-compressed data ingestion requests into the remaining protocols at VictoriaLogs and VictoriaMetrics. - Remove the `encoding` arg from PutUncompressedReader() - it has enough information about the passed reader arg in order to properly deal with it. - Add ReadUncompressedData to lib/protoparser/common for reading uncompressed data from the reader until EOF. This allows removing repeated code across request-based protocol parsers without streaming mode. - Consistently limit data ingestion request sizes, which can be read by ReadUncompressedData function. Previously this wasn't the case for all the supported protocols. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8416 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8380 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8300	2025-03-15 00:03:03 +01:00
Evgeny	486b9e1c64	lib/promscrape: use original job name as scrapePool value in targets api (#8457 ) ### Fix scrapePool name If in the scrape file, I do some magic and manipulate the job name then Prometheus will show scrapePool as the original job name in the targets API, but vmagent will set it to the final value which is wrong. example ``` job: consul-targets ... - source_labels: [ __meta_consul_service ] regex: (\w+)[_-]exporter target_label: job replacement: $1 ``` curl to prom API will show `"scrapePool": "consul-targets",` vmagent: `""scrapePool": "node",` before changes: ``` curl -s 'http://localhost:8429/api/v1/targets' \| jq -r '.data.activeTargets[].scrapePool'\| sort\|uniq blackbox pgbackrest postgres ``` after changes ``` curl -s 'http://localhost:8429/api/v1/targets' \| jq -r '.data.activeTargets[].scrapePool'\| sort\|uniq blackbox consul-targets ``` ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Co-authored-by: hagen1778 <roman@victoriametrics.com>	2025-03-11 13:11:35 +01:00
Roman Khavronenko	63f6ac3ff8	lib/promutils: move time-related funcs from `promutils` to `timeutil` (#8403 ) Since funcs `ParseDuration` and `ParseTimeMsec` are used in vlogs, vmalert, victoriametrics and other components, importing promutils only for this reason makes them to export irrelevant `vm_rows_invalid_total{type="prometheus"}` metric. This change removes `vm_rows_invalid_total{type="prometheus"}` metric from /metrics page for these components. ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-03-03 10:25:42 +01:00
Zakhar Bessarab	99de272b72	lib/promrelabel/scrape_url: properly parse IPv6 address from __address__ label Fix parsing of IPv6 addresses after discovery. Previously, it could lead to target being discovered and discarded afterwards. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8374 --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2025-02-28 14:19:10 +04:00
Roman Khavronenko	38d46d149f	lib/prompbmarshal: move MustParsePromMetrics to protoparser/prometheus (#8405 ) `MustParsePromMetrics` imports `lib/protoparser/prometheus`, and this package exposes the following metrics: ``` vm_protoparser_rows_read_total{type="promscrape"} vm_rows_invalid_total{type="prometheus"} ``` It means every package that uses `lib/prompbmarshal` will start exposing these metrics. For example, vlogs imports `lib/protoparser/common` which uses `lib/prompbmarshal.Label`. And only because of this vlogs starts exposing unrelated prometheus metrics on /metrics page. Moving `MustParsePromMetrics` to `lib/protoparser/prometheus` seems like the leas intrusive change. ----------- Depends on another change https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8403 Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-02-27 22:50:27 +01:00
Aliaksandr Valialkin	467cdd8a3d	lib: consistently use logger.Panicf("BUG: ...") for logging programming bugs logger.Fatalf("BUG: ...") complicates investigating the bug, since it doesn't show the call stack, which led to the bug. So it is better to consistently use logger.Panicf("BUG: ...") for logging programming bugs.	2025-01-24 16:39:21 +01:00
Zhu Jiekun	276989716f	lib/promscrape: add Marathon service discovery This commit adds support for [Marathon](https://mesosphere.github.io/marathon/) service discovery to the scrape configuration. The following flag is introduced: ``` -promscrape.marathonSDCheckInterval duration Interval for checking for changes in Marathon service discovery. This works only if marathon_sd_configs is configured in '-promscrape.config' file. See https://docs.victoriametrics.com/sd_configs.html#marathon_sd_configs for details (default 30s) ``` The service discovery could be config like: ```yaml scrape_configs: - job_name: marathon_job marathon_sd_configs: servers: - "..." - "..." ``` See: [`b555d94d1a/docs/sd_configs.md (marathon_sd_configs)`) related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6642 --------- Co-authored-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2025-01-08 18:57:22 +01:00
cuiweiyuan	d064e14933	chore: fix function name in comment (#7926 ) ### Describe Your Changes fix function name in comment ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: cuiweiyuan <cuiweiyuan@aliyun.com>	2025-01-08 13:58:22 +01:00
Zhu Jiekun	43181b67b1	discovery/dockerswarm: add missing service labels to tasks discovery role Previously service labels won't be attached when `role: tasks` is set. Because the `addServicesLabels` function is shared by `role: tasks` and `role: services`, and it will return nothing when `vip.Addr` is invalid or empty. In Prometheus, even if `vip.Addr` is empty, it attach common service labels with [a standalone function](`f10c3454e9/discovery/moby/services.go (L129)`), which offers: - `__meta_dockerswarm_service_id`: the id of the service. - `__meta_dockerswarm_service_name`: the name of the service. - `__meta_dockerswarm_service_mode`: the mode of the service. - `__meta_dockerswarm_service_label_<labelname>`: each label of the service, with any unsupported characters converted to an underscore. This PR add a `addServicesLabelsForTask`, to replace the usage of `addServicesLabels` when `role: tasks` is set. This function offers common service labels listed above. related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7800	2024-12-13 11:28:04 +01:00
Zhu Jiekun	6b48126603	discovery/docker: add match_first_network support for docker_sd_configs This commit aligns behaviour of docker service discovery with Prometheus implementation. It adds the following changes: * introduce new config param `match_first_network` with default value of `true`. It uses the first network if the container has multiple networks defined. It should help to avoid collecting duplicate targets error with multi network setups. * add `networks` for the containers with linked network to the other containers with `network_mode: container:id` setting. It resolve an issue with attached containers aka `pods` in Kubernetes. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7398	2024-12-10 20:15:33 +01:00
Zhu Jiekun	7374a8813d	lib/promscrape/discovery: properly apply the resource_group filter for Azure service discovery Previously, this filter did not apply to virtual machine scale sets, causing all virtual machines to be discovered. This commit conditionally adds `resource_group` filter for Azure service discovery on virtual machine scale sets. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7630.	2024-11-26 19:06:43 +01:00
Andrei Baidarov	727bc02a5c	vmagent: set up a timeout for tcp connection establishment during k8s discovery Previously, default dial timeout was used for kubernetes API server connection. This commit changes it for custom dialer used by the all VictoriaMetrics components. It has lower connection timeout (30s by default). Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7127 --------- Co-authored-by: f41gh7 <nik@victoriametrics.com>	2024-11-25 18:02:09 +01:00
Nikolay	bba08f7846	lib/promscrape: add relabel configs to `global` section This commit adds `metric_relabel_configs` and `relabel_configs` fields into the `global` section of scrape configuration file. New fields are used as global relabeling rules for the scrape targets. These relabel configs are prepended to the target relabel configs. This feature is useful to: * apply global rules to __meta labels from service discovery targets. * drop noisy labels during scrapping. * mutate labels without affecting metrics ingested via any of push protocols. Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6966 --------- Signed-off-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: Zhu Jiekun <jiekun@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-10-31 19:58:22 +01:00
Zhu Jiekun	f06c7e99fe	lib/promscrape: adds support for PuppetDB service discovery This commit adds support for [PuppetDB](https://www.puppet.com/docs/puppetdb/8/overview.html) service discovery to the `vmagent` and `victoria-metrics-single` components. Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5744	2024-10-27 20:38:34 +01:00
Andrii Chubatiuk	fc537bea00	lib/promscrape/discovery/kubernetes: support kubernetes native sidecars (#7324 ) This commit adds Kubernetes Native Sidecar support. It's the special type of init containers, that have restartPolicy == "Always" and continue to run after container initialization. related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7287	2024-10-24 17:04:12 +02:00
Andrii Chubatiuk	965a33c893	lib/promscrape: fixed reload on max_scrape_size change (#7282 ) ### Describe Your Changes fixed reload on max_scrape_size change https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7260 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-10-18 11:35:23 +02:00
Zakhar Bessarab	eefae85450	vmagent: add support of HTTP2 client for Kubernetes SD (#7114 ) ### Describe Your Changes Currently, vmagent always uses a separate `http.Client` for every group watcher in Kubernetes SD. With a high number of group watchers this leads to large amount of opened connections. This PR adds 2 changes to address this: - re-use of existing `http.Client` - in case `http.Client` is connecting to the same API server and uses the same parameters it will be re-used between group watchers - HTTP2 support - this allows to reuse connections more efficiently due to ability of using streaming via existing connections. See this issue for the details and test results - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5971 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-10-08 10:36:31 +02:00
Artem Fetishev	c1cd3e85a7	lib/promscrape: Fix TestClientProxyReadOk flaky test (#7173 ) This PR fixes #7062 For hijacked connections, one has to read from the connection buffer, but still write directly to the connection. Otherwise, when reading directly from such connections, the first byte may be lost. This, in turn corrupts the ClientHello TLS handshake message and when the backend server receives it, it closes the connection and reports the following error in the log: ``` http: TLS handshake error from 127.0.0.1:33150: tls: first record does not look like a TLS handshake ``` The first byte may be lost because underlying HTTP request handler may read it from the connection and put it into the buffer. As the result, subsequent connection reads won't see that byte. - See: https://github.com/golang/go/issues/27408 - The fix is taken from : https://github.com/k3s-io/k3s/pull/6216 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>	2024-10-03 18:27:15 +02:00
Zhu Jiekun	7bb8853a5c	feature: [vmagent] Add service discovery support for OVH Cloud VPS and dedicated server (#6160 ) ### Describe Your Changes related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6071 #### Added - Added service discovery support for OVH Cloud: - VPS. - Dedicated server. #### Docs - `CHANGELOG.md`, `sd_configs.md`, `vmagent.md` are updated. #### Note - Useful links: - OVH Cloud VPS API: https://eu.api.ovh.com/console/#/vps~GET - OVH Cloud Dedicated server API: https://eu.api.ovh.com/console/#/dedicated/server~GET - OVH Cloud SDK: https://github.com/ovh/go-ovh - Prometheus SD: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ovhcloud_sd_config Tested on OVH Cloud VPS and dedicated server. <img width="1722" alt="image" src="https://github.com/VictoriaMetrics/VictoriaMetrics/assets/30280396/d3f0adc8-b0ef-423e-9379-8a9b9b0792ee"> <img width="1724" alt="image" src="https://github.com/VictoriaMetrics/VictoriaMetrics/assets/30280396/18b5b730-3512-4fc0-8b2c-f2450ac550fd"> --- Signed-off-by: Jiekun <jiekun@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-09-30 14:42:46 +02:00
hagen1778	8bb3f2fd43	lib/promscrape: make linter happy Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-09-24 15:12:55 +02:00
hagen1778	c7569dac50	lib/promscrape: temporary disable TestClientProxyReadOk This test is very flaky and prevents other tests from running in CI. Disabling this test should improve tests quality, since it isn't reliable anyway. There is a ticket to fix this test - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7062 Once fixed, this test should be uncommented. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-09-24 14:59:25 +02:00
Dmytro Kozlov	cbeb7d50e8	lib/promscrape: show only unhealthy targets if `show_only_unhealthy` filter is enabled (#6960 ) ### Describe Your Changes It is better to show only unhealthy targets instead of all of them when `show_only_unhealthy` filter is enabled. Related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3536 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>	2024-09-24 12:18:24 +02:00
Zhu Jiekun	c193e6d43e	lib/discovery/azure: fix host check in next link in Azure SD (#6915 ) Previous bugfix at `49f63b2` only partially fixed pagination host validation error. Before this fix it was: ``` unexpected nextLink host \"management.azure.com\", expecting \"https://management.azure.com\" ``` Now we only check the `Host` without schema. However, when Azure respond `nextLink` in `Host:Port` format, the `nextLink` check will fail: ``` unexpected nextLink host \"management.azure.com:443\", expecting \"management.azure.com\" ``` This pull request further relaxes the checks by only checking the `Hostname`. --- related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6912	2024-09-05 16:48:09 +02:00
Roman Khavronenko	f586082520	attempt to fix flaky TestClientProxyReadOk (#6899 ) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-08-30 13:23:32 +02:00
Zhu Jiekun	e97e966f82	lib/promrelabel: follow-up for `8958cecad6` In the previous commit `8958cecad6` the default ports (80/443) were removed for both the `scrapeURL` and `instance` label values for those targets without a port in `__address__`. Different values in the `instance` label generate new time series. This commit reverts the changes made to the `instance` label. Now, for those targets: - `scrapeURL` will remain unchanged. - The `instance` label value will include the default port. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6792	2024-08-27 13:04:26 +02:00
Nikolay	9feee15493	lib/promscrape: fixes proxy autorization (#6783 ) * Adds custom dial func for HTTP-Connect and socks5 proxy tunnels. Standard golang http.transport exposes GetProxyConnectHeader function, but it doesn't allow to use separate tls config for proxy. It also not possible to enforce HTTP-Connect with standard http lib. * For http scrape targets, by default http.Transport.Proxy function must be used. Since it has special case with full uri forward. * Adds proxy.URL json methods that allow to properly copy internal fields, like User/Password. It should fix bug with proxy_url. When credentials specified at URL was ignored. * Adds tests for scrape client proxy requests related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6771	2024-08-19 22:31:18 +02:00
Zhu Jiekun	723d834c1a	lib/promrelabel: stop adding default port 80/433 to address label * It was necessary to add default ports for fasthttp client. After migration to the std.httpclient it's no longer needed. * An additional configuration is required at proxy servers with implicitly set 80/443 ports to the host header (such as HA proxy. It's expected that after upgrade __address_ label may change. But it should be rare case. 80/443 ports are not widely used at monitoring ecosystem. And it shouldn't have much impact. Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6792 Co-authored-by: Nikolay <nik@victoriametrics.com>	2024-08-19 22:28:49 +02:00
Zhu Jiekun	9e2bd82376	app/vmagent: fixes azure service discovery pagination Azure API response with link to the next page was incorrectly validate. Validation used url.Host header to match configure API URL. https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6784	2024-08-09 15:22:47 +02:00
Anzor	994796367b	app/vmagent: read __sample_limit__ from labels (#6665 ) (#6666 ) By introducing this feature, users will have the ability to customize the sampleLimit parameter on a per-target basis, providing more flexibility and control over the job execution behavior.	2024-08-07 09:36:14 +02:00
Aliaksandr Valialkin	9c4b0334f2	all: consistently use stringsutil.JSONString() for formatting JSON strings with fmt.* functions instead of using "%q" formatter The %q formatter may result in incorrectly formatted JSON string if the original string contains special chars such as \x1b . They must be encoded as \u001b , otherwise the resulting JSON string cannot be parsed by JSON parsers. This is a follow-up for `c0caa69939` See https://github.com/VictoriaMetrics/victorialogs-datasource/issues/24	2024-07-17 13:52:13 +02:00
Aliaksandr Valialkin	58a757cd01	lib: consistently use regexp.Regexp.ReplaceAllLiteralString instead of regexp.Regexp.ReplaceAllString in places where the replacement cannot contain matching group placeholders	2024-07-17 12:41:54 +02:00
Aliaksandr Valialkin	4304950391	lib/promscrape/discovery/yandexcloud: follow-up for `070abe5c71` - Obtain IAM token via GCE-like API instead of Amazon EC2 IMDSv2 API, since it looks like IMDBSv2 API isn't supported by Yandex Cloud according to https://yandex.cloud/en/docs/security/standard/authentication#aws-token : > So far, Yandex Cloud does not support version 2, so it is strongly recommended > to technically disable getting a service account token via the Amazon EC2 metadata service. - Try obtaining IAM token via GCE-like API at first and then fall back to the deprecated Amazon EC2 IMDBSv1. This should prevent from auth errors for instances with disabled GCE-like auth API. This addresses @ITD27M01 concern at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5513#issuecomment-1867794884 - Make more clear the description of the change at docs/CHANGELOG.md , add reference to the related issue. P.S. This change wasn't tested in prod because I have no access to Yandex Cloud. It is recommended to test this change by @ITD27M01 and @vmazgo , who filed the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5513 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6524	2024-07-16 17:58:40 +02:00
Aliaksandr Valialkin	57000f5105	lib/promscrape: follow-up for `1e83598be3` - Clarify that the -promscrape.maxScrapeSize value is used for limiting the maximum scrape size if max_scrape_size option isn't set at https://docs.victoriametrics.com/sd_configs/#scrape_configs - Fix query example for scrape_response_size_bytes metric at https://docs.victoriametrics.com/vmagent/#automatically-generated-metrics - Mention about max_scrape_size option at the -help description for -promscrape.maxScrapeSize command-line flag - Treat zero value for max_scrape_size option as 'no scrape size limit' - Change float64 to int type for scrapeResponseSize struct fields and function args, since response size cannot be fractional - Optimize isAutoMetric() function a bit - Sort auto metrics in alphabetical order in isAutoMetric() and in scrapeWork.addAutoMetrics() functions for better maintainability in the future Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6434 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6429	2024-07-16 12:38:21 +02:00
Aliaksandr Valialkin	a468a6e985	lib/{httputils,netutil}: move httputils.GetStatDialFunc to netutil.NewStatDialFunc - Rename GetStatDialFunc to NewStatDialFunc, since it returns new function with every call - NewStatDialFunc isn't related to http in any way, so it must be moved from lib/httputils to lib/netutil - Simplify the implementation of NewStatDialFunc by removing sync.Map from there. - Use netutil.NewStatDialFunc at app/vmauth and lib/promscrape/discoveryutils - Use gauge instead of counter type for *_conns metric This is a follow-up for `d7b5062917` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6299	2024-07-15 23:02:34 +02:00
Aliaksandr Valialkin	3c02937a34	all: consistently use 'any' instead of 'interface{}' 'any' type is supported starting from Go1.18. Let's consistently use it instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.	2024-07-10 00:20:37 +02:00
Aliaksandr Valialkin	a9525da8a4	lib: consistently use f-tests instead of table-driven tests This makes easier to read and debug these tests. This also reduces test lines count by 15% from 3K to 2.5K See https://itnext.io/f-tests-as-a-replacement-for-table-driven-tests-in-go-8814a8b19e9e While at it, consistently use t.Fatal* instead of t.Error, since t.Error usually leads to more complicated and fragile tests, while it doesn't bring any practical benefits over t.Fatal*.	2024-07-09 22:40:50 +02:00
Aliaksandr Valialkin	35b3b95cbc	lib/promscrape/discovery/vultr: follow-up after `17e3d019d2` - Sort the discovered labels in alphabetical order at https://docs.victoriametrics.com/sd_configs/#vultr_sd_configs - Rename VultrConfigs to VultrSDConfigs to be consistent with the naming for other SD configs. - Prepare query arg filters for `list instances API` at newAPIConfig() instead of passing them in a separate listParams struct. This simplifies the code a bit. - Return error when bearer token isn't set at vultr_sd_configs, since this token is mandatory according to https://docs.victoriametrics.com/sd_configs/#vultr_sd_configs - Remove unused fields from the parsed response from Vultr list instances API in order to simplify the code a bit. - Remove double logging of errors inside getInstances() function, since these errors must be already logged by the caller. - Simplify tests, so they are easier to maintain. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6041 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6068	2024-07-05 17:40:03 +02:00
Aliaksandr Valialkin	d8c7cc266b	lib/promscrape: use prompbmarshal.MustParsePromMetrics function at parseData() test function The prompbmarshal.MustParsePromMetrics function has been added in the commit `cc4d57d650`	2024-07-03 16:08:13 +02:00
Aliaksandr Valialkin	bb00bae353	Revert "Exemplar support (#5982 )" This reverts commit `5a3abfa041`. Reason for revert: exemplars aren't in wide use because they have numerous issues which prevent their adoption (see below). Adding support for examplars into VictoriaMetrics introduces non-trivial code changes. These code changes need to be supported forever once the release of VictoriaMetrics with exemplar support is published. That's why I don't think this is a good feature despite that the source code of the reverted commit has an excellent quality. See https://docs.victoriametrics.com/goals/ . Issues with Prometheus exemplars: - Prometheus still has only experimental support for exemplars after more than three years since they were introduced. It stores exemplars in memory, so they are lost after Prometheus restart. This doesn't look like production-ready feature. See `0a2f3b3794/content/docs/instrumenting/exposition_formats.md (L153-L159)` and https://prometheus.io/docs/prometheus/latest/feature_flags/#exemplars-storage - It is very non-trivial to expose exemplars alongside metrics in your application, since the official Prometheus SDKs for metrics' exposition ( https://prometheus.io/docs/instrumenting/clientlibs/ ) either have very hard-to-use API for exposing histograms or do not have this API at all. For example, try figuring out how to expose exemplars via https://pkg.go.dev/github.com/prometheus/client_golang@v1.19.1/prometheus . - It looks like exemplars are supported for Histogram metric types only - see https://pkg.go.dev/github.com/prometheus/client_golang@v1.19.1/prometheus#Timer.ObserveDurationWithExemplar . Exemplars aren't supported for Counter, Gauge and Summary metric types. - Grafana has very poor support for Prometheus exemplars. It looks like it supports exemplars only when the query contains histogram_quantile() function. It queries exemplars via special Prometheus API - https://prometheus.io/docs/prometheus/latest/querying/api/#querying-exemplars - (which is still marked as experimental, btw.) and then displays all the returned exemplars on the graph as special dots. The issue is that this doesn't work in production in most cases when the histogram_quantile() is calculated over thousands of histogram buckets exposed by big number of application instances. Every histogram bucket may expose an exemplar on every timestamp shown on the graph. This makes the graph unusable, since it is litterally filled with thousands of exemplar dots. Neither Prometheus API nor Grafana doesn't provide the ability to filter out unneeded exemplars. - Exemplars are usually connected to traces. While traces are good for some I doubt exemplars will become production-ready in the near future because of the issues outlined above. Alternative to exemplars: Exemplars are marketed as a silver bullet for the correlation between metrics, traces and logs - just click the exemplar dot on some graph in Grafana and instantly see the corresponding trace or log entry! This doesn't work as expected in production as shown above. Are there better solutions, which work in production? Yes - just use time-based and label-based correlation between metrics, traces and logs. Assign the same `job` and `instance` labels to metrics, logs and traces, so you can quickly find the needed trace or log entry by these labes on the time range with the anomaly on metrics' graph.	2024-07-03 15:30:21 +02:00
Andrii Chubatiuk	070abe5c71	added IMDSv2 for YC SD (#6524 ) ### Describe Your Changes Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5513 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-06-26 18:03:21 +02:00
Andrii Chubatiuk	1e83598be3	app/vmagent: add max_scrape_size to scrape config (#6434 ) Related to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6429 ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com>	2024-06-20 13:58:42 +02:00
Zakhar Bessarab	34071ac660	lib/promscrape: increase default value for promscrape.maxDroppedTargets to 10_000 (#6459 ) ### Describe Your Changes This limit can be increased since after `4513893ead` tracking of dropped targets uses much less memory per entry. See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6381#issuecomment-2156708228 ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>	2024-06-12 16:34:18 +02:00
Hui Wang	d7b5062917	app/vmalert: support DNS SRV record in `-remoteWrite.url` (#6299 ) part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6053, supports [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) address in `-remoteWrite.url` command-line option.	2024-05-22 10:52:51 +02:00
Zhu Jiekun	17e3d019d2	feature: [vmagent] Add service discovery support for Vultr (#6068 ) ### Describe Your Changes related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6041 #### Added - Added service discovery support for Vultr. #### Docs - `CHANGELOG.md`, `sd_configs.md`, `vmagent.md` are updated. #### Note - Useful links: - Vultr API: https://www.vultr.com/api/#tag/instances/operation/list-instances - Vultr client SDK: https://github.com/vultr/govultr - Prometheus SD: https://github.com/prometheus/prometheus/tree/main/discovery/vultr --- ### Checklist The following checks are mandatory: - [X] I have read the [Contributing Guidelines](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/CONTRIBUTING.md) - [x] All commits are signed and include `Signed-off-by` line. Use `git commit -s` to include `Signed-off-by` your commits. See this [doc](https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work) about how to sign your commits. - [x] Tests are passing locally. Use `make test` to run all tests locally. - [x] Linting is passing locally. Use `make check-all` to run all linters locally. Further checks are optional for External Contributions: - [X] Include a link to the GitHub issue in the commit message, if issue exists. - [x] Mention the change in the [Changelog](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/CHANGELOG.md). Explain what has changed and why. If there is a related issue or documentation change - link them as well. Tips for writing a good changelog message:: * Write a human-readable changelog message that describes the problem and solution. * Include a link to the issue or pull request in your changelog message. * Use specific language identifying the fix, such as an error message, metric name, or flag name. * Provide a link to the relevant documentation for any new features you add or modify. - [ ] After your pull request is merged, please add a message to the issue with instructions for how to test the fix or try the feature you added. Here is an [example](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4048#issuecomment-1546453726) - [x] Do not close the original issue before the change is released. Please note, in some cases Github can automatically close the issue once PR is merged. Re-open the issue in such case. - [x] If the change somehow affects public interfaces (a new flag was added or updated, or some behavior has changed) - add the corresponding change to documentation. Signed-off-by: Jiekun <jiekun.dev@gmail.com>	2024-05-08 10:01:48 +02:00
Ted Possible	5a3abfa041	Exemplar support (#5982 ) This code adds Exemplars to VMagent and the promscrape parser adhering to OpenMetrics Specifications. This will allow forwarding of exemplars to Prometheus and other third party apps that support OpenMetrics specs. --------- Signed-off-by: Ted Possible <ted_possible@cable.comcast.com>	2024-05-07 12:09:44 +02:00
Aliaksandr Valialkin	7531e9084a	all: use clear() built-in Go function for clearing []prompbmarshal.TimeSeries and []prompbmarshal.Label slices This makes the code a bit clear.	2024-04-20 21:00:03 +02:00

1 2 3 4 5 ...

726 commits