Commit graph

1235 commits

Author SHA1 Message Date
Aliaksandr Valialkin
2825a1e86d lib/promscrape: add follow_redirect option to scrape_configs section like Prometheus does
See https://github.com/prometheus/prometheus/pull/8546
2021-04-02 21:20:37 +03:00
Aliaksandr Valialkin
245eba8896 lib/promscrape/discovery/kubernetes: properly track objects with the same names in multiple namespaces
This is a follow-up for 12e4785fe8

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:46:34 +03:00
Aliaksandr Valialkin
eee860f83d lib/promscrape/discovery/kubernetes: properly discover targets in multiple namespaces
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:29:24 +03:00
Nikolay
00157f3ec6 Adds aws ECS credentials support (#1175) 2021-04-02 13:11:03 +03:00
Aliaksandr Valialkin
512addc608 app/{vminsert,vmagent}: add -sortLabels command-line option for sorting time series labels before ingesting them in the storage
This option can be useful when samples for the same time series are ingested with distinct order of labels.
For example, metric{k1="v1",k2="v2"} and metric{k2="v2",k1="v1"}.
2021-03-31 23:27:21 +03:00
Aliaksandr Valialkin
ae1c653d55 lib/storage: reduce memory usage when ingesting samples for the same time series with distinct order of labels 2021-03-31 21:22:40 +03:00
Aliaksandr Valialkin
887e67f13b lib/uint64set: improve Set.Has() performance scalability on multi-CPU system
Do not update bucket32.hint on Set.Has() call, since it leads to memory ping-pong between CPU cores multi-CPU system
2021-03-29 12:34:19 +03:00
Aliaksandr Valialkin
940a547116 lib/storage: do not update b.nextIdx if no samples are removed because of retention 2021-03-29 12:13:38 +03:00
Aliaksandr Valialkin
7d87d42a91 lib/promscrape/discovery/kubernetes: typo fix in error message 2021-03-26 12:46:33 +02:00
Aliaksandr Valialkin
a920e71809 lib/promscrape/discovery/kubernetes: properly handle too old resource version error message from Kubernetes watch API 2021-03-26 12:28:35 +02:00
Aliaksandr Valialkin
9c2be144cf app/vmselect: log the metric which trigger rollup result cache reset
This should help finding the source of stale metrics
2021-03-25 21:32:28 +02:00
Aliaksandr Valialkin
f971fe86cd lib/storage: tune loopsCountPerMetricNameMatch according to production workload 2021-03-25 13:48:17 +02:00
Aliaksandr Valialkin
6b1f807418 app/vmagent: add -promscrape.consul.waitTime command-line flag for configuring Consul service discovery wait time
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1144
2021-03-23 19:34:12 +02:00
Aliaksandr Valialkin
9947c65df3 lib/storage: do not reload metricName for the same metricID in Search.NextMetricBlock
This should speed up Search.NextMetricBlock a bit
2021-03-23 17:59:34 +02:00
Nikolay
19a40faf8e changes consul_service label value (#1143)
according to prometheus discovery.
 It should mitigate issue with case sensetive services
https://github.com/hashicorp/consul/issues/5707
2021-03-23 15:37:06 +02:00
Aliaksandr Valialkin
12ca0efc19 lib/storage: respect the deadline passed to Storage.SearchMetricNames 2021-03-22 23:03:00 +02:00
Aliaksandr Valialkin
40e47935e7 lib/storage: improve Search.NextMetricBlock performance by using MetricID->MetricName cache 2021-03-22 23:02:59 +02:00
Aliaksandr Valialkin
1618b0ca6d lib/storage: tune loopsCountPerMetricNameMatch 2021-03-22 12:57:54 +02:00
Aliaksandr Valialkin
7503111feb lib/storage: small code simplification after 6cee5338b2 2021-03-18 15:22:39 +02:00
Aliaksandr Valialkin
4443254fb9 lib/storage: prevent from infinite loop if {__graphite__="..."} filter matches a metric name with *, [ or { chars
The idea has been borrowed from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1137
2021-03-18 14:57:39 +02:00
Aliaksandr Valialkin
e03233f441 lib/fs: reduce the frequency of failed to remove directory ... due to NFS lock log warnings
Log `failed to remove directory ... due to NFS lock` warning only if the directory cannot be removed in one second.
2021-03-18 13:23:43 +02:00
Aliaksandr Valialkin
b859fe7879 lib/storage: faster move heavy filters to the end of list 2021-03-17 15:11:56 +02:00
Aliaksandr Valialkin
5e77a939c2 all: make go vet happy 2021-03-17 00:48:44 +02:00
Aliaksandr Valialkin
41fe707bec lib/storage: limit loops count in order to reduce max CPU usage during filter search 2021-03-17 00:48:44 +02:00
Aliaksandr Valialkin
e2a0c8bd72 lib/storage: do not modify filterLoopsCount stats with loopsCount stats
Such a modification can result in incorrect filter sorting later
2021-03-17 00:48:44 +02:00
Aliaksandr Valialkin
b997f4a418 all: make golangci-lint happy after the commit 6378205415 2021-03-17 00:24:31 +02:00
Aliaksandr Valialkin
8005ba26b9 lib/netutil: enable IPv6 UDP listening if -enableTCP6 command-line flag is passed to VictoriaMetrics
This is a follow-up for 18cfc4be7b

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1131
2021-03-17 00:19:30 +02:00
Nikolay
b293f81eb8 Adds udp6 support for ingest servers (#1134)
with flag -enableUDP6  https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1131
2021-03-17 00:19:29 +02:00
Aliaksandr Valialkin
1cfe3f872f lib/uint64set: optimize bucket16.addMulti a bit 2021-03-16 21:08:38 +02:00
Aliaksandr Valialkin
727ded9d4e lib/storage: time series search optimization according to production workload profiling
Do not pass filter metric ids to getMetricIDsForTagFilter, since it has been appeared that this slows down
the function by multiple times when it finds big number of metricIDs (tens of millions).
2021-03-16 20:08:43 +02:00
Aliaksandr Valialkin
f4a44d6c0d lib/storage: further tuning for time series search 2021-03-16 18:47:29 +02:00
Aliaksandr Valialkin
d074326970 app/vmstorage: add -logNewSeries command-line flag for determining the source of series churn rate 2021-03-15 22:40:28 +02:00
Aliaksandr Valialkin
c4d0c20de2 lib/influxutils: return response compatible with InfluxDB 1.8.4
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1124
2021-03-15 22:20:38 +02:00
f41gh7
b7c9f3676e typo fix 2021-03-15 22:20:37 +02:00
Aliaksandr Valialkin
e2717d84c0 all: various fixes in command-line flag descriptions 2021-03-15 22:03:49 +02:00
Aliaksandr Valialkin
776b8b32ca app/{vminsert,vmagent}: a follow-up for b1aa8c3d8f
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1124
2021-03-15 22:03:49 +02:00
Aliaksandr Valialkin
fc902734d9 lib/storage: further tuning for time series selector code 2021-03-15 20:32:37 +02:00
Aliaksandr Valialkin
dc70e7de76 lib/uint64set: reduce the size of bucket16 by storing smallPool by pointer.
This reduces CPU time spent on bucket16 copying inside bucket13.addBucketAtPos.
2021-03-15 17:27:03 +02:00
Aliaksandr Valialkin
cc6b1a5941 lib/uint64set: optimize Set.AddMulti for large sorted sets 2021-03-15 17:10:54 +02:00
Aliaksandr Valialkin
f023e92f1a lib/uint64set: optimize bucket16.add and bucket16.addMulti a bit 2021-03-15 16:58:24 +02:00
Aliaksandr Valialkin
1c26020080 lib/storage: tune per-day index search 2021-03-15 13:36:36 +02:00
Aliaksandr Valialkin
7f52aae20c lib/promscrape: an attempt to reduce memory usage when vmagent scrapes targets with varying number of metrics
Do not cache too big byte buffers and too big writeRequestCtx objects,
since it is cheaper to re-create them instead of wasting RAM for their caching.

This reverts 7f6f350ee1
2021-03-15 11:49:29 +02:00
Aliaksandr Valialkin
33cd6c26d3 lib/promscrape: return back the logic for flushing big buffers to storage from the commit 3fd8653b40
This should reduce memory usage when vmagent scrapes targets with big number of metrics and `-promscrape.streamParse` isn't enabled
2021-03-14 22:25:37 +02:00
Aliaksandr Valialkin
894246176f lib/promscrape/discovery/kubernetes: do not start object watcher until initial objects are loaded 2021-03-14 21:56:16 +02:00
Aliaksandr Valialkin
9e55db4a53 lib/promscrape: retry service discovery in a few seconds if it starts returning 0 targets
This should reduce recovery time from temporary issues during service discovery
2021-03-14 21:56:16 +02:00
Aliaksandr Valialkin
3b46ae1c05 lib/promscrape: remove duplicate target word in error message 2021-03-14 21:56:16 +02:00
Aliaksandr Valialkin
b0b28eeb93 lib/promscrape/discovery/kubernetes: further optimize kubernetes service discovery for the case with many scrape jobs
Do not re-calculate labels per each scrape job - reuse them instead for scrape jobs with identical Kubernetes role
2021-03-14 21:16:41 +02:00
Aliaksandr Valialkin
620f05cd2c lib/promscrape/discovery: fixes after 133b288681
- Removed a deadlock in addAPIWatcher
- Do not create unused ScrapeWork objects
- Do not spend CPU resources on creating objectByKey map in addAPIWatcher

This work is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1125
2021-03-13 15:22:38 +02:00
Aliaksandr Valialkin
54f902467d lib/proxy: there is no need in cloning tlsCfg, which has been created two lines above 2021-03-12 10:48:01 +02:00
Aliaksandr Valialkin
72a8fa484b lib/proxy: set proxy address in tls.Config.ServerName instead of the target address
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-12 10:41:25 +02:00
Aliaksandr Valialkin
60e0280a94 lib/promscrape: add ability to configure proxy options via proxy_tls_config, proxy_basic_auth, proxy_bearer_token and proxy_bearer_token_file
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-12 03:36:11 +02:00
Aliaksandr Valialkin
b2732575f7 lib/storage: further tune filters sorting logic 2021-03-12 00:51:35 +02:00
Aliaksandr Valialkin
8fc29ffc67 lib/promscrape/discovery/kubernetes: use a single watcher per apiURL
Previously multiple scrape jobs could create multiple watchers for the same apiURL. Now only a single watcher is used.
This should reduce load on Kubernetes API server when many scrape job configs use Kubernetes service discovery.
2021-03-11 17:04:14 +02:00
Aliaksandr Valialkin
8b8d4cbcfe lib/proxy: do not show inline basic auth passwords when logging errors related to proxy_url 2021-03-11 13:44:14 +02:00
Aliaksandr Valialkin
41f641b132 lib/promscrape/discovery/kubernetes: localize Bookmark parsing code
This is a follow-up for e772d1c920
2021-03-11 13:08:56 +02:00
Aliaksandr Valialkin
6c9cd3f7c1 lib/promscrape/discovery/kubernetes: reduce load on Kubernetes API server by using watch bookmarks
This allows continuing object watch from the last bookbark instead of reloading all the objects
on watch errors or timeouts.

See https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks
2021-03-10 15:08:40 +02:00
Aliaksandr Valialkin
bd8b7a88a7 lib/httpserver: export vm_available_memory_bytes and vm_available_cpu_cores metrics
These metrics are useful for tracking the available memory and CPU cores for VictoriaMetrics apps.
2021-03-10 12:08:26 +02:00
Aliaksandr Valialkin
e15f3f4f2a lib/proxy: pass proxy hostname in Host header of the CONNECT request
This should resolve the following issue when connecting to tls proxy:

  cannot validate certificate for ... because it doesn't contain any IP SANs
2021-03-09 20:41:18 +02:00
Aliaksandr Valialkin
9d8223eafb lib/proxy: set missing ServerName in TLS config for proxy_url.
While at it, allow setting Proxy-Authorization for `proxy_url` via `basic_auth` and `bearer_token` configs.
2021-03-09 19:01:14 +02:00
Nikolay
1310f84122 Changes tlsConfig init for proxy connections (#1121)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-09 19:01:13 +02:00
Aliaksandr Valialkin
0554430d7e lib/promscrape: apply sample_limit after metric relabeling is applied as Prometheus does
See the description for `sample_limit` option from Prometheus docs:

Per-scrape limit on number of scraped samples that will be accepted.
If more than this number of samples are present after metric relabeling
the entire scrape will be treated as failed. 0 means no limit.

https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
2021-03-09 15:52:41 +02:00
Aliaksandr Valialkin
7b66c8cbf8 lib/promscrape/discovery/kubernetes: remove too verbose logs about starting and stopping the watchers
Log the number of objects loaded per each watch url

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-09 15:07:12 +02:00
John Belmonte
edf39aa225 spelling fix: adjacent (#1115) 2021-03-09 09:19:16 +02:00
Aliaksandr Valialkin
502fab797a lib/promscrape: add scrape_offset option to scrape_config
This option can be used for specifying the particular offset per each scrape interval for target scraping
2021-03-08 11:59:32 +02:00
Aliaksandr Valialkin
c4a0bd5eac lib/storage: go fmt 2021-03-08 11:59:31 +02:00
Aliaksandr Valialkin
c76a904bb0 lib/storage: tune loopsCount estimations in getMetricIDsForTagFilterSlow
The adjusted estmations give up to 2x lower median response times on 200qps /api/v1/query_range workload
2021-03-07 21:17:48 +02:00
Aliaksandr Valialkin
c04505e585 lib/promscrape/discovery/kubernetes: reduce memory usage further when big number of scrape jobs are configured for the same kubernetes_sd_config role
Serialize reloading per-role objects, so they don't occupy too much memory when objects for many scrape jobs are simultaneously refreshed.
Do not reload per-role objects if they were already refreshed by concurrent goroutines. This should reduce load on Kubernetes API server
when big number of scrape jobs are configured for the same Kubernetes role.

This is a follow-up for 17b87725ed

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-07 20:03:22 +02:00
Aliaksandr Valialkin
175466bb41 lib/decimal: prevent exponent overflow when processing values close to zero
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1114
2021-03-05 18:53:41 +02:00
Aliaksandr Valialkin
5807ff57f3 lib/promscrape/discovery/kubernetes: reduce memory usage when Kubernetes service discovery is configured on a big number of scrape jobs
Previously vmagent was creating a separate Kubernetes object cache per each scrape job.
This could result in increased memory usage when monitoring a Kubernetes cluster with big number of objects (pods / nodes / services, etc.)
as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113

Now it uses a shared map of scrape objects across multiple scrape jobs.
2021-03-05 17:32:33 +02:00
Aliaksandr Valialkin
92ddb8f197 lib/promscrape/discovery/kubernetes: move apiWatcher code to a separate file 2021-03-05 17:32:32 +02:00
Aliaksandr Valialkin
02c0959380 lib/promscrape: make cluster membership calculations consistent across 32-bit and 64-bit architectures 2021-03-05 09:06:08 +02:00
Aliaksandr Valialkin
133fb9fc00 lib/promscrape: add -promscrape.cluster.replicationFactor command-line flag for replicating scrape targets among vmagent instances in the cluster 2021-03-04 10:21:27 +02:00
Aliaksandr Valialkin
bae7a1b47a lib/promscrape/discovery/kubernetes: fix tests after e154f4a644 2021-03-03 22:42:04 +02:00
Nikolay
7d92ef3acd Fix ingress discovery api (#1110) 2021-03-03 10:45:50 +02:00
Aliaksandr Valialkin
25f453ce1a lib/promscrape/discovery/kubernetes: properly check for nil pointer inside interface
See https://mangatmodi.medium.com/go-check-nil-interface-the-right-way-d142776edef1

This fixes a panic when the ScrapeWork is filtered out in swcFunc.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1108
2021-03-03 10:42:54 +02:00
Aliaksandr Valialkin
fea27b9845 lib/promscrape: go fmt 2021-03-02 21:20:23 +02:00
Aliaksandr Valialkin
c67a07b469 lib/handshake: log read/write operation duration on connection errors
This improve debuggability of network errors
2021-03-02 21:20:20 +02:00
Aliaksandr Valialkin
c8dde1fd6b lib/storage: typo fix: umarshal -> unmarshal 2021-03-02 20:48:44 +02:00
Aliaksandr Valialkin
3fbe2bf1c8 lib/promscrape: pre-allocate space for a map in mergeLabels
This should reduce the number of memory allocations when discovering big number of targets
2021-03-02 18:42:44 +02:00
Aliaksandr Valialkin
ac5c47a9f5 lib/promscrape/discovery: properly track vm_promscrape_discovery_kubernetes_objects_removed_total metric 2021-03-02 18:33:29 +02:00
Aliaksandr Valialkin
62d7e07ff7 lib/promrelabel: remove unneded optimizations for labeldrop and labelkeep actions
These optimizations may slow down code execution by matching the same label against regexp two times instead of a single time
2021-03-02 18:01:08 +02:00
Aliaksandr Valialkin
f9c1fe3852 lib/promscrape/discovery/kubernetes: cache ScrapeWork objects as soon as the corresponding k8s objects are changed
This should reduce CPU usage and memory usage when Kubernetes contains tens of thousands of objects
2021-03-02 16:44:19 +02:00
Aliaksandr Valialkin
f686174329 lib/promscrape/discovery/ec2: follow-up after f6114345de 2021-03-02 13:47:35 +02:00
Nikolay
fd8ca7df50 Adds webIndentity token for aws (#1099)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1080
2021-03-02 13:47:34 +02:00
Aliaksandr Valialkin
e45c399467 lib/protoparser/prometheus: properly unescape label values in Prometheus exposition format
Unescape only `\n`, `\"` and `\\` sequences as Prometheus does. Other escape sequences shouldn't be unescaped.
2021-03-02 13:22:10 +02:00
Aliaksandr Valialkin
f4969a624d lib/protoparser/graphite: fix parsing of a Graphite line with empty tags such as foo; 1 2
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1100
2021-03-01 17:17:01 +02:00
Aliaksandr Valialkin
b89a4fac2f lib/promscrape/discovery/kubernetes: deflake tests; a follow-up for 05fb08713c 2021-03-01 14:31:44 +02:00
Aliaksandr Valialkin
c3bf72992f lib/promscrape: explicitly stop and cleanup service discovery routines when new config is read from -promscrape.config
This should reduce memory usage when `-promscrape.config` file frequently changes
2021-03-01 14:15:16 +02:00
Aliaksandr Valialkin
8af9370bf2 lib/promscrape: use target arg in ScrapeWork cache 2021-03-01 12:31:11 +02:00
Aliaksandr Valialkin
5328506c3e lib/promscrape: typo fix, which prevented from caching ScrapeWork entries 2021-03-01 12:13:13 +02:00
Aliaksandr Valialkin
d5058caccc lib/promscrape: add vm_promscrape_scrapework_cache_* metrics for tracking ScrapeWork cache effectiveness 2021-03-01 12:05:58 +02:00
Aliaksandr Valialkin
ee5d26a546 lib/httpserver: make make errcheck happy after the commit 9fc7726d84 2021-03-01 00:35:30 +02:00
Aliaksandr Valialkin
109bfaadad lib/promscrape: reduce CPU usage an memory allocations when constructing scrapeWorkKey 2021-02-28 22:30:36 +02:00
Aliaksandr Valialkin
9e644ef111 lib/httpserver: make sure the gzipResponseWriter.Write() is called on Flush() and Close() calls
This should fix the `http: superfluous response.WriteHeader call` issue

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1078
2021-02-28 19:23:26 +02:00
Aliaksandr Valialkin
9a2bf65134 lib/promscrape: add ability to spread scrape targets among multiple vmagent instances
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1084
2021-02-28 18:40:42 +02:00
Aliaksandr Valialkin
3e44d9947e lib/promscrape/discovery/kubernetes: properly account the number of objects when watcher is stopped
A follow-up for b21b110b7a
2021-02-28 17:06:49 +02:00
Aliaksandr Valialkin
0ef7a94056 lib/promscrape/discovery/kubernetes: add vm_promscrape_discovery_kubernetes_* metrics for monitoring internal state of k8s service discovery 2021-02-28 16:58:45 +02:00
Aliaksandr Valialkin
f52bdbe2a3 lib/promscrape/discovery/kubernetes: remove resourceVersionMatch=NotOlderThan query arg when watching for k8s object changes, since it cannot be used when watch=1 query arg is passed 2021-02-28 16:08:43 +02:00
Aliaksandr Valialkin
6f8866a40a lib/promscrape: fix possible deadlock in parallel execution of target relabeling 2021-02-28 16:05:32 +02:00
Aliaksandr Valialkin
5c9e657808 lib/promscrape/discovery/kubernetes: fix deadlock in startWatcherForURL
reloadObjects must be called without holding aw.mu lock
2021-02-28 15:25:33 +02:00
Aliaksandr Valialkin
e77f2f8630 lib/promscrape/discovery/kubernetes: typo fix after 241ffd1f3b 2021-02-28 15:15:27 +02:00
Aliaksandr Valialkin
82441537ff lib/promscrape/discovery/kubernetes: pre-populate labelsByKey in reloadObject() 2021-02-28 15:09:43 +02:00
Aliaksandr Valialkin
e003453941 lib/promscrape/discovery/kubernetes: compare sorted sets of labels in tests
This should deflake tests where the order of labels isn't stable
2021-02-28 14:12:32 +02:00
Aliaksandr Valialkin
6a21ef87b7 lib/promscrape: add missing startWatchersForRole() call at the beginning of apiWatcher.getLabelsForRole 2021-02-28 14:00:00 +02:00
Aliaksandr Valialkin
6d0e7fb8b0 lib/promscrape/discovery/kubernetes: reload k8s resources on every error
This is needed for obtaining fresh resourceVersion
2021-02-27 01:46:59 +02:00
Aliaksandr Valialkin
7f1302688f lib/fs: follow-up after f3a03c4164 2021-02-27 01:09:37 +02:00
Nikolay
d88fa5ebe4 Adds windows build (#1040)
* fixes windows compilation,
adds signal impl for windows,
adds free space usage for windows,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1036

NOTE victoria metrics database still CANNOT work under windows system,
only vmagent is supported.
To completly port victoria metrics, you have to fix issues with separators,
parsing and posix file removall

* rollback separator

* Adds windows setInformation api,
it must behave like unix, need to test it.
changes procutil

* check for invlaid param

* Fixes posix delete semantic

* refactored a bit

* fixes openbsd build

* removed windows api call

* Fixes code after windows add

* Update lib/procutil/signal_windows.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-02-27 01:06:22 +02:00
Aliaksandr Valialkin
44975e28fe lib/promscrape: yet another typo fix after ed8441ec52 2021-02-26 23:36:34 +02:00
Aliaksandr Valialkin
c1b8729bd8 lib/fs: properly handle stale NFS file handle error during file deletion
This error can appear when -storageDataPath points to NFS volume and the given file has been already removed.
2021-02-26 23:24:46 +02:00
Aliaksandr Valialkin
50e74c439c lib/promscrape: typo fix after ed8441ec52 2021-02-26 23:04:40 +02:00
Aliaksandr Valialkin
fa3ce450fb lib/promscrape: cache ScrapeWork
This should reduce the time needed for updating big number of scrape targets.
2021-02-26 21:43:41 +02:00
Aliaksandr Valialkin
efcdf613c2 lib/promscrape/discovery/kubernetes: cache target labels
This should reduce CPU usage on repeated SDConfig.GetLabels() calls.
2021-02-26 20:24:29 +02:00
Aliaksandr Valialkin
22822feea3 lib/promscrape/discovery/kubernetes: errcheck fix 2021-02-26 19:09:12 +02:00
Aliaksandr Valialkin
dc8c045378 lib/promscrape: cleanup after 9b2246c29b
Main points:

* Revert changes outside lib/promscrape/discovery/kuberntes . These changes can be applied later in a separate commit
* Minimize changes in lib/promscrape/discovery/kubernetes compared to a93e644001
* Corner case fixes.
2021-02-26 19:09:12 +02:00
Nikolay
cf9262b01f vmagent kubernetes watch stream discovery. (#1082)
* started work on sd for k8s

* continue work on watch sd

* fixes

* continue work

* continue work on sd k8s

* disable gzip

* fixes typos

* log errror

* minor fix

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-02-26 19:09:12 +02:00
Aliaksandr Valialkin
4cfac70cde lib/promscrape: remove duplicate code a bit 2021-02-26 15:54:07 +02:00
Aliaksandr Valialkin
55eb983193 lib/promscrape: reduce processing time for big number of discovered targets by processing them in parallel 2021-02-26 12:48:48 +02:00
Aliaksandr Valialkin
f8baf1a76d lib/promrelabel: optimize labeldrop and labelkeep relabeling for prefix.* and prefix.+ regexps 2021-02-24 17:58:11 +02:00
Aliaksandr Valialkin
06676c8feb lib/storage: consistency renaming: durationsPerDateTagFilterCache -> loopsPerDateTagFilterCache 2021-02-23 15:50:08 +02:00
faceair
b1409a7413 lib/storage: correct tagfilter match cost (#1079) 2021-02-22 21:54:37 +02:00
Aliaksandr Valialkin
197ecca426 lib/promrelabel: add more optimizations for relabeling for common cases 2021-02-22 16:36:54 +02:00
Aliaksandr Valialkin
63c16c3fdf lib/promrelabel: optimize relabeling performance for common cases 2021-02-22 00:51:07 +02:00
Aliaksandr Valialkin
8b87398333 lib/promscrape: export vm_promscrape_target_relabel_duration_seconds metric 2021-02-21 23:21:35 +02:00
Aliaksandr Valialkin
587132555f lib/mergeset: reduce memory usage for inmemoryBlock by using more compact items representation
This also should reduce CPU time spent by GC, since inmemoryBlock.items don't have pointers now,
so GC doesn't need visiting them.
2021-02-21 22:09:10 +02:00
Aliaksandr Valialkin
bd3bcdc43c lib/storage: do not re-calculate stats for heavy tag filters
This should reduce the number of slow queries when stats for heavy tag filters was recalculated.
2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
b8a5ee2e93 lib/{mergeset,storage}: allow merging smaller number of small parts
While this may increase CPU and disk IO usage needed for background merge,
this also recudes CPU usage during queries in production. This is because
such queries tend to read recently added data and it is better to have lower number
of parts for such data in order to reduce CPU usage.

This partially reverts ebf8da3730
2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
34195218e1 lib/{mergeset,storage}: do not use pools for indexBlock and inmemoryBlock during their caching, since this results in higher memory usage in production without any performance gains 2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
f09b46b2b1 lib/promscrape: typo fix after the commit f26162ec99 2021-02-19 00:34:18 +02:00
Aliaksandr Valialkin
72eef964d9 app/vmagent: properly perform graceful shutdown, which was broken in the commit 1d1ba889fe
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1065
2021-02-19 00:34:17 +02:00
Aliaksandr Valialkin
502d0e2524 lib/promscrape: add scrape_align_interval config option into scrape config
This option allows aligning scrapes to a particular intervals.
2021-02-18 23:53:04 +02:00
Aliaksandr Valialkin
10ccb92e4d lib/storage: use composite index for a query with a name filter and negative filters 2021-02-18 18:57:45 +02:00
Aliaksandr Valialkin
418de71509 lib/storage: properly handle queries containing a filter on metric name plus any number of negative filters and zero non-negative filters
Example: `node_cpu_seconds_total{mode!="idle"}`
2021-02-18 18:33:05 +02:00
Aliaksandr Valialkin
628b8eb55e lib/storage: prevent from running identical heavy tag filters in concurrent queries when measuring the number of loops for such tag filter.
This should reduce CPU usage spikes when measuring the number of loops needed for heavy tag filters
2021-02-18 14:01:18 +02:00
Aliaksandr Valialkin
fd41f070db lib/storage: sort tag filters by the number of loops they need for the execution
This metric should work better than the filter execution duration, since it cannot be distorted
by concurrently running queries.
2021-02-18 12:52:29 +02:00
Aliaksandr Valialkin
9566015a36 Revert "lib/mergeset: tune lifetime for entries inside block caches"
This reverts commit 458c89324d.

Production testing revealed zero improvements for memory usage with reduced lifetime for entries in block caches.
2021-02-17 20:42:15 +02:00
Aliaksandr Valialkin
b4c7d5992b lib/storage: move composite filters to the top during sorting 2021-02-17 20:26:32 +02:00
Aliaksandr Valialkin
2005d8212a lib/storage: return back filter arg to getMetricIDsForTagFilter function
The filter arg has been removed in the commit c7ee2fabb8
because it was preventing from caching the number of matching time series per each tf.

Now the cache contains duration for tf execution, so the filter shouldn't break such caching.
2021-02-17 19:33:15 +02:00
Aliaksandr Valialkin
83da939947 app/vmstorage: export vm_composite_filter_success_conversions_total and vm_composite_filter_missing_conversions_total metrics 2021-02-17 19:13:49 +02:00
Aliaksandr Valialkin
0c5bb2a397 lib/storage: revert ecf132933e, since negative filters require the same amount of work as non-negative filters 2021-02-17 18:56:13 +02:00
Aliaksandr Valialkin
bbc287ea6a lib/storage: tag filters sorting... 2021-02-17 18:56:12 +02:00
Aliaksandr Valialkin
f5e841c1e9 lib/storage: further tune tag filters sorting 2021-02-17 17:28:36 +02:00
Aliaksandr Valialkin
9d1f14d94c lib/storage: tune the logic for sorting tag filters according the their exeuction times 2021-02-17 15:02:19 +02:00
Aliaksandr Valialkin
1a19702d92 lib/storage: make sure that nobody uses partitions when closing the table 2021-02-17 15:02:18 +02:00
Aliaksandr Valialkin
3062ff0fdb app/vmselect: export per-tenant stats on the number of requests and the cumulative request duration
The metrics are:
- vm_vmselect_http_requests_total{accountID="...",projectID="..."} - the total number of select requests per each tenant
- vm_vmselect_http_duration_ms_total{accountID="...",projectID="..."} - the total duration in milliseconds for per-tenant select requests

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/932
2021-02-16 23:30:29 +02:00
Aliaksandr Valialkin
3b8596bfec lib/httpserver: make errcheck happy 2021-02-16 22:05:59 +02:00
Aliaksandr Valialkin
1ab7a8dfd5 lib/storage: more tuning for tag filters sorting according the time they take 2021-02-16 21:27:27 +02:00
Aliaksandr Valialkin
35a23234ca lib/mergeset: tune lifetime for entries inside block caches
This should reduce memory usage in general case without significant CPU usage increase
2021-02-16 18:12:32 +02:00
Aliaksandr Valialkin
500acb958c lib/mergeset: clarify comments in the code a bit 2021-02-16 18:03:11 +02:00
Aliaksandr Valialkin
da05c10544 lib/uint64set: remove memory allocation in bucket16.appendTo when sorting smallPool 2021-02-16 15:31:59 +02:00
Aliaksandr Valialkin
18305caadb lib/httpserver: cache /metrics output for a second
This should reduce CPU load when `/metrics` output is scraped with a frequency exceeding a request per second
2021-02-16 14:57:13 +02:00
Aliaksandr Valialkin
7869d38043 lib/protoparser/influx: make sure that escaped whitespace can be put in measurement, tag names and field names 2021-02-16 13:59:11 +02:00
Aliaksandr Valialkin
d788876a7a lib/mergeset: remove unused code after a4140de9e6 2021-02-16 13:40:29 +02:00
Aliaksandr Valialkin
55952f8f2e lib/storage: tune sorting for tag filters 2021-02-16 13:07:42 +02:00
Aliaksandr Valialkin
3eae03a337 lib/storage: increase match cost for negative tag filters, since they need to scan all the label pairs 2021-02-15 16:39:52 +02:00
Aliaksandr Valialkin
46e98ed490 vendor: update github.com/VictoriaMetrics/metrics from v1.13.1 to v1.14.0
The new version switches from log-linear histograms to log-based histograms,
which provide up to 3.6 times better accuracy.
2021-02-15 15:11:15 +02:00
Aliaksandr Valialkin
93ff866e91 lib/storage: reduce the minimum supported retention for inverted index from one month to one day 2021-02-15 15:11:15 +02:00
Aliaksandr Valialkin
5d5d310dcc lib/flagutil: prevent from integer overflow when parsing duration 2021-02-15 15:11:15 +02:00
Aliaksandr Valialkin
fccb481de2 lib/promscrape/discovery/kubernetes: add __meta_kubernetes_endpoints_label_* and __meta_kuberntes_endpoints_annotation_* labels to role: endpoints
This syncs kubernetes SD with Prometheus 2.25
See 617c56f55a
2021-02-15 02:51:36 +02:00
Aliaksandr Valialkin
54a09de037 lib/logger: explicitly import "time/tzdata" package for embedding tzdata into the app
The approach with `timetzdata` build tag didn't work for GOARCH=arm and GOARCH=ppc64le
due to the issue https://github.com/golang/go/issues/44073#issuecomment-778854298
2021-02-15 01:00:30 +02:00
Aliaksandr Valialkin
6f3bbf21b8 lib/storage: sort tag filters by actual execution time instead of by the number of matching time series
This should improve query speed for queries with regexp filters matching small number of time series
on a label with big number of unique values.
2021-02-15 00:19:46 +02:00
Aliaksandr Valialkin
9e3993c585 lib/storage: properly hanle regexp tag filters with dots, which can be converted to full string match filters.
For example `{label=~"foo\.bar"}` should be converted to `{label="foo.bar"}`. Previously it has was mistakenly conveted to `{label="foo\.bar"}` .
This could result in missing time series for such tag filters.
2021-02-14 23:39:19 +02:00
Aliaksandr Valialkin
18c2075159 lib/promscrape: remove vm_promscrape_scrapes_failed_per_url_total and vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total metrics
These metrics may result in big number of time series when vmagent scrapes thousands of targets and these targets constantly changes.

* It is better using `up == 0` query for determining failing targets.
* It is better using the following query for determining targets with exceeded limit on the number of metrics:

  scrape_samples_scraped > 0 if up == 0
2021-02-12 05:23:27 +02:00
Aliaksandr Valialkin
4e645a5fd3 lib/storage: return back in-order applying of tag filters, since concurrently executing tag filters can result in CPU and RAM waste in common case 2021-02-10 22:43:07 +02:00
Aliaksandr Valialkin
eeb92eb7fc lib/storage: load metadata before loading indexdb, since indexdb depends on the metadata 2021-02-10 17:55:51 +02:00
Aliaksandr Valialkin
08f21d8761 app/vmstorage: export vm_composite_index_min_timestamp metric 2021-02-10 17:14:00 +02:00
Aliaksandr Valialkin
b27288f1b0 lib/storage: parallelize tag filters execution a bit
This should reduce execution time when a query contains multiple tag filters and each such filter matches big number of time series.
2021-02-10 16:32:27 +02:00
Aliaksandr Valialkin
4262c2f7c2 lib/storage: remove filter arg from getMetricIDsForDateTagFilter function
The `filter` arg breaks the logic for sorting tag filters by the matching metrics,
which may result in non-optimal performance during time series search.
2021-02-10 16:32:26 +02:00
Aliaksandr Valialkin
681dfb7485 lib/storage: fix inconsistencies in error logs 2021-02-10 16:32:21 +02:00
Aliaksandr Valialkin
148422bcba lib/storage: disable composite index usage when querying old data 2021-02-10 14:57:58 +02:00
Aliaksandr Valialkin
17d5a03f6e lib/storage: fix metric name match for composite filter 2021-02-10 01:27:34 +02:00
Aliaksandr Valialkin
fa0ef143b1 lib/storage: optimize search by label filters matching big number of time series 2021-02-10 00:46:17 +02:00
Aliaksandr Valialkin
5c9715a89a lib/storage: reduce lock contention in dateMetricIDCache when registering new time series for the current day
This should help systems with multiple CPU cores
2021-02-10 00:04:19 +02:00
Aliaksandr Valialkin
082eabf51e lib/fs: remove the code for tracking whether the given memory region is in page cache
This code didn't give performance gains under production workload, so let's remove it in order to simplify the code.
2021-02-09 16:51:11 +02:00
Aliaksandr Valialkin
afa9cf9c57 lib/mergeset: remove dead code left after a4140de9e6 2021-02-09 16:51:09 +02:00
Aliaksandr Valialkin
9ed7789fef optimize Storage.updatePerDateData() 2021-02-09 02:59:53 +02:00
Aliaksandr Valialkin
ea328b7391 lib/storage: skip deduplication when creating inmemory data blocks
The deduplication will be performed later during merging such blocks.
2021-02-09 02:26:16 +02:00
Aliaksandr Valialkin
7b7963a77f lib/mergeset: unconditionally cache indexdb blocks
Production workloads show that indexdb blocks must be cached unconditionally for reducing CPU usage.
This shouldn't increase memory usage too much, since unused blocks are removed from the cache every two minutes.
2021-02-09 00:49:59 +02:00
Aliaksandr Valialkin
e8ee9fa7fe app/vmstorage: export missing vm_cache_size_bytes metrics for indexdb and data caches 2021-02-09 00:49:58 +02:00
Aliaksandr Valialkin
2b36eb3d82 lib/cgroup: follow-up after b9bf3cbe3e 2021-02-08 16:01:26 +02:00
Nikolay
32603f57cc refactored cgroups limits, (#1061)
adds tests, remove os.Exec
2021-02-08 16:01:26 +02:00
Aliaksandr Valialkin
2dbb12563b lib/storage: optimize data ingestion in the beginning of every hour
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046
2021-02-08 12:04:51 +02:00
Aliaksandr Valialkin
1a6eb0c3cf lib/logger: exit the app if unsupported timezone value has been passed to -loggerTimezone
While at it, clarify descrption for `-loggerTimezone` command-line flag.
2021-02-07 23:33:35 +02:00
Aliaksandr Valialkin
c6a7288109 lib/storage: check for prevHourMetricIDs cache before falling back to checking for (date, metricID) entries during data ingestion
This should reduce possible CPU usage spikes at the beginning of every hour.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046
2021-02-04 18:46:23 +02:00
Aliaksandr Valialkin
b5eba70595 lib/httpserver: expose process_open_fds and process_max_fds metrics
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/402
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1037
2021-02-04 16:42:32 +02:00
Nikolay
6fc2a8e544 fixes dockerswarm (#1053)
fixes improper usage of host network services
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1028
2021-02-04 15:57:41 +02:00
Aliaksandr Valialkin
8249f13104 app/vmselect,lib/storage: properly parse Graphite selectors with inner wildcards
Example: foo{bar{x,yz},a[b-c],*de}
2021-02-03 20:16:28 +02:00
Aliaksandr Valialkin
2976ec89b8 lib/storage: fix a bug, which breaks searching by Graphite wildcard filters 2021-02-03 20:15:50 +02:00
Aliaksandr Valialkin
45a63a1da9 sort orSuffixes in tagFilter.InitFromGraphiteQuery for faster seeks 2021-02-03 20:15:37 +02:00
Aliaksandr Valialkin
4b930b9ffe app/vmselect: add ability to set Graphite-compatible filter via {__graphite__="foo.*.bar"} syntax 2021-02-03 01:17:19 +02:00
Aliaksandr Valialkin
755b0998ce lib/promscrape: add vm_promscrape_service_discovery_duration_seconds metric 2021-02-02 16:16:51 +02:00
Aliaksandr Valialkin
4c59dbc127 lib/promscrape: add vm_promscrape_scrape_retries_total, vm_promscrape_discovery_retries_total and vm_promscrape_discovery_requests_total metrics 2021-02-01 20:06:16 +02:00
Aliaksandr Valialkin
e79799a784 lib/flagutil: typo fix in comment to ArrayInt.GetOptionalArgOrDefault() func 2021-02-01 14:42:15 +02:00
Aliaksandr Valialkin
fdf9de98f8 app/vmagent: add -remoteWrite.roundDigits command-line option for limiting the number of digits after the point for stored values
This commit also adds --vm-round-digits command-line option to vmctl tool.
2021-02-01 14:42:15 +02:00
Aliaksandr Valialkin
2c14c70903 lib/logger: initialize timezone by UTC in order to fix failing tests 2021-01-27 01:00:57 +02:00
Aliaksandr Valialkin
ffec5131ae lib/promscrape: export vm_promscrape_scrapes_failed_per_url_total and vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total metrics
These metrics could be useful for determining imporperly working scrape targets.
Note that these metrics are exported only for failing scrape targets. They aren't exposed for normally working targets.
2021-01-27 00:40:39 +02:00
Aliaksandr Valialkin
4b324da947 all: consistently use timers from timerpool 2021-01-27 00:40:39 +02:00
Aliaksandr Valialkin
588090765c lib/fs: properly initialize cleaner for pageCache bitmaps
Previously it wasnt working because the timer was fired only once
2021-01-27 00:40:39 +02:00
Aliaksandr Valialkin
68a66be811 lib/logger: add -loggerTimezone command-line flag for adjusting timezone for timestamps in log messages 2021-01-26 22:53:25 +02:00
Aliaksandr Valialkin
fdced59278 lib/promscrape: retry scrape and service discovery requests when the remote server closes http keep-alive connection 2021-01-22 13:22:59 +02:00
faceair
cf76d8dd79 lib/mergeset: add missing shouldCacheBlock (#1019) 2021-01-15 11:47:25 +02:00
Aliaksandr Valialkin
81789da731 lib/backup: increase backup chunk size from 128MB to 1GB
This should reduce costs for object storage API calls by 8x. See https://cloud.google.com/storage/pricing#operations-pricing
2021-01-13 12:16:39 +02:00
Nikolay
095be83f2f bumps minimal tls version (#1012) 2021-01-13 00:39:09 +02:00
Aliaksandr Valialkin
3d79471fb3 lib/storage: inline marshalTags function and remove the code for handling duplicate tags from here
This is a follow-up commit after c8ea697db8
2021-01-12 15:20:22 +02:00
Aliaksandr Valialkin
719ad49adf lib/storage: de-duplicate tags in MetricName.sortTags
Leave only the last tag among tags with duplicate keys. This is needed for reliable addition of extra_labels
during data ingestion. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1007 for details.
2021-01-12 15:03:22 +02:00
Nikolay
9f0a4fd00e Fixes error handling for promscrape.streamParse (#1009)
properly return error if client cannot read data,
properly suppress scraper errors
2021-01-12 13:35:09 +02:00
Aliaksandr Valialkin
c97681b45c lib/promscrape: properly show scrape duration on /targets page
Previously it has been shown as 0.000s for any scrape duration.
2021-01-11 21:15:50 +02:00
Aliaksandr Valialkin
4ee53c3961 all: use net.Dial instead of fasthttp.Dial, because fasthttp.Dial limits the number of concurrent dials to 1000 2021-01-11 12:52:51 +02:00
Aliaksandr Valialkin
d5a2b120e9 app/vmstorage: disable final merge by default, since it may result in high disk IO and CPU usage without measurable benefits such as increased query performance and reduced disk space usage 2021-01-08 00:12:12 +02:00
Aliaksandr Valialkin
ca8919e8e1 lib/storage: wait for pending transactions before closing and dropping the partition
This deflakes `make test-full-386` test
2020-12-25 11:46:47 +02:00
Aliaksandr Valialkin
c0511144e3 lib/storage: physically remove stale parts
Previously they were removed from partition struct, but the corresponding directories weren't removed.

This is a follow-up for 46dba00756
2020-12-24 16:56:09 +02:00
Nikolay
dd2f815204 fixes kubernetes_sd (#983)
* fixes kubernetes_sd,
adds missing service metadata for pod ports without endpoint
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/982

* fix test
2020-12-24 11:34:12 +02:00
Aliaksandr Valialkin
367fc17933 lib/promscrape: code prettifying for 8dd03ecf19 2020-12-24 10:57:20 +02:00
Nikolay
b00f7816e2 adds proxy_url support, (#980)
* adds proxy_url support,
adds proxy_url to the dockerswarm, eureka, kubernetes and consul service discovery,
adds proxy_url to the scrape_config for targets scrapping,
http based proxy is supported atm,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/503

* fixes imports
2020-12-24 10:57:19 +02:00
Aliaksandr Valialkin
66f8fbbb32 lib/storage: do not remove parts outside the configured retention if they are currently merged
These parts are automatically removed after the merge is complete.
2020-12-24 09:02:12 +02:00
Aliaksandr Valialkin
fa3bcf220f lib/storage: remove stale parts as soon as they go outside the configured retention
Previously such parts could remain undeleted for long durations until they are merged with other parts.
This should help for `-retentionPeriod` values smaller than one month.
2020-12-22 19:55:07 +02:00
Aliaksandr Valialkin
6859737329 lib/storage: properly determine max rows for output part when merging small parts 2020-12-18 23:26:28 +02:00
Aliaksandr Valialkin
edbe35509e lib/{storage,mergeset}: tune background merge process in order to reduce CPU usage and disk IO usage 2020-12-18 20:01:20 +02:00
Aliaksandr Valialkin
1ee5a234dc lib/promscrape: remove ID field from ScrapeWork struct. Use a pointer to ScrapeWork as a key in targetStatusMap
This simplifies the code a bit.
2020-12-17 14:31:55 +02:00
Aliaksandr Valialkin
4fd2973e7c lib/protoparser/prometheus: follow-up commit after 7d38627b9f6f212ae602aea6a72f469fe3c70ba2
Document the bugfix in docs/CHANGELOG.md and add a test for the bugfix.
2020-12-16 23:42:17 +02:00
BigFish
60dd48c9eb lib/protoparser/prometheus/parser.go (#970)
fix parse timestamp error if there are some whitespaces after timestamp
2020-12-16 23:42:16 +02:00
Aliaksandr Valialkin
c4ec977934 lib/promscrape: properly remove deleted target from /targets page
Previously `sw` variable wasn't captured correctly by the started goroutine.
2020-12-15 20:58:37 +02:00
Aliaksandr Valialkin
eddc2bd017 lib/promscrape: properly handle scrape errors when stream parsing is enabled
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/967
2020-12-15 14:10:52 +02:00
Nikolay
7064c4eb8e adds new Array Flags (#965)
* adds ArrayDuration and ArrayBool flags,
makes sendTimeout and tlsInsecure configurable per remoteWrite url

* added backward compatibility testcases for ArrayDuration and ArrayBool

* fixes bool flag

* fixes test cases
2020-12-15 12:59:33 +02:00
Aliaksandr Valialkin
104aac170e lib/promscrape: add bootstrap styles to /targets html page 2020-12-15 12:38:29 +02:00
Aliaksandr Valialkin
ad961fe1f1 lib/promscrape: formatting fixes for /tarets page 2020-12-15 11:59:28 +02:00
Aliaksandr Valialkin
38145cfbb8 lib/promscrape: formatting fixes for /targets page 2020-12-15 11:27:22 +02:00
Aliaksandr Valialkin
d98a2f217b lib/persistentqueue: verify that ReaderOffset doesnt exceed WriterOffset when opening the persistent queue
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/964
2020-12-14 19:25:53 +02:00
Aliaksandr Valialkin
b904ae3722 lib/promscrape: add missing whitespace between duration and ago word at /targets page 2020-12-14 14:20:30 +02:00
Aliaksandr Valialkin
a2eb451de4 app/{vmagent,vminsert}: follow-up for ce8c2dd1f1: return /targets page in HTML when requested via web browser 2020-12-14 14:13:01 +02:00
Nikolay
324e3aa1a5 Changes targets api (#961)
* changes /targets api
adds html response if requester accepts text/html,
adds quick template for /targets api,
fixes pathPrefix for / requests

* changes namings

* renamed targets file

* Update app/victoria-metrics/main.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>

* adds trimspace to qtpl,
moves content-type for targets response closer to writer

* fixes bug with prefix

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-12-14 14:13:00 +02:00
Aliaksandr Valialkin
c80d38f00c lib/promscrape/discovery/consul: reduce load on Consul API server by increasing timeout for blocking requests from 50 seconds to 9 minutes
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-12-11 17:26:34 +02:00
Aliaksandr Valialkin
a84467958a lib/promscrape/discovery/consul: properly pass Datacenter filter to Consul API server
Previously it has been passed as `sdc` query arg, while it should be passed as `dc` query arg.
See https://www.consul.io/api-docs/health#list-nodes-for-service for details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574#issuecomment-740454170
2020-12-08 21:53:23 +02:00
Aliaksandr Valialkin
1a237c6903 all: properly handle CPU limits set on the host system/container
This can reduce memory usage on systems with enabled CPU limits.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946
2020-12-08 21:07:03 +02:00
Aliaksandr Valialkin
38188e1d6b lib/promscrape: store ScrapeWork items by pointer in the slice returned from get*ScrapeWork()
This should prevent from possible 'memory leaks' when a pointer to ScrapeWork item stored in the slice
could prevent from releasing memory occupied by all the ScrapeWork items stored in the slice when they
are no longer used.

See the related commit e205975716 and the related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-12-08 17:55:21 +02:00
Aliaksandr Valialkin
d5faad0240 lib/promscrape: re-use strings for labels stored in ScrapeWork
This should reduce memory usage when working with big number of scrape targets.
2020-12-08 12:23:44 +02:00
Aliaksandr Valialkin
06091cfdf8 lib/promscrape: export vm_promscrape_scrapers_{started|stopped}_total metrics for monitoring target churn rate 2020-12-08 11:58:44 +02:00
Aliaksandr Valialkin
affcee199c lib/promscrape: store targetStatus entries in targetStatusMap by pointer instead of by value
This guarantees that GC frees memory occupied by targetStatus after it is unregistered from targetStatusMap.
2020-12-08 11:52:20 +02:00
Aliaksandr Valialkin
56a0b058c1 lib/promscrape: export vm_promscrape_active_scrapers{type="<sd_type>"} metric for tracking the number of active scrapers per each service discovery type 2020-12-08 01:54:44 +02:00
Aliaksandr Valialkin
b5b32c65b0 lib/promscrape: do not enable strict config parsing when -promscrape.config.dryRun command-line flag is passed
Strict parsing for -promscrape.config can be enabled by passing `-promscrape.config.strictParse` command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/944
2020-12-07 13:18:16 +02:00
Aliaksandr Valialkin
9d79d11a6c lib/promscrape: mention in scrape error message that scrape errors can be disabled by -promscrape.suppressScrapeErrors command-line flag 2020-12-06 23:27:07 +02:00
Aliaksandr Valialkin
f6d32f99d7 lib/promscrape: clarify error message on failed connection to scrape target when -enableTCP6 command-line flag isn't set 2020-12-06 13:19:32 +02:00
Aliaksandr Valialkin
3d00613076 lib/protoparser/influx: allow multiple whitespace chars between measurement, fields and timestamp in Influx line protocol 2020-12-06 12:00:28 +02:00
Aliaksandr Valialkin
1430bbcf33 lib/promscrape/discoveryutils: remove limit on the number of concurrently running blocking queries
Too low limit could result in unexpected errors when performing big number of blocking queries.
2020-12-05 12:15:47 +02:00
Aliaksandr Valialkin
528587deef lib/flagutil: make golangci-lint happy by using strings.TrimPrefix instead of manual prefix removal via strings.HasPrefix 2020-12-03 22:07:26 +02:00
Aliaksandr Valialkin
bdac2171f1 all: do not print usage info for all the flags when incorrect command-line flag is passed
This should improve usability for VictoriaMetrics apps that have big number of command-line flags,
i.e. all the apps.
2020-12-03 21:46:19 +02:00
Aliaksandr Valialkin
96190f9d45 lib/promscrape/discovery/consul: log the time needed for stoppig Consul service watcher 2020-12-03 20:14:48 +02:00
Aliaksandr Valialkin
4e4a93c586 lib/promscrape/discovery/consul: make sure that block response contains X-Consul-Index header 2020-12-03 20:05:54 +02:00
Aliaksandr Valialkin
7a889f6850 lib/promscrape: code cleanup after c6dee6c52d
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-12-03 19:52:09 +02:00
Nikolay
0b302d33cb Changes consul discovery api (#921)
* adds consul watch api,
it must reduce load on consul service with blocking wait requests,
changed discoveryClient api with fetchResponseMeta callback.

* small fix

* fix after master merge

* adds watch client at discovery utils

* fixes consul watcher,
changes namings,
fixes data race

* small typo fix

* sanity fix

* fix naming and service node update
2020-12-03 19:52:08 +02:00
Aliaksandr Valialkin
9eca96596f lib/storage: add missing (AccountID, ProjectID) in MetricName.String() test 2020-11-29 01:25:50 +02:00
Aliaksandr Valialkin
2385ac11c0 lib/promscrape: fix failing tests after a906b3862f 2020-11-29 01:25:49 +02:00
Aliaksandr Valialkin
2ed721e457 lib/protoparser/prometheus: properly parse OpenMetrics timestamps
OpenMetrics timestamps are floating-point numbers, that represent Unix timestamp in seconds.
This differs from Prometheus exposition format, where timestamps are integer numbers representing Unix timestamp in milliseconds.
2020-11-27 14:54:36 +02:00
Aliaksandr Valialkin
3bb9bf33d6 lib/promscrape: reduce memory allocations when unpacking gzipped responses received from scrape targets 2020-11-26 18:32:16 +02:00
Aliaksandr Valialkin
af667c59c1 all: typo fix: thouthand->thousand 2020-11-26 13:34:05 +02:00
Aliaksandr Valialkin
3dd2282ed9 lib/promscrape: release http response non-200 status code 2020-11-26 13:25:25 +02:00
Aliaksandr Valialkin
3f52e59efe app/{vmagent,victoria-metrics}: add -dryRun option and make more clear handling for -promscrape.config.dryRun 2020-11-25 23:01:39 +02:00
Aliaksandr Valialkin
8f8727cb65 lib/mergeset: tune the number of rawItemsBlocks to merge at once
512 blocks give higher ingestion performance and slightly lower memory usage
2020-11-25 21:53:15 +02:00
Aliaksandr Valialkin
8fcd87a6a5 lib/mergeset: help GC by removing refereces to slices in inmemoryBlock.Reset 2020-11-25 21:20:02 +02:00
Aliaksandr Valialkin
03002f1fe1 lib/storage: log metric name plus all its labels when the metric timestamp is outside the configured retention
This should simplify debugging when the source of the metric with unexpected timestamp must be found.
2020-11-25 14:44:29 +02:00
Aliaksandr Valialkin
4848a05924 lib/storage: typo fix in error message: allowd->allowed 2020-11-25 14:15:54 +02:00
Aliaksandr Valialkin
26e699c440 lib/protoparser/prometheus: properly parse "infinity" values in OpenMetrics format
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/924
2020-11-24 19:02:50 +02:00
Aliaksandr Valialkin
a8b5a6e517 lib/logger: disable rate limiting for error and warn logs by default 2020-11-24 12:42:07 +02:00
Aliaksandr Valialkin
7f3e884a31 all: spelling fix: superflouos->superfluous. This is a follow-up for 0acdab3ab9 2020-11-24 12:42:04 +02:00
Aliaksandr Valialkin
dad8b76a0e lib/protoparser/prometheus: properly parse metrics with exemplars
Examplars have been introduced in OpenMetrics - see https://github.com/OpenObservability/OpenMetrics/blob/master/OpenMetrics.md#exemplars-1
Previously VictoriaMetrics couldn't parse the following metric

    foo{bar="baz"} 123 # exemplar here

This commit fixes this. Note that VictoriaMetrics ignores the exemplar as for now.
2020-11-24 12:30:34 +02:00
Aliaksandr Valialkin
8b82f9d8b8 lib/promscrape: expose __meta_ec2_ipv6_addresses label for ec2_sd_config like Prometheus will do in the next release 2020-11-23 16:57:03 +02:00
Aliaksandr Valialkin
c2186279b7 lib/promscrape: add filters option to dockerswarm_sd_config like Prometheus did in v2.23.0 2020-11-23 16:27:33 +02:00
Aliaksandr Valialkin
f4fd917e4f lib/fs: replace fs.OpenReaderAt with fs.MustOpenReaderAt
All the callers for fs.OpenReaderAt expect that the file will be opened.
So it is better to log fatal error inside fs.MustOpenReaderAt instead of leaving this to the caller.
2020-11-23 09:57:30 +02:00
Aliaksandr Valialkin
d892d63204 lib/promscrape: hint that -enableTCP6 command-line flag can be used for connecting to IPv6 addresses 2020-11-21 14:39:05 +02:00
Aliaksandr Valialkin
8608e956dd lib/promscrape/discovery/eureka: follow-up after eec76718e9 2020-11-20 14:02:14 +02:00
Nikolay
bb2bcb9725 Adds eureka service discovery (#913)
* Adds eureka service discovery
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/851
Netflix service discovery for AWS

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-11-20 14:02:13 +02:00
Aliaksandr Valialkin
8e1f657ef9 lib/logger: follow-up for 09105ff49c 2020-11-19 12:37:05 +02:00
Nikolay
f54a5f3868 Adds log suppression per caller (#908)
* Adds log suppression per caller
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/905

* fixes style and report message
2020-11-19 12:19:20 +02:00
Aliaksandr Valialkin
0895b7f411 lib/logger: add -loggerWarnsPerSecondLimit command-line flag for rate limiting of WARN log messages
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/905
2020-11-18 03:43:17 +02:00
Aliaksandr Valialkin
7d76fdedcc app/vmselect: use storage.NewSearchQuery() instead of constructing storage.SearchQuery in-place
This should prevent from bugs when AccountID and ProjectID aren't set in storage.SearchQuery.
2020-11-16 18:04:33 +02:00
Aliaksandr Valialkin
a9287cf564 lib/storage: do not pass (accountID, projectID) to SearchTagNames(), since they are already passed via tfss 2020-11-16 18:04:30 +02:00
Aliaksandr Valialkin
ac7460abdd lib/storage: add a test for Storage.SearchMetricNames 2020-11-16 13:18:48 +02:00
Aliaksandr Valialkin
eea1be0d5c app/vmselect/graphite: add /tags/findSeries handler from Graphite Tags API
See https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
2020-11-16 12:52:23 +02:00
Aliaksandr Valialkin
4be5b5733a app/vminsert: add /tags/tagSeries and /tags/tagMultiSeries handlers from Graphite Tags API
See https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb
2020-11-16 02:40:04 +02:00
Aliaksandr Valialkin
9ec964bff8 lib/storage: do not show artifically created label for reverse Graphite labels at /api/v1/labels page 2020-11-16 00:44:54 +02:00
Aliaksandr Valialkin
f80d6473e1 lib/protoparser/promremotewrite: log the time spent on unsuccessful data read from the network
This should help with debugging `connection timed out` errors.
2020-11-13 17:49:21 +02:00
Vasily
8ba168f3be
Add omitempty for DisableCompression and DisableKeepAlive fields in ScrapeConfig (#796)
* Add omitempty for DisableCompression and DisableKeepAlive fields in ScrapeConfig

* Add omitempty annotation to all the default/optional values

* Fix annotations after review
2020-11-13 16:17:03 +02:00
Aliaksandr Valialkin
d21b1606a1 lib/protoparser/opentsdbhttp: increment errors counter on unmarshal errors
This is a follow-up for 149c0c4a6d
2020-11-13 13:23:11 +02:00
Aliaksandr Valialkin
22c1e29284 lib/protoparser: propagate callback error to the caller of ParseStream for every supported data ingestion protocols
The caller of ParseStream then can generate HTTP 503 responses for non-nil errors occured in callbacks when processing incoming requests.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/896
2020-11-13 13:05:34 +02:00
Aliaksandr Valialkin
9dfe00c962 lib/protoparser/promremotewrite: synchronously process Prometheus remote_write requests
There is no reason in processing these requests asynchronously in the face of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/896
Synchronous processing code is easier to read and understand than the previous async code
2020-11-13 12:17:32 +02:00
Aliaksandr Valialkin
739b88c1e4 lib/protoparser/promremotewrite: forward errors, which can occur during data ingestion, to the caller of ParseStream, so it could properly return HTTP 503 status code on non-nil error
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/896
2020-11-13 11:00:41 +02:00
Aliaksandr Valialkin
7ceaf4ba8f all: consistently return text-based HTTP responses with charset=utf-8
This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897
2020-11-13 10:30:21 +02:00
immerrr again
1ec1a9f27f app/vmstorage: add "/internal/force_flush" endpoint (#893) 2020-11-11 14:46:37 +02:00
Aliaksandr Valialkin
697fd44158 lib/promscrape: make a copy of ScrapeWork from discovered []ScrapeWork slice instead of referring to an item in this slice
This should prevent from holding previously discovered []ScrapeWork slices when a part of discovered targets changes over time.
This should reduce memory usage for the case when big number of discovered scrape targets changes over time.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-10 16:13:31 +02:00
Aliaksandr Valialkin
2ec02b7bdb lib/promscrape: pre-allocate slice for discovered targets based on previously discovered targets
This should reduce load on GC a bit when discovering big number of scrape targets
2020-11-10 15:57:43 +02:00
Aliaksandr Valialkin
a8562d643b lib/promscrape: add -promscrape.dropOriginalLabels command-line flag for reducing memory usage when discovering big number of scrape targets
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/878
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-10 00:20:49 +02:00
Aliaksandr Valialkin
aa3e46a02d lib/promscrape: further reduce memory usage for per-scrape target labels by making a copy of actually used labels
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-09 10:55:03 +02:00
Aliaksandr Valialkin
b8083b7659 lib/promscrape: clean references to label name and label value strings after applying per-target relabeling
This should reduce memory usage when per-target relabeling creates big number of temporary labels
with long names and/or values.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-07 16:19:52 +02:00
Aliaksandr Valialkin
b4efe626d7 lib/promscrape/discovery/kubernetes: go fmt 2020-11-07 13:04:09 +02:00
Aliaksandr Valialkin
92bc1afcee lib/promscrape/discovery/kubernetes: reduce memory usage for labels when discovering big number of scrape targets by using string concatenation instead of fmt.Sprintf
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-07 13:03:01 +02:00
Aliaksandr Valialkin
535fea3d11 lib/promscrape: eliminate data race in stream parse mode
Previously `-promscrape.streamParse` mode could result in garbage labels for the scraped metrics because of data race.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825#issuecomment-723198247
2020-11-07 12:45:52 +02:00
Aliaksandr Valialkin
72011bcc45 app/vmselect: properly handle errors in GetLabelsOnTimeRange and GetLabelValuesOnTimeRange 2020-11-05 01:36:34 +02:00
Aliaksandr Valialkin
f2bff64933 lib/storage: remove data race when updating rowsDeleted 2020-11-05 01:19:30 +02:00
Aliaksandr Valialkin
c5e6c5f5a6 app/vmselect: optimize querying for /api/v1/labels and /api/v1/label/<name>/values when start and end args are set 2020-11-05 01:19:29 +02:00
Nikolay
5b235b902b Adds ready probe (#874)
* adds leading forward slash check for scrapeURL path
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/835

* adds ready probe for scrape config initialization,
it should prevent metrics loss during vmagent rolling update,
/ready api will return 425 http code, if some scrape config still waits for initialization.

* updates docs

* Update app/vmagent/README.md

* renames var

* Update app/vmagent/README.md

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-11-04 20:33:48 +02:00
Aliaksandr Valialkin
2cd86d0220 lib/promscrape: docs update after e4182dd896
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/878
2020-11-04 17:13:34 +02:00