匠心零度
b4f5be8bd8
fix vagent imbalance problem ( #1292 )
...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=0 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=1 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=2 -promscrape.config=/path/to/config.yml ...
Co-authored-by: lirenzuo <lirenzuo@shein.com>
2021-05-13 11:14:51 +03:00
Aliaksandr Valialkin
e6fda03e8f
vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.14 to v1.0.15
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:44:14 +03:00
Aliaksandr Valialkin
f6a641de62
lib/promscrape: exponentially increase retry interval on unsuccesful requests to scrape targets or to service discovery services
...
This should reduce CPU load at vmagent and at remote side when the remote side doesn't accept HTTP requests.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:38:50 +03:00
Aliaksandr Valialkin
c0ec541559
lib/cgroup: document the ability to detect cgroup v2 memory and cpu limits. This is follow-up for b50024812e
2021-05-13 09:26:20 +03:00
Aliaksandr Valialkin
d7be2753c0
lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss
2021-05-13 09:02:33 +03:00
Nikolay
b50024812e
adds cgroupsv2 support ( #1283 )
...
* adds cgroupv2 limits support
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1269
* small fix
* changes Atoi to ParseUint
2021-05-13 09:02:13 +03:00
Aliaksandr Valialkin
a22a17dc66
lib/storage: merge getTSDBStatusForDate with getTSDBStatusWithFiltersForDate
...
These functions are non-trivial, while their code has minimal differences.
It is better from maintainability PoV to merge these functions into a single function.
2021-05-12 17:56:53 +03:00
Aliaksandr Valialkin
832651c6c2
app/vmselect: follow up after 8a0678678b
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1168
2021-05-12 17:18:30 +03:00
Nikolay
8a0678678b
Adds tsdb match filters ( #1282 )
...
* init work on filters
* init propose for status filters
* fixes tsdb status
adds test
* fix bug
* removes checks from test
2021-05-12 15:18:45 +03:00
Aliaksandr Valialkin
cfd6aa28e1
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices scrape targets every 5 seconds, since they may depend on changed service and pod objects
...
This should make endpoints and endpointslices scrape targets eventually consistent with the maximum delay of 5 seconds after the related service or pod object changes.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-12 14:10:34 +03:00
Aliaksandr Valialkin
afec68ad13
lib/httpserver: add new X-Server-Hostname header instead of overwriting already exsiting header
...
This makes possible tracking origins of chained requests over multiple hops.
2021-05-11 23:48:59 +03:00
Aliaksandr Valialkin
f2d5c4e2d0
lib/httpserver: return X-Server-Hostname http header in all the responses for better debuggability
2021-05-11 22:03:48 +03:00
Aliaksandr Valialkin
33f7bacb01
lib/storage: properly apply time range when matching an empty filter
...
It must match all the time series on the given time range.
Previously it was matched to all the time series without the restriction on the given time range.
2021-05-11 01:12:05 +03:00
Aliaksandr Valialkin
75c2c813fc
lib/storage: remove dead code after the commit 3ccf7ea20c
2021-05-08 20:14:11 +03:00
Aliaksandr Valialkin
7aea5f58c4
lib/ingestserver: properly close incoming connections during graceful shutdown
2021-05-08 19:52:58 +03:00
Aliaksandr Valialkin
12d733dd5d
app/vminsert: add support for data ingestion via other vminsert nodes
2021-05-08 19:52:57 +03:00
Aliaksandr Valialkin
904bbffc7f
lib/promscrape/discovery/kubernetes: start watchers for pods and services before starting watchers for endpoints
...
This should eliminate possible race when an update on endpoints depends on pods and/or services, which are missing in the cache yet.
This could result in missing targets based on endpoints or endpointslices.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-05 12:23:50 +03:00
Aliaksandr Valialkin
76027093ca
lib/storage: use WARNING instead of INFO level for logging dropped labels
2021-05-03 13:56:43 +03:00
Aliaksandr Valialkin
ce9e163e94
lib/httpserver: stop the process on panics in request handlers
...
Panics may leave the process in inconsistent state. That's why it is better to stop the process after the panic
instead of recovering from the panic. Unfortunately, the standard net/http.Server recovers panics in request handlers.
See https://github.com/golang/go/issues/16542 . That's lib/httpserver must stop the process on itself after the panic.
2021-05-03 11:59:40 +03:00
Nikolay
477369b62f
adds stalePartsRemover ( #1261 )
...
for new created partitions
2021-05-03 11:34:00 +03:00
Aliaksandr Valialkin
9a930720b3
lib/promrelabel: add tests for removing the specified {label="value"} pair
2021-05-03 11:26:53 +03:00
Aliaksandr Valialkin
58d1e6eeea
lib/storage: log dropped labels if the number of labels in a metric exceeds -maxLabelsPerTimeseries
command-line flag value
...
This should improve debuggability for this case.
2021-05-01 09:27:55 +03:00
Aliaksandr Valialkin
6fa5981e68
app/vmagent: list user-visible endpoints at
http://vmagent:8429/
...
While at it, use common WriteAPIHelp function for the listing in vmagent, vmalert and victoria-metrics
2021-04-30 09:36:43 +03:00
Aliaksandr Valialkin
2ab1266593
lib/promscrape/discovery/kubernetes: remove a mutex at urlWatcher - use groupWatcher mutex for accessing all the urlWatcher children
...
This simplifies the code a bit and reduces the probability of improper mutex handling and deadlocks.
2021-04-29 10:14:26 +03:00
Nikolay
4e5a88114a
vmagent kubernetes_sd tests ( #1253 )
...
* first part of tests for kubernetes sd
* makes linter happy
* added more test cases
* adds pub/sub for tests
2021-04-29 10:10:24 +03:00
Aliaksandr Valialkin
87179c6839
lib/{storage,mergeset}: fix unaligned 64-bit atomic operation
panic for 32-bit architectures
...
The panic has been introduced in 56b6b893ce
2021-04-27 16:41:32 +03:00
Aliaksandr Valialkin
56b6b893ce
lib/mergeset: split rows ingestion among multiple shards
...
This improves rows ingestion on systems with many CPU cores by reducing lock contention.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
Thanks to @waldoweng for the original idea and draft implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1243
2021-04-27 15:36:34 +03:00
Aliaksandr Valialkin
2ec7d8b384
lib/promscrape/discovery/kubernetes: fix a deadlock introduced in eddba29664
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
Thanks to @f41gh7 for providing the initial idea for deadlock fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1248
2021-04-27 14:57:51 +03:00
Aliaksandr Valialkin
f89c1f7f49
lib/storage: typo fix in info message when deleting the part outside the configured retention
...
Previously the message was displaying incorrect retention time
2021-04-27 13:32:46 +03:00
Aliaksandr Valialkin
60947fb2d5
lib/persistentqueue: eliminate possible data race when obtaining vm_persistentqueue_bytes_pending metric value
2021-04-27 00:25:52 +03:00
Aliaksandr Valialkin
908e35affd
lib/promscrape: apply scrape_timeout
on receiving the first response byte for stream_parse: true
scrape targets
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017#issuecomment-767235047
2021-04-23 21:53:35 +03:00
Aliaksandr Valialkin
eddba29664
lib/promscrape/discovery/kubernetes: refresh role: endpoints
targets on service object removal as Prometheus does
...
This is a follow-up for ae37cfd528
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:26:57 +03:00
Aliaksandr Valialkin
ae37cfd528
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices targets on service object update like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:11:40 +03:00
Aliaksandr Valialkin
bbebdf9ba1
lib/{storage,mergeset}: remove empty directories on startup. Such directories can be left after unclean shutdown on NFS storage
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1142
2021-04-22 13:02:44 +03:00
Aliaksandr Valialkin
6bc52fe41a
all: rename https://victoriametrics.github.io to https://docs.victoriametrics.com
2021-04-20 20:16:17 +03:00
Aliaksandr Valialkin
9b1999a5ff
lib/uint64set: remove memory allocation in getOrCreateSmallPool()
...
This partially reverts fb82c4b9fa
It has been appeared that the additional memory allocation may result in higher GC pauses.
It is better to spend CPU time on copying bigger bucket16 structs instead of increasing query latencies due to higher GC pauses
2021-04-14 12:30:19 +03:00
Aliaksandr Valialkin
251747e253
lib/storage: code clarification: remove caching the found metricName in searchMetricName
2021-04-13 10:22:21 +03:00
Artem Navoiev
77be3e3a82
improve docs for cli flags ( #1202 )
...
* improve docs for cli flags
* improve docs for cli flags.2
2021-04-12 12:28:04 +03:00
Aliaksandr Valialkin
3f0bcbe067
lib/promscrape: create a single swosFunc
per scrape_config
2021-04-08 09:31:48 +03:00
Aliaksandr Valialkin
5a0938d807
lib/promscrape: do not spend CPU time on constructing scrapeWork key if clustering is disabled
2021-04-07 21:54:22 +03:00
Aliaksandr Valialkin
df32d2836c
lib/storage: properly handle big time ranges passed to /api/v1/labels
and /api/v1/label/<labelName>/values
...
It should be faster querying all the labels and/or all the values instead of querying per-day labels/values on time ranges exceeding maxDaysForPerDaySearch
2021-04-07 13:33:46 +03:00
Aliaksandr Valialkin
59f9960992
lib/promscrape/discovery: remove superflouos check in registerPendingAPIWatchers
...
The check `_, ok := uw.aws[aw]; !ok` isn't needed, since aw cannot exist in uw.aws
because of the check inside subscribeAPIWatcher
2021-04-07 13:07:39 +03:00
Aliaksandr Valialkin
3ec6639bbb
lib/promscrape/discovery/kubernetes: register pending apiWatchers in uw.aws
2021-04-06 11:12:13 +03:00
Aliaksandr Valialkin
5f593b0ed3
lib/promscrape/discovery/kubernetes: remove superflouos mustStart and mustStop functions
2021-04-05 22:44:12 +03:00
Lu Jiajing
b59164cf33
fix access to nil *url.URL ( #1180 )
...
* fix access to nil *url.URL
Signed-off-by: Megrez Lu <lujiajing1126@gmail.com>
* Update lib/promscrape/discovery/kubernetes/api_watcher.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-05 22:25:31 +03:00
Aliaksandr Valialkin
b46194472f
lib/promscrape/discovery/kubernetes: reduce CPU time spent on registering big number of Kubernetes objects shared among big number of scrape jobs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 22:04:30 +03:00
Aliaksandr Valialkin
a51d0ec6ec
lib/promscrape/discovery/kubernetes: load objects missing in local cache from api seriver in getObjectByRole()
...
This should fix possible race for `role: endpoints` and `role: endpointslices` service discovery,
when the referred `pod` and `service` objects aren't propagated to urlWatcher cache yet.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182#issuecomment-813353359 for details.
2021-04-05 20:31:17 +03:00
Aliaksandr Valialkin
95dbebf512
lib/persistentqueue: delete corrupted persistent queue instead of throwing a fatal error
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1030
2021-04-05 19:26:11 +03:00
Aliaksandr Valialkin
f010d773d6
lib/promscrape/discovery/kubernetes: synchronously load Kubernetes objects on first access
...
Remove async registration of apiWatchers, since it breaks discovering `role: endpoints` and `role: endpointslices` targets,
which depend on pod and service objects.
There is no need in reloading `endpoints` and `endpointslices` targets if the referenced `pod` or `service` objects change,
since in this case the corresponding `endpoints` and `endpointslices` objects should also change because they contain
ResourceVersion of the referenced `pod` or `service` objects, which is modified on object update.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 14:20:12 +03:00
Aliaksandr Valialkin
2c25b0322b
lib/proxy: typo fix after a5c5b54c22
2021-04-05 13:45:24 +03:00
Aliaksandr Valialkin
a5c5b54c22
lib/proxy: add support for socks5 over tls
proxy
2021-04-05 13:00:05 +03:00
Aliaksandr Valialkin
6742839fd6
lib/promscrape: pass X-Prometheus-Scrape-Timeout-Seconds
header to scrape targets as Prometheus does
2021-04-05 12:15:24 +03:00
Nikolay
a4c6a3b3e1
adds socks5 support for fasthttp client ( #1178 )
...
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1177
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-04 01:29:54 +03:00
Aliaksandr Valialkin
500e625e8c
lib/promscrape: properly send full url in GET
request via simple HTTP proxy
...
This is a follow-up for a0ae0f86666a75ec57b45eab2429da7ab4a7b250
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-04 01:20:06 +03:00
Aliaksandr Valialkin
5153410ced
lib/promscrape: support for simple HTTP proxies without CONNECT
method support such as https://github.com/prometheus-community/PushProx
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-04 00:40:40 +03:00
Aliaksandr Valialkin
4c56b1a6dd
lib/promscrape: add tests for authorization
config, which has been added in df148f48b7
2021-04-03 22:13:22 +03:00
Aliaksandr Valialkin
43f9842b6f
lib/proxy: log response body on non-200 response code
...
This should improve debuggability for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-03 03:03:55 +03:00
Aliaksandr Valialkin
df148f48b7
lib/promscrape: add support for authorization
config in -promscrape.config
as Prometheus 2.26 does
...
See https://github.com/prometheus/prometheus/pull/8512
2021-04-02 21:17:45 +03:00
Aliaksandr Valialkin
7f9c68cdcb
lib/promscrape: add follow_redirect
option to scrape_configs
section like Prometheus does
...
See https://github.com/prometheus/prometheus/pull/8546
2021-04-02 19:56:40 +03:00
Aliaksandr Valialkin
5b08e6fb16
lib/promscrape/discovery/kubernetes: properly track objects with the same names in multiple namespaces
...
This is a follow-up for 12e4785fe8
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:45:32 +03:00
Aliaksandr Valialkin
12e4785fe8
lib/promscrape/discovery/kubernetes: properly discover targets in multiple namespaces
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:28:30 +03:00
Nikolay
fdb8995642
Adds aws ECS credentials support ( #1175 )
2021-04-02 11:56:40 +03:00
Aliaksandr Valialkin
dc9eafcd02
app/{vminsert,vmagent}: add -sortLabels
command-line option for sorting time series labels before ingesting them in the storage
...
This option can be useful when samples for the same time series are ingested with distinct order of labels.
For example, metric{k1="v1",k2="v2"} and metric{k2="v2",k1="v1"}.
2021-03-31 23:27:58 +03:00
Aliaksandr Valialkin
e1f699bb6c
lib/storage: reduce memory usage when ingesting samples for the same time series with distinct order of labels
2021-03-31 21:24:46 +03:00
Aliaksandr Valialkin
b75c2ce659
lib/uint64set: improve Set.Has() performance scalability on multi-CPU system
...
Do not update bucket32.hint on Set.Has() call, since it leads to memory ping-pong between CPU cores multi-CPU system
2021-03-29 12:33:47 +03:00
Aliaksandr Valialkin
2601cc0fb0
lib/storage: do not update b.nextIdx if no samples are removed because of retention
2021-03-29 12:00:21 +03:00
Aliaksandr Valialkin
f39c84b21f
lib/promscrape/discovery/kubernetes: typo fix in error message
2021-03-26 12:46:14 +02:00
Aliaksandr Valialkin
9761ffd161
lib/promscrape/discovery/kubernetes: properly handle too old resource version
error message from Kubernetes watch API
2021-03-26 12:28:10 +02:00
Aliaksandr Valialkin
aa81039b42
app/vmselect: log the metric which trigger rollup result cache reset
...
This should help finding the source of stale metrics
2021-03-25 21:31:39 +02:00
Aliaksandr Valialkin
6e855d4b82
lib/storage: tune loopsCountPerMetricNameMatch according to production workload
2021-03-25 13:27:47 +02:00
Aliaksandr Valialkin
d4aadba9fa
app/vmagent: add -promscrape.consul.waitTime
command-line flag for configuring Consul service discovery wait time
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1144
2021-03-23 19:33:25 +02:00
Aliaksandr Valialkin
3a3d2165f9
lib/storage: do not reload metricName for the same metricID in Search.NextMetricBlock
...
This should speed up Search.NextMetricBlock a bit
2021-03-23 17:56:49 +02:00
Nikolay
29f9ef9b7f
changes consul_service label value ( #1143 )
...
according to prometheus discovery.
It should mitigate issue with case sensetive services
https://github.com/hashicorp/consul/issues/5707
2021-03-23 15:35:01 +02:00
Aliaksandr Valialkin
3cfb3a3683
lib/storage: respect the deadline passed to Storage.SearchMetricNames
2021-03-22 23:03:17 +02:00
Aliaksandr Valialkin
8e2afdf568
lib/storage: improve Search.NextMetricBlock performance by using MetricID->MetricName cache
2021-03-22 22:49:18 +02:00
Aliaksandr Valialkin
910092ca4d
lib/storage: tune loopsCountPerMetricNameMatch
2021-03-22 12:53:17 +02:00
Aliaksandr Valialkin
726f6ad804
lib/storage: small code simplification after 6cee5338b2
2021-03-18 15:21:13 +02:00
Aliaksandr Valialkin
6cee5338b2
lib/storage: prevent from infinite loop if {__graphite__="..."}
filter matches a metric name with *
, [
or {
chars
...
The idea has been borrowed from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1137
2021-03-18 14:53:47 +02:00
Aliaksandr Valialkin
4fb049bcba
lib/fs: reduce the frequency of failed to remove directory ... due to NFS lock
log warnings
...
Log `failed to remove directory ... due to NFS lock` warning only if the directory cannot be removed in one second.
2021-03-18 13:24:46 +02:00
Aliaksandr Valialkin
45dabfac1b
lib/storage: faster move heavy filters to the end of list
2021-03-17 15:12:13 +02:00
Aliaksandr Valialkin
828669e4e1
all: make golint
happy
2021-03-17 00:49:28 +02:00
Aliaksandr Valialkin
ccfb0ae2d3
lib/storage: limit loops count in order to reduce max CPU usage during filter search
2021-03-17 00:49:26 +02:00
Aliaksandr Valialkin
576a80b3d9
lib/storage: do not modify filterLoopsCount stats with loopsCount stats
...
Such a modification can result in incorrect filter sorting later
2021-03-17 00:49:26 +02:00
Aliaksandr Valialkin
f104f3eb2a
all: make golangci-lint happy after the commit 6378205415
2021-03-17 00:24:40 +02:00
Aliaksandr Valialkin
6378205415
lib/netutil: enable IPv6 UDP listening if -enableTCP6
command-line flag is passed to VictoriaMetrics
...
This is a follow-up for 18cfc4be7b
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1131
2021-03-17 00:16:17 +02:00
Nikolay
18cfc4be7b
Adds udp6 support for ingest servers ( #1134 )
...
with flag -enableUDP6 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1131
2021-03-17 00:03:06 +02:00
Aliaksandr Valialkin
35bb44d317
lib/uint64set: optimize bucket16.addMulti a bit
2021-03-16 21:09:07 +02:00
Aliaksandr Valialkin
fd86a7dc1d
lib/storage: time series search optimization according to production workload profiling
...
Do not pass filter metric ids to getMetricIDsForTagFilter, since it has been appeared that this slows down
the function by multiple times when it finds big number of metricIDs (tens of millions).
2021-03-16 20:01:43 +02:00
Aliaksandr Valialkin
e36fbfae5b
lib/storage: further tuning for time series search
2021-03-16 18:46:22 +02:00
Aliaksandr Valialkin
dd7e82c34f
app/vmstorage: add -logNewSeries
command-line flag for determining the source of series churn rate
2021-03-15 22:38:50 +02:00
Aliaksandr Valialkin
37323c57c9
lib/influxutils: return response compatible with InfluxDB 1.8.4
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1124
2021-03-15 22:19:59 +02:00
f41gh7
acf2a82d3c
typo fix
2021-03-15 23:03:52 +03:00
Aliaksandr Valialkin
85a95bf60c
all: various fixes in command-line flag descriptions
2021-03-15 21:59:25 +02:00
Aliaksandr Valialkin
7c37e9aea9
app/{vminsert,vmagent}: a follow-up for b1aa8c3d8f
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1124
2021-03-15 21:37:55 +02:00
Aliaksandr Valialkin
9c77e34ef9
lib/storage: further tuning for time series selector code
2021-03-15 20:31:34 +02:00
Aliaksandr Valialkin
fb82c4b9fa
lib/uint64set: reduce the size of bucket16 by storing smallPool by pointer.
...
This reduces CPU time spent on bucket16 copying inside bucket13.addBucketAtPos.
2021-03-15 17:23:31 +02:00
Aliaksandr Valialkin
eb103e1527
lib/uint64set: optimize Set.AddMulti for large sorted sets
2021-03-15 17:10:40 +02:00
Aliaksandr Valialkin
43504ebd14
lib/uint64set: optimize bucket16.add and bucket16.addMulti a bit
2021-03-15 16:58:12 +02:00
Aliaksandr Valialkin
3ccf7ea20c
lib/storage: tune per-day index search
2021-03-15 13:31:55 +02:00
Aliaksandr Valialkin
c14dafce43
lib/promscrape: an attempt to reduce memory usage when vmagent scrapes targets with varying number of metrics
...
Do not cache too big byte buffers and too big writeRequestCtx objects,
since it is cheaper to re-create them instead of wasting RAM for their caching.
This reverts 7f6f350ee1
2021-03-15 11:45:39 +02:00