Aliaksandr Valialkin
d77db9d813
all: do not skip SIGHUP signal during service initialization
...
This can lead to stale or incomplete configs like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-21 16:38:20 +03:00
Aliaksandr Valialkin
69e365cd48
Makefile: update golangci-lint from v1.29.0 to v1.40.1
2021-05-20 18:30:24 +03:00
Aliaksandr Valialkin
da0b32c31a
app/vmagent/remotewrite: expose metrics with the current number of active series per day and per hour
...
These numbers are exposed via the following metrics:
- vmagent_hourly_series_limit_current_series
- vmagent_daily_series_limit_current_series
Expose also the limits via the following metrics:
- vmagent_hourly_series_limit_max_series
- vmagent_daily_series_limit_max_series
2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin
165a9f9200
app/vmstorage: add ability to limit series cardinality via -storage.maxHourlySeries
and -storage.maxDailySeries
command-line flags
2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin
7aad5c3f76
app/vmagent: add ability to limit series cardinality on a per-hour and per-day basis
2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin
110a888e39
lib/promscrape/discovery/kubernetes: make golangci-lint
happy by removing empty branches
2021-05-20 12:00:17 +03:00
Aliaksandr Valialkin
e228f479a5
lib/storage: remove possible data race when logging dropped labels
2021-05-20 11:54:06 +03:00
Aliaksandr Valialkin
9d97f44772
lib/promscrape/discovery/kubernetes: reload objects on object parse error
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 23:27:24 +03:00
Aliaksandr Valialkin
74ef40034c
lib/httpserver: typo fix in -http.shutdownDelay
command-line flag description: servier -> server
2021-05-18 16:25:27 +03:00
Aliaksandr Valialkin
c507faec0b
lib/promscrape/discovery/kubernetes: simplify the reload logic for urlWatcher.objectsByKey
2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin
0f54c0121b
lib/promscrape/discovery/kubernetes: properly update vm_promscrape_discovery_kubernetes_scrape_works metric
...
Previously it wasn't descreased during config update.
2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin
9f62d348db
lib/promscrape/discovery/kubernetes: log errors and stop service discovery when unexpected updates are received from Kubernetes API server
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin
6ea191d196
docs: dealay -> delay
2021-05-18 01:07:32 +03:00
Aliaksandr Valialkin
c4ed50ae54
lib/promrelabel: add tests for conditional removal of label on another label match
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1294
2021-05-18 00:23:23 +03:00
Aliaksandr Valialkin
8764b0ae21
lib/promscrape/discovery/kubernetes: key ScrapeWork objects by urlWatcher instead of namespace
...
This makes the code less fragile if urlWatcher would depend on additional to namepsace properties.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-17 23:49:48 +03:00
Aliaksandr Valialkin
e08287f017
lib/promscrape: reload auth tokens from files every second
...
Previously auth tokens were loaded at startup and couldn't be updated without vmagent restart.
Now there is no need in vmagent restart.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1297
2021-05-14 20:03:35 +03:00
Aliaksandr Valialkin
a6cb4f10a7
app/{vmalert,vmauth}: explicitly set MaxIdleConnsPerHost in net/http.Client.Transport
...
By default MaxIdleConnsPerHost is set to 2. This limits the possibility to re-use http keep-alive connections.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1300
2021-05-14 18:13:34 +03:00
Aliaksandr Valialkin
e3f61d540b
lib/promscrape: limit scrape_timeout
by scrape_interval
like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1281
2021-05-13 16:10:42 +03:00
匠心零度
d5285ecaf0
fix vagent imbalance problem ( #1292 )
...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=0 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=1 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=2 -promscrape.config=/path/to/config.yml ...
Co-authored-by: lirenzuo <lirenzuo@shein.com>
2021-05-13 11:19:30 +03:00
Aliaksandr Valialkin
f13585dc5d
vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.14 to v1.0.15
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:47:09 +03:00
Aliaksandr Valialkin
d13906bf1f
lib/promscrape: exponentially increase retry interval on unsuccesful requests to scrape targets or to service discovery services
...
This should reduce CPU load at vmagent and at remote side when the remote side doesn't accept HTTP requests.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:47:07 +03:00
Aliaksandr Valialkin
66c6976723
lib/cgroup: document the ability to detect cgroup v2 memory and cpu limits. This is follow-up for b50024812e
2021-05-13 09:27:35 +03:00
Nikolay
8743bf541f
adds cgroupsv2 support ( #1283 )
...
* adds cgroupv2 limits support
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1269
* small fix
* changes Atoi to ParseUint
2021-05-13 09:27:33 +03:00
Aliaksandr Valialkin
2839055513
lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss
2021-05-13 09:01:05 +03:00
Aliaksandr Valialkin
008ae25b3a
lib/storage: merge getTSDBStatusForDate with getTSDBStatusWithFiltersForDate
...
These functions are non-trivial, while their code has minimal differences.
It is better from maintainability PoV to merge these functions into a single function.
2021-05-12 18:01:08 +03:00
Nikolay
be87be34a4
Adds tsdb match filters ( #1282 )
...
* init work on filters
* init propose for status filters
* fixes tsdb status
adds test
* fix bug
* removes checks from test
2021-05-12 17:16:58 +03:00
Aliaksandr Valialkin
027607db3e
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices scrape targets every 5 seconds, since they may depend on changed service and pod objects
...
This should make endpoints and endpointslices scrape targets eventually consistent with the maximum delay of 5 seconds after the related service or pod object changes.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-12 14:12:43 +03:00
Aliaksandr Valialkin
1d32b008c6
lib/httpserver: add new X-Server-Hostname header instead of overwriting already exsiting header
...
This makes possible tracking origins of chained requests over multiple hops.
2021-05-11 23:47:19 +03:00
Aliaksandr Valialkin
f1317f7c6c
lib/httpserver: return X-Server-Hostname http header in all the responses for better debuggability
2021-05-11 22:04:41 +03:00
Aliaksandr Valialkin
4e59cf4380
lib/storage: properly apply time range when matching an empty filter
...
It must match all the time series on the given time range.
Previously it was matched to all the time series without the restriction on the given time range.
2021-05-11 01:09:35 +03:00
Aliaksandr Valialkin
326cf83eb4
lib/storage: remove dead code after the commit 3ccf7ea20c
2021-05-08 20:15:59 +03:00
Aliaksandr Valialkin
9c505d27dd
lib/ingestserver: properly close incoming connections during graceful shutdown
2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin
4a5f45c77e
app/vminsert: add support for data ingestion via other vminsert nodes
2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin
e6c19cb09d
lib/promscrape/discovery/kubernetes: start watchers for pods and services before starting watchers for endpoints
...
This should eliminate possible race when an update on endpoints depends on pods and/or services, which are missing in the cache yet.
This could result in missing targets based on endpoints or endpointslices.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-05 12:23:16 +03:00
Aliaksandr Valialkin
43c52ff77a
lib/storage: use WARNING instead of INFO level for logging dropped labels
2021-05-03 13:57:28 +03:00
Aliaksandr Valialkin
ec6becd3f5
lib/httpserver: stop the process on panics in request handlers
...
Panics may leave the process in inconsistent state. That's why it is better to stop the process after the panic
instead of recovering from the panic. Unfortunately, the standard net/http.Server recovers panics in request handlers.
See https://github.com/golang/go/issues/16542 . That's lib/httpserver must stop the process on itself after the panic.
2021-05-03 12:00:44 +03:00
Nikolay
62d58324dd
adds stalePartsRemover ( #1261 )
...
for new created partitions
2021-05-03 11:34:33 +03:00
Aliaksandr Valialkin
60ffbcbb99
lib/promrelabel: add tests for removing the specified {label="value"} pair
2021-05-03 11:26:58 +03:00
Aliaksandr Valialkin
b43ba6d85f
lib/storage: log dropped labels if the number of labels in a metric exceeds -maxLabelsPerTimeseries
command-line flag value
...
This should improve debuggability for this case.
2021-05-01 09:29:56 +03:00
Aliaksandr Valialkin
8be1cb297b
app/vmagent: list user-visible endpoints at
http://vmagent:8429/
...
While at it, use common WriteAPIHelp function for the listing in vmagent, vmalert and victoria-metrics
2021-04-30 09:38:23 +03:00
Aliaksandr Valialkin
421a92983a
lib/promscrape/discovery/kubernetes: remove a mutex at urlWatcher - use groupWatcher mutex for accessing all the urlWatcher children
...
This simplifies the code a bit and reduces the probability of improper mutex handling and deadlocks.
2021-04-29 10:17:45 +03:00
Nikolay
535b3ff618
vmagent kubernetes_sd tests ( #1253 )
...
* first part of tests for kubernetes sd
* makes linter happy
* added more test cases
* adds pub/sub for tests
2021-04-29 10:17:45 +03:00
Aliaksandr Valialkin
e37e1b1e34
lib/{storage,mergeset}: fix unaligned 64-bit atomic operation
panic for 32-bit architectures
...
The panic has been introduced in 56b6b893ce
2021-04-27 16:42:19 +03:00
Aliaksandr Valialkin
2d1d60118d
lib/mergeset: split rows ingestion among multiple shards
...
This improves rows ingestion on systems with many CPU cores by reducing lock contention.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
Thanks to @waldoweng for the original idea and draft implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1243
2021-04-27 15:45:11 +03:00
Aliaksandr Valialkin
b3da457629
lib/promscrape/discovery/kubernetes: fix a deadlock introduced in eddba29664
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
Thanks to @f41gh7 for providing the initial idea for deadlock fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1248
2021-04-27 14:59:56 +03:00
Aliaksandr Valialkin
cba2d13456
lib/storage: typo fix in info message when deleting the part outside the configured retention
...
Previously the message was displaying incorrect retention time
2021-04-27 13:33:36 +03:00
Aliaksandr Valialkin
f14412321b
lib/persistentqueue: eliminate possible data race when obtaining vm_persistentqueue_bytes_pending metric value
2021-04-27 00:26:32 +03:00
Aliaksandr Valialkin
320983f650
lib/promscrape: apply scrape_timeout
on receiving the first response byte for stream_parse: true
scrape targets
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017#issuecomment-767235047
2021-04-23 22:05:00 +03:00
Aliaksandr Valialkin
34321e5f8d
lib/promscrape/discovery/kubernetes: refresh role: endpoints
targets on service object removal as Prometheus does
...
This is a follow-up for ae37cfd528
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:27:29 +03:00
Aliaksandr Valialkin
db27dbab5e
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices targets on service object update like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:12:22 +03:00