Aliaksandr Valialkin
92628f9f07
lib/promrelabel: add tests for verifying that regex works as expected in single quotes and double quotes
2021-07-27 10:53:03 +03:00
Aliaksandr Valialkin
5d255846ac
all: add go:build
lines for Go1.17
...
See https://tip.golang.org/doc/go1.17#gofmt for more details
2021-07-26 15:50:46 +03:00
Aliaksandr Valialkin
c857e05604
lib/promscrape: add missing whitespace at /targets page before up
word
2021-07-26 12:23:06 +03:00
Aliaksandr Valialkin
376af3c956
lib/workingsetcache: switch from split cache to full cache after the cache size exceeds 95% of split capacity
...
Previously the switch occurred when the cache size becomes 100% of its capacity. The cache size could never reach 100% capacity.
This could prevent from switching from the split cache to full cache, thus reducing the cache effectiveness.
2021-07-15 16:53:35 +03:00
Aliaksandr Valialkin
9d3f9da5ad
lib/storage: make sure the second call to DeduplicateSamples and deduplicateSamplesDuringMerge doesnt change samples
2021-07-15 12:18:38 +03:00
Aliaksandr Valialkin
e992754e79
lib/storage: remove cache directory if it contains reset_cache_on_startup file
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447
2021-07-13 17:59:51 +03:00
Aliaksandr Valialkin
e6edb85fa2
lib/httpserver: add is_set
label to flag
metrics
...
This label allows determining the set flags with the query `flag{is_set="true"}`
2021-07-13 15:10:18 +03:00
Aliaksandr Valialkin
51cd19d2e3
lib/storage: reset perKeyMisses stats less frequently
...
This should reduce CPU usage for queries executed with intervals higher than 30 seconds
2021-07-12 14:34:54 +03:00
Aliaksandr Valialkin
3f705fe8d7
lib/storage: properly limit the size of storage/date_metricID
cache
2021-07-12 14:25:28 +03:00
Aliaksandr Valialkin
ef3c58d7a3
lib/storage: properly determine when the deduplication is needed in needsDedup
...
Previously needsDedup() could return true if the de-duplication wasn't needed for the following case:
d < interval
/ \
| v | v |
interval interval
Now it properly returns false for this case
2021-07-12 10:54:51 +03:00
Aliaksandr Valialkin
41754e12f8
lib/mergeset: cache indexBlock items only on the second request
...
This should reduce the indexdb/indexBlocks cache size, since it won't contain one-time-wonders items.
2021-07-07 15:24:37 +03:00
Aliaksandr Valialkin
ceda2b1df4
lib/httpserver: print full requestURI in httpserver.Errorf
...
This should simplify debugging.
2021-07-07 13:11:29 +03:00
Aliaksandr Valialkin
9826f7c1be
lib/storage: do not cache inmemoryBlock entries requested only once (aka one-time-wonder items)
...
This should reduce the cache size and memory usage for the indexdb/dataBlocks cache
2021-07-07 10:59:45 +03:00
Aliaksandr Valialkin
74ace9340d
lib/storage: periodically reset prefetchedMetricIDs cache in order to limit its size under high churn rate
2021-07-07 10:59:39 +03:00
Aliaksandr Valialkin
a846febc89
Revert "lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method"
...
This reverts commit 7c6d3981bf
.
Reason for revert: high contention at bucket16Pool on systems with big number of CPU cores.
This slows down query processing significantly.
2021-07-06 18:26:56 +03:00
Aliaksandr Valialkin
b805a675f3
lib/{mergeset,storage}: switch from sync.Pool to chan-based pool for inmemoryPart objects
...
This should reduce memory usage on systems with big number of CPU cores,
since every inmemoryPart object occupies at least 64KB of memory and sync.Pool maintains
a separate pool inmemoryPart objects per each CPU core.
Though the new scheme for the pool worsens per-cpu cache locality, this should be amortized
by big sizes of inmemoryPart objects.
2021-07-06 16:33:25 +03:00
Aliaksandr Valialkin
d8e7c1ef27
lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method
...
This reduces the load on memory allocator in Go runtime in production workload.
2021-07-06 16:33:24 +03:00
Aliaksandr Valialkin
db6bd69475
lib/mergeset: increase pool capacity for inmemoryBlock according to collected profiles from production workload
...
CPU and memory profiles show that the pool capacity for inmemoryBlock objects is too small.
This results in the increased load on memory allocation code in Go runtime.
Increase the pool capacity in order to reduce the load on Go runtime.
2021-07-06 13:44:27 +03:00
Aliaksandr Valialkin
fd32855a6c
lib/mergeset: limit the frequency for flushCallback calls to once per 10 seconds
...
This should improve hit ratio for tagFiltersCache when big number of new time series are constantly registered
(aka high churn rate). This, in turn, should reduce CPU usage for queries over such time series.
2021-07-06 12:20:15 +03:00
Aliaksandr Valialkin
22c6e64bbc
lib/storage: consistency renaming: tagCache -> tagFiltersCache
...
This improves code readability
2021-07-06 11:03:30 +03:00
Aliaksandr Valialkin
21abf487c3
lib/workingsetcache: properly update stats for requests and cache misses
...
Previously the stats for cache misses could be improperly counted, because it had inflated cache misses
if the entry was missing in the curr cache, but was existing in the prev cache.
The same applies to cache requests - they were inflated if the entry was missing in the curr cache.
2021-07-06 10:54:38 +03:00
Aliaksandr Valialkin
e5031d9aee
lib/workingsetcache: fix cache capacity calculations after 4f0003f182
2021-07-05 17:16:35 +03:00
Aliaksandr Valialkin
bd71f102e8
lib/workingsetcache: typo fixes after d0c830039d
2021-07-05 15:35:51 +03:00
Aliaksandr Valialkin
4b25e627f8
lib/workingsetcache: properly switch to whole
mode
...
Previously the switch from `split` to `whole` mode had been performed too early,
e.g. when the current cache size became bigger than 1/4 of the allowed cache size.
Now it is performed when the current cache size becomes bigger than 1/2 of the allowed cache size.
This change can reduce memory usage for data ingestion path when big number of active time series are ingested.
2021-07-05 15:15:39 +03:00
Aliaksandr Valialkin
51516b96e6
lib/storage: tune cache sizes according to production workload
2021-07-05 15:14:45 +03:00
Aliaksandr Valialkin
f12f97daa1
lib/{storage,mergeset}: increase cache timeout for data and index blocks from a minute to two minutes
...
One minute cache timeout result in slower queries in some production workloads where the interval
between query execution is in the range 1 minute - 2 minutes.
2021-07-05 14:25:59 +03:00
Aliaksandr Valialkin
377bb06b47
lib/cgroup: set GOGC to 50 by default if it isn't set
...
This should reduce memory usage for typical VictoriaMetrics workloads by up to 50%
2021-07-05 12:34:01 +03:00
Aliaksandr Valialkin
8055439fe4
lib/storage: properly detect free disk space shortage during data merge
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-02 17:42:23 +03:00
Aliaksandr Valialkin
6fc3696260
lib/promscrape/discovery/consul: use case-insensitive comparison for service names
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1424
2021-07-02 14:49:22 +03:00
Aliaksandr Valialkin
61e483a01c
lib/protoparser/clusternative: remove unused field - unmarshalWork.lastResetTime
...
This is a follow-up for b84aea1e6e
2021-07-02 13:32:59 +03:00
Aliaksandr Valialkin
72de54f93e
lib/promauth: cache the client TLS certificate for up to a second
...
This should reduce CPU usage when TLS connections are established at a high rate.
2021-07-02 13:20:18 +03:00
Aliaksandr Valialkin
1c12c0f79c
lib/promauth: reload TLS certificates from disk on every mTLS connection as Prometheus does
...
This allows updating client certificates without the need to restart vmagent and/or single-node VictoriaMetrics.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1420
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470
2021-07-01 15:43:43 +03:00
Nikolay
6bd2309449
fixes /targets button style ( #1423 )
...
* fixes /targets button style
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1422
* updates boostrap version
2021-07-01 11:52:47 +03:00
Aliaksandr Valialkin
71c856beb8
lib/workingsetcache: reset the cache mode when the cache is reset
...
This should reduce memory usage if the working set is reduced after the cache reset.
2021-07-01 11:52:47 +03:00
Aliaksandr Valialkin
bced9ee666
lib/{mergeset,storage}: reduce the maximum lifetime for cached indexdb and data blocks from 2 minutes to a minute
...
This should reduce memory usage on a system with high number of active time series and a high churn rate.
One minute is enough for caching the blocks needed for repeated queries (e.g. alerting rules, recording rules and dashboard refreshes).
2021-06-29 19:57:53 +03:00
Aliaksandr Valialkin
b7c0b3dde3
lib/mergeset: switch from sync.Pool to a channel for a pool for inmemoryBlock structs
...
This should reduce memory usage for the pool on systems with big number of CPU cores.
The sync.Pool maintains per-CPU pools, so the total number of objects in the pool
is proportional to the number of available CPU cores. The channel limits the number
of pooled objects by its own capacity. This means smaller number of pooled objects on average.
2021-06-29 19:57:52 +03:00
Aliaksandr Valialkin
2edfea8c36
lib/promscrape/discovery/docker: fix golint warning: struct field Id should be ID
2021-06-29 13:11:33 +03:00
Aliaksandr Valialkin
609ad6d9bf
lib/storage: put indexDBName into the key for dateTagFilter cache and for uselessTagFilters cache
...
This should prevent from stats overwriting when the previous indexdb is queried.
2021-06-29 13:11:32 +03:00
Aliaksandr Valialkin
0c4c630839
lib/promscrape: typo fix in /targets
output
...
The typo has been introduced in fb72a2133f
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1408
2021-06-28 21:27:22 +03:00
Aliaksandr Valialkin
97d1ccfc8e
lib/promscrape: split docker and dockerswarm service discovery code bases, since they have very little in common
...
This is a follow up after c85a5b7fcb
2021-06-25 13:22:16 +03:00
Aliaksandr Valialkin
4461e20e7d
lib/promscrape: consistently sort service discovery routines
...
This should simplify further maintenance of the code
2021-06-25 13:22:16 +03:00
Lu Jiajing
12b4cbb68f
Support Docker ServiceDiscovery ( #1402 )
...
* add docker discovery
* add test
* add labels test and add scrape work
* remove TODO
* refactor to merge apiConfig and sdConfig
* apply suggestion
2021-06-25 13:22:16 +03:00
Nikolay
501429c3ff
adds missing MustStop call to do and http sd ( #1404 )
2021-06-25 11:43:32 +03:00
Aliaksandr Valialkin
b84aea1e6e
lib/protoparser/clusternative: do not pool unmarshalWork structs, since they can occupy big amounts of memory (more than 100MB per each struct)
...
This should reduce memory usage for vmstorage under high ingestion rate when the vmstorage runs on a system with big number of CPU cores
2021-06-23 15:45:08 +03:00
Aliaksandr Valialkin
a22f37599b
lib/storage: tune tag filters search logic
...
Tune the logic according to the logs provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-864293624
The previous logic had a race when multiple concurrent queries execute the same tag filter without prior stats.
This could result in incorrectly stored stats for such tag filter, which then could result in non-optimal sorting of tag filters
for further queries.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-23 13:30:36 +03:00
Aliaksandr Valialkin
f10fa0d1d7
lib/promscrape/discovery/consul: properly pass namespace to Consul watcher
...
Follow-up for 58a2989fe7
2021-06-22 17:43:20 +03:00
Aliaksandr Valialkin
4adf6c9766
lib/promscrape/discovery/http: follow up after e307bbb29a
2021-06-22 13:42:10 +03:00
Nikolay
e03a3d3a36
adds http_sd ( #1399 )
...
* adds http_sd
* adds X-Prometheus-Refresh-Interval-Seconds header
* Update lib/promscrape/discovery/http/api.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 13:42:09 +03:00
Aliaksandr Valialkin
3ab3902f17
lib/promscrape/discovery: support generic auth configs in Consul service discovery in the same way as Prometheus 2.28 does
2021-06-22 13:18:51 +03:00
Nikolay
827a2396d2
adds consul enterprise namespace support ( #1400 )
...
* adds consul enterprise namespace support
* Update lib/promscrape/discovery/consul/consul.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 12:56:11 +03:00
Aliaksandr Valialkin
f9069ba32a
lib/promscrape: show jobs with empty scrape targets on /targets page
2021-06-18 10:54:12 +03:00
Nikolay
9ea1dca3dd
fixes DO service discovery labels ( #1389 )
...
adds test for digitalocean sd
2021-06-17 17:21:10 +03:00
Aliaksandr Valialkin
a207be3ffb
lib/storage: fix infinite loop introduced in aa9b56a046
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 14:27:30 +03:00
Aliaksandr Valialkin
0efd37cec1
lib/{mergeset,storage}: reduce the number of fsync calls on data ingestion path on systems with many cpu cores
...
VictoriaMetrics maintains a buffer per CPU core for the ingested data. These buffers are flushed to disk every second.
These buffers are flushed to disk in parallel starting from the commit 56b6b893ce
.
This resulted in increased write disk IO usage on systems with many cpu cores
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-863046999 .
This commit merges the per-CPU buffers into bigger in-memory buffers before flushing them to disk.
This should reduce the rate of fsync syscalls and, consequently, the write disk IO on systems with many CPU cores.
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 13:51:42 +03:00
Aliaksandr Valialkin
b133de1e37
lib/storage: move deletedMetricIDs set from indexDB to Storage
...
This makes consitent the list of deleted metricIDs when it is used from both the current indexDB and the previous indexDB (aka extDB).
This should fix the issue, which could lead to storing new samples under deleted metricIDs after indexDB rotation.
See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136 .
Thanks to @tangqipengleoo for the initial analysis and the pull request - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .
This commit resolves the issue in more generic way compared to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .
The downside of the commit is the deletedMetricIDs set isn't cleaned from the metricIDs outside the retention. It needs app restart.
This should be OK in most cases.
2021-06-15 15:07:54 +03:00
Aliaksandr Valialkin
ebaf68bcb0
lib/protoparser: stop reading the input stream as soon as the callback provided by the caller returns error
...
This is a follow-up for af90c3c43b
2021-06-14 15:20:38 +03:00
faceair
2ea187e801
lib/protoparser: stop read when callback error ( #1380 )
2021-06-14 15:20:37 +03:00
Aliaksandr Valialkin
5f91a701fa
lib/promscrape: show the number of samples collected during the last scrape at /targets and /api/v1/targets pages
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1377
2021-06-14 14:04:35 +03:00
Nikolay
e42da47608
adds digital ocean sd ( #1376 )
...
* adds digital ocean sd config
* adds digital ocean sd
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1367
* typo fix
2021-06-14 13:19:29 +03:00
Aliaksandr Valialkin
df057177a0
lib/promscrape: increase the duration for reading the full response in stream parsing mode
...
Increase the duration from 10x to 30x of the configured `scrape_interval'.
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:29:46 +03:00
Aliaksandr Valialkin
074b11fa69
lib/protoparser: measure the duration for reading the whole block of data instead of a single read operation
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:29:45 +03:00
Aliaksandr Valialkin
87d221f78a
lib/protoparser/common: log the duration for reading a block of data in ReadLinesBlockExt on error
...
This may help debugging issues like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:21:21 +03:00
Aliaksandr Valialkin
0672cfffa2
app/vmauth: properly handle http.ErrAbortHandler panic
...
This panic can be raised by the reverseProxy on aborted request to the backend.
So handle it (e.g. suppress) at reverseProxy.ServeHTTP call.
Do not suppress the panic at lib/httpserver generic HTTP handler,
since it may result in an inconsistent state left after the panicking handler.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1353
2021-06-11 12:54:37 +03:00
Aliaksandr Valialkin
ce10bdc82a
lib/storage: reset cache on disk during series deletion and during indexdb rotation
...
This should prevent from inconsistent behavior (aka partially missing data for some time series) after unclean shutdown.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347
2021-06-11 12:54:36 +03:00
Aliaksandr Valialkin
eb335d2c29
lib/storage: consistency renaming: getMaxRawRowsPerPartition -> getMaxRawRowsPerShard
2021-06-11 10:52:31 +03:00
Aliaksandr Valialkin
d06c0e7a94
lib/storage: reduce the amounts of memory which can be occupied by rawRow items during data ingestion on a system with many CPU cores
2021-06-11 10:49:02 +03:00
Nikolay
2c1611d316
disables panic for net/httpAbortHandler ( #1355 )
2021-06-09 12:12:45 +03:00
Aliaksandr Valialkin
1e4a64844d
lib/storage: properly account the number of loops spent when matching for or suffixes
...
This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-08 13:07:14 +03:00
Aliaksandr Valialkin
e7d353ee6a
lib/promrelabel: add tests for labelsToString() function
2021-06-04 20:42:14 +03:00
Aliaksandr Valialkin
269e35d676
app/{vmagent,vminsert}: follow-up after 2fe045e2a4
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1343
2021-06-04 20:33:22 +03:00
jelmd
d8b46908db
new feature: debug relabeling ( #1344 )
...
* new feature: relabel logging
Use scrape_configs[x].relabel_debug = true to log metric names inkl.
labels before and after relabeling. After relabeling related metrics
get dropped, i.e. not submitted to servers.
* vminsert wants relabel logging, too.
2021-06-04 20:33:21 +03:00
Nikolay
3d89c01d07
fixes solaris build ( #1345 )
2021-06-04 11:56:06 +03:00
Hason Chan
439c2ed510
fix eureka_sd_configs HTTPClientConfig incorrect parsing ( #1350 )
2021-06-04 11:56:06 +03:00
Aliaksandr Valialkin
fc2565b4ee
lib/storage: reduce memory allocations when syncing dateMetricIDCache
2021-06-03 16:20:02 +03:00
Aliaksandr Valialkin
0b9f0de0a1
lib/promscrape: fix tests after f0c21b6300
2021-05-28 01:33:28 +03:00
Aliaksandr Valialkin
6865f3b497
Revert "lib/mergeset: remove a pool for inmemoryBlock structs"
...
This reverts commit 793fe39921
.
Reason to revert: production testing revealed possible slowdown when registering big number of new time series
2021-05-28 01:11:22 +03:00
Aliaksandr Valialkin
7b33bc67a1
lib/mergeset: remove a pool for inmemoryBlock structs
...
The pool for inmemoryBlock struct doesn't give any performance gains in production workloads,
while it may result in excess memory usage for inmemoryBlock structs inside the pool during
background merge of indexdb.
2021-05-27 22:00:50 +03:00
Aliaksandr Valialkin
97de72054e
docs: document f0c21b6300
2021-05-27 15:04:13 +03:00
faceair
b801b299f0
lib/promscrape: apply body size & sample limit to stream parse ( #1331 )
...
* lib/promscrape: apply body size limit to stream parse
Signed-off-by: faceair <git@faceair.me>
* lib/promscrape: apply sample limit to stream parse
Signed-off-by: faceair <git@faceair.me>
2021-05-27 15:04:11 +03:00
Aliaksandr Valialkin
49490ae5a7
lib/protoparser/clusternative: remove duplicate cannot read packet size
phrase from the log message
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1336
2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin
c85084b659
lib/handshake: pass io.EOF unmodified to the caller for BufferedConn.Read, so it could properly detect the end of stream
2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin
10b2855949
lib/storage: fix spelling typo: borken->broken
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1336
2021-05-27 12:09:17 +03:00
Aliaksandr Valialkin
6b90570ed3
lib/uint64set: store pointers to bucket16 instead of bucket16 objects in bucket32
...
This speeds up bucket32.addBucketAtPos() when bucket32.buckets contains big number of items,
since the copying of bucket16 pointers is much faster than the copying of bucket16 objects.
This is a cpu profile for copying bucket16 objects:
10ms 13.43s (flat, cum) 32.01% of Total
10ms 120ms 650: b.b16his = append(b.b16his[:pos+1], b.b16his[pos:]...)
. . 651: b.b16his[pos] = hi
. 13.31s 652: b.buckets = append(b.buckets[:pos+1], b.buckets[pos:]...)
. . 653: b16 := &b.buckets[pos]
. . 654: *b16 = bucket16{}
. . 655: return b16
. . 656:}
This is a cpu profile for copying pointers to bucket16:
10ms 1.14s (flat, cum) 2.19% of Total
. 100ms 647: b.b16his = append(b.b16his[:pos+1], b.b16his[pos:]...)
. . 648: b.b16his[pos] = hi
10ms 700ms 649: b.buckets = append(b.buckets[:pos+1], b.buckets[pos:]...)
. 330ms 650: b16 := &bucket16{}
. . 651: b.buckets[pos] = b16
. . 652: return b16
. . 653:}
2021-05-25 14:27:52 +03:00
Aliaksandr Valialkin
1c16cbacf5
lib/storage: do not stop data ingestion on the first error in Storage.AddRows
...
Continue data ingestion for the rest of blocks.
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
2601844de3
lib/storage: limit the number of rows per each block in Storage.AddRows()
...
This should reduce memory usage when ingesting big blocks or rows.
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
95b735a883
lib/storage: allow filling all the rows up to their capacity in rawRowsShard.addRows
...
This should reduce memory usage a bit on data ingestion path
2021-05-24 15:32:24 +03:00
Aliaksandr Valialkin
0f84503880
lib/bloomfilter: fix TestLimiterConcurrent
2021-05-24 05:18:29 +03:00
Aliaksandr Valialkin
745eda9e87
lib/fs: do not pass done callback to tryRemoveAll() func
...
This improves code readability a bit.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-24 05:00:53 +03:00
Aliaksandr Valialkin
402a8ca710
lib/storage: do not populate MetricID->MetricName cache during data ingestion
...
This cache isn't needed during data ingestion, so there is no need in spending RAM on it.
This reduces RAM usage on data ingestion path by 30%
2021-05-24 03:06:40 +03:00
Aliaksandr Valialkin
0fc857d363
lib/{mergeset,storage}: reduce the number of IFNO log messages like merged ... items across ... blocks in ... seconds
...
Log these messages if the merge takes more than 30 seconds instead of 10 seconds.
2021-05-23 14:15:49 +03:00
Aliaksandr Valialkin
71ff7ee18d
lib/promauth: follow-up after 5b8176c68e
2021-05-22 18:02:03 +03:00
Nikolay
2780d6dbcd
basic OAuth2 support for remoteWrite and scrape targets ( #1316 )
...
* adds OAuth2 support for remoteWrite and scrapping
* adds tests
changes init
2021-05-22 18:02:01 +03:00
Aliaksandr Valialkin
89e1a45cdb
lib/fs: concurrently remove up to 1024 blocked NFS directories
...
Previously the blocked directories were removed sequentially by a single goroutine.
This can be not enough for highly loaded VictoriaMetrics that accepts millions of sample per second,
when big number of LSM parts are created and removed at high rate.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-21 17:58:08 +03:00
Aliaksandr Valialkin
23355ca34c
lib/fs: wait for a while before giving up on NFS file removal if the removal queue is full
...
This should reduce the probability of the panic on a highly loaded VictoriaMetrics
accepting millions of samples per second.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-21 17:21:35 +03:00
Aliaksandr Valialkin
d77db9d813
all: do not skip SIGHUP signal during service initialization
...
This can lead to stale or incomplete configs like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-21 16:38:20 +03:00
Aliaksandr Valialkin
69e365cd48
Makefile: update golangci-lint from v1.29.0 to v1.40.1
2021-05-20 18:30:24 +03:00
Aliaksandr Valialkin
da0b32c31a
app/vmagent/remotewrite: expose metrics with the current number of active series per day and per hour
...
These numbers are exposed via the following metrics:
- vmagent_hourly_series_limit_current_series
- vmagent_daily_series_limit_current_series
Expose also the limits via the following metrics:
- vmagent_hourly_series_limit_max_series
- vmagent_daily_series_limit_max_series
2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin
165a9f9200
app/vmstorage: add ability to limit series cardinality via -storage.maxHourlySeries
and -storage.maxDailySeries
command-line flags
2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin
7aad5c3f76
app/vmagent: add ability to limit series cardinality on a per-hour and per-day basis
2021-05-20 15:31:57 +03:00
Aliaksandr Valialkin
110a888e39
lib/promscrape/discovery/kubernetes: make golangci-lint
happy by removing empty branches
2021-05-20 12:00:17 +03:00
Aliaksandr Valialkin
e228f479a5
lib/storage: remove possible data race when logging dropped labels
2021-05-20 11:54:06 +03:00
Aliaksandr Valialkin
9d97f44772
lib/promscrape/discovery/kubernetes: reload objects on object parse error
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 23:27:24 +03:00
Aliaksandr Valialkin
74ef40034c
lib/httpserver: typo fix in -http.shutdownDelay
command-line flag description: servier -> server
2021-05-18 16:25:27 +03:00
Aliaksandr Valialkin
c507faec0b
lib/promscrape/discovery/kubernetes: simplify the reload logic for urlWatcher.objectsByKey
2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin
0f54c0121b
lib/promscrape/discovery/kubernetes: properly update vm_promscrape_discovery_kubernetes_scrape_works metric
...
Previously it wasn't descreased during config update.
2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin
9f62d348db
lib/promscrape/discovery/kubernetes: log errors and stop service discovery when unexpected updates are received from Kubernetes API server
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 15:41:51 +03:00
Aliaksandr Valialkin
6ea191d196
docs: dealay -> delay
2021-05-18 01:07:32 +03:00
Aliaksandr Valialkin
c4ed50ae54
lib/promrelabel: add tests for conditional removal of label on another label match
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1294
2021-05-18 00:23:23 +03:00
Aliaksandr Valialkin
8764b0ae21
lib/promscrape/discovery/kubernetes: key ScrapeWork objects by urlWatcher instead of namespace
...
This makes the code less fragile if urlWatcher would depend on additional to namepsace properties.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-17 23:49:48 +03:00
Aliaksandr Valialkin
e08287f017
lib/promscrape: reload auth tokens from files every second
...
Previously auth tokens were loaded at startup and couldn't be updated without vmagent restart.
Now there is no need in vmagent restart.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1297
2021-05-14 20:03:35 +03:00
Aliaksandr Valialkin
a6cb4f10a7
app/{vmalert,vmauth}: explicitly set MaxIdleConnsPerHost in net/http.Client.Transport
...
By default MaxIdleConnsPerHost is set to 2. This limits the possibility to re-use http keep-alive connections.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1300
2021-05-14 18:13:34 +03:00
Aliaksandr Valialkin
e3f61d540b
lib/promscrape: limit scrape_timeout
by scrape_interval
like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1281
2021-05-13 16:10:42 +03:00
匠心零度
d5285ecaf0
fix vagent imbalance problem ( #1292 )
...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=0 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=1 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=2 -promscrape.config=/path/to/config.yml ...
Co-authored-by: lirenzuo <lirenzuo@shein.com>
2021-05-13 11:19:30 +03:00
Aliaksandr Valialkin
f13585dc5d
vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.14 to v1.0.15
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:47:09 +03:00
Aliaksandr Valialkin
d13906bf1f
lib/promscrape: exponentially increase retry interval on unsuccesful requests to scrape targets or to service discovery services
...
This should reduce CPU load at vmagent and at remote side when the remote side doesn't accept HTTP requests.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:47:07 +03:00
Aliaksandr Valialkin
66c6976723
lib/cgroup: document the ability to detect cgroup v2 memory and cpu limits. This is follow-up for b50024812e
2021-05-13 09:27:35 +03:00
Nikolay
8743bf541f
adds cgroupsv2 support ( #1283 )
...
* adds cgroupv2 limits support
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1269
* small fix
* changes Atoi to ParseUint
2021-05-13 09:27:33 +03:00
Aliaksandr Valialkin
2839055513
lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss
2021-05-13 09:01:05 +03:00
Aliaksandr Valialkin
008ae25b3a
lib/storage: merge getTSDBStatusForDate with getTSDBStatusWithFiltersForDate
...
These functions are non-trivial, while their code has minimal differences.
It is better from maintainability PoV to merge these functions into a single function.
2021-05-12 18:01:08 +03:00
Nikolay
be87be34a4
Adds tsdb match filters ( #1282 )
...
* init work on filters
* init propose for status filters
* fixes tsdb status
adds test
* fix bug
* removes checks from test
2021-05-12 17:16:58 +03:00
Aliaksandr Valialkin
027607db3e
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices scrape targets every 5 seconds, since they may depend on changed service and pod objects
...
This should make endpoints and endpointslices scrape targets eventually consistent with the maximum delay of 5 seconds after the related service or pod object changes.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-12 14:12:43 +03:00
Aliaksandr Valialkin
1d32b008c6
lib/httpserver: add new X-Server-Hostname header instead of overwriting already exsiting header
...
This makes possible tracking origins of chained requests over multiple hops.
2021-05-11 23:47:19 +03:00
Aliaksandr Valialkin
f1317f7c6c
lib/httpserver: return X-Server-Hostname http header in all the responses for better debuggability
2021-05-11 22:04:41 +03:00
Aliaksandr Valialkin
4e59cf4380
lib/storage: properly apply time range when matching an empty filter
...
It must match all the time series on the given time range.
Previously it was matched to all the time series without the restriction on the given time range.
2021-05-11 01:09:35 +03:00
Aliaksandr Valialkin
326cf83eb4
lib/storage: remove dead code after the commit 3ccf7ea20c
2021-05-08 20:15:59 +03:00
Aliaksandr Valialkin
9c505d27dd
lib/ingestserver: properly close incoming connections during graceful shutdown
2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin
4a5f45c77e
app/vminsert: add support for data ingestion via other vminsert nodes
2021-05-08 19:53:45 +03:00
Aliaksandr Valialkin
e6c19cb09d
lib/promscrape/discovery/kubernetes: start watchers for pods and services before starting watchers for endpoints
...
This should eliminate possible race when an update on endpoints depends on pods and/or services, which are missing in the cache yet.
This could result in missing targets based on endpoints or endpointslices.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-05 12:23:16 +03:00
Aliaksandr Valialkin
43c52ff77a
lib/storage: use WARNING instead of INFO level for logging dropped labels
2021-05-03 13:57:28 +03:00
Aliaksandr Valialkin
ec6becd3f5
lib/httpserver: stop the process on panics in request handlers
...
Panics may leave the process in inconsistent state. That's why it is better to stop the process after the panic
instead of recovering from the panic. Unfortunately, the standard net/http.Server recovers panics in request handlers.
See https://github.com/golang/go/issues/16542 . That's lib/httpserver must stop the process on itself after the panic.
2021-05-03 12:00:44 +03:00
Nikolay
62d58324dd
adds stalePartsRemover ( #1261 )
...
for new created partitions
2021-05-03 11:34:33 +03:00
Aliaksandr Valialkin
60ffbcbb99
lib/promrelabel: add tests for removing the specified {label="value"} pair
2021-05-03 11:26:58 +03:00
Aliaksandr Valialkin
b43ba6d85f
lib/storage: log dropped labels if the number of labels in a metric exceeds -maxLabelsPerTimeseries
command-line flag value
...
This should improve debuggability for this case.
2021-05-01 09:29:56 +03:00
Aliaksandr Valialkin
8be1cb297b
app/vmagent: list user-visible endpoints at
http://vmagent:8429/
...
While at it, use common WriteAPIHelp function for the listing in vmagent, vmalert and victoria-metrics
2021-04-30 09:38:23 +03:00
Aliaksandr Valialkin
421a92983a
lib/promscrape/discovery/kubernetes: remove a mutex at urlWatcher - use groupWatcher mutex for accessing all the urlWatcher children
...
This simplifies the code a bit and reduces the probability of improper mutex handling and deadlocks.
2021-04-29 10:17:45 +03:00
Nikolay
535b3ff618
vmagent kubernetes_sd tests ( #1253 )
...
* first part of tests for kubernetes sd
* makes linter happy
* added more test cases
* adds pub/sub for tests
2021-04-29 10:17:45 +03:00
Aliaksandr Valialkin
e37e1b1e34
lib/{storage,mergeset}: fix unaligned 64-bit atomic operation
panic for 32-bit architectures
...
The panic has been introduced in 56b6b893ce
2021-04-27 16:42:19 +03:00
Aliaksandr Valialkin
2d1d60118d
lib/mergeset: split rows ingestion among multiple shards
...
This improves rows ingestion on systems with many CPU cores by reducing lock contention.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
Thanks to @waldoweng for the original idea and draft implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1243
2021-04-27 15:45:11 +03:00
Aliaksandr Valialkin
b3da457629
lib/promscrape/discovery/kubernetes: fix a deadlock introduced in eddba29664
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
Thanks to @f41gh7 for providing the initial idea for deadlock fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1248
2021-04-27 14:59:56 +03:00
Aliaksandr Valialkin
cba2d13456
lib/storage: typo fix in info message when deleting the part outside the configured retention
...
Previously the message was displaying incorrect retention time
2021-04-27 13:33:36 +03:00
Aliaksandr Valialkin
f14412321b
lib/persistentqueue: eliminate possible data race when obtaining vm_persistentqueue_bytes_pending metric value
2021-04-27 00:26:32 +03:00
Aliaksandr Valialkin
320983f650
lib/promscrape: apply scrape_timeout
on receiving the first response byte for stream_parse: true
scrape targets
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017#issuecomment-767235047
2021-04-23 22:05:00 +03:00
Aliaksandr Valialkin
34321e5f8d
lib/promscrape/discovery/kubernetes: refresh role: endpoints
targets on service object removal as Prometheus does
...
This is a follow-up for ae37cfd528
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:27:29 +03:00
Aliaksandr Valialkin
db27dbab5e
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices targets on service object update like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:12:22 +03:00
Aliaksandr Valialkin
ab8008d6d7
lib/{storage,mergeset}: remove empty directories on startup. Such directories can be left after unclean shutdown on NFS storage
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1142
2021-04-22 13:03:29 +03:00
Aliaksandr Valialkin
6dc5d3b357
all: rename https://victoriametrics.github.io to https://docs.victoriametrics.com
2021-04-20 20:20:01 +03:00
Aliaksandr Valialkin
1088113110
lib/uint64set: remove memory allocation in getOrCreateSmallPool()
...
This partially reverts fb82c4b9fa
It has been appeared that the additional memory allocation may result in higher GC pauses.
It is better to spend CPU time on copying bigger bucket16 structs instead of increasing query latencies due to higher GC pauses
2021-04-14 12:31:53 +03:00
Aliaksandr Valialkin
72c41323fa
lib/storage: code clarification: remove caching the found metricName in searchMetricName
2021-04-13 10:20:35 +03:00
Artem Navoiev
c3dcfdef8c
improve docs for cli flags ( #1202 )
...
* improve docs for cli flags
* improve docs for cli flags.2
2021-04-12 12:28:36 +03:00
Aliaksandr Valialkin
3b1f0cb3f6
lib/promscrape: create a single swosFunc
per scrape_config
2021-04-08 09:31:38 +03:00
Aliaksandr Valialkin
276dbc2133
lib/promscrape: do not spend CPU time on constructing scrapeWork key if clustering is disabled
2021-04-07 21:55:06 +03:00
Aliaksandr Valialkin
59ccc43e3a
lib/storage: properly handle big time ranges passed to /api/v1/labels
and /api/v1/label/<labelName>/values
...
It should be faster querying all the labels and/or all the values instead of querying per-day labels/values on time ranges exceeding maxDaysForPerDaySearch
2021-04-07 13:33:10 +03:00
Aliaksandr Valialkin
02b83e0957
lib/promscrape/discovery: remove superflouos check in registerPendingAPIWatchers
...
The check `_, ok := uw.aws[aw]; !ok` isn't needed, since aw cannot exist in uw.aws
because of the check inside subscribeAPIWatcher
2021-04-07 13:10:04 +03:00
Aliaksandr Valialkin
db56ee0e28
lib/promscrape/discovery/kubernetes: register pending apiWatchers in uw.aws
2021-04-06 11:11:53 +03:00
Aliaksandr Valialkin
edd66b7e82
lib/promscrape/discovery/kubernetes: remove superflouos mustStart and mustStop functions
2021-04-05 22:43:49 +03:00
Lu Jiajing
4ee6def68b
fix access to nil *url.URL ( #1180 )
...
* fix access to nil *url.URL
Signed-off-by: Megrez Lu <lujiajing1126@gmail.com>
* Update lib/promscrape/discovery/kubernetes/api_watcher.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-05 22:26:43 +03:00
Aliaksandr Valialkin
7eca60694e
lib/promscrape/discovery/kubernetes: reduce CPU time spent on registering big number of Kubernetes objects shared among big number of scrape jobs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 22:05:02 +03:00
Aliaksandr Valialkin
9da2ef3d8f
lib/promscrape/discovery/kubernetes: load objects missing in local cache from api seriver in getObjectByRole()
...
This should fix possible race for `role: endpoints` and `role: endpointslices` service discovery,
when the referred `pod` and `service` objects aren't propagated to urlWatcher cache yet.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182#issuecomment-813353359 for details.
2021-04-05 20:31:22 +03:00
Aliaksandr Valialkin
4d6a3aba4d
lib/persistentqueue: delete corrupted persistent queue instead of throwing a fatal error
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1030
2021-04-05 19:26:51 +03:00
Aliaksandr Valialkin
fe084fdd33
lib/promscrape/discovery/kubernetes: synchronously load Kubernetes objects on first access
...
Remove async registration of apiWatchers, since it breaks discovering `role: endpoints` and `role: endpointslices` targets,
which depend on pod and service objects.
There is no need in reloading `endpoints` and `endpointslices` targets if the referenced `pod` or `service` objects change,
since in this case the corresponding `endpoints` and `endpointslices` objects should also change because they contain
ResourceVersion of the referenced `pod` or `service` objects, which is modified on object update.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 14:37:07 +03:00
Aliaksandr Valialkin
c8b7c32e23
lib/proxy: typo fix after a5c5b54c22
2021-04-05 14:37:06 +03:00
Aliaksandr Valialkin
e0b687171b
lib/proxy: add support for socks5 over tls
proxy
2021-04-05 13:00:42 +03:00
Aliaksandr Valialkin
90da332ea0
lib/promscrape: pass X-Prometheus-Scrape-Timeout-Seconds
header to scrape targets as Prometheus does
2021-04-05 12:15:14 +03:00
Nikolay
d0b664454b
adds socks5 support for fasthttp client ( #1178 )
...
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1177
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-04 01:43:59 +03:00
Aliaksandr Valialkin
dd19fab7c9
lib/promscrape: properly send full url in GET
request via simple HTTP proxy
...
This is a follow-up for a0ae0f86666a75ec57b45eab2429da7ab4a7b250
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-04 01:20:43 +03:00
Aliaksandr Valialkin
ab9e1eb41f
lib/promscrape: support for simple HTTP proxies without CONNECT
method support such as https://github.com/prometheus-community/PushProx
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-04 00:40:58 +03:00
Aliaksandr Valialkin
27d8a5d2c0
lib/promscrape: add tests for authorization
config, which has been added in df148f48b7
2021-04-03 22:14:03 +03:00
Aliaksandr Valialkin
ae5c20a34c
lib/proxy: log response body on non-200 response code
...
This should improve debuggability for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-03 03:03:41 +03:00
Aliaksandr Valialkin
87700f1259
lib/promscrape: add support for authorization
config in -promscrape.config
as Prometheus 2.26 does
...
See https://github.com/prometheus/prometheus/pull/8512
2021-04-02 21:20:37 +03:00
Aliaksandr Valialkin
2825a1e86d
lib/promscrape: add follow_redirect
option to scrape_configs
section like Prometheus does
...
See https://github.com/prometheus/prometheus/pull/8546
2021-04-02 21:20:37 +03:00
Aliaksandr Valialkin
245eba8896
lib/promscrape/discovery/kubernetes: properly track objects with the same names in multiple namespaces
...
This is a follow-up for 12e4785fe8
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:46:34 +03:00
Aliaksandr Valialkin
eee860f83d
lib/promscrape/discovery/kubernetes: properly discover targets in multiple namespaces
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:29:24 +03:00
Nikolay
00157f3ec6
Adds aws ECS credentials support ( #1175 )
2021-04-02 13:11:03 +03:00
Aliaksandr Valialkin
512addc608
app/{vminsert,vmagent}: add -sortLabels
command-line option for sorting time series labels before ingesting them in the storage
...
This option can be useful when samples for the same time series are ingested with distinct order of labels.
For example, metric{k1="v1",k2="v2"} and metric{k2="v2",k1="v1"}.
2021-03-31 23:27:21 +03:00
Aliaksandr Valialkin
ae1c653d55
lib/storage: reduce memory usage when ingesting samples for the same time series with distinct order of labels
2021-03-31 21:22:40 +03:00
Aliaksandr Valialkin
887e67f13b
lib/uint64set: improve Set.Has() performance scalability on multi-CPU system
...
Do not update bucket32.hint on Set.Has() call, since it leads to memory ping-pong between CPU cores multi-CPU system
2021-03-29 12:34:19 +03:00
Aliaksandr Valialkin
940a547116
lib/storage: do not update b.nextIdx if no samples are removed because of retention
2021-03-29 12:13:38 +03:00
Aliaksandr Valialkin
7d87d42a91
lib/promscrape/discovery/kubernetes: typo fix in error message
2021-03-26 12:46:33 +02:00
Aliaksandr Valialkin
a920e71809
lib/promscrape/discovery/kubernetes: properly handle too old resource version
error message from Kubernetes watch API
2021-03-26 12:28:35 +02:00
Aliaksandr Valialkin
9c2be144cf
app/vmselect: log the metric which trigger rollup result cache reset
...
This should help finding the source of stale metrics
2021-03-25 21:32:28 +02:00
Aliaksandr Valialkin
f971fe86cd
lib/storage: tune loopsCountPerMetricNameMatch according to production workload
2021-03-25 13:48:17 +02:00
Aliaksandr Valialkin
6b1f807418
app/vmagent: add -promscrape.consul.waitTime
command-line flag for configuring Consul service discovery wait time
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1144
2021-03-23 19:34:12 +02:00
Aliaksandr Valialkin
9947c65df3
lib/storage: do not reload metricName for the same metricID in Search.NextMetricBlock
...
This should speed up Search.NextMetricBlock a bit
2021-03-23 17:59:34 +02:00
Nikolay
19a40faf8e
changes consul_service label value ( #1143 )
...
according to prometheus discovery.
It should mitigate issue with case sensetive services
https://github.com/hashicorp/consul/issues/5707
2021-03-23 15:37:06 +02:00
Aliaksandr Valialkin
12ca0efc19
lib/storage: respect the deadline passed to Storage.SearchMetricNames
2021-03-22 23:03:00 +02:00
Aliaksandr Valialkin
40e47935e7
lib/storage: improve Search.NextMetricBlock performance by using MetricID->MetricName cache
2021-03-22 23:02:59 +02:00
Aliaksandr Valialkin
1618b0ca6d
lib/storage: tune loopsCountPerMetricNameMatch
2021-03-22 12:57:54 +02:00
Aliaksandr Valialkin
7503111feb
lib/storage: small code simplification after 6cee5338b2
2021-03-18 15:22:39 +02:00
Aliaksandr Valialkin
4443254fb9
lib/storage: prevent from infinite loop if {__graphite__="..."}
filter matches a metric name with *
, [
or {
chars
...
The idea has been borrowed from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1137
2021-03-18 14:57:39 +02:00
Aliaksandr Valialkin
e03233f441
lib/fs: reduce the frequency of failed to remove directory ... due to NFS lock
log warnings
...
Log `failed to remove directory ... due to NFS lock` warning only if the directory cannot be removed in one second.
2021-03-18 13:23:43 +02:00
Aliaksandr Valialkin
b859fe7879
lib/storage: faster move heavy filters to the end of list
2021-03-17 15:11:56 +02:00
Aliaksandr Valialkin
5e77a939c2
all: make go vet
happy
2021-03-17 00:48:44 +02:00
Aliaksandr Valialkin
41fe707bec
lib/storage: limit loops count in order to reduce max CPU usage during filter search
2021-03-17 00:48:44 +02:00
Aliaksandr Valialkin
e2a0c8bd72
lib/storage: do not modify filterLoopsCount stats with loopsCount stats
...
Such a modification can result in incorrect filter sorting later
2021-03-17 00:48:44 +02:00
Aliaksandr Valialkin
b997f4a418
all: make golangci-lint happy after the commit 6378205415
2021-03-17 00:24:31 +02:00
Aliaksandr Valialkin
8005ba26b9
lib/netutil: enable IPv6 UDP listening if -enableTCP6
command-line flag is passed to VictoriaMetrics
...
This is a follow-up for 18cfc4be7b
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1131
2021-03-17 00:19:30 +02:00
Nikolay
b293f81eb8
Adds udp6 support for ingest servers ( #1134 )
...
with flag -enableUDP6 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1131
2021-03-17 00:19:29 +02:00
Aliaksandr Valialkin
1cfe3f872f
lib/uint64set: optimize bucket16.addMulti a bit
2021-03-16 21:08:38 +02:00
Aliaksandr Valialkin
727ded9d4e
lib/storage: time series search optimization according to production workload profiling
...
Do not pass filter metric ids to getMetricIDsForTagFilter, since it has been appeared that this slows down
the function by multiple times when it finds big number of metricIDs (tens of millions).
2021-03-16 20:08:43 +02:00
Aliaksandr Valialkin
f4a44d6c0d
lib/storage: further tuning for time series search
2021-03-16 18:47:29 +02:00
Aliaksandr Valialkin
d074326970
app/vmstorage: add -logNewSeries
command-line flag for determining the source of series churn rate
2021-03-15 22:40:28 +02:00
Aliaksandr Valialkin
c4d0c20de2
lib/influxutils: return response compatible with InfluxDB 1.8.4
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1124
2021-03-15 22:20:38 +02:00
f41gh7
b7c9f3676e
typo fix
2021-03-15 22:20:37 +02:00
Aliaksandr Valialkin
e2717d84c0
all: various fixes in command-line flag descriptions
2021-03-15 22:03:49 +02:00
Aliaksandr Valialkin
776b8b32ca
app/{vminsert,vmagent}: a follow-up for b1aa8c3d8f
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1124
2021-03-15 22:03:49 +02:00
Aliaksandr Valialkin
fc902734d9
lib/storage: further tuning for time series selector code
2021-03-15 20:32:37 +02:00
Aliaksandr Valialkin
dc70e7de76
lib/uint64set: reduce the size of bucket16 by storing smallPool by pointer.
...
This reduces CPU time spent on bucket16 copying inside bucket13.addBucketAtPos.
2021-03-15 17:27:03 +02:00
Aliaksandr Valialkin
cc6b1a5941
lib/uint64set: optimize Set.AddMulti for large sorted sets
2021-03-15 17:10:54 +02:00
Aliaksandr Valialkin
f023e92f1a
lib/uint64set: optimize bucket16.add and bucket16.addMulti a bit
2021-03-15 16:58:24 +02:00
Aliaksandr Valialkin
1c26020080
lib/storage: tune per-day index search
2021-03-15 13:36:36 +02:00
Aliaksandr Valialkin
7f52aae20c
lib/promscrape: an attempt to reduce memory usage when vmagent scrapes targets with varying number of metrics
...
Do not cache too big byte buffers and too big writeRequestCtx objects,
since it is cheaper to re-create them instead of wasting RAM for their caching.
This reverts 7f6f350ee1
2021-03-15 11:49:29 +02:00
Aliaksandr Valialkin
33cd6c26d3
lib/promscrape: return back the logic for flushing big buffers to storage from the commit 3fd8653b40
...
This should reduce memory usage when vmagent scrapes targets with big number of metrics and `-promscrape.streamParse` isn't enabled
2021-03-14 22:25:37 +02:00
Aliaksandr Valialkin
894246176f
lib/promscrape/discovery/kubernetes: do not start object watcher until initial objects are loaded
2021-03-14 21:56:16 +02:00
Aliaksandr Valialkin
9e55db4a53
lib/promscrape: retry service discovery in a few seconds if it starts returning 0 targets
...
This should reduce recovery time from temporary issues during service discovery
2021-03-14 21:56:16 +02:00
Aliaksandr Valialkin
3b46ae1c05
lib/promscrape: remove duplicate target
word in error message
2021-03-14 21:56:16 +02:00
Aliaksandr Valialkin
b0b28eeb93
lib/promscrape/discovery/kubernetes: further optimize kubernetes service discovery for the case with many scrape jobs
...
Do not re-calculate labels per each scrape job - reuse them instead for scrape jobs with identical Kubernetes role
2021-03-14 21:16:41 +02:00
Aliaksandr Valialkin
620f05cd2c
lib/promscrape/discovery: fixes after 133b288681
...
- Removed a deadlock in addAPIWatcher
- Do not create unused ScrapeWork objects
- Do not spend CPU resources on creating objectByKey map in addAPIWatcher
This work is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1125
2021-03-13 15:22:38 +02:00
Aliaksandr Valialkin
54f902467d
lib/proxy: there is no need in cloning tlsCfg, which has been created two lines above
2021-03-12 10:48:01 +02:00
Aliaksandr Valialkin
72a8fa484b
lib/proxy: set proxy address in tls.Config.ServerName instead of the target address
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-12 10:41:25 +02:00
Aliaksandr Valialkin
60e0280a94
lib/promscrape: add ability to configure proxy options via proxy_tls_config
, proxy_basic_auth
, proxy_bearer_token
and proxy_bearer_token_file
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-12 03:36:11 +02:00
Aliaksandr Valialkin
b2732575f7
lib/storage: further tune filters sorting logic
2021-03-12 00:51:35 +02:00
Aliaksandr Valialkin
8fc29ffc67
lib/promscrape/discovery/kubernetes: use a single watcher per apiURL
...
Previously multiple scrape jobs could create multiple watchers for the same apiURL. Now only a single watcher is used.
This should reduce load on Kubernetes API server when many scrape job configs use Kubernetes service discovery.
2021-03-11 17:04:14 +02:00
Aliaksandr Valialkin
8b8d4cbcfe
lib/proxy: do not show inline basic auth passwords when logging errors related to proxy_url
2021-03-11 13:44:14 +02:00
Aliaksandr Valialkin
41f641b132
lib/promscrape/discovery/kubernetes: localize Bookmark parsing code
...
This is a follow-up for e772d1c920
2021-03-11 13:08:56 +02:00
Aliaksandr Valialkin
6c9cd3f7c1
lib/promscrape/discovery/kubernetes: reduce load on Kubernetes API server by using watch bookmarks
...
This allows continuing object watch from the last bookbark instead of reloading all the objects
on watch errors or timeouts.
See https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks
2021-03-10 15:08:40 +02:00
Aliaksandr Valialkin
bd8b7a88a7
lib/httpserver: export vm_available_memory_bytes
and vm_available_cpu_cores
metrics
...
These metrics are useful for tracking the available memory and CPU cores for VictoriaMetrics apps.
2021-03-10 12:08:26 +02:00
Aliaksandr Valialkin
e15f3f4f2a
lib/proxy: pass proxy hostname in Host
header of the CONNECT
request
...
This should resolve the following issue when connecting to tls proxy:
cannot validate certificate for ... because it doesn't contain any IP SANs
2021-03-09 20:41:18 +02:00
Aliaksandr Valialkin
9d8223eafb
lib/proxy: set missing ServerName in TLS config for proxy_url
.
...
While at it, allow setting Proxy-Authorization for `proxy_url` via `basic_auth` and `bearer_token` configs.
2021-03-09 19:01:14 +02:00
Nikolay
1310f84122
Changes tlsConfig init for proxy connections ( #1121 )
...
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-09 19:01:13 +02:00
Aliaksandr Valialkin
0554430d7e
lib/promscrape: apply sample_limit
after metric relabeling is applied as Prometheus does
...
See the description for `sample_limit` option from Prometheus docs:
Per-scrape limit on number of scraped samples that will be accepted.
If more than this number of samples are present after metric relabeling
the entire scrape will be treated as failed. 0 means no limit.
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
2021-03-09 15:52:41 +02:00
Aliaksandr Valialkin
7b66c8cbf8
lib/promscrape/discovery/kubernetes: remove too verbose logs about starting and stopping the watchers
...
Log the number of objects loaded per each watch url
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-09 15:07:12 +02:00
John Belmonte
edf39aa225
spelling fix: adjacent ( #1115 )
2021-03-09 09:19:16 +02:00
Aliaksandr Valialkin
502fab797a
lib/promscrape: add scrape_offset
option to scrape_config
...
This option can be used for specifying the particular offset per each scrape interval for target scraping
2021-03-08 11:59:32 +02:00
Aliaksandr Valialkin
c4a0bd5eac
lib/storage: go fmt
2021-03-08 11:59:31 +02:00
Aliaksandr Valialkin
c76a904bb0
lib/storage: tune loopsCount estimations in getMetricIDsForTagFilterSlow
...
The adjusted estmations give up to 2x lower median response times on 200qps /api/v1/query_range workload
2021-03-07 21:17:48 +02:00
Aliaksandr Valialkin
c04505e585
lib/promscrape/discovery/kubernetes: reduce memory usage further when big number of scrape jobs are configured for the same kubernetes_sd_config
role
...
Serialize reloading per-role objects, so they don't occupy too much memory when objects for many scrape jobs are simultaneously refreshed.
Do not reload per-role objects if they were already refreshed by concurrent goroutines. This should reduce load on Kubernetes API server
when big number of scrape jobs are configured for the same Kubernetes role.
This is a follow-up for 17b87725ed
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-07 20:03:22 +02:00
Aliaksandr Valialkin
175466bb41
lib/decimal: prevent exponent overflow when processing values close to zero
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1114
2021-03-05 18:53:41 +02:00
Aliaksandr Valialkin
5807ff57f3
lib/promscrape/discovery/kubernetes: reduce memory usage when Kubernetes service discovery is configured on a big number of scrape jobs
...
Previously vmagent was creating a separate Kubernetes object cache per each scrape job.
This could result in increased memory usage when monitoring a Kubernetes cluster with big number of objects (pods / nodes / services, etc.)
as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
Now it uses a shared map of scrape objects across multiple scrape jobs.
2021-03-05 17:32:33 +02:00
Aliaksandr Valialkin
92ddb8f197
lib/promscrape/discovery/kubernetes: move apiWatcher code to a separate file
2021-03-05 17:32:32 +02:00
Aliaksandr Valialkin
02c0959380
lib/promscrape: make cluster membership calculations consistent across 32-bit and 64-bit architectures
2021-03-05 09:06:08 +02:00
Aliaksandr Valialkin
133fb9fc00
lib/promscrape: add -promscrape.cluster.replicationFactor
command-line flag for replicating scrape targets among vmagent
instances in the cluster
2021-03-04 10:21:27 +02:00
Aliaksandr Valialkin
bae7a1b47a
lib/promscrape/discovery/kubernetes: fix tests after e154f4a644
2021-03-03 22:42:04 +02:00
Nikolay
7d92ef3acd
Fix ingress discovery api ( #1110 )
2021-03-03 10:45:50 +02:00
Aliaksandr Valialkin
25f453ce1a
lib/promscrape/discovery/kubernetes: properly check for nil pointer inside interface
...
See https://mangatmodi.medium.com/go-check-nil-interface-the-right-way-d142776edef1
This fixes a panic when the ScrapeWork is filtered out in swcFunc.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1108
2021-03-03 10:42:54 +02:00
Aliaksandr Valialkin
fea27b9845
lib/promscrape: go fmt
2021-03-02 21:20:23 +02:00
Aliaksandr Valialkin
c67a07b469
lib/handshake: log read/write operation duration on connection errors
...
This improve debuggability of network errors
2021-03-02 21:20:20 +02:00
Aliaksandr Valialkin
c8dde1fd6b
lib/storage: typo fix: umarshal -> unmarshal
2021-03-02 20:48:44 +02:00
Aliaksandr Valialkin
3fbe2bf1c8
lib/promscrape: pre-allocate space for a map in mergeLabels
...
This should reduce the number of memory allocations when discovering big number of targets
2021-03-02 18:42:44 +02:00
Aliaksandr Valialkin
ac5c47a9f5
lib/promscrape/discovery: properly track vm_promscrape_discovery_kubernetes_objects_removed_total metric
2021-03-02 18:33:29 +02:00
Aliaksandr Valialkin
62d7e07ff7
lib/promrelabel: remove unneded optimizations for labeldrop
and labelkeep
actions
...
These optimizations may slow down code execution by matching the same label against regexp two times instead of a single time
2021-03-02 18:01:08 +02:00