Aliaksandr Valialkin
db8c4054e5
lib/promscrape: fix errors in test config
...
The errors were discovered after enabling strict parse mode by default.
See 9bb60ab00f
2022-02-08 19:56:37 +02:00
Aliaksandr Valialkin
4507b111a9
lib/blockcache: split the cache into multiple shards
...
This should reduce contention on cache mutex on hosts with many CPU cores,
which, in turn, should increase overall throughput for the cache.
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-02-08 19:44:29 +02:00
Aliaksandr Valialkin
2455a988e4
lib/mergeset: tune sizes for indexdb/dataBlocks
and indexdb/indexBlocks
according to production workload
...
This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007#issuecomment-1032308742
2022-02-08 17:58:49 +02:00
Aliaksandr Valialkin
9bb60ab00f
lib/promscrape: set -promscrape.config.strictParse
to true by default
...
This allows detecting long-living silent errors in -promscrape.config
2022-02-08 15:41:43 +02:00
Aliaksandr Valialkin
a19e7f8c5b
lib/blockcache: make fmt
2022-02-08 15:24:11 +02:00
Aliaksandr Valialkin
d0f785defd
lib/blockcache: eliminate possible race when Cache.Put is called for the same entry from multiple goroutines
...
The race could result in incorrect cache size tracking, which, in turn, could result in too frequent cache cleaning
2022-02-08 01:10:43 +02:00
Aliaksandr Valialkin
46bd2c4d6d
lib/blockcache: increase the lifetime for rarely accessed blocks from 2 minutes to 5 minutes
...
This should improve data ingestion speed if time series samples are ingested with interval bigger than 2 minutes.
The actual interval could exceed 2 minutes if the original interval between samples doesn't exceed 2 minutes
in the case of slow inserts. Slow inserts may appear in the following cases:
* Big number of new time series are pushed to VictoriaMetrics, so they couldn't be registered in 2 minutes.
* MetricName->tsid cache reset on indexdb rotation or due to unclean shutdown.
In this case VictoriaMetrics needs to load MetricName->tsid entries for all the incoming series from IndexDB.
IndexDB uses the block cache for increasing lookup performance. If the cache has no the needed block,
then IndexDB reads and unpacks the block from disk. This requires an extra disk read IO and CPU.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
This also should increase performance for periodically executed queries with intervals from 2 minutes to 5 minutes.
See the previous similar commit - 43103be011
It is possible that the timeout can be increased further. Let's collect production numbers for this change
so the timeout could be adjusted further.
2022-02-08 00:15:56 +02:00
Aliaksandr Valialkin
e86b7cc9a5
lib/workingsetcache: use the original cache size limits when rotating caches
...
Previously limits for new caches were taken from cache stats.
These limits could mismatch the original limits. This could result in failed cache load
if the stored cache has been created with the limits obtained from cache stats.
2022-02-08 00:10:14 +02:00
Aliaksandr Valialkin
cde4664f0d
lib/blockcache: return proper number of entries from the cache
...
This has been broken in 0d7374ad2f
2022-02-07 19:28:42 +02:00
Aliaksandr Valialkin
b5b3c585b3
lib/promscrape: show the total number of scrapes and the total number of scrape errors per target at /targets page
...
This information may be useful when debugging unreliable scrape targets
2022-02-03 20:22:41 +02:00
Aliaksandr Valialkin
2968779f16
lib/promscrape: provide the ability to fetch target responses on behalf of vmagent or single-node VictoriaMetrics
...
This feature may be useful when debugging metrics for the given target located in isolated environment
2022-02-03 19:00:55 +02:00
Aliaksandr Valialkin
9c62b25ad6
lib/mergeset: pre-allocate data and items for inmemoryBlock in order to reduce memory allocations under high churn rate
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-02-01 00:57:14 +02:00
Aliaksandr Valialkin
4bdd10ab90
lib/bytesutil: split Resize* funcs to MayOverallocate and NoOverallocate for more fine-grained control over memory allocations
...
Follow-up for f4989edd96
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-02-01 00:18:42 +02:00
Aliaksandr Valialkin
e13ce2ee98
lib/encoding: substitute 64-bits.LeadingZeros64()
with bits.Len64()
2022-01-31 23:36:48 +02:00
Aliaksandr Valialkin
a8509c112a
lib/storage: avoid allocations of tsidPrev on every blockStreamReader.NextBlock() call
...
This is a follow-up for 00b7c97d2a
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2082
2022-01-31 22:46:53 +02:00
Aliaksandr Valialkin
f50cf60534
lib/cgroup: fall back to runtime.NumCPU() when determining process_cpu_cores_available metric if it is impossible to determine cpu quota via cgroups
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2107
2022-01-31 20:30:14 +02:00
Aliaksandr Valialkin
ead66155ef
lib/cgroup: expose process_cpu_cores_available
metric
...
This metric shows the number of CPU cores available to the process.
This allows creating alerting rules on CPU saturation with the following query:
rate(process_cpu_seconds_total[5m]) / process_cpu_cores_available > 0.9
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2107
2022-01-31 20:24:41 +02:00
Aliaksandr Valialkin
96aa3761fc
lib/storage/table.go: add missing tb.ptwsLock.Unlock()
before the return
...
This is a follow-up for a1083d0531
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2103
2022-01-28 14:15:42 +02:00
匠心零度
1999bbfe82
optimized code ( #2103 )
...
* optimized code ,because only the first error,so no need var errors []error
* optimized code ,because only the first error,so no need var errors []error
Co-authored-by: lirenzuo <lirenzuo@shein.com>
2022-01-28 14:15:41 +02:00
Aliaksandr Valialkin
f4989edd96
lib/bytesutil: split Resize() into ResizeNoCopy() and ResizeWithCopy() functions
...
Previously bytesutil.Resize() was copying the original byte slice contents to a newly allocated slice.
This wasted CPU cycles and memory bandwidth in some places, where the original slice contents wasn't needed
after slize resizing. Switch such places to bytesutil.ResizeNoCopy().
Rename the original bytesutil.Resize() function to bytesutil.ResizeWithCopy() for the sake of improved readability.
Additionally, allocate new slice with `make()` instead of `append()`. This guarantees that the capacity of the allocated slice
exactly matches the requested size. The `append()` could return a slice with bigger capacity as an optimization for further `append()` calls.
This could result in excess memory usage when the returned byte slice was cached (for instance, in lib/blockcache).
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-25 15:24:44 +02:00
Aliaksandr Valialkin
91f2af2d7a
lib/mergeset: allocate the needed amounts of memory when unmarshaling inmemoryBlock
...
This should reduce the memory required for indexdb/dataBlocks cache.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-24 18:50:40 +02:00
Aliaksandr Valialkin
4c13bae1cf
lib/logger: removed broken test after 746ee191e8
2022-01-24 12:14:32 +02:00
Aliaksandr Valialkin
746ee191e8
lib/logger/throttler.go: show the original location of the error and warning message
...
Previously the location inside LogThrottler implementation was shown. This could complicate debugging.
2022-01-23 13:55:00 +02:00
Aliaksandr Valialkin
0d7374ad2f
lib/blockcache: optimize blockcache a bit
...
- Optimize Cache.RemoveBlocksFromPart(), so it doesn't need to iterate over all the cached blocks.
- Cache blocks if there were no cache misses during the last 2 minutes.
This may be the case when new blocks are added simultaneously to the storage and to the cache.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-23 13:13:45 +02:00
Aliaksandr Valialkin
ede93469ea
lib/mergeset: tune caches size limits for indexdb/dataBlocks
and indexdb/indexBlocks
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-21 12:45:43 +02:00
Aliaksandr Valialkin
5f84b17ed6
lib/storage: properly limit cardinality when ingesting multiple samples for the same time series in a single request
2022-01-21 12:38:09 +02:00
Aliaksandr Valialkin
00b7c97d2a
lib/storage: verify that blocks in a single part are sorted by TSID when reading sequential blocks from the part
...
This may help narrowing down the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2082
2022-01-20 20:36:37 +02:00
Aliaksandr Valialkin
ea87f21e23
lib/storage: set bsm.Block to nil on error, so the previous block couldn't be used.
...
This may help nailing down the issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2082
2022-01-20 20:13:14 +02:00
Aliaksandr Valialkin
9797c928ef
lib/blockcache: add missing dependency after 145337792d
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-20 18:50:44 +02:00
Aliaksandr Valialkin
145337792d
lib/{mergeset,storage}: properly limit cache sizes for indexdb
...
Previously these caches could exceed limits set via `-memory.allowedPercent` and/or `-memory.allowedBytes`,
since limits were set independently per each data part. If the number of data parts was big, then limits could be exceeded,
which could result to out of memory errors.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2007
2022-01-20 18:37:17 +02:00
Aliaksandr Valialkin
1d05444b33
lib/promscrape: expose promscrape_stale_samples_created_total metric for monitoring the number of created stale samples
2022-01-14 01:00:46 +02:00
Aliaksandr Valialkin
80f03177c4
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_node_provider_id
label for discovered Kubernetes nodes in the same way as Prometheus does
...
See https://github.com/prometheus/prometheus/pull/9603
2022-01-13 23:16:02 +02:00
Aliaksandr Valialkin
355a63733d
lib/promscrape/discovery/kubernetes: add the ability to limit service discovery to the current namespace
...
See https://github.com/prometheus/prometheus/issues/9782 and https://github.com/prometheus/prometheus/pull/9881
2022-01-13 22:44:35 +02:00
Aliaksandr Valialkin
17eb86a689
lib/promscrape/discovery/dockerswarm: follow up after 68a117a25a
...
- Document the bugfix at docs/CHANGELOG.md
- Set __address__ field after copying commonLabels to the resulting map of discovered labels.
This makes sure that the correct __address__ label is used.
2022-01-11 09:20:10 +02:00
Alexander Shtuchkin
68a117a25a
Fix for #2038 : Make correct __address__ value for dockerswarm promscrape ( #2041 )
2022-01-11 08:59:06 +02:00
Aliaksandr Valialkin
e4e36383e2
lib/promscrape: do not send staleness markers on graceful shutdown
...
This follows Prometheus behavior.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2013#issuecomment-1006994079
2022-01-07 01:17:57 +02:00
Aliaksandr Valialkin
178dd87e26
lib/storage: follow-up for 38bf5fc136
2022-01-05 16:00:11 +02:00
weng zhao
38bf5fc136
vmstorage: fix query like {foo=~"bar|"}
return extra timeseries cause by negative filter transformation malfunction ( #2032 )
...
1. L2749 make kb.B remain the value of comonPrefix instead of tf.prefix
2. L2762 avoid change tf.value from "bar|" to ".+r|"
2022-01-05 15:59:15 +02:00
Aliaksandr Valialkin
cbaa2af280
lib/promscrape: scrape replicated targets at different offsets in vmagent replicated clustering mode
...
This guarantees that the deduplication consistently leaves samples from the same vmagent replica.
See https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets
2021-12-23 00:20:39 +02:00
Nikolay
8ff7da7202
adds restore.lock ( #1988 )
...
* adds restore.lock
it must prevent from running storage after incomplete restore process
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1958
* return back flock file deletion
* Apply suggestions from code review
* wip
* docs/CHANGELOG.md: document https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1958
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-22 13:10:15 +02:00
Aliaksandr Valialkin
ce333f28d8
all: use logger.WithThrottler() where appropriate
2021-12-21 17:03:25 +02:00
Roman Khavronenko
34fdc8881b
vmagent: add error log for skipped data block when rejected by receiv… ( #1956 )
...
* vmagent: add error log for skipped data block when rejected by receiving side
Previously, rejected data blocks were silently dropped - only metrics were update.
From operational perspective, having an additional logging for such cases is preferable.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1911
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* vmagent: throttle log messages about skipped blocks
The new type of logger was added to logger pacakge.
This new type supposed to control number of logged messages
by time.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* lib/logger: make LogThrottler public, so its methods can be inspected by external packages
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-21 16:36:09 +02:00
Aliaksandr Valialkin
b9363d9726
lib/promscrape: take into account the original job_name when creating an unique key per each scrape target
...
This should handle the case when the original job_name has been changed in -promscrape.config ,
while the resulting job label remains the same because it is overriden via relabeling.
2021-12-20 18:38:05 +02:00
Aliaksandr Valialkin
afafeb379a
all: typo fix: unexected -> unexpected
2021-12-20 17:39:52 +02:00
Aliaksandr Valialkin
5a36e241f4
lib/persistentqueue: check that readerOffset doesnt exceed writerOffset after each readerOffset increase
...
This should help detecting the source of the panic from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1981
2021-12-20 17:25:11 +02:00
Aliaksandr Valialkin
8a7f08ded3
lib/storage: properly update per-part min_dedup_interval
file contents after merge
...
Previously 0s was always written even if -dedup.minScrapeInterval was set to non-zero value
This is a follow-up for 4ff647137a
2021-12-17 20:13:24 +02:00
Aliaksandr Valialkin
a3adf24527
lib/promscrape: allow up to 5 redirects when scraping a target by default
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945
2021-12-16 00:14:14 +02:00
Aliaksandr Valialkin
4ff647137a
lib/storage: deduplicate samples more thoroughly
...
Previously some duplicate samples may be left on disk for time series with high churn rate.
This may result in higher disk space usage.
2021-12-15 15:59:58 +02:00
Aliaksandr Valialkin
92070cbb67
lib/storage: return dedup interval in milliseconds from GetDedupInterval()
...
This removes duplicate .Milliseconds() calls after GetDedupInterval() calls.
2021-12-15 13:26:38 +02:00
Aliaksandr Valialkin
1d20a19c7d
lib/storage: explicitly pass dedupInterval to DeduplicateSamples() and deduplicateSamplesDuringMerge()
...
This improves the code readability and debuggability, since the output of these functions
stops depending on global state.
2021-12-14 20:49:12 +02:00
Aliaksandr Valialkin
e1a715b0f5
lib/storage: convert alternate regexps into Graphite wildcards inside __graphite__
pseudo-label
...
For example, `{__graphite__=~"foo.(bar|baz)"}` is automatically converted to `{__graphite__=~"foo.{bar,baz}"}` before execution.
This allows using multi-value Grafana template variables such as `{__graphite__=~"foo.($app)"}`.
2021-12-14 19:51:49 +02:00
Yury Molodov
c1fd93e8a0
vmui: multiple queries ( #1916 )
...
* feat: change duration by "enter"
* fix: optimize data processing for chart
* feat: set minimum step to 1ms
* update dependencies
* feat: remove save the last query to local storage
* fix: handle an error in a table with subqueries
* feat: store display type in URL
* Revert "feat: store display type in URL"
This reverts commit ccc242c69a
.
* feat: store display type in URL
* refactor: move the time setting to a folder
* refactor: move the query configurator to a folder
* refactor: move the auth settings to a folder
* feat: improve styles
* feat: add multi query
* update package-lock
* feat: add display multiple queries
* feat: add limits for multiple queries
* update dependencies
* feat: add history for multiple queries
* feat: add line type to legend
* feat: change style for switch
* feat: change the logic for axes limits for multiple queries
* update package-lock.json
* update dependencies
* feat: add the filter to legend
* wip
* lib/httpserver: add missing 127.0.0.1 hostname to the logged address for http and pprof server if the address starts with ':'
This allows copy-pasting the url to http server from logs.
* lib/httpserver: add missing 127.0.0.1 hostname to the logged address for http and pprof server if the address starts with ':'
This allows copy-pasting the url to http server from logs.
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-08 16:40:15 +02:00
Aliaksandr Valialkin
45d082bbe2
app/vminsert: add -maxLabelValueLen
command-line flag
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1908
2021-12-06 11:40:34 +02:00
Aliaksandr Valialkin
da402fbdfa
lib/workingsetcache: fix unaligned 64-bit atomic operation
panic on 32-bit architectures
...
The panic has been introduced in 7275ebf91a
2021-12-03 01:21:51 +02:00
Aliaksandr Valialkin
06642d97f5
app: allow specifying http and https urls in the following command-line flags
...
* -promscrape.config
* -relabelConfig
* -remoteWrite.relabelConfig
* -remoteWrite.urlRelabelConfig
2021-12-03 00:10:02 +02:00
Aliaksandr Valialkin
62b4efb3e7
app/vmauth: follow-up for 13368bed18
...
* Document the ability to specify http or https urls in `-auth.config` at docs/CHANGELOG.md
* Move the ReadFileOrHTTP to lib/fs, so it can be re-used in other places where a file
should be read from the given path. For example, in `-promscrape.config` at `vmagent`.
2021-12-02 23:32:05 +02:00
Aliaksandr Valialkin
394a345ae0
lib/httpserver: expose /-/healthy
and /-/ready
endpoints as Prometheus does
...
This improves integration with third-party solutions, which rely on these endpoints.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1833
2021-12-02 14:36:58 +02:00
Aliaksandr Valialkin
90c542af12
app: use relative paths instead of absolute paths for the supported http handlers on the main page
...
This allows hiding VictoriaMetrics components behind proxies, which serve pages at different path prefixes
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1858
2021-12-02 13:52:39 +02:00
Aliaksandr Valialkin
03f5ad3060
lib/protoparser/graphite: allow multiple separators between metric name, value and timestamp
2021-12-02 13:43:49 +02:00
Aliaksandr Valialkin
49a18b8660
lib/protoparser/graphite: properly parse Graphite line with whitespace after the timestamp
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1865
2021-12-02 13:33:26 +02:00
Aliaksandr Valialkin
c0cbf0de2a
app/{vmbackup,vmrestore}: export internal metrics at /metrics
http handler
2021-12-02 11:55:58 +02:00
Aliaksandr Valialkin
7275ebf91a
app/vmstorage: export vm_cache_size_max_bytes metrics for determining capacity of various caches
...
The vm_cache_size_max_bytes metric can be used for determining caches which reach their capacity via the following query:
vm_cache_size_bytes / vm_cache_size_max_bytes > 0.9
2021-12-02 10:30:43 +02:00
Aliaksandr Valialkin
2f63dec2e3
lib/fs: add vm_filestream_read_duration_seconds_total
and vm_filestream_write_duration_seconds_total
metrics
...
These metrics help determining persistent disk saturation with `rate(vm_filestream_read_duration_seconds_total) > 0.9`
2021-12-02 10:30:42 +02:00
Aliaksandr Valialkin
2fb5a6ca78
lib/storage: do not take into account -storage.minFreeDiskSpaceBytes during background merges
2021-12-01 11:02:36 +02:00
Nikolay
06eff5a72c
removes FileSize from backup part key ( #1872 )
...
* removes FileSize from backup part key
it should fix download restoration for backups
* Update lib/backup/common/part.go
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-01 11:01:28 +02:00
Aliaksandr Valialkin
d666755159
lib/storage: take into account -storage.minFreeDiskSpaceBytes
when performing big merges
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-11-30 12:56:35 +02:00
guidao
f05cddd2fc
fix #1830 ( #1861 )
...
Co-authored-by: wangfeng <wangfeng@zhihu.com>
2021-11-30 01:12:24 +02:00
Aliaksandr Valialkin
ba927d1c77
lib/protoparser/prometheus: follow-up for 8e338632a3
...
Do not spend CPU time on error message formatting if error logger is disabled
2021-11-30 00:50:11 +02:00
Nikolay
8e338632a3
Changes unmarshallRow logger to noop for getRowsDiff ( #1835 )
2021-11-30 00:48:13 +02:00
Aliaksandr Valialkin
d44c585ca4
lib/protoparser: do not log connection reset by peer
error when reading the data via InfluxDB, Graphite and OpenTSDB protocols over plain TCP connections
...
This error is expected, so there is no need in spamming the log with this error.
2021-11-29 21:47:56 +02:00
Aliaksandr Valialkin
b688960db0
lib/persistentqueue: add vm_persistentqueue_read_duration_seconds_total and vm_persistentqueue_write_duration_seconds_total metrics for determining disk usage saturation at vmagent
2021-11-17 16:41:35 +02:00
Lan
b72eed1f5e
Add flag of S3ForcePathStyle ( #1802 )
2021-11-17 01:03:03 +02:00
Aliaksandr Valialkin
e5ac9d8e57
all: consistently return application/json
content-type without charset=utf-8
...
The `application/json` content-type has utf-8 encoding by default.
See https://stackoverflow.com/questions/9254891/what-does-content-type-application-json-charset-utf-8-really-mean
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897
2021-11-09 18:04:44 +02:00
Aliaksandr Valialkin
fd596945e7
lib/promscrape: improve logging for scrape_config_files
parse errors
...
Log the actual file path, which led to the parse error.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1789
2021-11-08 13:34:12 +02:00
Aliaksandr Valialkin
cbfc7b7c92
app/{vminsert,vmagent}: hide passwords and auth tokens by default at /config
page
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764
2021-11-05 14:41:16 +02:00
Aliaksandr Valialkin
e73a82f7a5
lib/promauth: do not show empty values in oauth2
config section at /config
page
2021-11-05 12:53:39 +02:00
Aliaksandr Valialkin
aa534c2582
lib/promscrape: add -promscrape.maxResponseHeadersSize
command-line flag for tuning the maximum http response headers size from Prometheus scrape targets
2021-11-03 22:26:56 +02:00
Aliaksandr Valialkin
d1eb87c831
app/{vmagent,vminsert}: add ability to restrict access to /config page with authKey query arg
...
The authKey can be configured via `-configAuthKey` command-line flag.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764
2021-11-01 16:44:54 +02:00
Aliaksandr Valialkin
bb87949d5c
lib/protoparser/influx: automatically detect timestamp precision depending on the number of decimal digits in the timestamp
2021-10-28 12:47:22 +03:00
Aliaksandr Valialkin
d0e7c0535e
lib/logger: show only explicitly set command-line flags in logs
...
This reduces initial verbosity in logs
2021-10-28 11:00:52 +03:00
Aliaksandr Valialkin
74b8af9891
lib/promscrape: add collapse
and expand
buttons per each group of targets from the same scrape job
2021-10-27 20:03:24 +03:00
Aliaksandr Valialkin
6608705652
app/{vmalert,vmagent}: improve the distribution of scrape offsets among targets / rules
...
Previously only the lower part of 64-bit hash was used for calculating the offset.
This may give uneven distribution in some cases. So let's use all the available 64 bits from the hash
for calculating the offset.
2021-10-27 19:59:16 +03:00
Aliaksandr Valialkin
e3a91b186a
lib/protoparser/prometheus: optimize GetRowsDiff() function
...
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1745 ,
since the provided profile shows that the majority of CPU and memory is spent in this function
during `streamParse` when `-promscrape.noStaleMarkers` wasn't set.
2021-10-27 18:54:45 +03:00
Aliaksandr Valialkin
95d44157fc
lib/protoparser/prometheus: add a benchmark for GetRowsDiff
2021-10-27 18:53:54 +03:00
Aliaksandr Valialkin
1952ab99aa
all: fix build issues and tests for Apple M1
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1653
2021-10-27 15:06:34 +03:00
Aliaksandr Valialkin
4821adfd95
lib/promscrape: properly show proxy_url
option value at /config
page
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1755
2021-10-26 21:23:54 +03:00
Aliaksandr Valialkin
7fa15f7f86
lib/promscrape: do not populate response body to memory in stream parsing mode if -promscrape.noStaleMarkers is set
...
The response body isn't used if -promscrape.noStaleMarkers is set after the commit 2876137c92
,
so there is no sense in pupulating it in memory. This should reduce memory usage when scraping big responses.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1728#issuecomment-949630694
2021-10-22 16:44:44 +03:00
Aliaksandr Valialkin
6106d4069d
lib/promscrape: do not sort original labels and do not intern label string for the original labels before the sharding code is executed
...
This should reduce CPU and memory usage in shard mode when service discovery finds big number of scrape targets with many long labels.
See https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets
This is a follow-up after 9882cda8b9
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1728
2021-10-22 13:54:30 +03:00
Aliaksandr Valialkin
2876137c92
lib/promscrape: reduce memory usage if -promscrape.noStaleMarkers
command-line flag is passed
...
Do not store in memory the response from the last scrape per each target if -promscrape.noStaleMarkers option is enabled.
This should reduce memory usage when the scraped targets return large responses.
2021-10-22 13:10:29 +03:00
Nikolay
a3684fe3de
adds tab as second separator for graphite text protocol ( #1733 )
...
* adds tab as second separator for graphite text protocol
* changes indexFunc for indexAny
* Update lib/protoparser/graphite/parser_test.go
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-22 12:23:45 +03:00
Aliaksandr Valialkin
8991c8b589
lib/flagutil: do not expose sensitive info (passwords, keys and urls) at /flags page
2021-10-20 00:51:26 +03:00
Aliaksandr Valialkin
8ad95f0db7
lib/httpserver: expose command-line flags at /flags
page
...
This should simplify debugging.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695
2021-10-20 00:45:09 +03:00
Aliaksandr Valialkin
676ad70d9f
lib/envflag: use flag.Set for setting the flags from env vars
...
This should make visible the set flags at flag.Visit(), which is used later for logging
and exporting the `is_set` label for these flags at /metrics page
2021-10-20 00:41:08 +03:00
Aliaksandr Valialkin
53bb58ed2a
lib/storage: log a warning when the -storageDataPath has less than -storage.minFreeDiskSpaceBytes
...
This should improve the debuggability of the readonly feature.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1727
2021-10-19 23:59:13 +03:00
Aliaksandr Valialkin
3408a05d12
lib/promscrape/discovery/kubernetes: log a warning if role: endpoints
discovers more than 1000 targets per a single endpoint
...
In this case `role: endpointslice` must be used instead.
See the following references:
* https://kubernetes.io/docs/reference/labels-annotations-taints/#endpoints-kubernetes-io-over-capacity
* https://github.com/kubernetes/kubernetes/pull/99975
* https://github.com/prometheus/prometheus/issues/7572#issuecomment-934779398
2021-10-19 13:20:40 +03:00
Nikolay
cbcc622786
changes job source for /target api ( #1723 )
...
use jobNameOriginal instead of relabeled as prometheus does
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1707
2021-10-19 08:49:36 +03:00
Aliaksandr Valialkin
c37f285466
lib/promscrape: set honor_timestamps: true
by default if this option isnt set explicitly in scrape configs
...
This aligns the behavior to Prometheus - see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
2021-10-16 20:49:08 +03:00
Aliaksandr Valialkin
c055bc478c
lib/promscrape: expose promscrape_series_limit_max_series
and promscrape_series_limit_current_series
metrics per each scrape target with the enabled unique series limiter
2021-10-16 18:47:13 +03:00
Aliaksandr Valialkin
06b0982d6b
lib/promscrape: always initialize http client for stream parsing mode
...
Stream parsing mode can be automatically enabled when scraping targets with big response bodies
exceeding the -promscrape.minResponseSizeForStreamParse , so it must be always initialized.
2021-10-16 13:18:23 +03:00
Aliaksandr Valialkin
32793adbd9
lib/promscrape: store the last scraped response in compressed form if its size exceeds -promscrape.minResponseSizeForStreamParse
...
This should reduce memory usage when scraping targets with big response bodies.
2021-10-16 13:00:30 +03:00
Aliaksandr Valialkin
9866dd95c1
lib/promscrape: store the full response in stream parsing mode in scrapeWork.lastScrape byte slice
...
This allows sending staleness marks and properly calculate scrape_series_added metric in stream parsing mode
at the cost of the increased memory usage, since now the potentially big response is kept
in the lastScrape byte slice per each scrapeWork.
In practice the memory usage increase shouldn't be big, since the response size
is usually much smaller than the parsed metrics from this response after the relabeling,
which usually adds a big pile of target-specific labels per each metric.
2021-10-15 15:39:23 +03:00
Aliaksandr Valialkin
f6d33596ff
lib/promscrape/discovery/kubernetes: rename endpointslices.go -> endpointslice.go in order to be consistent with EndpointSlice struct name
...
This is a follow-up for 31b42b30b6
2021-10-15 12:27:12 +03:00
Aliaksandr Valialkin
bbd34fa15e
lib/promscrape: add -promscrape.minResponseSizeForStreamParse
command-line option for automatic switching to stream parsing mode when scraping targets with big responses
...
This should reduce memory usage when vmagent scrapes targets with non-uniform response sizes.
This is common case in Kubernetes monitoring.
2021-10-14 12:29:35 +03:00
Aliaksandr Valialkin
1a7287c408
lib/promscrape: return error if sample_limit
or series_limit
options are set when stream parsing mode is enabled
2021-10-14 12:11:23 +03:00
Aliaksandr Valialkin
e3c8304deb
lib/promscrape: add ability to show the original labels for discovered targets at /targets page
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1698
2021-10-13 15:59:58 +03:00
Roman Khavronenko
c0a932a55f
lib/promscrape: make errcheck happy ( #1703 )
2021-10-13 14:57:30 +03:00
Aliaksandr Valialkin
9882cda8b9
lib/promscrape: shard targets among cluster nodes after relabeling is applied
...
This guarantees that targets with the same set of labels go to the same vmagent node.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1687#issuecomment-940629495
2021-10-12 17:06:00 +03:00
Aliaksandr Valialkin
5a58c041c2
app/vmagent: expose -promscrape.config contents at /config page as Prometheus does
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695
2021-10-12 16:25:37 +03:00
Aliaksandr Valialkin
873aac584e
lib/promscrape: use Prometheus format for target labels at /targets
page
...
This should simplify copy-pasting the labels to/from PromQL / MetricsQL
2021-10-11 12:41:37 +03:00
Aliaksandr Valialkin
001750c239
lib/storage: fix unaligned access on 32-bit architectures.
...
The bug has been introduced at a171916ef5
2021-10-08 19:43:03 +03:00
Aliaksandr Valialkin
cf5cbd1c70
app/{vminsert,vmstorage}: follow-up after a171916ef5
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-10-08 14:35:49 +03:00
Nikolay
4290b46e8c
Adds read-only mode for vmstorage node ( #1680 )
...
* adds read-only mode for vmstorage
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
* changes order a bit
* moves isFreeDiskLimitReached var to storage struct
renames functions to be consistent
change protoparser api - with optional storage limit check for given openned storage
* renames freeSpaceLimit to ReadOnly
2021-10-08 14:35:48 +03:00
Ziqi Zhao
402c995d6d
fix some typos ( #1678 )
...
Co-authored-by: 柘远 <zzq237937@alibaba-inc.com>
2021-10-06 14:43:10 +03:00
Aliaksandr Valialkin
6ee66fb6b1
lib/promscrape: reduce memory allocations in mergeLabels() after 48e3e6c8df
2021-09-30 16:56:12 +03:00
Aliaksandr Valialkin
463a5bf76e
lib/protoparser: go fmt
2021-09-29 21:19:00 +03:00
Aliaksandr Valialkin
58964d52a5
lib/protoparser/prometheus: compare invalid Prometheus lines in full
2021-09-29 19:41:28 +03:00
Aliaksandr Valialkin
d80d72efec
app/{vmbackup,vmrestore}: switch from gcs://...
to gs://...
urls for backups to GCS
...
The `gs://` urls are commonly used, so prefer them instead of `gcs://` urls,
while leaving support for `gcs://` urls for backwards compatibility.
2021-09-29 12:10:29 +03:00
Nikolay
3d17112a7e
changes auth validation for openstack ( #1663 )
...
* changes auth validation for openstack
must fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1655
* Apply suggestions from code review
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-09-29 00:28:49 +03:00
Aliaksandr Valialkin
91b3c601bc
app/{vminsert,vmagent}: add ability to ingest data via DataDog "submit metrics" API
...
See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/206
2021-09-29 00:13:08 +03:00
Aliaksandr Valialkin
718eca33ab
lib/storage: properly handle {__name__=~"prefix(suffix1|suffix2)",other_label="..."}
queries
...
They were broken in the commit 00cbb099b6
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1644
2021-09-23 21:48:51 +03:00
Aliaksandr Valialkin
a0313c046b
lib/promscrape: add vm_promscrape_max_scrape_size_exceeded_errors_total
metric for counting of the failed scrapes due to the exceeded response size
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1639
2021-09-23 14:47:54 +03:00
Aliaksandr Valialkin
9ca1cbced1
lib/httpserver: add -enterprise
and/or -cluster
suffixes to short_version
label of vm_app_version
metric
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1635
2021-09-21 23:12:42 +03:00
Aliaksandr Valialkin
207c5760ce
lib/promrelabel: fix parsing regex: true
in relabeling rules
2021-09-21 23:00:53 +03:00
Nikolay
ad08d9dfc0
changes protoparser apis for accepting reading from io.Reader ( #1624 )
...
adds InsertHandlerForReader apis to vmagent
2021-09-20 14:49:28 +03:00
Nikolay
0e09fdb8b0
makes filters optional for ec2 api requests ( #1627 )
...
filters can be applied only for DescribeInstances requests, like prometheus does.
related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626
2021-09-17 18:00:37 +03:00
Aliaksandr Valialkin
8f685d81c6
lib/storage: follow up after 00cbb099b6
2021-09-14 14:16:25 +03:00
faceair
00cbb099b6
lib/storage: optimize convert multiple values regexp filter to composite tag filter ( #1610 )
...
* lib/storage: optimize convert multiple values regexp filter to composite tag filter
* Apply suggestions from code review
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-09-14 12:47:07 +03:00
Aliaksandr Valialkin
7f0a8d4bdb
docs: consistency renaming: Influx -> InfluxDB
2021-09-13 17:05:16 +03:00
Aliaksandr Valialkin
fb6ed0ce19
lib/promscrape/discovery/docker: support host networking mode
...
See https://github.com/prometheus/prometheus/issues/9116
2021-09-13 13:30:16 +03:00
Aliaksandr Valialkin
6295861acd
lib/promscrape/discovery/kubernetes: properly use https scheme for wildcard TLS certificates in ingress target discovery
2021-09-13 13:03:42 +03:00
Aliaksandr Valialkin
728c4c3841
lib/promscrape: generate scrape_timeout_seconds
metric per each scrape target in the same way as Prometheus 2.30 does
...
See https://github.com/prometheus/prometheus/pull/9247
2021-09-12 15:20:44 +03:00
Aliaksandr Valialkin
0b4eb0fa7d
lib/promscrape: make fmt
2021-09-12 13:34:15 +03:00
Aliaksandr Valialkin
48e3e6c8df
lib/promscrape: add ability to configure scrape_timeout and scrape_interval via relabeling
...
See https://github.com/prometheus/prometheus/pull/8911
2021-09-12 13:33:41 +03:00
Aliaksandr Valialkin
f3e89754a9
lib/promscrape: reduce CPU usage for common case when calculating scrape_series_added
metric
...
Also reduce CPU usage when applying `series_limit` to scrape targets with constant set of metrics.
The main idea is to perform the calculations on scrape_series_added and series_limit
only if the set of metrics exposed by the target has been changed.
Scrape targets rarely change the set of exposed metrics,
so this optimization should reduce CPU usage in general case.
2021-09-12 12:53:14 +03:00
Aliaksandr Valialkin
cebcb15ba4
lib/storage: verify that the tsidsFound contain the needed tsids in tests added at f4dead529f
2021-09-11 10:57:13 +03:00
Aliaksandr Valialkin
9286107e82
lib/promscrape: send stale markers for disappeared metrics like Prometheus does
2021-09-11 10:51:04 +03:00
Aliaksandr Valialkin
f4dead529f
lib/storage: properly search series by multiple tag filters matching empty labels such as foo{bar=~"baz|",x=~"y|"}
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1601
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395
2021-09-09 21:09:21 +03:00
Aliaksandr Valialkin
4aeb8db83f
lib/promscrape: add ability to set series_limit
and stream_parse
options via relabeling
...
This allows managing these options on a per-target basis.
Typical use case: to manage these options for pods via Kubernetes annotations.
2021-09-09 18:49:39 +03:00
Aliaksandr Valialkin
468f941f7e
lib/promscrape: add the actual job name to the labels of promscrape_series_limit_rows_dropped_total metric
2021-09-09 17:37:37 +03:00
Aliaksandr Valialkin
086b5d0cf1
lib/promscrape: add scrape_
prefix to job
and target
labels exported by promscrape_series_limit_rows_dropped_total
metric
...
This is needed in order to prevent from possible clash with the corresponding (job, target) labels for the job, which scrapes this metric.
2021-09-09 17:29:21 +03:00
Aliaksandr Valialkin
d6bd956930
lib/promrelabel: add keep_metrics
and drop_metrics
actions to relabeling rules
...
These actions simlify metrics filtering. For example,
- action: keep_metrics
regex: 'foo|bar|baz'
would leave only metrics with `foo`, `bar` and `baz` names, while the rest of metrics will be deleted.
The commit also makes possible to split long regexps into multiple lines. For example, the following config is equivalent to the config above:
- action: keep_metrics
regex:
- foo
- bar
- baz
2021-09-09 16:18:21 +03:00
Aliaksandr Valialkin
f77dde837a
lib/promscrape: add the ability to limit the number of unique series per each scrape target
...
The number of series per target can be limited with the following options:
* Global limit with `-promscrape.maxSeriesPerTarget` command-line option.
* Per-target limit with `max_series: N` option in `scrape_config` section.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1561
2021-09-01 16:03:59 +03:00
Aliaksandr Valialkin
5c63d69454
lib/promscrape/discovery/kubernetes: return back support role: endpointslices
, since it is used by VictoriaMetrics operator
...
This is a follow up commit after 31b42b30b6
2021-08-29 12:37:03 +03:00
Aliaksandr Valialkin
db330232ac
lib/protoparser/opentsdb: follow-up after 8ee75ca45a
2021-08-29 11:49:21 +03:00
envzhu
8ee75ca45a
lib/protoparser/opentsdb: accept multiple spaces between fields in a row as a deliminator. ( #1575 )
2021-08-29 11:38:32 +03:00
Aliaksandr Valialkin
31b42b30b6
lib/promscrape/discovery/kubernetes: rename role: endpointslices
to role: endpointslice
to be consistent with Prometheus
...
See 2ec6c7dbb8/discovery/kubernetes/kubernetes.go (L99)
2021-08-29 11:23:08 +03:00
Aliaksandr Valialkin
2e001db4de
lib/promscrape/discovery/kubernetes: use v1 API instead of v1beta1 API for role: ingress
and role: endpointslices
...
This should fix service discovery for these roles in Kubernetes v1.22 and newer versions.
See https://kubernetes.io/docs/reference/using-api/deprecation-guide/#ingress-v122
The corresponding change in Prometheus - https://github.com/prometheus/prometheus/pull/9205
2021-08-29 11:16:59 +03:00
Aliaksandr Valialkin
10f960fa0c
lib/promscrape: add ability to load scrape configs from multiple files
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1559
2021-08-26 08:51:16 +03:00
Aliaksandr Valialkin
c27ee35c5c
lib/promscrape: expose promscrape_discovery_http_errors_total metric for tracking errors per each http_sd config
2021-08-25 13:05:49 +03:00
Aliaksandr Valialkin
ffc0ab1774
lib/{mergeset,storage}: improve the detection of the needed free space for background merge
...
This should prevent from possible out of disk space crashes during big merges.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1560
2021-08-25 09:35:44 +03:00
Aliaksandr Valialkin
d5622b32e2
lib/promscrape: reduce memory and CPU usage when Prometheus staleness tracking is enabled for metrics from deleted / disappeared scrape targets
...
Store the scraped response body instead of storing the parsed and relabeld metrics.
This should reduce memory usage, since the response body takes less memory than the parsed and relabeled metrics.
This is especially true for Kubernetes service discovery, which adds many long labels for all the scraped metrics.
This should also reduce CPU usage, since the marshaling of the parsed
and relabeld metrics has been substituted by response body copying.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
2021-08-21 21:17:26 +03:00
Aliaksandr Valialkin
f46a73dcdd
lib/promscrape: use scrapeTimestamp when storing stale markers for failed scrape
...
This will make timestamps for stale markers more consistent for timestamps for other samples
2021-08-19 14:18:05 +03:00
Aliaksandr Valialkin
c09446a9aa
lib/promscrape: send stale markers for the previously scraped metrics on failed scrapes like Prometheus does
2021-08-18 21:59:03 +03:00
Aliaksandr Valialkin
cdc372bb98
app/vmselect: add -search.noStaleMarkers
command-line flag for disabling stale markers handling in queries
...
This option allows reducing CPU usage a bit when VictoriaMetrics is used
for collecting and processing non-Prometheus data. For example, InfluxDB line protocol, Graphite, OpenTSDB, CSV, etc.
2021-08-18 13:59:02 +03:00
Aliaksandr Valialkin
226143f31b
lib/promscrape: add ability to disable sending Prometheus staleness markers with -promscrape.disableStaleMarkers command-line flag
...
This option can be useful when vmagent consumes too much additional memory
for staleness markers functionality and when staleness markers aren't needed.
2021-08-18 13:43:21 +03:00
Aliaksandr Valialkin
03c959f1df
lib/promscrape: stop scrapers for the removed targets before starting scrapers for the added targets
...
This should prevent from possible time series overlap when old target is substituted by new target (for example, during Kubernetes deployments).
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509
2021-08-17 00:55:51 +03:00
Aliaksandr Valialkin
a0e18f06eb
lib/promscrape: restore red highlighting for DOWN targets at /targets page
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1461
2021-08-15 16:03:57 +03:00
Aliaksandr Valialkin
4401464c22
all: add support for Prometheus staleness markers
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
2021-08-13 12:10:17 +03:00
Aliaksandr Valialkin
d375d9b878
lib/envflag: add a link to docs for -envflag.enable
2021-08-11 10:29:33 +03:00
Aliaksandr Valialkin
d826352688
app/vmagent: follow-up after fe445f753b
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1491
2021-08-05 09:52:32 +03:00
Omar Ghader
46e27d60a6
feature: Add multitenant for vmagent ( #1505 )
...
* feature: Add multitenant for vmagent
* Minor fix
* Fix rcs index out of range
* Minor fix
* Fix multi Init
* Fix multi Init
* Fix multi Init
* Add default multi
* Adjust naming
* Add TenantInserted metrics
* Add TenantInserted metrics
* fix: remove unused metrics for vmagent
* fix: remove unused metrics for vmagent
Co-authored-by: mghader <marc.ghader@ubisoft.com>
Co-authored-by: Sebastian YEPES <syepes@gmail.com>
2021-08-05 09:52:31 +03:00
Aliaksandr Valialkin
50663ba41f
lib/promscrape/discovery/gce: add __meta_gce_interface_ipv4_<name> labels as in Prometheus 2.29
...
See https://github.com/prometheus/prometheus/pull/8978
2021-08-03 16:11:49 +03:00
Aliaksandr Valialkin
3cad8b4564
lib/promscrape/discovery/ec2: add __meta_ec2_availability_zone_id
label as Prometheus 2.29 does
2021-08-03 16:11:49 +03:00
Aliaksandr Valialkin
d05cac6c98
li/storage: re-use the per-day inverted index search code for searching in global index
...
This allows removing a big pile of outdated code for global index search.
This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1486
2021-07-30 10:31:37 +03:00
Aliaksandr Valialkin
8ee8660ac4
app/vmselect: follow-up for 626073bca8
...
* Rename -search.maxMetricsPointSearch to -search.maxSamplesPerQuery, so it is more consistent with the existing -search.maxSamplesPerSeries
* Move the -search.maxSamplesPerQuery from vmstorage to vmselect, so it could effectively limit the number of raw samples obtained from all the vmstorage nodes
* Document the -search.maxSamplesPerQuery in docs/CHANGELOG.md
2021-07-28 18:00:23 +03:00
Nikolay
9d45b46f4c
adds check for region with custom s3 endpoint ( #1465 )
2021-07-27 12:35:38 +03:00
Aliaksandr Valialkin
c2deee9911
lib/storage: yet another attempt to properly determine disk space shortage, which prevents from optimal merges
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-27 12:04:50 +03:00
Aliaksandr Valialkin
bb31117555
lib/promrelabel: add tests for verifying that regex works as expected in single quotes and double quotes
2021-07-27 10:50:55 +03:00
Aliaksandr Valialkin
8b7917cd81
all: add go:build
lines for Go1.17
...
See https://tip.golang.org/doc/go1.17#gofmt for more details
2021-07-26 15:48:21 +03:00
Aliaksandr Valialkin
1318736ad1
lib/promscrape: add missing whitespace at /targets page before up
word
2021-07-26 12:22:59 +03:00
Aliaksandr Valialkin
4ba3fd9e6d
lib/workingsetcache: switch from split cache to full cache after the cache size exceeds 95% of split capacity
...
Previously the switch occurred when the cache size becomes 100% of its capacity. The cache size could never reach 100% capacity.
This could prevent from switching from the split cache to full cache, thus reducing the cache effectiveness.
2021-07-15 16:12:04 +03:00
Aliaksandr Valialkin
d472b03e34
lib/storage: make sure the second call to DeduplicateSamples and deduplicateSamplesDuringMerge doesnt change samples
2021-07-15 12:17:45 +03:00
Aliaksandr Valialkin
682662b2ae
lib/storage: remove cache directory if it contains reset_cache_on_startup file
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447
2021-07-13 17:58:51 +03:00
Aliaksandr Valialkin
2df66dad7b
lib/httpserver: add is_set
label to flag
metrics
...
This label allows determining the set flags with the query `flag{is_set="true"}`
2021-07-13 15:10:13 +03:00
Aliaksandr Valialkin
f9de546139
lib/storage: reset perKeyMisses stats less frequently
...
This should reduce CPU usage for queries executed with intervals higher than 30 seconds
2021-07-12 14:33:42 +03:00
Aliaksandr Valialkin
4f80b2f230
lib/storage: properly limit the size of storage/date_metricID
cache
2021-07-12 14:25:44 +03:00
Aliaksandr Valialkin
8ca2799478
lib/storage: properly determine when the deduplication is needed in needsDedup
...
Previously needsDedup() could return true if the de-duplication wasn't needed for the following case:
d < interval
/ \
| v | v |
interval interval
Now it properly returns false for this case
2021-07-12 10:53:30 +03:00
Aliaksandr Valialkin
6e0553c92e
lib/mergeset: cache indexBlock items only on the second request
...
This should reduce the indexdb/indexBlocks cache size, since it won't contain one-time-wonders items.
2021-07-07 15:23:06 +03:00
Aliaksandr Valialkin
766edbc421
lib/httpserver: print full requestURI in httpserver.Errorf
...
This should simplify debugging.
2021-07-07 13:09:40 +03:00
Aliaksandr Valialkin
e843bd7bd7
lib/storage: do not cache inmemoryBlock entries requested only once (aka one-time-wonder items)
...
This should reduce the cache size and memory usage for the indexdb/dataBlocks cache
2021-07-07 10:58:51 +03:00
Aliaksandr Valialkin
8b262d4ba7
lib/storage: periodically reset prefetchedMetricIDs cache in order to limit its size under high churn rate
2021-07-07 10:58:51 +03:00
Aliaksandr Valialkin
a7694092b8
Revert "lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method"
...
This reverts commit 7c6d3981bf
.
Reason for revert: high contention at bucket16Pool on systems with big number of CPU cores.
This slows down query processing significantly.
2021-07-06 18:21:35 +03:00
Aliaksandr Valialkin
8aa9bba9bd
lib/{mergeset,storage}: switch from sync.Pool to chan-based pool for inmemoryPart objects
...
This should reduce memory usage on systems with big number of CPU cores,
since every inmemoryPart object occupies at least 64KB of memory and sync.Pool maintains
a separate pool inmemoryPart objects per each CPU core.
Though the new scheme for the pool worsens per-cpu cache locality, this should be amortized
by big sizes of inmemoryPart objects.
2021-07-06 16:28:41 +03:00
Aliaksandr Valialkin
7c6d3981bf
lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method
...
This reduces the load on memory allocator in Go runtime in production workload.
2021-07-06 15:35:03 +03:00
Aliaksandr Valialkin
78c9174682
lib/mergeset: increase pool capacity for inmemoryBlock according to collected profiles from production workload
...
CPU and memory profiles show that the pool capacity for inmemoryBlock objects is too small.
This results in the increased load on memory allocation code in Go runtime.
Increase the pool capacity in order to reduce the load on Go runtime.
2021-07-06 13:41:34 +03:00
Aliaksandr Valialkin
f71e4d1853
lib/mergeset: limit the frequency for flushCallback calls to once per 10 seconds
...
This should improve hit ratio for tagFiltersCache when big number of new time series are constantly registered
(aka high churn rate). This, in turn, should reduce CPU usage for queries over such time series.
2021-07-06 12:17:17 +03:00
Aliaksandr Valialkin
f3acf065c9
lib/storage: consistency renaming: tagCache -> tagFiltersCache
...
This improves code readability
2021-07-06 11:03:51 +03:00
Aliaksandr Valialkin
0020b9f904
lib/workingsetcache: properly update stats for requests and cache misses
...
Previously the stats for cache misses could be improperly counted, because it had inflated cache misses
if the entry was missing in the curr cache, but was existing in the prev cache.
The same applies to cache requests - they were inflated if the entry was missing in the curr cache.
2021-07-06 10:53:32 +03:00
Aliaksandr Valialkin
4cf47163c1
lib/workingsetcache: fix cache capacity calculations after 4f0003f182
2021-07-05 17:11:57 +03:00
Aliaksandr Valialkin
4f0003f182
lib/workingsetcache: typo fixes after d0c830039d
2021-07-05 15:35:37 +03:00
Aliaksandr Valialkin
d0c830039d
lib/storage: tune cache sizes according to production workload
2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
8f973e34fb
lib/workingsetcache: properly switch to whole
mode
...
Previously the switch from `split` to `whole` mode had been performed too early,
e.g. when the current cache size became bigger than 1/4 of the allowed cache size.
Now it is performed when the current cache size becomes bigger than 1/2 of the allowed cache size.
This change can reduce memory usage for data ingestion path when big number of active time series are ingested.
2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
43103be011
lib/{storage,mergeset}: increase cache timeout for data and index blocks from a minute to two minutes
...
One minute cache timeout result in slower queries in some production workloads where the interval
between query execution is in the range 1 minute - 2 minutes.
2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
54b9e1d3cb
lib/cgroup: set GOGC to 50 by default if it isn't set
...
This should reduce memory usage for typical VictoriaMetrics workloads by up to 50%
2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
9a83e9018d
lib/storage: properly detect free disk space shortage during data merge
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-02 17:40:54 +03:00
Aliaksandr Valialkin
7088f17494
lib/promscrape/discovery/consul: use case-insensitive comparison for service names
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1424
2021-07-02 14:49:27 +03:00
Aliaksandr Valialkin
6e406083f2
lib/promauth: cache the client TLS certificate for up to a second
...
This should reduce CPU usage when TLS connections are established at a high rate.
2021-07-02 13:21:51 +03:00
Aliaksandr Valialkin
158c50c0ee
lib/promauth: reload TLS certificates from disk on every mTLS connection as Prometheus does
...
This allows updating client certificates without the need to restart vmagent and/or single-node VictoriaMetrics.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1420
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470
2021-07-01 15:27:47 +03:00
Aliaksandr Valialkin
c25b839078
lib/workingsetcache: reset the cache mode when the cache is reset
...
This should reduce memory usage if the working set is reduced after the cache reset.
2021-07-01 11:50:11 +03:00
Nikolay
ae485c2bfd
fixes /targets button style ( #1423 )
...
* fixes /targets button style
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1422
* updates boostrap version
2021-07-01 11:48:07 +03:00
Aliaksandr Valialkin
c93cee8de8
lib/{mergeset,storage}: reduce the maximum lifetime for cached indexdb and data blocks from 2 minutes to a minute
...
This should reduce memory usage on a system with high number of active time series and a high churn rate.
One minute is enough for caching the blocks needed for repeated queries (e.g. alerting rules, recording rules and dashboard refreshes).
2021-06-29 19:57:07 +03:00
Aliaksandr Valialkin
fc12484734
lib/mergeset: switch from sync.Pool to a channel for a pool for inmemoryBlock structs
...
This should reduce memory usage for the pool on systems with big number of CPU cores.
The sync.Pool maintains per-CPU pools, so the total number of objects in the pool
is proportional to the number of available CPU cores. The channel limits the number
of pooled objects by its own capacity. This means smaller number of pooled objects on average.
2021-06-29 19:56:59 +03:00
Aliaksandr Valialkin
9ce211a514
lib/promscrape/discovery/docker: fix golint warning: struct field Id should be ID
2021-06-29 13:12:28 +03:00
Aliaksandr Valialkin
5506cff76e
lib/storage: put indexDBName into the key for dateTagFilter cache and for uselessTagFilters cache
...
This should prevent from stats overwriting when the previous indexdb is queried.
2021-06-29 12:40:05 +03:00
Aliaksandr Valialkin
1b0501a09e
lib/promscrape: typo fix in /targets
output
...
The typo has been introduced in fb72a2133f
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1408
2021-06-28 21:26:37 +03:00
Aliaksandr Valialkin
cb5453953f
lib/promscrape: split docker and dockerswarm service discovery code bases, since they have very little in common
...
This is a follow up after c85a5b7fcb
2021-06-25 13:20:20 +03:00
Aliaksandr Valialkin
a69045e440
lib/promscrape: consistently sort service discovery routines
...
This should simplify further maintenance of the code
2021-06-25 12:10:46 +03:00
Lu Jiajing
c85a5b7fcb
Support Docker ServiceDiscovery ( #1402 )
...
* add docker discovery
* add test
* add labels test and add scrape work
* remove TODO
* refactor to merge apiConfig and sdConfig
* apply suggestion
2021-06-25 11:42:47 +03:00
Nikolay
434e33da9b
adds missing MustStop call to do and http sd ( #1404 )
2021-06-25 11:39:18 +03:00
Aliaksandr Valialkin
c22114c6f0
lib/storage: tune tag filters search logic
...
Tune the logic according to the logs provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-864293624
The previous logic had a race when multiple concurrent queries execute the same tag filter without prior stats.
This could result in incorrectly stored stats for such tag filter, which then could result in non-optimal sorting of tag filters
for further queries.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-23 13:29:39 +03:00
Aliaksandr Valialkin
e8a5bb92b7
lib/promscrape/discovery/consul: properly pass namespace to Consul watcher
...
Follow-up for 58a2989fe7
2021-06-22 17:42:41 +03:00
Aliaksandr Valialkin
ac54f34f9e
lib/promscrape/discovery/http: follow up after e307bbb29a
2021-06-22 13:40:33 +03:00
Aliaksandr Valialkin
755040a171
lib/promscrape/discovery: support generic auth configs in Consul service discovery in the same way as Prometheus 2.28 does
2021-06-22 13:34:02 +03:00
Nikolay
e307bbb29a
adds http_sd ( #1399 )
...
* adds http_sd
* adds X-Prometheus-Refresh-Interval-Seconds header
* Update lib/promscrape/discovery/http/api.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 13:33:37 +03:00
Nikolay
58a2989fe7
adds consul enterprise namespace support ( #1400 )
...
* adds consul enterprise namespace support
* Update lib/promscrape/discovery/consul/consul.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 12:49:44 +03:00
Aliaksandr Valialkin
fb72a2133f
lib/promscrape: show jobs with empty scrape targets on /targets page
2021-06-18 10:53:52 +03:00
Nikolay
6c434b260e
fixes DO service discovery labels ( #1389 )
...
adds test for digitalocean sd
2021-06-17 15:12:20 +03:00
Aliaksandr Valialkin
dcbc22552f
lib/storage: fix infinite loop introduced in aa9b56a046
2021-06-17 14:28:10 +03:00
Aliaksandr Valialkin
aa9b56a046
lib/{mergeset,storage}: reduce the number of fsync calls on data ingestion path on systems with many cpu cores
...
VictoriaMetrics maintains a buffer per CPU core for the ingested data. These buffers are flushed to disk every second.
These buffers are flushed to disk in parallel starting from the commit 56b6b893ce
.
This resulted in increased write disk IO usage on systems with many cpu cores
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-863046999 .
This commit merges the per-CPU buffers into bigger in-memory buffers before flushing them to disk.
This should reduce the rate of fsync syscalls and, consequently, the write disk IO on systems with many CPU cores.
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 13:52:08 +03:00
Aliaksandr Valialkin
84fb59b0ba
lib/storage: move deletedMetricIDs set from indexDB to Storage
...
This makes consitent the list of deleted metricIDs when it is used from both the current indexDB and the previous indexDB (aka extDB).
This should fix the issue, which could lead to storing new samples under deleted metricIDs after indexDB rotation.
See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136 .
Thanks to @tangqipengleoo for the initial analysis and the pull request - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .
This commit resolves the issue in more generic way compared to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .
The downside of the commit is the deletedMetricIDs set isn't cleaned from the metricIDs outside the retention. It needs app restart.
This should be OK in most cases.
2021-06-15 15:04:30 +03:00
Aliaksandr Valialkin
e028ad241a
lib/protoparser: stop reading the input stream as soon as the callback provided by the caller returns error
...
This is a follow-up for af90c3c43b
2021-06-14 15:18:49 +03:00
faceair
af90c3c43b
lib/protoparser: stop read when callback error ( #1380 )
2021-06-14 15:10:58 +03:00
Aliaksandr Valialkin
36d55bff66
lib/promscrape: show the number of samples collected during the last scrape at /targets and /api/v1/targets pages
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1377
2021-06-14 14:04:00 +03:00
Nikolay
729c4eeb9c
adds digital ocean sd ( #1376 )
...
* adds digital ocean sd config
* adds digital ocean sd
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1367
* typo fix
2021-06-14 13:15:04 +03:00
Aliaksandr Valialkin
06b8e7d148
lib/promscrape: increase the duration for reading the full response in stream parsing mode
...
Increase the duration from 10x to 30x of the configured `scrape_interval'.
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:28:09 +03:00
Aliaksandr Valialkin
48210130ac
lib/protoparser: measure the duration for reading the whole block of data instead of a single read operation
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:25:52 +03:00
Aliaksandr Valialkin
3c4366806c
lib/protoparser/common: log the duration for reading a block of data in ReadLinesBlockExt on error
...
This may help debugging issues like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:22:04 +03:00
Aliaksandr Valialkin
ed83558646
app/vmauth: properly handle http.ErrAbortHandler panic
...
This panic can be raised by the reverseProxy on aborted request to the backend.
So handle it (e.g. suppress) at reverseProxy.ServeHTTP call.
Do not suppress the panic at lib/httpserver generic HTTP handler,
since it may result in an inconsistent state left after the panicking handler.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1353
2021-06-11 12:50:25 +03:00
Aliaksandr Valialkin
c4f3fbfa5d
lib/storage: reset cache on disk during series deletion and during indexdb rotation
...
This should prevent from inconsistent behavior (aka partially missing data for some time series) after unclean shutdown.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347
2021-06-11 12:42:28 +03:00
Aliaksandr Valialkin
69b1482bdb
lib/storage: consistency renaming: getMaxRawRowsPerPartition -> getMaxRawRowsPerShard
2021-06-11 10:57:23 +03:00
Aliaksandr Valialkin
044ab46824
lib/storage: reduce the amounts of memory which can be occupied by rawRow items during data ingestion on a system with many CPU cores
2021-06-11 10:57:23 +03:00
Nikolay
6b29b955c0
disables panic for net/httpAbortHandler ( #1355 )
2021-06-09 12:08:58 +03:00
Aliaksandr Valialkin
96b691a0ab
lib/storage: properly account the number of loops spent when matching for or suffixes
...
This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-08 13:06:12 +03:00
Aliaksandr Valialkin
661d2668f8
lib/promrelabel: add tests for labelsToString() function
2021-06-04 20:42:46 +03:00
Aliaksandr Valialkin
78f83dc5ad
app/{vmagent,vminsert}: follow-up after 2fe045e2a4
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1343
2021-06-04 20:27:58 +03:00
jelmd
2fe045e2a4
new feature: debug relabeling ( #1344 )
...
* new feature: relabel logging
Use scrape_configs[x].relabel_debug = true to log metric names inkl.
labels before and after relabeling. After relabeling related metrics
get dropped, i.e. not submitted to servers.
* vminsert wants relabel logging, too.
2021-06-04 17:50:23 +03:00
Hason Chan
6f19bb23a1
fix eureka_sd_configs HTTPClientConfig incorrect parsing ( #1350 )
2021-06-04 11:47:17 +03:00
Aliaksandr Valialkin
2d8bd41f8a
lib/storage: reduce memory allocations when syncing dateMetricIDCache
2021-06-03 16:20:42 +03:00
Nikolay
ddc8022702
fixes solaris build ( #1345 )
2021-05-31 09:21:23 +03:00
Aliaksandr Valialkin
a52a20659a
lib/promscrape: fix tests after f0c21b6300
2021-05-28 01:32:50 +03:00
Aliaksandr Valialkin
d088923aef
Revert "lib/mergeset: remove a pool for inmemoryBlock structs"
...
This reverts commit 793fe39921
.
Reason to revert: production testing revealed possible slowdown when registering big number of new time series
2021-05-28 01:09:32 +03:00
Aliaksandr Valialkin
793fe39921
lib/mergeset: remove a pool for inmemoryBlock structs
...
The pool for inmemoryBlock struct doesn't give any performance gains in production workloads,
while it may result in excess memory usage for inmemoryBlock structs inside the pool during
background merge of indexdb.
2021-05-27 21:57:33 +03:00
Aliaksandr Valialkin
60341722d5
docs: document f0c21b6300
2021-05-27 15:03:30 +03:00
faceair
f0c21b6300
lib/promscrape: apply body size & sample limit to stream parse ( #1331 )
...
* lib/promscrape: apply body size limit to stream parse
Signed-off-by: faceair <git@faceair.me>
* lib/promscrape: apply sample limit to stream parse
Signed-off-by: faceair <git@faceair.me>
2021-05-27 14:52:44 +03:00
Aliaksandr Valialkin
2233d6ed8a
lib/uint64set: store pointers to bucket16 instead of bucket16 objects in bucket32
...
This speeds up bucket32.addBucketAtPos() when bucket32.buckets contains big number of items,
since the copying of bucket16 pointers is much faster than the copying of bucket16 objects.
This is a cpu profile for copying bucket16 objects:
10ms 13.43s (flat, cum) 32.01% of Total
10ms 120ms 650: b.b16his = append(b.b16his[:pos+1], b.b16his[pos:]...)
. . 651: b.b16his[pos] = hi
. 13.31s 652: b.buckets = append(b.buckets[:pos+1], b.buckets[pos:]...)
. . 653: b16 := &b.buckets[pos]
. . 654: *b16 = bucket16{}
. . 655: return b16
. . 656:}
This is a cpu profile for copying pointers to bucket16:
10ms 1.14s (flat, cum) 2.19% of Total
. 100ms 647: b.b16his = append(b.b16his[:pos+1], b.b16his[pos:]...)
. . 648: b.b16his[pos] = hi
10ms 700ms 649: b.buckets = append(b.buckets[:pos+1], b.buckets[pos:]...)
. 330ms 650: b16 := &bucket16{}
. . 651: b.buckets[pos] = b16
. . 652: return b16
. . 653:}
2021-05-25 14:20:52 +03:00
Aliaksandr Valialkin
39ef1e7a51
lib/storage: do not stop data ingestion on the first error in Storage.AddRows
...
Continue data ingestion for the rest of blocks.
2021-05-24 15:32:47 +03:00
Aliaksandr Valialkin
4b01c9fb2e
lib/storage: limit the number of rows per each block in Storage.AddRows()
...
This should reduce memory usage when ingesting big blocks or rows.
2021-05-24 15:24:07 +03:00
Aliaksandr Valialkin
a4ff4b8e65
lib/storage: allow filling all the rows up to their capacity in rawRowsShard.addRows
...
This should reduce memory usage a bit on data ingestion path
2021-05-24 15:22:59 +03:00
Aliaksandr Valialkin
a46551245c
lib/bloomfilter: fix TestLimiterConcurrent
2021-05-24 05:17:36 +03:00
Aliaksandr Valialkin
93d81b486d
lib/fs: do not pass done callback to tryRemoveAll() func
...
This improves code readability a bit.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-24 04:51:57 +03:00
Aliaksandr Valialkin
f54133b200
lib/storage: do not populate MetricID->MetricName cache during data ingestion
...
This cache isn't needed during data ingestion, so there is no need in spending RAM on it.
This reduces RAM usage on data ingestion path by 30%
2021-05-24 03:02:46 +03:00
Aliaksandr Valialkin
ec79abc382
lib/{mergeset,storage}: reduce the number of IFNO log messages like merged ... items across ... blocks in ... seconds
...
Log these messages if the merge takes more than 30 seconds instead of 10 seconds.
2021-05-23 14:03:21 +03:00
Aliaksandr Valialkin
78dddfb98f
lib/promauth: follow-up after 5b8176c68e
2021-05-22 18:01:11 +03:00
Nikolay
5b8176c68e
basic OAuth2 support for remoteWrite and scrape targets ( #1316 )
...
* adds OAuth2 support for remoteWrite and scrapping
* adds tests
changes init
2021-05-22 16:20:18 +03:00
Aliaksandr Valialkin
e05dd475f0
lib/fs: concurrently remove up to 1024 blocked NFS directories
...
Previously the blocked directories were removed sequentially by a single goroutine.
This can be not enough for highly loaded VictoriaMetrics that accepts millions of sample per second,
when big number of LSM parts are created and removed at high rate.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-21 17:57:46 +03:00
Aliaksandr Valialkin
8e2985b53d
lib/fs: wait for a while before giving up on NFS file removal if the removal queue is full
...
This should reduce the probability of the panic on a highly loaded VictoriaMetrics
accepting millions of samples per second.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-21 17:21:00 +03:00
Aliaksandr Valialkin
c54bb73867
all: do not skip SIGHUP signal during service initialization
...
This can lead to stale or incomplete configs like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-21 16:34:06 +03:00
Aliaksandr Valialkin
4c7bb75fa2
Makefile: update golangci-lint from v1.29.0 to v1.40.1
2021-05-20 18:27:10 +03:00
Aliaksandr Valialkin
e394ff6466
app/vmagent/remotewrite: expose metrics with the current number of active series per day and per hour
...
These numbers are exposed via the following metrics:
- vmagent_hourly_series_limit_current_series
- vmagent_daily_series_limit_current_series
Expose also the limits via the following metrics:
- vmagent_hourly_series_limit_max_series
- vmagent_daily_series_limit_max_series
2021-05-20 15:28:09 +03:00
Aliaksandr Valialkin
ad73f226ff
app/vmstorage: add ability to limit series cardinality via -storage.maxHourlySeries
and -storage.maxDailySeries
command-line flags
2021-05-20 14:15:19 +03:00
Aliaksandr Valialkin
7e526effaa
app/vmagent: add ability to limit series cardinality on a per-hour and per-day basis
2021-05-20 13:13:40 +03:00
Aliaksandr Valialkin
22585531ad
lib/promscrape/discovery/kubernetes: make golangci-lint
happy by removing empty branches
2021-05-20 12:00:29 +03:00
Aliaksandr Valialkin
009e136d88
lib/storage: remove possible data race when logging dropped labels
2021-05-20 02:47:22 +03:00
Aliaksandr Valialkin
eb8093ca6b
lib/promscrape/discovery/kubernetes: reload objects on object parse error
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 23:25:48 +03:00
Aliaksandr Valialkin
f4719889da
lib/httpserver: typo fix in -http.shutdownDelay
command-line flag description: servier -> server
2021-05-18 16:26:16 +03:00
Aliaksandr Valialkin
0cf1fe19e6
lib/promscrape/discovery/kubernetes: simplify the reload logic for urlWatcher.objectsByKey
2021-05-18 15:40:46 +03:00
Aliaksandr Valialkin
bfb61de606
lib/promscrape/discovery/kubernetes: properly update vm_promscrape_discovery_kubernetes_scrape_works metric
...
Previously it wasn't descreased during config update.
2021-05-18 15:33:45 +03:00
Aliaksandr Valialkin
3166994244
lib/promscrape/discovery/kubernetes: log errors and stop service discovery when unexpected updates are received from Kubernetes API server
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 15:10:48 +03:00
Aliaksandr Valialkin
6c944b86d8
docs: dealay -> delay
...
Thanks to @jelmd . See 0b7e3510c8 (r50884991)
2021-05-18 01:07:52 +03:00
Aliaksandr Valialkin
ec6a284978
lib/promrelabel: add tests for conditional removal of label on another label match
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1294
2021-05-18 00:23:03 +03:00
Aliaksandr Valialkin
2eed410466
lib/promscrape/discovery/kubernetes: key ScrapeWork objects by urlWatcher instead of namespace
...
This makes the code less fragile if urlWatcher would depend on additional to namepsace properties.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-17 23:47:08 +03:00
Aliaksandr Valialkin
733706e6c6
lib/promscrape: reload auth tokens from files every second
...
Previously auth tokens were loaded at startup and couldn't be updated without vmagent restart.
Now there is no need in vmagent restart.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1297
2021-05-14 20:00:08 +03:00
Aliaksandr Valialkin
10a47af631
app/{vmalert,vmauth}: explicitly set MaxIdleConnsPerHost in net/http.Client.Transport
...
By default MaxIdleConnsPerHost is set to 2. This limits the possibility to re-use http keep-alive connections.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1300
2021-05-14 18:12:24 +03:00
Aliaksandr Valialkin
fc3519fa26
lib/promscrape: limit scrape_timeout
by scrape_interval
like Prometheus does
2021-05-13 16:09:45 +03:00
匠心零度
b4f5be8bd8
fix vagent imbalance problem ( #1292 )
...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=0 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=1 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=2 -promscrape.config=/path/to/config.yml ...
Co-authored-by: lirenzuo <lirenzuo@shein.com>
2021-05-13 11:14:51 +03:00
Aliaksandr Valialkin
e6fda03e8f
vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.14 to v1.0.15
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:44:14 +03:00
Aliaksandr Valialkin
f6a641de62
lib/promscrape: exponentially increase retry interval on unsuccesful requests to scrape targets or to service discovery services
...
This should reduce CPU load at vmagent and at remote side when the remote side doesn't accept HTTP requests.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:38:50 +03:00
Aliaksandr Valialkin
c0ec541559
lib/cgroup: document the ability to detect cgroup v2 memory and cpu limits. This is follow-up for b50024812e
2021-05-13 09:26:20 +03:00
Aliaksandr Valialkin
d7be2753c0
lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss
2021-05-13 09:02:33 +03:00
Nikolay
b50024812e
adds cgroupsv2 support ( #1283 )
...
* adds cgroupv2 limits support
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1269
* small fix
* changes Atoi to ParseUint
2021-05-13 09:02:13 +03:00
Aliaksandr Valialkin
a22a17dc66
lib/storage: merge getTSDBStatusForDate with getTSDBStatusWithFiltersForDate
...
These functions are non-trivial, while their code has minimal differences.
It is better from maintainability PoV to merge these functions into a single function.
2021-05-12 17:56:53 +03:00
Aliaksandr Valialkin
832651c6c2
app/vmselect: follow up after 8a0678678b
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1168
2021-05-12 17:18:30 +03:00
Nikolay
8a0678678b
Adds tsdb match filters ( #1282 )
...
* init work on filters
* init propose for status filters
* fixes tsdb status
adds test
* fix bug
* removes checks from test
2021-05-12 15:18:45 +03:00
Aliaksandr Valialkin
cfd6aa28e1
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices scrape targets every 5 seconds, since they may depend on changed service and pod objects
...
This should make endpoints and endpointslices scrape targets eventually consistent with the maximum delay of 5 seconds after the related service or pod object changes.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-12 14:10:34 +03:00
Aliaksandr Valialkin
afec68ad13
lib/httpserver: add new X-Server-Hostname header instead of overwriting already exsiting header
...
This makes possible tracking origins of chained requests over multiple hops.
2021-05-11 23:48:59 +03:00
Aliaksandr Valialkin
f2d5c4e2d0
lib/httpserver: return X-Server-Hostname http header in all the responses for better debuggability
2021-05-11 22:03:48 +03:00
Aliaksandr Valialkin
33f7bacb01
lib/storage: properly apply time range when matching an empty filter
...
It must match all the time series on the given time range.
Previously it was matched to all the time series without the restriction on the given time range.
2021-05-11 01:12:05 +03:00
Aliaksandr Valialkin
75c2c813fc
lib/storage: remove dead code after the commit 3ccf7ea20c
2021-05-08 20:14:11 +03:00
Aliaksandr Valialkin
7aea5f58c4
lib/ingestserver: properly close incoming connections during graceful shutdown
2021-05-08 19:52:58 +03:00
Aliaksandr Valialkin
12d733dd5d
app/vminsert: add support for data ingestion via other vminsert nodes
2021-05-08 19:52:57 +03:00
Aliaksandr Valialkin
904bbffc7f
lib/promscrape/discovery/kubernetes: start watchers for pods and services before starting watchers for endpoints
...
This should eliminate possible race when an update on endpoints depends on pods and/or services, which are missing in the cache yet.
This could result in missing targets based on endpoints or endpointslices.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-05 12:23:50 +03:00
Aliaksandr Valialkin
76027093ca
lib/storage: use WARNING instead of INFO level for logging dropped labels
2021-05-03 13:56:43 +03:00
Aliaksandr Valialkin
ce9e163e94
lib/httpserver: stop the process on panics in request handlers
...
Panics may leave the process in inconsistent state. That's why it is better to stop the process after the panic
instead of recovering from the panic. Unfortunately, the standard net/http.Server recovers panics in request handlers.
See https://github.com/golang/go/issues/16542 . That's lib/httpserver must stop the process on itself after the panic.
2021-05-03 11:59:40 +03:00
Nikolay
477369b62f
adds stalePartsRemover ( #1261 )
...
for new created partitions
2021-05-03 11:34:00 +03:00
Aliaksandr Valialkin
9a930720b3
lib/promrelabel: add tests for removing the specified {label="value"} pair
2021-05-03 11:26:53 +03:00
Aliaksandr Valialkin
58d1e6eeea
lib/storage: log dropped labels if the number of labels in a metric exceeds -maxLabelsPerTimeseries
command-line flag value
...
This should improve debuggability for this case.
2021-05-01 09:27:55 +03:00
Aliaksandr Valialkin
6fa5981e68
app/vmagent: list user-visible endpoints at
http://vmagent:8429/
...
While at it, use common WriteAPIHelp function for the listing in vmagent, vmalert and victoria-metrics
2021-04-30 09:36:43 +03:00
Aliaksandr Valialkin
2ab1266593
lib/promscrape/discovery/kubernetes: remove a mutex at urlWatcher - use groupWatcher mutex for accessing all the urlWatcher children
...
This simplifies the code a bit and reduces the probability of improper mutex handling and deadlocks.
2021-04-29 10:14:26 +03:00
Nikolay
4e5a88114a
vmagent kubernetes_sd tests ( #1253 )
...
* first part of tests for kubernetes sd
* makes linter happy
* added more test cases
* adds pub/sub for tests
2021-04-29 10:10:24 +03:00
Aliaksandr Valialkin
87179c6839
lib/{storage,mergeset}: fix unaligned 64-bit atomic operation
panic for 32-bit architectures
...
The panic has been introduced in 56b6b893ce
2021-04-27 16:41:32 +03:00
Aliaksandr Valialkin
56b6b893ce
lib/mergeset: split rows ingestion among multiple shards
...
This improves rows ingestion on systems with many CPU cores by reducing lock contention.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
Thanks to @waldoweng for the original idea and draft implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1243
2021-04-27 15:36:34 +03:00
Aliaksandr Valialkin
2ec7d8b384
lib/promscrape/discovery/kubernetes: fix a deadlock introduced in eddba29664
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
Thanks to @f41gh7 for providing the initial idea for deadlock fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1248
2021-04-27 14:57:51 +03:00
Aliaksandr Valialkin
f89c1f7f49
lib/storage: typo fix in info message when deleting the part outside the configured retention
...
Previously the message was displaying incorrect retention time
2021-04-27 13:32:46 +03:00
Aliaksandr Valialkin
60947fb2d5
lib/persistentqueue: eliminate possible data race when obtaining vm_persistentqueue_bytes_pending metric value
2021-04-27 00:25:52 +03:00
Aliaksandr Valialkin
908e35affd
lib/promscrape: apply scrape_timeout
on receiving the first response byte for stream_parse: true
scrape targets
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017#issuecomment-767235047
2021-04-23 21:53:35 +03:00
Aliaksandr Valialkin
eddba29664
lib/promscrape/discovery/kubernetes: refresh role: endpoints
targets on service object removal as Prometheus does
...
This is a follow-up for ae37cfd528
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:26:57 +03:00
Aliaksandr Valialkin
ae37cfd528
lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices targets on service object update like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:11:40 +03:00
Aliaksandr Valialkin
bbebdf9ba1
lib/{storage,mergeset}: remove empty directories on startup. Such directories can be left after unclean shutdown on NFS storage
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1142
2021-04-22 13:02:44 +03:00
Aliaksandr Valialkin
6bc52fe41a
all: rename https://victoriametrics.github.io to https://docs.victoriametrics.com
2021-04-20 20:16:17 +03:00
Aliaksandr Valialkin
9b1999a5ff
lib/uint64set: remove memory allocation in getOrCreateSmallPool()
...
This partially reverts fb82c4b9fa
It has been appeared that the additional memory allocation may result in higher GC pauses.
It is better to spend CPU time on copying bigger bucket16 structs instead of increasing query latencies due to higher GC pauses
2021-04-14 12:30:19 +03:00
Aliaksandr Valialkin
251747e253
lib/storage: code clarification: remove caching the found metricName in searchMetricName
2021-04-13 10:22:21 +03:00
Artem Navoiev
77be3e3a82
improve docs for cli flags ( #1202 )
...
* improve docs for cli flags
* improve docs for cli flags.2
2021-04-12 12:28:04 +03:00
Aliaksandr Valialkin
3f0bcbe067
lib/promscrape: create a single swosFunc
per scrape_config
2021-04-08 09:31:48 +03:00
Aliaksandr Valialkin
5a0938d807
lib/promscrape: do not spend CPU time on constructing scrapeWork key if clustering is disabled
2021-04-07 21:54:22 +03:00
Aliaksandr Valialkin
df32d2836c
lib/storage: properly handle big time ranges passed to /api/v1/labels
and /api/v1/label/<labelName>/values
...
It should be faster querying all the labels and/or all the values instead of querying per-day labels/values on time ranges exceeding maxDaysForPerDaySearch
2021-04-07 13:33:46 +03:00
Aliaksandr Valialkin
59f9960992
lib/promscrape/discovery: remove superflouos check in registerPendingAPIWatchers
...
The check `_, ok := uw.aws[aw]; !ok` isn't needed, since aw cannot exist in uw.aws
because of the check inside subscribeAPIWatcher
2021-04-07 13:07:39 +03:00
Aliaksandr Valialkin
3ec6639bbb
lib/promscrape/discovery/kubernetes: register pending apiWatchers in uw.aws
2021-04-06 11:12:13 +03:00
Aliaksandr Valialkin
5f593b0ed3
lib/promscrape/discovery/kubernetes: remove superflouos mustStart and mustStop functions
2021-04-05 22:44:12 +03:00
Lu Jiajing
b59164cf33
fix access to nil *url.URL ( #1180 )
...
* fix access to nil *url.URL
Signed-off-by: Megrez Lu <lujiajing1126@gmail.com>
* Update lib/promscrape/discovery/kubernetes/api_watcher.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-05 22:25:31 +03:00
Aliaksandr Valialkin
b46194472f
lib/promscrape/discovery/kubernetes: reduce CPU time spent on registering big number of Kubernetes objects shared among big number of scrape jobs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 22:04:30 +03:00
Aliaksandr Valialkin
a51d0ec6ec
lib/promscrape/discovery/kubernetes: load objects missing in local cache from api seriver in getObjectByRole()
...
This should fix possible race for `role: endpoints` and `role: endpointslices` service discovery,
when the referred `pod` and `service` objects aren't propagated to urlWatcher cache yet.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182#issuecomment-813353359 for details.
2021-04-05 20:31:17 +03:00
Aliaksandr Valialkin
95dbebf512
lib/persistentqueue: delete corrupted persistent queue instead of throwing a fatal error
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1030
2021-04-05 19:26:11 +03:00
Aliaksandr Valialkin
f010d773d6
lib/promscrape/discovery/kubernetes: synchronously load Kubernetes objects on first access
...
Remove async registration of apiWatchers, since it breaks discovering `role: endpoints` and `role: endpointslices` targets,
which depend on pod and service objects.
There is no need in reloading `endpoints` and `endpointslices` targets if the referenced `pod` or `service` objects change,
since in this case the corresponding `endpoints` and `endpointslices` objects should also change because they contain
ResourceVersion of the referenced `pod` or `service` objects, which is modified on object update.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 14:20:12 +03:00
Aliaksandr Valialkin
2c25b0322b
lib/proxy: typo fix after a5c5b54c22
2021-04-05 13:45:24 +03:00
Aliaksandr Valialkin
a5c5b54c22
lib/proxy: add support for socks5 over tls
proxy
2021-04-05 13:00:05 +03:00
Aliaksandr Valialkin
6742839fd6
lib/promscrape: pass X-Prometheus-Scrape-Timeout-Seconds
header to scrape targets as Prometheus does
2021-04-05 12:15:24 +03:00
Nikolay
a4c6a3b3e1
adds socks5 support for fasthttp client ( #1178 )
...
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1177
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-04 01:29:54 +03:00
Aliaksandr Valialkin
500e625e8c
lib/promscrape: properly send full url in GET
request via simple HTTP proxy
...
This is a follow-up for a0ae0f86666a75ec57b45eab2429da7ab4a7b250
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-04 01:20:06 +03:00
Aliaksandr Valialkin
5153410ced
lib/promscrape: support for simple HTTP proxies without CONNECT
method support such as https://github.com/prometheus-community/PushProx
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-04 00:40:40 +03:00
Aliaksandr Valialkin
4c56b1a6dd
lib/promscrape: add tests for authorization
config, which has been added in df148f48b7
2021-04-03 22:13:22 +03:00
Aliaksandr Valialkin
43f9842b6f
lib/proxy: log response body on non-200 response code
...
This should improve debuggability for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-03 03:03:55 +03:00
Aliaksandr Valialkin
df148f48b7
lib/promscrape: add support for authorization
config in -promscrape.config
as Prometheus 2.26 does
...
See https://github.com/prometheus/prometheus/pull/8512
2021-04-02 21:17:45 +03:00
Aliaksandr Valialkin
7f9c68cdcb
lib/promscrape: add follow_redirect
option to scrape_configs
section like Prometheus does
...
See https://github.com/prometheus/prometheus/pull/8546
2021-04-02 19:56:40 +03:00
Aliaksandr Valialkin
5b08e6fb16
lib/promscrape/discovery/kubernetes: properly track objects with the same names in multiple namespaces
...
This is a follow-up for 12e4785fe8
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:45:32 +03:00
Aliaksandr Valialkin
12e4785fe8
lib/promscrape/discovery/kubernetes: properly discover targets in multiple namespaces
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:28:30 +03:00
Nikolay
fdb8995642
Adds aws ECS credentials support ( #1175 )
2021-04-02 11:56:40 +03:00
Aliaksandr Valialkin
dc9eafcd02
app/{vminsert,vmagent}: add -sortLabels
command-line option for sorting time series labels before ingesting them in the storage
...
This option can be useful when samples for the same time series are ingested with distinct order of labels.
For example, metric{k1="v1",k2="v2"} and metric{k2="v2",k1="v1"}.
2021-03-31 23:27:58 +03:00
Aliaksandr Valialkin
e1f699bb6c
lib/storage: reduce memory usage when ingesting samples for the same time series with distinct order of labels
2021-03-31 21:24:46 +03:00
Aliaksandr Valialkin
b75c2ce659
lib/uint64set: improve Set.Has() performance scalability on multi-CPU system
...
Do not update bucket32.hint on Set.Has() call, since it leads to memory ping-pong between CPU cores multi-CPU system
2021-03-29 12:33:47 +03:00
Aliaksandr Valialkin
2601cc0fb0
lib/storage: do not update b.nextIdx if no samples are removed because of retention
2021-03-29 12:00:21 +03:00
Aliaksandr Valialkin
f39c84b21f
lib/promscrape/discovery/kubernetes: typo fix in error message
2021-03-26 12:46:14 +02:00
Aliaksandr Valialkin
9761ffd161
lib/promscrape/discovery/kubernetes: properly handle too old resource version
error message from Kubernetes watch API
2021-03-26 12:28:10 +02:00
Aliaksandr Valialkin
aa81039b42
app/vmselect: log the metric which trigger rollup result cache reset
...
This should help finding the source of stale metrics
2021-03-25 21:31:39 +02:00
Aliaksandr Valialkin
6e855d4b82
lib/storage: tune loopsCountPerMetricNameMatch according to production workload
2021-03-25 13:27:47 +02:00
Aliaksandr Valialkin
d4aadba9fa
app/vmagent: add -promscrape.consul.waitTime
command-line flag for configuring Consul service discovery wait time
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1144
2021-03-23 19:33:25 +02:00
Aliaksandr Valialkin
3a3d2165f9
lib/storage: do not reload metricName for the same metricID in Search.NextMetricBlock
...
This should speed up Search.NextMetricBlock a bit
2021-03-23 17:56:49 +02:00
Nikolay
29f9ef9b7f
changes consul_service label value ( #1143 )
...
according to prometheus discovery.
It should mitigate issue with case sensetive services
https://github.com/hashicorp/consul/issues/5707
2021-03-23 15:35:01 +02:00
Aliaksandr Valialkin
3cfb3a3683
lib/storage: respect the deadline passed to Storage.SearchMetricNames
2021-03-22 23:03:17 +02:00
Aliaksandr Valialkin
8e2afdf568
lib/storage: improve Search.NextMetricBlock performance by using MetricID->MetricName cache
2021-03-22 22:49:18 +02:00
Aliaksandr Valialkin
910092ca4d
lib/storage: tune loopsCountPerMetricNameMatch
2021-03-22 12:53:17 +02:00
Aliaksandr Valialkin
726f6ad804
lib/storage: small code simplification after 6cee5338b2
2021-03-18 15:21:13 +02:00
Aliaksandr Valialkin
6cee5338b2
lib/storage: prevent from infinite loop if {__graphite__="..."}
filter matches a metric name with *
, [
or {
chars
...
The idea has been borrowed from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1137
2021-03-18 14:53:47 +02:00
Aliaksandr Valialkin
4fb049bcba
lib/fs: reduce the frequency of failed to remove directory ... due to NFS lock
log warnings
...
Log `failed to remove directory ... due to NFS lock` warning only if the directory cannot be removed in one second.
2021-03-18 13:24:46 +02:00
Aliaksandr Valialkin
45dabfac1b
lib/storage: faster move heavy filters to the end of list
2021-03-17 15:12:13 +02:00
Aliaksandr Valialkin
828669e4e1
all: make golint
happy
2021-03-17 00:49:28 +02:00
Aliaksandr Valialkin
ccfb0ae2d3
lib/storage: limit loops count in order to reduce max CPU usage during filter search
2021-03-17 00:49:26 +02:00
Aliaksandr Valialkin
576a80b3d9
lib/storage: do not modify filterLoopsCount stats with loopsCount stats
...
Such a modification can result in incorrect filter sorting later
2021-03-17 00:49:26 +02:00
Aliaksandr Valialkin
f104f3eb2a
all: make golangci-lint happy after the commit 6378205415
2021-03-17 00:24:40 +02:00
Aliaksandr Valialkin
6378205415
lib/netutil: enable IPv6 UDP listening if -enableTCP6
command-line flag is passed to VictoriaMetrics
...
This is a follow-up for 18cfc4be7b
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1131
2021-03-17 00:16:17 +02:00
Nikolay
18cfc4be7b
Adds udp6 support for ingest servers ( #1134 )
...
with flag -enableUDP6 https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1131
2021-03-17 00:03:06 +02:00
Aliaksandr Valialkin
35bb44d317
lib/uint64set: optimize bucket16.addMulti a bit
2021-03-16 21:09:07 +02:00
Aliaksandr Valialkin
fd86a7dc1d
lib/storage: time series search optimization according to production workload profiling
...
Do not pass filter metric ids to getMetricIDsForTagFilter, since it has been appeared that this slows down
the function by multiple times when it finds big number of metricIDs (tens of millions).
2021-03-16 20:01:43 +02:00
Aliaksandr Valialkin
e36fbfae5b
lib/storage: further tuning for time series search
2021-03-16 18:46:22 +02:00
Aliaksandr Valialkin
dd7e82c34f
app/vmstorage: add -logNewSeries
command-line flag for determining the source of series churn rate
2021-03-15 22:38:50 +02:00
Aliaksandr Valialkin
37323c57c9
lib/influxutils: return response compatible with InfluxDB 1.8.4
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1124
2021-03-15 22:19:59 +02:00
f41gh7
acf2a82d3c
typo fix
2021-03-15 23:03:52 +03:00
Aliaksandr Valialkin
85a95bf60c
all: various fixes in command-line flag descriptions
2021-03-15 21:59:25 +02:00
Aliaksandr Valialkin
7c37e9aea9
app/{vminsert,vmagent}: a follow-up for b1aa8c3d8f
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1124
2021-03-15 21:37:55 +02:00
Aliaksandr Valialkin
9c77e34ef9
lib/storage: further tuning for time series selector code
2021-03-15 20:31:34 +02:00
Aliaksandr Valialkin
fb82c4b9fa
lib/uint64set: reduce the size of bucket16 by storing smallPool by pointer.
...
This reduces CPU time spent on bucket16 copying inside bucket13.addBucketAtPos.
2021-03-15 17:23:31 +02:00
Aliaksandr Valialkin
eb103e1527
lib/uint64set: optimize Set.AddMulti for large sorted sets
2021-03-15 17:10:40 +02:00
Aliaksandr Valialkin
43504ebd14
lib/uint64set: optimize bucket16.add and bucket16.addMulti a bit
2021-03-15 16:58:12 +02:00
Aliaksandr Valialkin
3ccf7ea20c
lib/storage: tune per-day index search
2021-03-15 13:31:55 +02:00
Aliaksandr Valialkin
c14dafce43
lib/promscrape: an attempt to reduce memory usage when vmagent scrapes targets with varying number of metrics
...
Do not cache too big byte buffers and too big writeRequestCtx objects,
since it is cheaper to re-create them instead of wasting RAM for their caching.
This reverts 7f6f350ee1
2021-03-15 11:45:39 +02:00
Aliaksandr Valialkin
7f6f350ee1
lib/promscrape: return back the logic for flushing big buffers to storage from the commit 3fd8653b40
...
This should reduce memory usage when vmagent scrapes targets with big number of metrics and `-promscrape.streamParse` isn't enabled
2021-03-14 22:26:00 +02:00
Aliaksandr Valialkin
b88806ecbf
lib/promscrape/discovery/kubernetes: do not start object watcher until initial objects are loaded
2021-03-14 21:55:00 +02:00
Aliaksandr Valialkin
83edbb7cab
lib/promscrape: retry service discovery in a few seconds if it starts returning 0 targets
...
This should reduce recovery time from temporary issues during service discovery
2021-03-14 21:53:23 +02:00
Aliaksandr Valialkin
bf15d6a6a2
lib/promscrape: remove duplicate target
word in error message
2021-03-14 21:52:02 +02:00
Aliaksandr Valialkin
d409898515
lib/promscrape/discovery/kubernetes: further optimize kubernetes service discovery for the case with many scrape jobs
...
Do not re-calculate labels per each scrape job - reuse them instead for scrape jobs with identical Kubernetes role
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-14 21:14:53 +02:00
Aliaksandr Valialkin
7a16e8e3a2
lib/promscrape/discovery: fixes after 133b288681
...
- Removed a deadlock in addAPIWatcher
- Do not create unused ScrapeWork objects
- Do not spend CPU resources on creating objectByKey map in addAPIWatcher
This work is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1125
2021-03-13 15:18:51 +02:00
Aliaksandr Valialkin
def014eb75
lib/promscrape/discovery/kubernetes: remove debug lines left after the commit 133b288681
2021-03-12 11:22:33 +02:00
Aliaksandr Valialkin
ca4d5ce037
lib/proxy: there is no need in cloning tlsCfg, which has been created two lines above
2021-03-12 10:47:02 +02:00
Aliaksandr Valialkin
895d5d1355
lib/proxy: set proxy address in tls.Config.ServerName instead of the target address
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-12 10:41:33 +02:00
Aliaksandr Valialkin
a6a71ef861
lib/promscrape: add ability to configure proxy options via proxy_tls_config
, proxy_basic_auth
, proxy_bearer_token
and proxy_bearer_token_file
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-12 03:36:19 +02:00
Aliaksandr Valialkin
f669531506
lib/storage: further tune filters sorting logic
2021-03-12 00:53:04 +02:00
Aliaksandr Valialkin
133b288681
lib/promscrape/discovery/kubernetes: use a single watcher per apiURL
...
Previously multiple scrape jobs could create multiple watchers for the same apiURL. Now only a single watcher is used.
This should reduce load on Kubernetes API server when many scrape job configs use Kubernetes service discovery.
2021-03-11 16:43:04 +02:00
Aliaksandr Valialkin
c0ac740f93
lib/proxy: do not show inline basic auth passwords when logging errors related to proxy_url
2021-03-11 13:43:36 +02:00
Aliaksandr Valialkin
bebcb8130c
lib/promscrape/discovery/kubernetes: localize Bookmark parsing code
...
This is a follow-up for e772d1c920
2021-03-11 13:08:08 +02:00
Aliaksandr Valialkin
e772d1c920
lib/promscrape/discovery/kubernetes: reduce load on Kubernetes API server by using watch bookmarks
...
This allows continuing object watch from the last bookbark instead of reloading all the objects
on watch errors or timeouts.
See https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks
2021-03-10 15:06:35 +02:00
Aliaksandr Valialkin
bc5c4add89
lib/httpserver: export vm_available_memory_bytes
and vm_available_cpu_cores
metrics
...
These metrics are useful for tracking the available memory and CPU cores for VictoriaMetrics apps.
2021-03-10 12:02:42 +02:00
Aliaksandr Valialkin
787242d7b0
lib/proxy: pass proxy hostname in Host
header of the CONNECT
request
...
This should resolve the following issue when connecting to tls proxy:
cannot validate certificate for ... because it doesn't contain any IP SANs
2021-03-09 20:39:40 +02:00
Aliaksandr Valialkin
36fd007247
lib/proxy: set missing ServerName in TLS config for proxy_url
.
...
While at it, allow setting Proxy-Authorization for `proxy_url` via `basic_auth` and `bearer_token` configs.
2021-03-09 18:58:18 +02:00
Nikolay
ad34f42467
Changes tlsConfig init for proxy connections ( #1121 )
...
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-09 18:51:00 +02:00
Aliaksandr Valialkin
3fd8653b40
lib/promscrape: apply sample_limit
after metric relabeling is applied as Prometheus does
...
See the description for `sample_limit` option from Prometheus docs:
Per-scrape limit on number of scraped samples that will be accepted.
If more than this number of samples are present after metric relabeling
the entire scrape will be treated as failed. 0 means no limit.
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
2021-03-09 15:47:18 +02:00
Aliaksandr Valialkin
4b18a4f026
lib/promscrape/discovery/kubernetes: remove too verbose logs about starting and stopping the watchers
...
Log the number of objects loaded per each watch url
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-09 15:05:14 +02:00
John Belmonte
364fdf4a56
spelling fix: adjacent ( #1115 )
2021-03-09 09:18:19 +02:00
Aliaksandr Valialkin
14a399dd06
lib/promscrape: add scrape_offset
option to scrape_config
...
This option can be used for specifying the particular offset per each scrape interval for target scraping
2021-03-08 12:03:33 +02:00
Aliaksandr Valialkin
345980f78f
lib/storage: go fmt
2021-03-08 12:03:31 +02:00
Aliaksandr Valialkin
18fe0ff14b
lib/storage: tune loopsCount estimations in getMetricIDsForTagFilterSlow
...
The adjusted estmations give up to 2x lower median response times on 200qps /api/v1/query_range workload
2021-03-07 21:12:35 +02:00
Aliaksandr Valialkin
ab4f090c63
lib/promscrape/discovery/kubernetes: reduce memory usage further when big number of scrape jobs are configured for the same kubernetes_sd_config
role
...
Serialize reloading per-role objects, so they don't occupy too much memory when objects for many scrape jobs are simultaneously refreshed.
Do not reload per-role objects if they were already refreshed by concurrent goroutines. This should reduce load on Kubernetes API server
when big number of scrape jobs are configured for the same Kubernetes role.
This is a follow-up for 17b87725ed
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-07 19:51:03 +02:00
Aliaksandr Valialkin
1187ee5e16
lib/decimal: prevent exponent overflow when processing values close to zero
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1114
2021-03-05 18:52:47 +02:00
Aliaksandr Valialkin
17b87725ed
lib/promscrape/discovery/kubernetes: reduce memory usage when Kubernetes service discovery is configured on a big number of scrape jobs
...
Previously vmagent was creating a separate Kubernetes object cache per each scrape job.
This could result in increased memory usage when monitoring a Kubernetes cluster with big number of objects (pods / nodes / services, etc.)
as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
Now it uses a shared map of scrape objects across multiple scrape jobs.
2021-03-05 17:29:55 +02:00
Aliaksandr Valialkin
c9a25e931b
lib/promscrape/discovery/kubernetes: move apiWatcher code to a separate file
2021-03-05 12:36:05 +02:00
Aliaksandr Valialkin
444fca89f8
lib/promscrape: make cluster membership calculations consistent across 32-bit and 64-bit architectures
2021-03-05 09:06:17 +02:00
Aliaksandr Valialkin
423cd981fb
lib/promscrape: add -promscrape.cluster.replicationFactor
command-line flag for replicating scrape targets among vmagent
instances in the cluster
2021-03-04 10:20:15 +02:00
Aliaksandr Valialkin
9a48c1b53d
lib/promscrape/discovery/kubernetes: fix tests after e154f4a644
2021-03-03 22:41:30 +02:00
Nikolay
e154f4a644
Fix ingress discovery api ( #1110 )
2021-03-03 10:43:39 +02:00
Aliaksandr Valialkin
7906316741
lib/promscrape/discovery/kubernetes: properly check for nil pointer inside interface
...
See https://mangatmodi.medium.com/go-check-nil-interface-the-right-way-d142776edef1
This fixes a panic when the ScrapeWork is filtered out in swcFunc.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1108
2021-03-03 10:33:17 +02:00
Aliaksandr Valialkin
03dceb700d
lib/promscrape: go fmt
2021-03-02 21:20:43 +02:00
Aliaksandr Valialkin
4de4da1e2a
lib/storage: typo fix: umarshal -> unmarshal
2021-03-02 20:47:59 +02:00
Aliaksandr Valialkin
062211c61c
lib/promscrape: pre-allocate space for a map in mergeLabels
...
This should reduce the number of memory allocations when discovering big number of targets
2021-03-02 18:41:58 +02:00
Aliaksandr Valialkin
d1d34664b5
lib/promscrape/discovery: properly track vm_promscrape_discovery_kubernetes_objects_removed_total metric
2021-03-02 18:32:54 +02:00
Aliaksandr Valialkin
a939667ce0
lib/promrelabel: remove unneded optimizations for labeldrop
and labelkeep
actions
...
These optimizations may slow down code execution by matching the same label against regexp two times instead of a single time
2021-03-02 17:55:43 +02:00
Aliaksandr Valialkin
6a7ef768ff
lib/promscrape/discovery/kubernetes: cache ScrapeWork objects as soon as the corresponding k8s objects are changed
...
This should reduce CPU usage and memory usage when Kubernetes contains tens of thousands of objects
2021-03-02 16:42:55 +02:00
Aliaksandr Valialkin
22b1941cfc
lib/promscrape/discovery/ec2: follow-up after f6114345de
2021-03-02 13:46:26 +02:00
Nikolay
f6114345de
Adds webIndentity token for aws ( #1099 )
...
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1080
2021-03-02 13:27:09 +02:00
Aliaksandr Valialkin
937f382938
lib/protoparser/prometheus: properly unescape label values in Prometheus exposition format
...
Unescape only `\n`, `\"` and `\\` sequences as Prometheus does. Other escape sequences shouldn't be unescaped.
2021-03-02 13:21:43 +02:00
Aliaksandr Valialkin
019d8e88d8
lib/protoparser/graphite: fix parsing of a Graphite line with empty tags such as foo; 1 2
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1100
2021-03-01 17:16:35 +02:00
Aliaksandr Valialkin
1a3689af9a
lib/promscrape/discovery/kubernetes: deflake tests; a follow-up for 05fb08713c
2021-03-01 14:32:12 +02:00
Aliaksandr Valialkin
62ebf5c88e
lib/promscrape: explicitly stop and cleanup service discovery routines when new config is read from -promscrape.config
...
This should reduce memory usage when `-promscrape.config` file frequently changes
2021-03-01 14:14:00 +02:00
Aliaksandr Valialkin
e32ad9e923
lib/promscrape: use target arg in ScrapeWork cache
2021-03-01 12:29:09 +02:00
Aliaksandr Valialkin
f5d77a7081
lib/promscrape: typo fix, which prevented from caching ScrapeWork entries
2021-03-01 12:12:56 +02:00
Aliaksandr Valialkin
e84153d5ca
lib/promscrape: add vm_promscrape_scrapework_cache_* metrics for tracking ScrapeWork cache effectiveness
2021-03-01 12:05:45 +02:00
Aliaksandr Valialkin
7f15cd7161
lib/httpserver: make make errcheck
happy after the commit 9fc7726d84
2021-03-01 00:34:43 +02:00
Aliaksandr Valialkin
530e9904af
lib/promscrape: reduce CPU usage an memory allocations when constructing scrapeWorkKey
2021-02-28 22:29:58 +02:00
Aliaksandr Valialkin
9fc7726d84
lib/httpserver: make sure the gzipResponseWriter.Write() is called on Flush() and Close() calls
...
This should fix the `http: superfluous response.WriteHeader call` issue
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1078
2021-02-28 19:22:50 +02:00
Aliaksandr Valialkin
e5ca8ac0db
lib/promscrape: add ability to spread scrape targets among multiple vmagent instances
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1084
2021-02-28 18:41:08 +02:00
Aliaksandr Valialkin
e02d1ef93c
lib/promscrape/discovery/kubernetes: properly account the number of objects when watcher is stopped
...
A follow-up for b21b110b7a
2021-02-28 17:06:02 +02:00
Aliaksandr Valialkin
b21b110b7a
lib/promscrape/discovery/kubernetes: add vm_promscrape_discovery_kubernetes_*
metrics for monitoring internal state of k8s service discovery
2021-02-28 16:57:40 +02:00
Aliaksandr Valialkin
c459600346
lib/promscrape/discovery/kubernetes: remove resourceVersionMatch=NotOlderThan
query arg when watching for k8s object changes, since it cannot be used when watch=1
query arg is passed
2021-02-28 16:07:14 +02:00
Aliaksandr Valialkin
59a31171e3
lib/promscrape: fix possible deadlock in parallel execution of target relabeling
2021-02-28 16:05:13 +02:00
Aliaksandr Valialkin
68a0f5ce12
lib/promscrape/discovery/kubernetes: fix deadlock in startWatcherForURL
...
reloadObjects must be called without holding aw.mu lock
2021-02-28 15:26:30 +02:00
Aliaksandr Valialkin
b523e0369c
lib/promscrape/discovery/kubernetes: typo fix after 241ffd1f3b
2021-02-28 15:12:17 +02:00
Aliaksandr Valialkin
241ffd1f3b
lib/promscrape/discovery/kubernetes: pre-populate labelsByKey in reloadObject()
2021-02-28 15:09:49 +02:00
Aliaksandr Valialkin
05fb08713c
lib/promscrape/discovery/kubernetes: compare sorted sets of labels in tests
...
This should deflake tests where the order of labels isn't stable
2021-02-28 14:10:19 +02:00
Aliaksandr Valialkin
af8b7e8391
lib/promscrape: add missing startWatchersForRole()
call at the beginning of apiWatcher.getLabelsForRole
2021-02-28 14:00:17 +02:00
Aliaksandr Valialkin
422b31de40
lib/promscrape/discovery/kubernetes: reload k8s resources on every error
...
This is needed for obtaining fresh resourceVersion
2021-02-27 01:47:27 +02:00
Aliaksandr Valialkin
7cc3d96a41
lib/fs: follow-up after f3a03c4164
2021-02-27 01:01:47 +02:00
Nikolay
f3a03c4164
Adds windows build ( #1040 )
...
* fixes windows compilation,
adds signal impl for windows,
adds free space usage for windows,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1036
NOTE victoria metrics database still CANNOT work under windows system,
only vmagent is supported.
To completly port victoria metrics, you have to fix issues with separators,
parsing and posix file removall
* rollback separator
* Adds windows setInformation api,
it must behave like unix, need to test it.
changes procutil
* check for invlaid param
* Fixes posix delete semantic
* refactored a bit
* fixes openbsd build
* removed windows api call
* Fixes code after windows add
* Update lib/procutil/signal_windows.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-02-27 00:37:07 +02:00
Aliaksandr Valialkin
a78948ae8b
lib/promscrape: yet another typo fix after ed8441ec52
2021-02-26 23:35:47 +02:00
Aliaksandr Valialkin
8683ea85e6
lib/fs: properly handle stale NFS file handle
error during file deletion
...
This error can appear when -storageDataPath points to NFS volume and the given file has been already removed.
2021-02-26 23:25:14 +02:00
Aliaksandr Valialkin
9fa2632ac3
lib/promscrape: typo fix after ed8441ec52
2021-02-26 23:04:05 +02:00
Aliaksandr Valialkin
ed8441ec52
lib/promscrape: cache ScrapeWork
...
This should reduce the time needed for updating big number of scrape targets.
2021-02-26 21:43:22 +02:00
Aliaksandr Valialkin
815666e6a6
lib/promscrape/discovery/kubernetes: cache target labels
...
This should reduce CPU usage on repeated SDConfig.GetLabels() calls.
2021-02-26 20:23:28 +02:00
Aliaksandr Valialkin
19712fc2bd
lib/promscrape/discovery/kubernetes: errcheck fix
2021-02-26 17:00:08 +02:00
Aliaksandr Valialkin
c8f2f9b2e8
lib/promscrape: cleanup after 9b2246c29b
...
Main points:
* Revert changes outside lib/promscrape/discovery/kuberntes . These changes can be applied later in a separate commit
* Minimize changes in lib/promscrape/discovery/kubernetes compared to a93e644001
* Corner case fixes.
2021-02-26 16:54:05 +02:00
Nikolay
9b2246c29b
vmagent kubernetes watch stream discovery. ( #1082 )
...
* started work on sd for k8s
* continue work on watch sd
* fixes
* continue work
* continue work on sd k8s
* disable gzip
* fixes typos
* log errror
* minor fix
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-02-26 16:46:13 +02:00
Aliaksandr Valialkin
a93e644001
lib/promscrape: remove duplicate code a bit
2021-02-26 16:39:56 +02:00
Aliaksandr Valialkin
f7b242540b
lib/promscrape: reduce processing time for big number of discovered targets by processing them in parallel
2021-02-26 16:39:56 +02:00
Aliaksandr Valialkin
f7049e2af7
lib/promrelabel: optimize labeldrop
and labelkeep
relabeling for prefix.*
and prefix.+
regexps
2021-02-24 17:58:28 +02:00
Aliaksandr Valialkin
2c44178645
lib/storage: consistency renaming: durationsPerDateTagFilterCache -> loopsPerDateTagFilterCache
2021-02-23 15:47:19 +02:00
faceair
15d61c4879
lib/storage: correct tagfilter match cost ( #1079 )
2021-02-22 21:46:56 +02:00
Aliaksandr Valialkin
d136081040
lib/promrelabel: add more optimizations for relabeling for common cases
2021-02-22 16:33:55 +02:00
Aliaksandr Valialkin
dd1e53b119
lib/promrelabel: optimize relabeling performance for common cases
2021-02-22 00:51:13 +02:00
Aliaksandr Valialkin
ff5bbc4b88
lib/promscrape: export vm_promscrape_target_relabel_duration_seconds metric
2021-02-21 23:21:42 +02:00
Aliaksandr Valialkin
636c55b526
lib/mergeset: reduce memory usage for inmemoryBlock by using more compact items representation
...
This also should reduce CPU time spent by GC, since inmemoryBlock.items don't have pointers now,
so GC doesn't need visiting them.
2021-02-21 22:06:47 +02:00
Aliaksandr Valialkin
388cdb1980
lib/storage: do not re-calculate stats for heavy tag filters
...
This should reduce the number of slow queries when stats for heavy tag filters was recalculated.
2021-02-21 21:39:01 +02:00
Aliaksandr Valialkin
48656dcc38
lib/{mergeset,storage}: allow merging smaller number of small parts
...
While this may increase CPU and disk IO usage needed for background merge,
this also recudes CPU usage during queries in production. This is because
such queries tend to read recently added data and it is better to have lower number
of parts for such data in order to reduce CPU usage.
This partially reverts ebf8da3730
2021-02-21 21:28:36 +02:00
Aliaksandr Valialkin
cb311bb156
lib/{mergeset,storage}: do not use pools for indexBlock and inmemoryBlock during their caching, since this results in higher memory usage in production without any performance gains
2021-02-21 21:18:59 +02:00
Aliaksandr Valialkin
2cfb376945
lib/promscrape: typo fix after the commit f26162ec99
2021-02-19 00:33:37 +02:00
Aliaksandr Valialkin
c2678754e4
app/vmagent: properly perform graceful shutdown, which was broken in the commit 1d1ba889fe
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1065
2021-02-19 00:31:34 +02:00
Aliaksandr Valialkin
f26162ec99
lib/promscrape: add scrape_align_interval config option into scrape config
...
This option allows aligning scrapes to a particular intervals.
2021-02-18 23:53:44 +02:00
Aliaksandr Valialkin
f9084611bd
lib/storage: use composite index for a query with a name filter and negative filters
2021-02-18 18:57:23 +02:00
Aliaksandr Valialkin
a537c4f602
lib/storage: properly handle queries containing a filter on metric name plus any number of negative filters and zero non-negative filters
...
Example: `node_cpu_seconds_total{mode!="idle"}`
2021-02-18 18:46:36 +02:00
Aliaksandr Valialkin
e540c02014
lib/storage: prevent from running identical heavy tag filters in concurrent queries when measuring the number of loops for such tag filter.
...
This should reduce CPU usage spikes when measuring the number of loops needed for heavy tag filters
2021-02-18 13:58:18 +02:00
Aliaksandr Valialkin
711f8a5b8d
lib/storage: sort tag filters by the number of loops they need for the execution
...
This metric should work better than the filter execution duration, since it cannot be distorted
by concurrently running queries.
2021-02-18 12:47:38 +02:00
Aliaksandr Valialkin
ce99b48a9a
Revert "lib/mergeset: tune lifetime for entries inside block caches"
...
This reverts commit 458c89324d
.
Production testing revealed zero improvements for memory usage with reduced lifetime for entries in block caches.
2021-02-17 20:42:21 +02:00
Aliaksandr Valialkin
939d5ffc2b
lib/storage: move composite filters to the top during sorting
2021-02-17 20:26:51 +02:00
Aliaksandr Valialkin
faad6f84a4
lib/storage: return back filter arg to getMetricIDsForTagFilter function
...
The filter arg has been removed in the commit c7ee2fabb8
because it was preventing from caching the number of matching time series per each tf.
Now the cache contains duration for tf execution, so the filter shouldn't break such caching.
2021-02-17 19:33:22 +02:00
Aliaksandr Valialkin
d4849561ef
app/vmstorage: export vm_composite_filter_success_conversions_total and vm_composite_filter_missing_conversions_total metrics
2021-02-17 19:13:38 +02:00
Aliaksandr Valialkin
33806264ec
lib/storage: revert ecf132933e
, since negative filters require the same amount of work as non-negative filters
2021-02-17 18:55:04 +02:00
Aliaksandr Valialkin
63fc140624
lib/storage: tag filters sorting...
2021-02-17 17:55:29 +02:00
Aliaksandr Valialkin
74424b55ee
lib/storage: further tune tag filters sorting
2021-02-17 17:28:15 +02:00
Aliaksandr Valialkin
442fcfec5a
lib/storage: tune the logic for sorting tag filters according the their exeuction times
2021-02-17 15:00:08 +02:00
Aliaksandr Valialkin
4a07820048
lib/storage: make sure that nobody uses partitions when closing the table
2021-02-17 14:59:04 +02:00
Aliaksandr Valialkin
1256931aee
lib/httpserver: make errcheck happy
2021-02-16 22:05:32 +02:00
Aliaksandr Valialkin
d61f7b7279
lib/storage: more tuning for tag filters sorting according the time they take
2021-02-16 21:22:23 +02:00
Aliaksandr Valialkin
458c89324d
lib/mergeset: tune lifetime for entries inside block caches
...
This should reduce memory usage in general case without significant CPU usage increase
2021-02-16 18:11:51 +02:00
Aliaksandr Valialkin
2824856691
lib/mergeset: clarify comments in the code a bit
2021-02-16 18:02:57 +02:00
Aliaksandr Valialkin
1bf6cd814d
lib/uint64set: remove memory allocation in bucket16.appendTo when sorting smallPool
2021-02-16 15:31:49 +02:00
Aliaksandr Valialkin
8ec45ff335
lib/httpserver: cache /metrics
output for a second
...
This should reduce CPU load when `/metrics` output is scraped with a frequency exceeding a request per second
2021-02-16 14:56:36 +02:00
Aliaksandr Valialkin
b861a64510
lib/protoparser/influx: make sure that escaped whitespace can be put in measurement, tag names and field names
2021-02-16 13:59:18 +02:00
Aliaksandr Valialkin
7faa762021
lib/mergeset: remove unused code after a4140de9e6
2021-02-16 13:40:09 +02:00
Aliaksandr Valialkin
ca191696fe
lib/storage: tune sorting for tag filters
2021-02-16 13:04:49 +02:00
Aliaksandr Valialkin
ecf132933e
lib/storage: increase match cost for negative tag filters, since they need to scan all the label pairs
2021-02-15 16:34:23 +02:00
Aliaksandr Valialkin
4e39bf148c
vendor: update github.com/VictoriaMetrics/metrics from v1.13.1 to v1.14.0
...
The new version switches from log-linear histograms to log-based histograms,
which provide up to 3.6 times better accuracy.
2021-02-15 15:12:29 +02:00
Aliaksandr Valialkin
9f5ac603a7
lib/storage: reduce the minimum supported retention for inverted index from one month to one day
2021-02-15 15:12:29 +02:00
Aliaksandr Valialkin
2e30202dc7
lib/flagutil: prevent from integer overflow when parsing duration
2021-02-15 15:12:29 +02:00
Aliaksandr Valialkin
38d7e96602
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_endpoints_label_*
and __meta_kuberntes_endpoints_annotation_*
labels to role: endpoints
...
This syncs kubernetes SD with Prometheus 2.25
See 617c56f55a
2021-02-15 02:51:16 +02:00
Aliaksandr Valialkin
74963f71c6
lib/logger: explicitly import "time/tzdata" package for embedding tzdata into the app
...
The approach with `timetzdata` build tag didn't work for GOARCH=arm and GOARCH=ppc64le
due to the issue https://github.com/golang/go/issues/44073#issuecomment-778854298
2021-02-15 01:00:01 +02:00
Aliaksandr Valialkin
71c417427c
lib/storage: sort tag filters by actual execution time instead of by the number of matching time series
...
This should improve query speed for queries with regexp filters matching small number of time series
on a label with big number of unique values.
2021-02-15 00:18:13 +02:00
Aliaksandr Valialkin
c727d2219b
lib/storage: properly hanle regexp tag filters with dots, which can be converted to full string match filters.
...
For example `{label=~"foo\.bar"}` should be converted to `{label="foo.bar"}`. Previously it has was mistakenly conveted to `{label="foo\.bar"}` .
This could result in missing time series for such tag filters.
2021-02-14 23:38:14 +02:00
Aliaksandr Valialkin
80dc74dbc1
lib/promscrape: remove vm_promscrape_scrapes_failed_per_url_total and vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total metrics
...
These metrics may result in big number of time series when vmagent scrapes thousands of targets and these targets constantly changes.
* It is better using `up == 0` query for determining failing targets.
* It is better using the following query for determining targets with exceeded limit on the number of metrics:
scrape_samples_scraped > 0 if up == 0
2021-02-12 05:26:04 +02:00
Aliaksandr Valialkin
0e26b7168a
lib/storage: return back in-order applying of tag filters, since concurrently executing tag filters can result in CPU and RAM waste in common case
2021-02-10 22:41:04 +02:00
Aliaksandr Valialkin
b51c23dc5b
lib/storage: parallelize tag filters execution a bit
...
This should reduce execution time when a query contains multiple tag filters and each such filter matches big number of time series.
2021-02-10 18:12:25 +02:00
Aliaksandr Valialkin
c7ee2fabb8
lib/storage: remove filter arg from getMetricIDsForDateTagFilter function
...
The `filter` arg breaks the logic for sorting tag filters by the matching metrics,
which may result in non-optimal performance during time series search.
2021-02-10 18:12:20 +02:00
Aliaksandr Valialkin
57cac289e0
lib/storage: fix inconsistencies in error logs
2021-02-10 18:12:16 +02:00
Aliaksandr Valialkin
5d5f0b0627
lib/storage: load metadata before loading indexdb, since indexdb depends on the metadata
2021-02-10 17:55:40 +02:00
Aliaksandr Valialkin
cdecf83ce5
app/vmstorage: export vm_composite_index_min_timestamp metric
2021-02-10 17:14:08 +02:00
Aliaksandr Valialkin
553016ea99
lib/storage: disable composite index usage when querying old data
2021-02-10 14:57:50 +02:00
Aliaksandr Valialkin
fcb7655d1e
lib/storage: fix metric name match for composite filter
2021-02-10 01:27:45 +02:00
Aliaksandr Valialkin
c7dccebaef
lib/storage: optimize search by label filters matching big number of time series
2021-02-10 00:44:54 +02:00
Aliaksandr Valialkin
6b4e6c229c
lib/storage: reduce lock contention in dateMetricIDCache when registering new time series for the current day
...
This should help systems with multiple CPU cores
2021-02-10 00:01:13 +02:00
Aliaksandr Valialkin
31f6b9c977
lib/fs: remove the code for tracking whether the given memory region is in page cache
...
This code didn't give performance gains under production workload, so let's remove it in order to simplify the code.
2021-02-09 16:49:03 +02:00
Aliaksandr Valialkin
0a69122d81
lib/mergeset: remove dead code left after a4140de9e6
2021-02-09 16:33:52 +02:00
Aliaksandr Valialkin
d56390b925
optimize Storage.updatePerDateData()
2021-02-09 02:55:36 +02:00
Aliaksandr Valialkin
fda61e8e96
lib/storage: skip deduplication when creating inmemory data blocks
...
The deduplication will be performed later during merging such blocks.
2021-02-09 02:25:32 +02:00
Aliaksandr Valialkin
a4140de9e6
lib/mergeset: unconditionally cache indexdb blocks
...
Production workloads show that indexdb blocks must be cached unconditionally for reducing CPU usage.
This shouldn't increase memory usage too much, since unused blocks are removed from the cache every two minutes.
2021-02-09 00:47:50 +02:00
Aliaksandr Valialkin
cb96a1865b
app/vmstorage: export missing vm_cache_size_bytes
metrics for indexdb and data caches
2021-02-09 00:47:00 +02:00
Aliaksandr Valialkin
c5770600a2
lib/cgroup: follow-up after b9bf3cbe3e
2021-02-08 15:54:38 +02:00
Nikolay
b9bf3cbe3e
refactored cgroups limits, ( #1061 )
...
adds tests, remove os.Exec
2021-02-08 15:46:22 +02:00
Aliaksandr Valialkin
2242647a04
lib/storage: optimize data ingestion in the beginning of every hour
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046
2021-02-08 12:01:12 +02:00
Aliaksandr Valialkin
8f28a578d3
lib/logger: exit the app if unsupported timezone value has been passed to -loggerTimezone
...
While at it, clarify descrption for `-loggerTimezone` command-line flag.
2021-02-07 23:35:37 +02:00
Aliaksandr Valialkin
83d3e582ab
lib/storage: check for prevHourMetricIDs cache before falling back to checking for (date, metricID) entries during data ingestion
...
This should reduce possible CPU usage spikes at the beginning of every hour.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046
2021-02-04 18:48:13 +02:00
Aliaksandr Valialkin
9fb38569eb
lib/httpserver: expose process_open_fds
and process_max_fds
metrics
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/402
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1037
2021-02-04 16:40:50 +02:00
Nikolay
48c8c5093b
fixes dockerswarm ( #1053 )
...
fixes improper usage of host network services
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1028
2021-02-04 15:56:42 +02:00
Aliaksandr Valialkin
d16f22f3a1
app/vmselect,lib/storage: properly parse Graphite selectors with inner wildcards
...
Example: foo{bar{x,yz},a[b-c],*de}
2021-02-03 20:14:22 +02:00
Aliaksandr Valialkin
a5a1b9bd66
lib/storage: fix a bug, which breaks searching by Graphite wildcard filters
2021-02-03 20:14:22 +02:00
Aliaksandr Valialkin
6123aa3e75
sort orSuffixes in tagFilter.InitFromGraphiteQuery for faster seeks
2021-02-03 20:14:22 +02:00
Aliaksandr Valialkin
157c02622b
app/vmselect: add ability to set Graphite-compatible filter via {__graphite__="foo.*.bar"}
syntax
2021-02-03 01:21:54 +02:00
Aliaksandr Valialkin
4068f8d590
lib/promscrape: add vm_promscrape_service_discovery_duration_seconds metric
2021-02-02 16:15:25 +02:00
Aliaksandr Valialkin
bd11fd8f1d
lib/promscrape: add vm_promscrape_scrape_retries_total
, vm_promscrape_discovery_retries_total
and vm_promscrape_discovery_requests_total
metrics
2021-02-01 20:06:27 +02:00
Aliaksandr Valialkin
f0087f0dbb
lib/flagutil: typo fix in comment to ArrayInt.GetOptionalArgOrDefault() func
2021-02-01 14:35:39 +02:00
Aliaksandr Valialkin
b2aa80e74b
app/vmagent: add -remoteWrite.roundDigits command-line option for limiting the number of digits after the point for stored values
...
This commit also adds --vm-round-digits command-line option to vmctl tool.
2021-02-01 14:27:09 +02:00
Aliaksandr Valialkin
d6347a3e56
lib/logger: initialize timezone by UTC in order to fix failing tests
2021-01-27 00:59:12 +02:00
Aliaksandr Valialkin
fc5b26d856
lib/promscrape: export vm_promscrape_scrapes_failed_per_url_total
and vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total
metrics
...
These metrics could be useful for determining imporperly working scrape targets.
Note that these metrics are exported only for failing scrape targets. They aren't exposed for normally working targets.
2021-01-27 00:39:26 +02:00
Aliaksandr Valialkin
de3c662e8a
all: consistently use timers from timerpool
2021-01-27 00:39:26 +02:00
Aliaksandr Valialkin
3149ac7a7e
lib/fs: properly initialize cleaner for pageCache bitmaps
...
Previously it wasnt working because the timer was fired only once
2021-01-27 00:39:26 +02:00
Aliaksandr Valialkin
3fe848cdd7
lib/logger: add -loggerTimezone
command-line flag for adjusting timezone for timestamps in log messages
2021-01-26 22:51:54 +02:00
Aliaksandr Valialkin
8cea3c3cc4
lib/promscrape: retry scrape and service discovery requests when the remote server closes http keep-alive connection
2021-01-22 13:22:33 +02:00
faceair
b638c1eed5
lib/mergeset: add missing shouldCacheBlock ( #1019 )
2021-01-15 11:46:01 +02:00
Aliaksandr Valialkin
bdd0a1cdb2
lib/backup: increase backup chunk size from 128MB to 1GB
...
This should reduce costs for object storage API calls by 8x. See https://cloud.google.com/storage/pricing#operations-pricing
2021-01-13 12:16:35 +02:00
Nikolay
7bf5d48315
bumps minimal tls version ( #1012 )
2021-01-13 00:35:47 +02:00
Aliaksandr Valialkin
31ec79eaf6
lib/storage: inline marshalTags function and remove the code for handling duplicate tags from here
...
This is a follow-up commit after c8ea697db8
2021-01-12 15:13:30 +02:00
Aliaksandr Valialkin
c8ea697db8
lib/storage: de-duplicate tags in MetricName.sortTags
...
Leave only the last tag among tags with duplicate keys. This is needed for reliable addition of extra_labels
during data ingestion. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1007 for details.
2021-01-12 15:03:42 +02:00
Nikolay
7976c22797
Fixes error handling for promscrape.streamParse ( #1009 )
...
properly return error if client cannot read data,
properly suppress scraper errors
2021-01-12 13:31:47 +02:00
Aliaksandr Valialkin
2c44f9989a
lib/promscrape: properly show scrape duration on /targets
page
...
Previously it has been shown as 0.000s for any scrape duration.
2021-01-11 21:14:46 +02:00
Aliaksandr Valialkin
24ffad74c1
all: use net.Dial
instead of fasthttp.Dial
, because fasthttp.Dial
limits the number of concurrent dials to 1000
2021-01-11 12:53:30 +02:00
Aliaksandr Valialkin
9dcb18e03d
app/vmstorage: disable final merge by default, since it may result in high disk IO and CPU usage without measurable benefits such as increased query performance and reduced disk space usage
2021-01-08 00:16:05 +02:00
Aliaksandr Valialkin
490c69c64e
lib/storage: wait for pending transactions before closing and dropping the partition
...
This deflakes `make test-full-386` test
2020-12-25 11:45:53 +02:00
Aliaksandr Valialkin
cab7e936a3
lib/storage: physically remove stale parts
...
Previously they were removed from partition struct, but the corresponding directories weren't removed.
This is a follow-up for 46dba00756
2020-12-24 16:51:36 +02:00
Nikolay
b21e16ad0c
fixes kubernetes_sd ( #983 )
...
* fixes kubernetes_sd,
adds missing service metadata for pod ports without endpoint
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/982
* fix test
2020-12-24 11:26:14 +02:00
Aliaksandr Valialkin
820669da69
lib/promscrape: code prettifying for 8dd03ecf19
2020-12-24 10:56:10 +02:00
Nikolay
8dd03ecf19
adds proxy_url support, ( #980 )
...
* adds proxy_url support,
adds proxy_url to the dockerswarm, eureka, kubernetes and consul service discovery,
adds proxy_url to the scrape_config for targets scrapping,
http based proxy is supported atm,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/503
* fixes imports
2020-12-24 10:52:37 +02:00
Aliaksandr Valialkin
9e4ed5e591
lib/storage: do not remove parts outside the configured retention if they are currently merged
...
These parts are automatically removed after the merge is complete.
2020-12-24 08:51:28 +02:00
Aliaksandr Valialkin
46dba00756
lib/storage: remove stale parts as soon as they go outside the configured retention
...
Previously such parts could remain undeleted for long durations until they are merged with other parts.
This should help for `-retentionPeriod` values smaller than one month.
2020-12-22 19:54:31 +02:00
Aliaksandr Valialkin
d65c03c004
lib/storage: properly determine max rows for output part when merging small parts
2020-12-18 23:14:38 +02:00
Aliaksandr Valialkin
ebf8da3730
lib/{storage,mergeset}: tune background merge process in order to reduce CPU usage and disk IO usage
2020-12-18 20:01:08 +02:00
Aliaksandr Valialkin
2dfa746c91
lib/promscrape: remove ID
field from ScrapeWork
struct. Use a pointer to ScrapeWork as a key in targetStatusMap
...
This simplifies the code a bit.
2020-12-17 14:32:56 +02:00
Aliaksandr Valialkin
9abb2d6c74
lib/protoparser/prometheus: follow-up commit after 7d38627b9f6f212ae602aea6a72f469fe3c70ba2
...
Document the bugfix in docs/CHANGELOG.md and add a test for the bugfix.
2020-12-16 23:40:17 +02:00