Aliaksandr Valialkin
394a345ae0
lib/httpserver: expose /-/healthy
and /-/ready
endpoints as Prometheus does
...
This improves integration with third-party solutions, which rely on these endpoints.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1833
2021-12-02 14:36:58 +02:00
Aliaksandr Valialkin
90c542af12
app: use relative paths instead of absolute paths for the supported http handlers on the main page
...
This allows hiding VictoriaMetrics components behind proxies, which serve pages at different path prefixes
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1858
2021-12-02 13:52:39 +02:00
Aliaksandr Valialkin
03f5ad3060
lib/protoparser/graphite: allow multiple separators between metric name, value and timestamp
2021-12-02 13:43:49 +02:00
Aliaksandr Valialkin
49a18b8660
lib/protoparser/graphite: properly parse Graphite line with whitespace after the timestamp
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1865
2021-12-02 13:33:26 +02:00
Aliaksandr Valialkin
c0cbf0de2a
app/{vmbackup,vmrestore}: export internal metrics at /metrics
http handler
2021-12-02 11:55:58 +02:00
Aliaksandr Valialkin
7275ebf91a
app/vmstorage: export vm_cache_size_max_bytes metrics for determining capacity of various caches
...
The vm_cache_size_max_bytes metric can be used for determining caches which reach their capacity via the following query:
vm_cache_size_bytes / vm_cache_size_max_bytes > 0.9
2021-12-02 10:30:43 +02:00
Aliaksandr Valialkin
2f63dec2e3
lib/fs: add vm_filestream_read_duration_seconds_total
and vm_filestream_write_duration_seconds_total
metrics
...
These metrics help determining persistent disk saturation with `rate(vm_filestream_read_duration_seconds_total) > 0.9`
2021-12-02 10:30:42 +02:00
Aliaksandr Valialkin
2fb5a6ca78
lib/storage: do not take into account -storage.minFreeDiskSpaceBytes during background merges
2021-12-01 11:02:36 +02:00
Nikolay
06eff5a72c
removes FileSize from backup part key ( #1872 )
...
* removes FileSize from backup part key
it should fix download restoration for backups
* Update lib/backup/common/part.go
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-01 11:01:28 +02:00
Aliaksandr Valialkin
d666755159
lib/storage: take into account -storage.minFreeDiskSpaceBytes
when performing big merges
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-11-30 12:56:35 +02:00
guidao
f05cddd2fc
fix #1830 ( #1861 )
...
Co-authored-by: wangfeng <wangfeng@zhihu.com>
2021-11-30 01:12:24 +02:00
Aliaksandr Valialkin
ba927d1c77
lib/protoparser/prometheus: follow-up for 8e338632a3
...
Do not spend CPU time on error message formatting if error logger is disabled
2021-11-30 00:50:11 +02:00
Nikolay
8e338632a3
Changes unmarshallRow logger to noop for getRowsDiff ( #1835 )
2021-11-30 00:48:13 +02:00
Aliaksandr Valialkin
d44c585ca4
lib/protoparser: do not log connection reset by peer
error when reading the data via InfluxDB, Graphite and OpenTSDB protocols over plain TCP connections
...
This error is expected, so there is no need in spamming the log with this error.
2021-11-29 21:47:56 +02:00
Aliaksandr Valialkin
b688960db0
lib/persistentqueue: add vm_persistentqueue_read_duration_seconds_total and vm_persistentqueue_write_duration_seconds_total metrics for determining disk usage saturation at vmagent
2021-11-17 16:41:35 +02:00
Lan
b72eed1f5e
Add flag of S3ForcePathStyle ( #1802 )
2021-11-17 01:03:03 +02:00
Aliaksandr Valialkin
e5ac9d8e57
all: consistently return application/json
content-type without charset=utf-8
...
The `application/json` content-type has utf-8 encoding by default.
See https://stackoverflow.com/questions/9254891/what-does-content-type-application-json-charset-utf-8-really-mean
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897
2021-11-09 18:04:44 +02:00
Aliaksandr Valialkin
fd596945e7
lib/promscrape: improve logging for scrape_config_files
parse errors
...
Log the actual file path, which led to the parse error.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1789
2021-11-08 13:34:12 +02:00
Aliaksandr Valialkin
cbfc7b7c92
app/{vminsert,vmagent}: hide passwords and auth tokens by default at /config
page
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764
2021-11-05 14:41:16 +02:00
Aliaksandr Valialkin
e73a82f7a5
lib/promauth: do not show empty values in oauth2
config section at /config
page
2021-11-05 12:53:39 +02:00
Aliaksandr Valialkin
aa534c2582
lib/promscrape: add -promscrape.maxResponseHeadersSize
command-line flag for tuning the maximum http response headers size from Prometheus scrape targets
2021-11-03 22:26:56 +02:00
Aliaksandr Valialkin
d1eb87c831
app/{vmagent,vminsert}: add ability to restrict access to /config page with authKey query arg
...
The authKey can be configured via `-configAuthKey` command-line flag.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764
2021-11-01 16:44:54 +02:00
Aliaksandr Valialkin
bb87949d5c
lib/protoparser/influx: automatically detect timestamp precision depending on the number of decimal digits in the timestamp
2021-10-28 12:47:22 +03:00
Aliaksandr Valialkin
d0e7c0535e
lib/logger: show only explicitly set command-line flags in logs
...
This reduces initial verbosity in logs
2021-10-28 11:00:52 +03:00
Aliaksandr Valialkin
74b8af9891
lib/promscrape: add collapse
and expand
buttons per each group of targets from the same scrape job
2021-10-27 20:03:24 +03:00
Aliaksandr Valialkin
6608705652
app/{vmalert,vmagent}: improve the distribution of scrape offsets among targets / rules
...
Previously only the lower part of 64-bit hash was used for calculating the offset.
This may give uneven distribution in some cases. So let's use all the available 64 bits from the hash
for calculating the offset.
2021-10-27 19:59:16 +03:00
Aliaksandr Valialkin
e3a91b186a
lib/protoparser/prometheus: optimize GetRowsDiff() function
...
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1745 ,
since the provided profile shows that the majority of CPU and memory is spent in this function
during `streamParse` when `-promscrape.noStaleMarkers` wasn't set.
2021-10-27 18:54:45 +03:00
Aliaksandr Valialkin
95d44157fc
lib/protoparser/prometheus: add a benchmark for GetRowsDiff
2021-10-27 18:53:54 +03:00
Aliaksandr Valialkin
1952ab99aa
all: fix build issues and tests for Apple M1
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1653
2021-10-27 15:06:34 +03:00
Aliaksandr Valialkin
4821adfd95
lib/promscrape: properly show proxy_url
option value at /config
page
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1755
2021-10-26 21:23:54 +03:00
Aliaksandr Valialkin
7fa15f7f86
lib/promscrape: do not populate response body to memory in stream parsing mode if -promscrape.noStaleMarkers is set
...
The response body isn't used if -promscrape.noStaleMarkers is set after the commit 2876137c92
,
so there is no sense in pupulating it in memory. This should reduce memory usage when scraping big responses.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1728#issuecomment-949630694
2021-10-22 16:44:44 +03:00
Aliaksandr Valialkin
6106d4069d
lib/promscrape: do not sort original labels and do not intern label string for the original labels before the sharding code is executed
...
This should reduce CPU and memory usage in shard mode when service discovery finds big number of scrape targets with many long labels.
See https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets
This is a follow-up after 9882cda8b9
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1728
2021-10-22 13:54:30 +03:00
Aliaksandr Valialkin
2876137c92
lib/promscrape: reduce memory usage if -promscrape.noStaleMarkers
command-line flag is passed
...
Do not store in memory the response from the last scrape per each target if -promscrape.noStaleMarkers option is enabled.
This should reduce memory usage when the scraped targets return large responses.
2021-10-22 13:10:29 +03:00
Nikolay
a3684fe3de
adds tab as second separator for graphite text protocol ( #1733 )
...
* adds tab as second separator for graphite text protocol
* changes indexFunc for indexAny
* Update lib/protoparser/graphite/parser_test.go
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-22 12:23:45 +03:00
Aliaksandr Valialkin
8991c8b589
lib/flagutil: do not expose sensitive info (passwords, keys and urls) at /flags page
2021-10-20 00:51:26 +03:00
Aliaksandr Valialkin
8ad95f0db7
lib/httpserver: expose command-line flags at /flags
page
...
This should simplify debugging.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695
2021-10-20 00:45:09 +03:00
Aliaksandr Valialkin
676ad70d9f
lib/envflag: use flag.Set for setting the flags from env vars
...
This should make visible the set flags at flag.Visit(), which is used later for logging
and exporting the `is_set` label for these flags at /metrics page
2021-10-20 00:41:08 +03:00
Aliaksandr Valialkin
53bb58ed2a
lib/storage: log a warning when the -storageDataPath has less than -storage.minFreeDiskSpaceBytes
...
This should improve the debuggability of the readonly feature.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1727
2021-10-19 23:59:13 +03:00
Aliaksandr Valialkin
3408a05d12
lib/promscrape/discovery/kubernetes: log a warning if role: endpoints
discovers more than 1000 targets per a single endpoint
...
In this case `role: endpointslice` must be used instead.
See the following references:
* https://kubernetes.io/docs/reference/labels-annotations-taints/#endpoints-kubernetes-io-over-capacity
* https://github.com/kubernetes/kubernetes/pull/99975
* https://github.com/prometheus/prometheus/issues/7572#issuecomment-934779398
2021-10-19 13:20:40 +03:00
Nikolay
cbcc622786
changes job source for /target api ( #1723 )
...
use jobNameOriginal instead of relabeled as prometheus does
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1707
2021-10-19 08:49:36 +03:00
Aliaksandr Valialkin
c37f285466
lib/promscrape: set honor_timestamps: true
by default if this option isnt set explicitly in scrape configs
...
This aligns the behavior to Prometheus - see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
2021-10-16 20:49:08 +03:00
Aliaksandr Valialkin
c055bc478c
lib/promscrape: expose promscrape_series_limit_max_series
and promscrape_series_limit_current_series
metrics per each scrape target with the enabled unique series limiter
2021-10-16 18:47:13 +03:00
Aliaksandr Valialkin
06b0982d6b
lib/promscrape: always initialize http client for stream parsing mode
...
Stream parsing mode can be automatically enabled when scraping targets with big response bodies
exceeding the -promscrape.minResponseSizeForStreamParse , so it must be always initialized.
2021-10-16 13:18:23 +03:00
Aliaksandr Valialkin
32793adbd9
lib/promscrape: store the last scraped response in compressed form if its size exceeds -promscrape.minResponseSizeForStreamParse
...
This should reduce memory usage when scraping targets with big response bodies.
2021-10-16 13:00:30 +03:00
Aliaksandr Valialkin
9866dd95c1
lib/promscrape: store the full response in stream parsing mode in scrapeWork.lastScrape byte slice
...
This allows sending staleness marks and properly calculate scrape_series_added metric in stream parsing mode
at the cost of the increased memory usage, since now the potentially big response is kept
in the lastScrape byte slice per each scrapeWork.
In practice the memory usage increase shouldn't be big, since the response size
is usually much smaller than the parsed metrics from this response after the relabeling,
which usually adds a big pile of target-specific labels per each metric.
2021-10-15 15:39:23 +03:00
Aliaksandr Valialkin
f6d33596ff
lib/promscrape/discovery/kubernetes: rename endpointslices.go -> endpointslice.go in order to be consistent with EndpointSlice struct name
...
This is a follow-up for 31b42b30b6
2021-10-15 12:27:12 +03:00
Aliaksandr Valialkin
bbd34fa15e
lib/promscrape: add -promscrape.minResponseSizeForStreamParse
command-line option for automatic switching to stream parsing mode when scraping targets with big responses
...
This should reduce memory usage when vmagent scrapes targets with non-uniform response sizes.
This is common case in Kubernetes monitoring.
2021-10-14 12:29:35 +03:00
Aliaksandr Valialkin
1a7287c408
lib/promscrape: return error if sample_limit
or series_limit
options are set when stream parsing mode is enabled
2021-10-14 12:11:23 +03:00
Aliaksandr Valialkin
e3c8304deb
lib/promscrape: add ability to show the original labels for discovered targets at /targets page
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1698
2021-10-13 15:59:58 +03:00
Roman Khavronenko
c0a932a55f
lib/promscrape: make errcheck happy ( #1703 )
2021-10-13 14:57:30 +03:00
Aliaksandr Valialkin
9882cda8b9
lib/promscrape: shard targets among cluster nodes after relabeling is applied
...
This guarantees that targets with the same set of labels go to the same vmagent node.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1687#issuecomment-940629495
2021-10-12 17:06:00 +03:00
Aliaksandr Valialkin
5a58c041c2
app/vmagent: expose -promscrape.config contents at /config page as Prometheus does
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695
2021-10-12 16:25:37 +03:00
Aliaksandr Valialkin
873aac584e
lib/promscrape: use Prometheus format for target labels at /targets
page
...
This should simplify copy-pasting the labels to/from PromQL / MetricsQL
2021-10-11 12:41:37 +03:00
Aliaksandr Valialkin
001750c239
lib/storage: fix unaligned access on 32-bit architectures.
...
The bug has been introduced at a171916ef5
2021-10-08 19:43:03 +03:00
Aliaksandr Valialkin
cf5cbd1c70
app/{vminsert,vmstorage}: follow-up after a171916ef5
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-10-08 14:35:49 +03:00
Nikolay
4290b46e8c
Adds read-only mode for vmstorage node ( #1680 )
...
* adds read-only mode for vmstorage
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
* changes order a bit
* moves isFreeDiskLimitReached var to storage struct
renames functions to be consistent
change protoparser api - with optional storage limit check for given openned storage
* renames freeSpaceLimit to ReadOnly
2021-10-08 14:35:48 +03:00
Ziqi Zhao
402c995d6d
fix some typos ( #1678 )
...
Co-authored-by: 柘远 <zzq237937@alibaba-inc.com>
2021-10-06 14:43:10 +03:00
Aliaksandr Valialkin
6ee66fb6b1
lib/promscrape: reduce memory allocations in mergeLabels() after 48e3e6c8df
2021-09-30 16:56:12 +03:00
Aliaksandr Valialkin
463a5bf76e
lib/protoparser: go fmt
2021-09-29 21:19:00 +03:00
Aliaksandr Valialkin
58964d52a5
lib/protoparser/prometheus: compare invalid Prometheus lines in full
2021-09-29 19:41:28 +03:00
Aliaksandr Valialkin
d80d72efec
app/{vmbackup,vmrestore}: switch from gcs://...
to gs://...
urls for backups to GCS
...
The `gs://` urls are commonly used, so prefer them instead of `gcs://` urls,
while leaving support for `gcs://` urls for backwards compatibility.
2021-09-29 12:10:29 +03:00
Nikolay
3d17112a7e
changes auth validation for openstack ( #1663 )
...
* changes auth validation for openstack
must fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1655
* Apply suggestions from code review
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-09-29 00:28:49 +03:00
Aliaksandr Valialkin
91b3c601bc
app/{vminsert,vmagent}: add ability to ingest data via DataDog "submit metrics" API
...
See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/206
2021-09-29 00:13:08 +03:00
Aliaksandr Valialkin
718eca33ab
lib/storage: properly handle {__name__=~"prefix(suffix1|suffix2)",other_label="..."}
queries
...
They were broken in the commit 00cbb099b6
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1644
2021-09-23 21:48:51 +03:00
Aliaksandr Valialkin
a0313c046b
lib/promscrape: add vm_promscrape_max_scrape_size_exceeded_errors_total
metric for counting of the failed scrapes due to the exceeded response size
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1639
2021-09-23 14:47:54 +03:00
Aliaksandr Valialkin
9ca1cbced1
lib/httpserver: add -enterprise
and/or -cluster
suffixes to short_version
label of vm_app_version
metric
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1635
2021-09-21 23:12:42 +03:00
Aliaksandr Valialkin
207c5760ce
lib/promrelabel: fix parsing regex: true
in relabeling rules
2021-09-21 23:00:53 +03:00
Nikolay
ad08d9dfc0
changes protoparser apis for accepting reading from io.Reader ( #1624 )
...
adds InsertHandlerForReader apis to vmagent
2021-09-20 14:49:28 +03:00
Nikolay
0e09fdb8b0
makes filters optional for ec2 api requests ( #1627 )
...
filters can be applied only for DescribeInstances requests, like prometheus does.
related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626
2021-09-17 18:00:37 +03:00
Aliaksandr Valialkin
8f685d81c6
lib/storage: follow up after 00cbb099b6
2021-09-14 14:16:25 +03:00
faceair
00cbb099b6
lib/storage: optimize convert multiple values regexp filter to composite tag filter ( #1610 )
...
* lib/storage: optimize convert multiple values regexp filter to composite tag filter
* Apply suggestions from code review
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-09-14 12:47:07 +03:00
Aliaksandr Valialkin
7f0a8d4bdb
docs: consistency renaming: Influx -> InfluxDB
2021-09-13 17:05:16 +03:00
Aliaksandr Valialkin
fb6ed0ce19
lib/promscrape/discovery/docker: support host networking mode
...
See https://github.com/prometheus/prometheus/issues/9116
2021-09-13 13:30:16 +03:00
Aliaksandr Valialkin
6295861acd
lib/promscrape/discovery/kubernetes: properly use https scheme for wildcard TLS certificates in ingress target discovery
2021-09-13 13:03:42 +03:00
Aliaksandr Valialkin
728c4c3841
lib/promscrape: generate scrape_timeout_seconds
metric per each scrape target in the same way as Prometheus 2.30 does
...
See https://github.com/prometheus/prometheus/pull/9247
2021-09-12 15:20:44 +03:00
Aliaksandr Valialkin
0b4eb0fa7d
lib/promscrape: make fmt
2021-09-12 13:34:15 +03:00
Aliaksandr Valialkin
48e3e6c8df
lib/promscrape: add ability to configure scrape_timeout and scrape_interval via relabeling
...
See https://github.com/prometheus/prometheus/pull/8911
2021-09-12 13:33:41 +03:00
Aliaksandr Valialkin
f3e89754a9
lib/promscrape: reduce CPU usage for common case when calculating scrape_series_added
metric
...
Also reduce CPU usage when applying `series_limit` to scrape targets with constant set of metrics.
The main idea is to perform the calculations on scrape_series_added and series_limit
only if the set of metrics exposed by the target has been changed.
Scrape targets rarely change the set of exposed metrics,
so this optimization should reduce CPU usage in general case.
2021-09-12 12:53:14 +03:00
Aliaksandr Valialkin
cebcb15ba4
lib/storage: verify that the tsidsFound contain the needed tsids in tests added at f4dead529f
2021-09-11 10:57:13 +03:00
Aliaksandr Valialkin
9286107e82
lib/promscrape: send stale markers for disappeared metrics like Prometheus does
2021-09-11 10:51:04 +03:00
Aliaksandr Valialkin
f4dead529f
lib/storage: properly search series by multiple tag filters matching empty labels such as foo{bar=~"baz|",x=~"y|"}
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1601
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395
2021-09-09 21:09:21 +03:00
Aliaksandr Valialkin
4aeb8db83f
lib/promscrape: add ability to set series_limit
and stream_parse
options via relabeling
...
This allows managing these options on a per-target basis.
Typical use case: to manage these options for pods via Kubernetes annotations.
2021-09-09 18:49:39 +03:00
Aliaksandr Valialkin
468f941f7e
lib/promscrape: add the actual job name to the labels of promscrape_series_limit_rows_dropped_total metric
2021-09-09 17:37:37 +03:00
Aliaksandr Valialkin
086b5d0cf1
lib/promscrape: add scrape_
prefix to job
and target
labels exported by promscrape_series_limit_rows_dropped_total
metric
...
This is needed in order to prevent from possible clash with the corresponding (job, target) labels for the job, which scrapes this metric.
2021-09-09 17:29:21 +03:00
Aliaksandr Valialkin
d6bd956930
lib/promrelabel: add keep_metrics
and drop_metrics
actions to relabeling rules
...
These actions simlify metrics filtering. For example,
- action: keep_metrics
regex: 'foo|bar|baz'
would leave only metrics with `foo`, `bar` and `baz` names, while the rest of metrics will be deleted.
The commit also makes possible to split long regexps into multiple lines. For example, the following config is equivalent to the config above:
- action: keep_metrics
regex:
- foo
- bar
- baz
2021-09-09 16:18:21 +03:00
Aliaksandr Valialkin
f77dde837a
lib/promscrape: add the ability to limit the number of unique series per each scrape target
...
The number of series per target can be limited with the following options:
* Global limit with `-promscrape.maxSeriesPerTarget` command-line option.
* Per-target limit with `max_series: N` option in `scrape_config` section.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1561
2021-09-01 16:03:59 +03:00
Aliaksandr Valialkin
5c63d69454
lib/promscrape/discovery/kubernetes: return back support role: endpointslices
, since it is used by VictoriaMetrics operator
...
This is a follow up commit after 31b42b30b6
2021-08-29 12:37:03 +03:00
Aliaksandr Valialkin
db330232ac
lib/protoparser/opentsdb: follow-up after 8ee75ca45a
2021-08-29 11:49:21 +03:00
envzhu
8ee75ca45a
lib/protoparser/opentsdb: accept multiple spaces between fields in a row as a deliminator. ( #1575 )
2021-08-29 11:38:32 +03:00
Aliaksandr Valialkin
31b42b30b6
lib/promscrape/discovery/kubernetes: rename role: endpointslices
to role: endpointslice
to be consistent with Prometheus
...
See 2ec6c7dbb8/discovery/kubernetes/kubernetes.go (L99)
2021-08-29 11:23:08 +03:00
Aliaksandr Valialkin
2e001db4de
lib/promscrape/discovery/kubernetes: use v1 API instead of v1beta1 API for role: ingress
and role: endpointslices
...
This should fix service discovery for these roles in Kubernetes v1.22 and newer versions.
See https://kubernetes.io/docs/reference/using-api/deprecation-guide/#ingress-v122
The corresponding change in Prometheus - https://github.com/prometheus/prometheus/pull/9205
2021-08-29 11:16:59 +03:00
Aliaksandr Valialkin
10f960fa0c
lib/promscrape: add ability to load scrape configs from multiple files
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1559
2021-08-26 08:51:16 +03:00
Aliaksandr Valialkin
c27ee35c5c
lib/promscrape: expose promscrape_discovery_http_errors_total metric for tracking errors per each http_sd config
2021-08-25 13:05:49 +03:00
Aliaksandr Valialkin
ffc0ab1774
lib/{mergeset,storage}: improve the detection of the needed free space for background merge
...
This should prevent from possible out of disk space crashes during big merges.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1560
2021-08-25 09:35:44 +03:00
Aliaksandr Valialkin
d5622b32e2
lib/promscrape: reduce memory and CPU usage when Prometheus staleness tracking is enabled for metrics from deleted / disappeared scrape targets
...
Store the scraped response body instead of storing the parsed and relabeld metrics.
This should reduce memory usage, since the response body takes less memory than the parsed and relabeled metrics.
This is especially true for Kubernetes service discovery, which adds many long labels for all the scraped metrics.
This should also reduce CPU usage, since the marshaling of the parsed
and relabeld metrics has been substituted by response body copying.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
2021-08-21 21:17:26 +03:00
Aliaksandr Valialkin
f46a73dcdd
lib/promscrape: use scrapeTimestamp when storing stale markers for failed scrape
...
This will make timestamps for stale markers more consistent for timestamps for other samples
2021-08-19 14:18:05 +03:00
Aliaksandr Valialkin
c09446a9aa
lib/promscrape: send stale markers for the previously scraped metrics on failed scrapes like Prometheus does
2021-08-18 21:59:03 +03:00
Aliaksandr Valialkin
cdc372bb98
app/vmselect: add -search.noStaleMarkers
command-line flag for disabling stale markers handling in queries
...
This option allows reducing CPU usage a bit when VictoriaMetrics is used
for collecting and processing non-Prometheus data. For example, InfluxDB line protocol, Graphite, OpenTSDB, CSV, etc.
2021-08-18 13:59:02 +03:00
Aliaksandr Valialkin
226143f31b
lib/promscrape: add ability to disable sending Prometheus staleness markers with -promscrape.disableStaleMarkers command-line flag
...
This option can be useful when vmagent consumes too much additional memory
for staleness markers functionality and when staleness markers aren't needed.
2021-08-18 13:43:21 +03:00
Aliaksandr Valialkin
03c959f1df
lib/promscrape: stop scrapers for the removed targets before starting scrapers for the added targets
...
This should prevent from possible time series overlap when old target is substituted by new target (for example, during Kubernetes deployments).
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509
2021-08-17 00:55:51 +03:00
Aliaksandr Valialkin
a0e18f06eb
lib/promscrape: restore red highlighting for DOWN targets at /targets page
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1461
2021-08-15 16:03:57 +03:00
Aliaksandr Valialkin
4401464c22
all: add support for Prometheus staleness markers
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
2021-08-13 12:10:17 +03:00
Aliaksandr Valialkin
d375d9b878
lib/envflag: add a link to docs for -envflag.enable
2021-08-11 10:29:33 +03:00
Aliaksandr Valialkin
d826352688
app/vmagent: follow-up after fe445f753b
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1491
2021-08-05 09:52:32 +03:00
Omar Ghader
46e27d60a6
feature: Add multitenant for vmagent ( #1505 )
...
* feature: Add multitenant for vmagent
* Minor fix
* Fix rcs index out of range
* Minor fix
* Fix multi Init
* Fix multi Init
* Fix multi Init
* Add default multi
* Adjust naming
* Add TenantInserted metrics
* Add TenantInserted metrics
* fix: remove unused metrics for vmagent
* fix: remove unused metrics for vmagent
Co-authored-by: mghader <marc.ghader@ubisoft.com>
Co-authored-by: Sebastian YEPES <syepes@gmail.com>
2021-08-05 09:52:31 +03:00
Aliaksandr Valialkin
50663ba41f
lib/promscrape/discovery/gce: add __meta_gce_interface_ipv4_<name> labels as in Prometheus 2.29
...
See https://github.com/prometheus/prometheus/pull/8978
2021-08-03 16:11:49 +03:00
Aliaksandr Valialkin
3cad8b4564
lib/promscrape/discovery/ec2: add __meta_ec2_availability_zone_id
label as Prometheus 2.29 does
2021-08-03 16:11:49 +03:00
Aliaksandr Valialkin
d05cac6c98
li/storage: re-use the per-day inverted index search code for searching in global index
...
This allows removing a big pile of outdated code for global index search.
This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1486
2021-07-30 10:31:37 +03:00
Aliaksandr Valialkin
8ee8660ac4
app/vmselect: follow-up for 626073bca8
...
* Rename -search.maxMetricsPointSearch to -search.maxSamplesPerQuery, so it is more consistent with the existing -search.maxSamplesPerSeries
* Move the -search.maxSamplesPerQuery from vmstorage to vmselect, so it could effectively limit the number of raw samples obtained from all the vmstorage nodes
* Document the -search.maxSamplesPerQuery in docs/CHANGELOG.md
2021-07-28 18:00:23 +03:00
Nikolay
9d45b46f4c
adds check for region with custom s3 endpoint ( #1465 )
2021-07-27 12:35:38 +03:00
Aliaksandr Valialkin
c2deee9911
lib/storage: yet another attempt to properly determine disk space shortage, which prevents from optimal merges
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-27 12:04:50 +03:00
Aliaksandr Valialkin
bb31117555
lib/promrelabel: add tests for verifying that regex works as expected in single quotes and double quotes
2021-07-27 10:50:55 +03:00
Aliaksandr Valialkin
8b7917cd81
all: add go:build
lines for Go1.17
...
See https://tip.golang.org/doc/go1.17#gofmt for more details
2021-07-26 15:48:21 +03:00
Aliaksandr Valialkin
1318736ad1
lib/promscrape: add missing whitespace at /targets page before up
word
2021-07-26 12:22:59 +03:00
Aliaksandr Valialkin
4ba3fd9e6d
lib/workingsetcache: switch from split cache to full cache after the cache size exceeds 95% of split capacity
...
Previously the switch occurred when the cache size becomes 100% of its capacity. The cache size could never reach 100% capacity.
This could prevent from switching from the split cache to full cache, thus reducing the cache effectiveness.
2021-07-15 16:12:04 +03:00
Aliaksandr Valialkin
d472b03e34
lib/storage: make sure the second call to DeduplicateSamples and deduplicateSamplesDuringMerge doesnt change samples
2021-07-15 12:17:45 +03:00
Aliaksandr Valialkin
682662b2ae
lib/storage: remove cache directory if it contains reset_cache_on_startup file
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447
2021-07-13 17:58:51 +03:00
Aliaksandr Valialkin
2df66dad7b
lib/httpserver: add is_set
label to flag
metrics
...
This label allows determining the set flags with the query `flag{is_set="true"}`
2021-07-13 15:10:13 +03:00
Aliaksandr Valialkin
f9de546139
lib/storage: reset perKeyMisses stats less frequently
...
This should reduce CPU usage for queries executed with intervals higher than 30 seconds
2021-07-12 14:33:42 +03:00
Aliaksandr Valialkin
4f80b2f230
lib/storage: properly limit the size of storage/date_metricID
cache
2021-07-12 14:25:44 +03:00
Aliaksandr Valialkin
8ca2799478
lib/storage: properly determine when the deduplication is needed in needsDedup
...
Previously needsDedup() could return true if the de-duplication wasn't needed for the following case:
d < interval
/ \
| v | v |
interval interval
Now it properly returns false for this case
2021-07-12 10:53:30 +03:00
Aliaksandr Valialkin
6e0553c92e
lib/mergeset: cache indexBlock items only on the second request
...
This should reduce the indexdb/indexBlocks cache size, since it won't contain one-time-wonders items.
2021-07-07 15:23:06 +03:00
Aliaksandr Valialkin
766edbc421
lib/httpserver: print full requestURI in httpserver.Errorf
...
This should simplify debugging.
2021-07-07 13:09:40 +03:00
Aliaksandr Valialkin
e843bd7bd7
lib/storage: do not cache inmemoryBlock entries requested only once (aka one-time-wonder items)
...
This should reduce the cache size and memory usage for the indexdb/dataBlocks cache
2021-07-07 10:58:51 +03:00
Aliaksandr Valialkin
8b262d4ba7
lib/storage: periodically reset prefetchedMetricIDs cache in order to limit its size under high churn rate
2021-07-07 10:58:51 +03:00
Aliaksandr Valialkin
a7694092b8
Revert "lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method"
...
This reverts commit 7c6d3981bf
.
Reason for revert: high contention at bucket16Pool on systems with big number of CPU cores.
This slows down query processing significantly.
2021-07-06 18:21:35 +03:00
Aliaksandr Valialkin
8aa9bba9bd
lib/{mergeset,storage}: switch from sync.Pool to chan-based pool for inmemoryPart objects
...
This should reduce memory usage on systems with big number of CPU cores,
since every inmemoryPart object occupies at least 64KB of memory and sync.Pool maintains
a separate pool inmemoryPart objects per each CPU core.
Though the new scheme for the pool worsens per-cpu cache locality, this should be amortized
by big sizes of inmemoryPart objects.
2021-07-06 16:28:41 +03:00
Aliaksandr Valialkin
7c6d3981bf
lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method
...
This reduces the load on memory allocator in Go runtime in production workload.
2021-07-06 15:35:03 +03:00
Aliaksandr Valialkin
78c9174682
lib/mergeset: increase pool capacity for inmemoryBlock according to collected profiles from production workload
...
CPU and memory profiles show that the pool capacity for inmemoryBlock objects is too small.
This results in the increased load on memory allocation code in Go runtime.
Increase the pool capacity in order to reduce the load on Go runtime.
2021-07-06 13:41:34 +03:00
Aliaksandr Valialkin
f71e4d1853
lib/mergeset: limit the frequency for flushCallback calls to once per 10 seconds
...
This should improve hit ratio for tagFiltersCache when big number of new time series are constantly registered
(aka high churn rate). This, in turn, should reduce CPU usage for queries over such time series.
2021-07-06 12:17:17 +03:00
Aliaksandr Valialkin
f3acf065c9
lib/storage: consistency renaming: tagCache -> tagFiltersCache
...
This improves code readability
2021-07-06 11:03:51 +03:00
Aliaksandr Valialkin
0020b9f904
lib/workingsetcache: properly update stats for requests and cache misses
...
Previously the stats for cache misses could be improperly counted, because it had inflated cache misses
if the entry was missing in the curr cache, but was existing in the prev cache.
The same applies to cache requests - they were inflated if the entry was missing in the curr cache.
2021-07-06 10:53:32 +03:00
Aliaksandr Valialkin
4cf47163c1
lib/workingsetcache: fix cache capacity calculations after 4f0003f182
2021-07-05 17:11:57 +03:00
Aliaksandr Valialkin
4f0003f182
lib/workingsetcache: typo fixes after d0c830039d
2021-07-05 15:35:37 +03:00
Aliaksandr Valialkin
d0c830039d
lib/storage: tune cache sizes according to production workload
2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
8f973e34fb
lib/workingsetcache: properly switch to whole
mode
...
Previously the switch from `split` to `whole` mode had been performed too early,
e.g. when the current cache size became bigger than 1/4 of the allowed cache size.
Now it is performed when the current cache size becomes bigger than 1/2 of the allowed cache size.
This change can reduce memory usage for data ingestion path when big number of active time series are ingested.
2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
43103be011
lib/{storage,mergeset}: increase cache timeout for data and index blocks from a minute to two minutes
...
One minute cache timeout result in slower queries in some production workloads where the interval
between query execution is in the range 1 minute - 2 minutes.
2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
54b9e1d3cb
lib/cgroup: set GOGC to 50 by default if it isn't set
...
This should reduce memory usage for typical VictoriaMetrics workloads by up to 50%
2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
9a83e9018d
lib/storage: properly detect free disk space shortage during data merge
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-02 17:40:54 +03:00
Aliaksandr Valialkin
7088f17494
lib/promscrape/discovery/consul: use case-insensitive comparison for service names
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1424
2021-07-02 14:49:27 +03:00
Aliaksandr Valialkin
6e406083f2
lib/promauth: cache the client TLS certificate for up to a second
...
This should reduce CPU usage when TLS connections are established at a high rate.
2021-07-02 13:21:51 +03:00
Aliaksandr Valialkin
158c50c0ee
lib/promauth: reload TLS certificates from disk on every mTLS connection as Prometheus does
...
This allows updating client certificates without the need to restart vmagent and/or single-node VictoriaMetrics.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1420
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470
2021-07-01 15:27:47 +03:00
Aliaksandr Valialkin
c25b839078
lib/workingsetcache: reset the cache mode when the cache is reset
...
This should reduce memory usage if the working set is reduced after the cache reset.
2021-07-01 11:50:11 +03:00
Nikolay
ae485c2bfd
fixes /targets button style ( #1423 )
...
* fixes /targets button style
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1422
* updates boostrap version
2021-07-01 11:48:07 +03:00
Aliaksandr Valialkin
c93cee8de8
lib/{mergeset,storage}: reduce the maximum lifetime for cached indexdb and data blocks from 2 minutes to a minute
...
This should reduce memory usage on a system with high number of active time series and a high churn rate.
One minute is enough for caching the blocks needed for repeated queries (e.g. alerting rules, recording rules and dashboard refreshes).
2021-06-29 19:57:07 +03:00
Aliaksandr Valialkin
fc12484734
lib/mergeset: switch from sync.Pool to a channel for a pool for inmemoryBlock structs
...
This should reduce memory usage for the pool on systems with big number of CPU cores.
The sync.Pool maintains per-CPU pools, so the total number of objects in the pool
is proportional to the number of available CPU cores. The channel limits the number
of pooled objects by its own capacity. This means smaller number of pooled objects on average.
2021-06-29 19:56:59 +03:00
Aliaksandr Valialkin
9ce211a514
lib/promscrape/discovery/docker: fix golint warning: struct field Id should be ID
2021-06-29 13:12:28 +03:00
Aliaksandr Valialkin
5506cff76e
lib/storage: put indexDBName into the key for dateTagFilter cache and for uselessTagFilters cache
...
This should prevent from stats overwriting when the previous indexdb is queried.
2021-06-29 12:40:05 +03:00
Aliaksandr Valialkin
1b0501a09e
lib/promscrape: typo fix in /targets
output
...
The typo has been introduced in fb72a2133f
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1408
2021-06-28 21:26:37 +03:00
Aliaksandr Valialkin
cb5453953f
lib/promscrape: split docker and dockerswarm service discovery code bases, since they have very little in common
...
This is a follow up after c85a5b7fcb
2021-06-25 13:20:20 +03:00