Commit graph

2810 commits

Author SHA1 Message Date
Roman Khavronenko
b1e49bab52
Dashboards update (#1153)
* dashboard: update single node dashboard

* add number of new series created over last 24h;
* bump version requirements.

* dashboard: update vmagent dashboard

* add panel for open file descriptors;
* add panel for disk I/O;
* add panel for `vmagent_remotewrite_packets_dropped_total` metric;
* bump version requirements.
2021-03-29 12:37:17 +03:00
Aliaksandr Valialkin
b75c2ce659 lib/uint64set: improve Set.Has() performance scalability on multi-CPU system
Do not update bucket32.hint on Set.Has() call, since it leads to memory ping-pong between CPU cores multi-CPU system
2021-03-29 12:33:47 +03:00
Aliaksandr Valialkin
2601cc0fb0 lib/storage: do not update b.nextIdx if no samples are removed because of retention 2021-03-29 12:00:21 +03:00
hagen1778
0c403bfd29 deployment: fix typo in vmalert docker-compose definition 2021-03-29 11:06:32 +03:00
Aliaksandr Valialkin
78188decf9 docs: document that vmagent drops data blocks when remote storage replies with 400 and 409 http status codes
This is a follow up for 1b7dc1e5a5.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1149
2021-03-26 14:44:06 +02:00
Aliaksandr Valialkin
c54cb3e63c app/vmagent/remotewrite: remove superflouos code after 1b7dc1e5a5 2021-03-26 13:59:46 +02:00
Aliaksandr Valialkin
8fc8ef1aba vendor: update github.com/klauspost/compress from v1.11.12 to v1.11.13 2021-03-26 13:57:01 +02:00
Nikolay
1b7dc1e5a5
Adds blocks drop (#1151)
* adds blocks drop at 400 BadRequest status code
recieved from remote storage,
not expected that remote storage will be able to handle it on retry

* removes error logging for dropped blocks,
its expected error
2021-03-26 14:17:59 +03:00
Aliaksandr Valialkin
f39c84b21f lib/promscrape/discovery/kubernetes: typo fix in error message 2021-03-26 12:46:14 +02:00
Aliaksandr Valialkin
9761ffd161 lib/promscrape/discovery/kubernetes: properly handle too old resource version error message from Kubernetes watch API 2021-03-26 12:28:10 +02:00
Aliaksandr Valialkin
aa81039b42 app/vmselect: log the metric which trigger rollup result cache reset
This should help finding the source of stale metrics
2021-03-25 21:31:39 +02:00
Aliaksandr Valialkin
50f790b5d7 docs/Cluster-VictoriaMetrics.md: mention that vmselect doesnt serve partial responses from export API
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1148
2021-03-25 21:04:13 +02:00
Aliaksandr Valialkin
136fc6217c vendor: make vendor-update 2021-03-25 17:56:10 +02:00
Aliaksandr Valialkin
5ec9e49103 docs/vmagent.md: add an example for -remoteWrite.label 2021-03-25 17:54:55 +02:00
Aliaksandr Valialkin
88f6286df7 docs/Cluster-VictoriaMetrics.md: sync with upstream 2021-03-25 17:18:18 +02:00
Aliaksandr Valialkin
f6529f932a docs: add a link to the repository from build instruction for all the VictoriaMetrics components 2021-03-25 17:14:42 +02:00
Aliaksandr Valialkin
2065d11300 docs/vmagent.md: cosmetic fixes 2021-03-25 17:11:19 +02:00
Aliaksandr Valialkin
35fb9bdee1 docs/vmagent.md: cosmetic fixes 2021-03-25 16:54:03 +02:00
Aliaksandr Valialkin
a647144616 docs/vmagent.md: typo fix: tupically -> typically 2021-03-25 16:48:45 +02:00
Aliaksandr Valialkin
c3c3e51f17 docs/vmalert.md: remove misleading -evaluationInterval=3s from example config args
3s evaluation interval is too small for practical setups. It can result in increased load on datasource.
So it is better to remove it from example config args, which are usually copy-pasted by novice users.
2021-03-25 15:29:06 +02:00
Aliaksandr Valialkin
0b2a66db30 app/vmselect/promql: do not merge time series during requests to /api/v1/query
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1141
2021-03-25 13:56:07 +02:00
Aliaksandr Valialkin
6e855d4b82 lib/storage: tune loopsCountPerMetricNameMatch according to production workload 2021-03-25 13:27:47 +02:00
Aliaksandr Valialkin
d4aadba9fa app/vmagent: add -promscrape.consul.waitTime command-line flag for configuring Consul service discovery wait time
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1144
2021-03-23 19:33:25 +02:00
Aliaksandr Valialkin
edc0a94a3c docs/CHANGELOG.md: mention the feature from 44a6cc5eca 2021-03-23 19:00:18 +02:00
Aliaksandr Valialkin
3a3d2165f9 lib/storage: do not reload metricName for the same metricID in Search.NextMetricBlock
This should speed up Search.NextMetricBlock a bit
2021-03-23 17:56:49 +02:00
Aliaksandr Valialkin
9c566f7db9 app/vmagent: mention -remoteWrite.maxDiskUsagePerURL in the descriptio of -remoteWrite.tmpDataPath flag 2021-03-23 16:38:48 +02:00
Nikolay
29f9ef9b7f
changes consul_service label value (#1143)
according to prometheus discovery.
 It should mitigate issue with case sensetive services
https://github.com/hashicorp/consul/issues/5707
2021-03-23 15:35:01 +02:00
Aliaksandr Valialkin
331a6a2015 app/vmselect/graphite: accept and enforce extra_label in all the Graphite APIs 2021-03-23 15:29:16 +02:00
Aliaksandr Valialkin
b521d1d4f2 app/vmselect: move getEnforcedTagFiltersFromRequest to searchtuils, since it will be used in Graphite functions soon 2021-03-23 14:16:29 +02:00
Aliaksandr Valialkin
3cfb3a3683 lib/storage: respect the deadline passed to Storage.SearchMetricNames 2021-03-22 23:03:17 +02:00
Aliaksandr Valialkin
8e2afdf568 lib/storage: improve Search.NextMetricBlock performance by using MetricID->MetricName cache 2021-03-22 22:49:18 +02:00
Aliaksandr Valialkin
e17eb35147 docs/Single-server-VictoriaMetrics.md: sync with README.md 2021-03-22 17:51:43 +02:00
Aliaksandr Valialkin
65a61ff118 docs/Articles.md: add https://blog.cybozu.io/entry/2021/03/18/115743 2021-03-22 17:50:52 +02:00
Aliaksandr Valialkin
71b72304ae app/vmselect: improve description for -search.maxPointsPerTimeseries command-line flag 2021-03-22 16:45:34 +02:00
Aliaksandr Valialkin
44a6cc5eca app/{vminsert,vmagent}: use Influx field as metric name if measurement is empty and -influxSkipSingleField command-line is set
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1139
2021-03-22 13:53:53 +02:00
Aliaksandr Valialkin
910092ca4d lib/storage: tune loopsCountPerMetricNameMatch 2021-03-22 12:53:17 +02:00
Aliaksandr Valialkin
cef010d5f7 app/vmselect/promql: increment key prefix for faster reset for rollup result cache 2021-03-22 11:59:07 +02:00
Aliaksandr Valialkin
648d11b8e0 vendor: update github.com/VictoriaMetrics/metrics from v1.17.0 to v1.17.1 2021-03-18 18:53:07 +02:00
Aliaksandr Valialkin
fb83e97170 app/victoria-metrics: use flag.Parse instead of envflag.Parse for avoiding possible side effects of envflag 2021-03-18 18:20:22 +02:00
Aliaksandr Valialkin
b0c956a178 app/vmselect/graphite: follow-up after 529d7be26b 2021-03-18 16:30:20 +02:00
Nikolay
529d7be26b
changes metricsFind api (#1137)
it should be able mitigate crash if label value contains *,[ or { symbols
2021-03-18 16:12:02 +02:00
Aliaksandr Valialkin
726f6ad804 lib/storage: small code simplification after 6cee5338b2 2021-03-18 15:21:13 +02:00
Aliaksandr Valialkin
6cee5338b2 lib/storage: prevent from infinite loop if {__graphite__="..."} filter matches a metric name with *, [ or { chars
The idea has been borrowed from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1137
2021-03-18 14:53:47 +02:00
Aliaksandr Valialkin
e061a4fa19 docs/Single-server-VictoriaMetrics.md: remove outdated message about experimental mode for -pure builds 2021-03-18 13:49:17 +02:00
Aliaksandr Valialkin
4fb049bcba lib/fs: reduce the frequency of failed to remove directory ... due to NFS lock log warnings
Log `failed to remove directory ... due to NFS lock` warning only if the directory cannot be removed in one second.
2021-03-18 13:24:46 +02:00
Aliaksandr Valialkin
17d4a6e900 vendor: update github.com/VictoriaMetrics/metrics from v1.16.0 to v1.17.0 2021-03-17 23:23:04 +02:00
Aliaksandr Valialkin
904dababcc vendor: update github.com/VictoriaMetrics/metrics from v1.15.3 to v1.16.0
This adds the following new metrics for each VictoriaMetrics app:

* process_resident_memory_anonymous_bytes - the RSS share for memory allocated by the process itself.
  This share cannot be freed by the OS, so it must be taken into account by OOM killer.

* process_resident_memory_pagecache_bytes - the RSS share for page cache memory (aka memory-mapped files).
  This share can be freed by the OS at any time, so it must be ignored by OOM killer.
2021-03-17 17:59:40 +02:00
Aliaksandr Valialkin
45dabfac1b lib/storage: faster move heavy filters to the end of list 2021-03-17 15:12:13 +02:00
Aliaksandr Valialkin
b1713e3fcd app/vmselect/promql: typo fix after 9666834045 2021-03-17 15:12:11 +02:00
Aliaksandr Valialkin
9666834045 app/vmselect/promql: merge adjancent buckets with the smallest summary number of hits in buckets_limit() function
This should improve accuracy for the returned buckets
2021-03-17 14:31:41 +02:00