Commit graph

1137 commits

Author SHA1 Message Date
Roman Khavronenko
0157566fdb vmalert: cleanup and restructure of code to improve maintainability (#471)
The change introduces new entity `manager` which replaces
`watchdog`, decouples requestHandler and groups. Manager
supposed to control life cycle of groups, rules and
config reloads.

Groups export an ID method which returns a hash
from filename and group name. ID supposed to be unique
identifier across all loaded groups.

Some tests were added to improve coverage.

Bug with wrong annotation value if $value is used in
 templates after metrics being restored fixed.

Notifier interface was extended to accept context.

New set of metrics was introduced for config reload.
2020-05-11 14:35:55 +03:00
Nikolay Khramchikhin
0e8c345ffb vmalert config reload
added config hot reload for vmalert with sighup and api call
2020-05-11 14:35:50 +03:00
Aliaksandr Valialkin
6ce9f81d16 docs/CaseStudies.md: add CERN case study 2020-05-11 14:35:43 +03:00
Aliaksandr Valialkin
6c88e3523b docs/Single-server-VictoriaMetrics.md: small updates for Monitoring and How to start VictoriaMetrics sections 2020-05-08 20:35:31 +03:00
Aliaksandr Valialkin
6646b380ef docs/vmauth.md: fix a link to docker images 2020-05-08 14:11:10 +03:00
Aliaksandr Valialkin
0362bd220e docs/Articles.md: add a link to CERN article at https://indico.cern.ch/event/877333/contributions/3696707/attachments/1972189/3281133/CMS_mon_RD_for_opInt.pdf 2020-05-08 01:25:17 +03:00
Aliaksandr Valialkin
657c3e3fc5 Makefile: suppress false positives for golangci-lint on nil pointer dereference 2020-05-07 19:41:11 +03:00
Aliaksandr Valialkin
28ad350a31 app/vmagent: return 200 from /-/reload endpoint as Prometheus does 2020-05-07 19:29:48 +03:00
Aliaksandr Valialkin
2f28e945b8 lib/httpserver: add -http.shutdownDelay flag for a grace period before http server shutdown
The http server returns 503 non-OK error at `/health` page during grace period,
so load balancers in front of the http server could re-route incoming requests
to other servers.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/463
2020-05-07 15:25:51 +03:00
Aliaksandr Valialkin
3052b479b7 lib/httpserver: reduce typical duration for http server graceful shutdown
Previously the duration for graceful shutdown for http server could take more than a minute
because of imporperly set timeouts in setNetworkTimeout.
Now typical duration for graceful shutdown should be reduced to less than 5 seconds.
2020-05-07 14:16:38 +03:00
Aliaksandr Valialkin
dc04040781 docs/{vmagent,vmauth}: small clarifications in the docs 2020-05-07 12:55:06 +03:00
Aliaksandr Valialkin
2b403d3f42 app/vmauth: prevent from attacks with .. in path for accessing resources outside the configured url_prefix 2020-05-07 12:55:04 +03:00
Aliaksandr Valialkin
c43a265716 lib/flagutil: make errcheck happy by explicitly ignoring Array.Set result in tests 2020-05-06 22:37:28 +03:00
Aliaksandr Valialkin
15e3682b40 lib/flagutil: properly parse quoted flag values for flagutil.Array 2020-05-06 22:28:15 +03:00
Aliaksandr Valialkin
20538a2a5d app/vmagent: allow setting independent auth configs per each configured -remoteWrite.url 2020-05-06 16:52:32 +03:00
Aliaksandr Valialkin
12dbb9e22c app/vmagent: properly set client-side TLS certificates for -remoteWrite.url. Previously they were mistakenly set as server-side 2020-05-06 16:50:37 +03:00
Aliaksandr Valialkin
9f39e618ed lib/promscrape/discovery/gce: discover per-zone instances for gce_sd_config in parallel. This should reduce discovery latency 2020-05-06 15:00:23 +03:00
Aliaksandr Valialkin
8665c2edb1 docs/vmagent.md: small fixes 2020-05-06 14:49:25 +03:00
Aliaksandr Valialkin
8ab5e47b5c lib/promscrape: add Prometheus-compatible DNS-based service discovery aka dns_sd_configs 2020-05-06 00:02:41 +03:00
Aliaksandr Valialkin
42d563934b lib/promscrape: properly connect to TCP6 addresses if -enableTCP6 is set 2020-05-06 00:02:40 +03:00
Aliaksandr Valialkin
21b91599c2 docs/{vmauth,vmagent}: fix ports for profiling 2020-05-05 20:16:09 +03:00
Aliaksandr Valialkin
309700ab8c docs/vmauth.md: mention that we can help creating customized proxy 2020-05-05 12:34:08 +03:00
Aliaksandr Valialkin
20e958789a docs/{vmagent,vmauth}: add Profiling section 2020-05-05 11:45:29 +03:00
Aliaksandr Valialkin
1153f30fee docs: add vmauth.md 2020-05-05 11:17:45 +03:00
Aliaksandr Valialkin
782fb30cd0 app/vmauth: build fixes 2020-05-05 11:03:25 +03:00
Aliaksandr Valialkin
de31d16154 app/vmauth: add initial version of vmauth. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmauth/README.md for details 2020-05-05 10:56:20 +03:00
Aliaksandr Valialkin
61df59b9ea docs/vmagent.md: /targets page doesnt expose infomration about imporperly configured scrape configs now. It is written in error log instead 2020-05-05 10:56:18 +03:00
Aliaksandr Valialkin
1c8e97c8a0 lib/procutil: add NewSighupChan function, which returns a channel, which is triggered on every SIGHUP 2020-05-05 10:56:15 +03:00
Aliaksandr Valialkin
dde92fccc5 docs/vmalert.md: sync with app/vmalert/README.md 2020-05-05 07:51:32 +03:00
Aliaksandr Valialkin
054457d1f4 lib/promscrape: allow explicitly setting empty token via token: "" in consul_sd_config 2020-05-05 07:49:54 +03:00
Aliaksandr Valialkin
fd739808f3 make vendor-update 2020-05-05 00:53:41 +03:00
Roman Khavronenko
abce2b092f app/vmalert: restore alerts state from datasource metrics (#461)
* app/vmalert: restore alerts state from datasource metrics

Vmalert will restore alerts state for rules that have `rule.For` > 0 from previously written timeseries via `remotewrite.url` flag.

* app/vmalert: mention remotewerite and remoteread configuration in README
2020-05-05 00:52:19 +03:00
Aliaksandr Valialkin
89aa6dbf56 lib/promscrape: add Prometheus-compatible service discovery for Consul aka consul_sd_configs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330
2020-05-04 20:53:06 +03:00
Aliaksandr Valialkin
28e0e8fd88 lib/promauth: properly set up client certificate in tls.Config
Previously the client certificate has been mistakenly set up as a server certificate
2020-05-04 20:53:04 +03:00
Aliaksandr Valialkin
ed91fe1d9b lib/promscrape: move common code for discovery api config map handling into discoveryutils 2020-05-04 20:52:58 +03:00
Aliaksandr Valialkin
c50fd219dc lib/promscrape/discovery/kubernetes/: unify apiConfig creation 2020-05-04 20:52:53 +03:00
Aliaksandr Valialkin
54414fefef vendor: update github.com/valyala/quicktemplate from v1.4.1 to v1.5.0 2020-05-04 01:37:34 +03:00
Aliaksandr Valialkin
6606dff58d docs/Single-server-VictoriaMetrics.md: mention that it is recommended upgrading to the latest release before reporting issues 2020-05-04 00:42:33 +03:00
Aliaksandr Valialkin
e3a4b75e59 docs/Cluster-VictoriaMetrics.md: add Multitenancy chapter 2020-05-03 18:01:15 +03:00
Aliaksandr Valialkin
a5880f17af lib/promscrape: remove debug line left after the commit e4aac6ea40 2020-05-03 17:16:19 +03:00
Aliaksandr Valialkin
1f0e8fdc0d lib/promscrape: fix tests after the commit 658a8742ac
The original commit copies `__address__` label to `instance` label when generating per-target labels as Prometheus does.

See https://www.robustperception.io/life-of-a-label for details.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/453
2020-05-03 16:59:29 +03:00
DexterZhang
317688f144 fix(vmagent): different behavior as how prometheus deal with labels. [Issue#453] (#454) 2020-05-03 16:59:28 +03:00
Aliaksandr Valialkin
ab1e6a76bb lib/promscrape: make consistent scrape time offsets across reloads for the same ScrapeURL and Labels
This should make consistent intervals between data points for scrape targets across reloads.
Previously these intervals were random.
2020-05-03 14:31:22 +03:00
Aliaksandr Valialkin
f25416984b lib/promscrape: fix TestGetFileSDScrapeWorkSuccess after 3b234d82e5 2020-05-03 14:31:20 +03:00
Aliaksandr Valialkin
f422203e10 lib/promscrape: reload only modified scrapers on config changes
This should improve scrape stability when big number of targets are scraped and these targets are frequently changed.

Thanks to @xbsura for the idea and initial implementation attempts at the following pull requests:

- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/449
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/458
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/459
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/460
2020-05-03 12:47:16 +03:00
Aliaksandr Valialkin
8f591b848a docs/MetricsQL.md: document first_over_time and last_over_time functions 2020-05-03 12:47:16 +03:00
Aleksey Shirokih
137e371219
Avoid ugly y-label for rows inserted (#457) 2020-05-02 19:06:37 +01:00
Aliaksandr Valialkin
bbaca16ce8 lib/httpserver: rename http.externalURL to http.pathPrefix and improve help message for this flag
The `http.externalURL` flag name was slightly misleading, so it has been renamed to `http.pathPrefix`.
2020-05-02 13:12:24 +03:00
DexterZhang
a0589f2ca5 feat(httpserver): add http.externalUrl config to http server, it adds prefix to http path automatically (#452) 2020-05-02 13:12:23 +03:00
Aliaksandr Valialkin
8e041f1911 docs/Single-server-VictoriaMetrics.md: hint that \n is a single newline char 2020-05-01 13:42:50 +03:00