Commit graph

1413 commits

Author SHA1 Message Date
Aliaksandr Valialkin
6daa5f7500 lib/storage: prioritize data ingestion over heavy queries
Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream.
Prevent this by delaying queries' execution until free resources are available for data ingestion.

Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources
for data ingestion and/or for executing heavy queries.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2020-07-05 19:42:05 +03:00
Roman Khavronenko
703def4b2e
app/vmalert: add retries to remotewrite (#605)
* app/vmalert: add retries to remotewrite

Remotewrite pkg now does limited number of retries if write request failed.
This suppose to make vmalert state persisting more reliable.

New metrics were added to remotewrite in order to track rows/bytes sent/dropped.

defaultFlushInterval was increased from 1s to 5s for sanity reasons.

* fix

* wip

* wip

* wip

* fix bits alignment bug for 32-bit systems

* fix mistakenly dropped field
2020-07-05 18:46:52 +03:00
Aliaksandr Valialkin
de137aef98 app/victoria-metrics: fix tests after the commit acf828a759 2020-07-05 18:24:41 +03:00
Aliaksandr Valialkin
acf828a759 app/vmselect/prometheus: small fixes on top of 8bb762124a 2020-07-05 18:17:06 +03:00
faceair
8bb762124a
fix adjust last points avoid influence earlier value (#606) 2020-07-05 17:56:54 +03:00
Aliaksandr Valialkin
ff6a0955eb lib/promscrape: use HostClient.DoDeadline instead of HostClient.Do in order to guarantee strict deadline across multiple scrape attempts 2020-07-03 21:33:22 +03:00
Aliaksandr Valialkin
8b133e40d5 lib/promscrape: prevent from too big deadline misses on scrape retries
The maximum deadline miss duration is reduced to 2x scrape_interval in the worst case.
By default it is limited to scrape_interval configured for the given scrape target.
2020-07-03 20:41:36 +03:00
Aliaksandr Valialkin
44a54b8b3d lib/promscrape: check for nil error before checking for the returned status code when scraping targets 2020-07-03 18:37:14 +03:00
Ween
d59cdbe90c
[VMAlert] Fix error log when remoteWrite queue size is full (#602)
* Fix Auto metrics relabeled errors

* Finalize auto-genenated  Labels

* Fix Test Errors

* fix error logs when queue is full

Co-authored-by: xinyulong <xinyulong@kuaishou.com>
2020-07-03 16:49:37 +03:00
Aliaksandr Valialkin
0b2086b7a5 app/vminsert: prevent from adding and/or selecting labels with empty values
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600
2020-07-02 23:14:11 +03:00
Aliaksandr Valialkin
8f628cd805 app/victoria-metrics: removed debug log message when -selfScrapeInterval is set 2020-07-02 20:39:41 +03:00
Aliaksandr Valialkin
91b3482894 app/vminsert: add ability to apply relabeling to all the incoming metrics if -relabelConfig command-line arg points to a file with a list of relabel_config entries
See https://victoriametrics.github.io/#relabeling
2020-07-02 20:39:28 +03:00
Aliaksandr Valialkin
e5500bfcf2 all: typo fix: exptected -> expected 2020-07-02 18:05:52 +03:00
Aliaksandr Valialkin
5d3db3ff7c app/vmselect: add interpolate function for filling gaps with linearly interpolated values
See https://stackoverflow.com/q/62565021/274937 for details
2020-07-02 14:54:21 +03:00
Aliaksandr Valialkin
4dd3de9286 lib/promscrape: add ability to set disable_compression and disable_keepalive options in scrape_config section of the config passed to -promscrape.config
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-07-02 14:19:14 +03:00
Aliaksandr Valialkin
8da3f773ae lib/promscrape: add -promscrape.disableKeepAlive command-line flag for disabling http keep-alive connections when scraping targets
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-07-01 02:20:20 +03:00
BigFish
9d5f5b6878
fix: spelling mistakes (#594)
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-07-01 01:35:26 +03:00
Aliaksandr Valialkin
9a2ba5b6d1 vendor: make vendor-update 2020-07-01 01:04:58 +03:00
Aliaksandr Valialkin
b277ba8121 lib/httpserver: add Unwrap method to ErrorWithStatusCode, so As and Is functions in standard errors package may properly unwrap the error inside ErrorWithStatusCode 2020-07-01 00:54:01 +03:00
Aliaksandr Valialkin
84a37098ed app/vmstorage: add -denyQueriesOutsideRetention command-line flag for denying queries outside the configured retention
VictoriaMetrics returns `503 Service Unavailable` http error for requests with time ranges outside the configured retention
if `-denyQueriesOutsideRetention` command-line flag is set.
2020-07-01 00:21:44 +03:00
Aliaksandr Valialkin
56ccfa5218 all: use errors.As instead of type assertion for detecting net.Error 2020-07-01 00:15:34 +03:00
Aliaksandr Valialkin
7c2c8b2981 all: use errors.As for inspecting errors that implement httpserver.ErrorWithStatusCode 2020-07-01 00:04:34 +03:00
Aliaksandr Valialkin
d5dddb0953 all: use %w instead of %s for wrapping errors in fmt.Errorf
This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode .
See https://blog.golang.org/go1.13-errors for details.
2020-06-30 23:05:11 +03:00
Aliaksandr Valialkin
586c5be404 lib/promscrape: add missing label sorting for autogenerated metrics
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/592
2020-06-29 22:36:12 +03:00
Ween
1cd01b5359
Fix Auto metrics relabeled errors (#593)
* Fix Auto metrics relabeled errors

* Finalize auto-genenated  Labels

* Fix Test Errors

Co-authored-by: xinyulong <xinyulong@kuaishou.com>
2020-06-29 22:29:29 +03:00
Roman Khavronenko
88538df267
app/vmalert: support multiple notifier urls (#584) (#590)
* app/vmalert: support multiple notifier urls (#584)

User now can set multiple notifier URLs in the same fashion
as for other vmutils (e.g. vmagent). The same is correct for
TLS setting for every configured URL. Alerts sending is done
in sequential way for respecting the specified URLs order.

* app/vmalert: add basicAuth support for notifier client (#585)

The change adds possibility to set basicAuth creds for notifier
client in the same fasion as for remote write/read and datasource.
2020-06-29 22:21:03 +03:00
Aliaksandr Valialkin
63e5ee0d29 docs: sync with upstream 2020-06-29 22:09:03 +03:00
Roman Khavronenko
eba4e92994
deployment/docker: replace Prometheus with vmagent (#589)
vmagent replaces Prometheus to perform scrapes and writes
into VictoriaMetrics installation. Prometheus datasource was
dropped, but its config was reused to feed vmagent.

Change also contains simplification in dashboard propagation
to Grafana container by removing excessive json manipulation
steps.
2020-06-29 22:05:34 +03:00
Roman Khavronenko
82ecfa3b32
app/vmalert: move flags description and initialization into subpackages
The change adds no new functionality and aims to move flags definitions
to subpackages that are using them. This should improve readability
of the main function.
2020-06-28 12:26:22 +01:00
kreedom
dc4e3f0e0b
app/vmalert: properly set transport for HTTP clients
Fixes issue #586
2020-06-27 08:31:54 +01:00
Aliaksandr Valialkin
8f2e88234f docs: update the info that docker images are built on top of alpine image now
A follow-up after the commit ff624c9125
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/522
2020-06-26 13:54:10 +03:00
Aliaksandr Valialkin
423825695f vendor: make vendor-update 2020-06-25 23:45:14 +03:00
Aliaksandr Valialkin
5dc0bf6d3d vendor: update github.com/valyala/fastjson from v1.5.1 to v1.5.2 2020-06-25 23:35:03 +03:00
Aliaksandr Valialkin
7eb171182b lib/promrelabel: properly apply ^ and $ anchors to regex value in Prometheus relabeling rules 2020-06-25 17:19:19 +03:00
Aliaksandr Valialkin
05d754d7bb app/vmselect/netstorage: reset big result values every 10 seconds instead of after processing every time series
This should reduce GC pressure when processing time series with big number of rows
2020-06-24 19:38:39 +03:00
Aliaksandr Valialkin
8dec17470d deployment/docker/docker-compose.yml: update Prometheus from v1.18.1 to v1.19.1 and Grafana from v7.0.2 to v7.0.3 2020-06-24 18:09:33 +03:00
Aliaksandr Valialkin
5e35b87c3d docs/Cluster-VictoriaMetrics.md: move VictoriaMetrics logo below "Cluster version" heading, since it is heeded for proper navigation at https://victoriametrics.github.io 2020-06-24 12:06:27 +03:00
Aliaksandr Valialkin
c85d926569 docs/SampleSizeCalculations.md: updates 2020-06-24 12:06:25 +03:00
Aliaksandr Valialkin
f0cef4761b docs/SampleSizeCalculations.md: add a doc with calculations for the "Lowest sample size" graph at https://victoriametrics.com/ 2020-06-24 12:00:22 +03:00
nicbaz
774f7ca1c1
vmselect: fix label_replace when mismatch (#579)
As per documentation on `label_replace` function: "If the regular
expression doesn't match then the timeseries is returned unchanged".

Currently this behavior is not enforced, if a regexp on an existing
tag doesn't match then the tag value is copied as-is in the destination
tag. This fix first checks that the regular expression matches the
source tag before applying anything.

Given the current implementation, this fix also changes the behavior
of the **MetricsQL** `label_transform` function which does not
document this behavior at the moment.
2020-06-23 23:50:33 +03:00
Aliaksandr Valialkin
a560b4788e lib/fs: go fmt 2020-06-23 23:02:39 +03:00
Aliaksandr Valialkin
8141541e61 lib/fs: fall back to cgo copy for copying the last 4KB of mmaped data
This probably should fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/581
2020-06-23 22:55:22 +03:00
Aliaksandr Valialkin
e65b4cb6b1 docs/vmalert.md: sync with app/vmalert/README.md 2020-06-23 22:49:38 +03:00
Aliaksandr Valialkin
7209d58fbd app/vmselect/netstorage: increase concurrency when processing small number of time series with big number of data points per each time series
Previously VictoriaMetrics was processing up to 32 time series in a single goroutine.
This could be slow if each time series contains big number of data points (10M+ or more), since only a single CPU core could be loaded with work,
while other CPU cores were idle. Fix this by launching GOMAXPROCS workers for time series processing.

This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/572
2020-06-23 22:46:15 +03:00
nicbaz
72c90bfd8b
vmalert: add support for TLS configuration (#578)
app/vmalert: add support for TLS configuration

Add support for TLS optional configuration in a similar fashion to what
is currently supported in other vmutils such as vmagent. TLS
configuration options are distinct for datasource, remoteRead,
remoteWrite as well as notifier.
2020-06-23 20:45:45 +01:00
Aliaksandr Valialkin
2a39ba639d lib/promrelabel: add support for keep_if_equal and drop_if_equal actions to relabel configs
These actions may be useful for filtering out unneeded targets and/or metrics if they contain equal label values.
For example, the following rule would leave the target only if __meta_kubernetes_annotation_prometheus_io_port
equals __meta_kubernetes_pod_container_port_number:

  - action: keep_if_equal
    source_labels: [__meta_kubernetes_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number]
2020-06-23 17:29:03 +03:00
Aliaksandr Valialkin
8f0bcec6cc lib/promscrape: preserve the previously discovered targets on discovery errors per each job_name
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/582
2020-06-23 15:40:40 +03:00
Aliaksandr Valialkin
a13cd60c6f vendor: update github.com/klauspost/compress from v1.10.9 to v1.10.10 2020-06-23 13:48:51 +03:00
Aliaksandr Valialkin
c970cb912c lib/fs: an attempt to fix SIGBUS error by rounding mmap`ed region to multiple of 4KB pages
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/581
2020-06-23 13:39:49 +03:00
Aliaksandr Valialkin
b5206ce33f lib/logger: add -loggerErrorsPerSecondLimit for limiting the rate of ERROR messages 2020-06-23 12:41:36 +03:00