Commit graph

950 commits

Author SHA1 Message Date
Aliaksandr Valialkin
5328a102e0 app/vmselect: unconditionally deny partial responses from /api/v1/export*
It is expected that `/api/v1/export*` returns full data, so there is no sense in partial responses there.
2021-01-27 14:39:53 +02:00
Aliaksandr Valialkin
4b324da947 all: consistently use timers from timerpool 2021-01-27 00:40:39 +02:00
Aliaksandr Valialkin
29bf531f7d app/vmagent: add -remoteWrite.rateLimit command-line flag for limiting data rate to remote storage
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1035
2021-01-27 00:40:39 +02:00
weng zhao
2a8a34ea05 vmalert: add option datasource.queryStep to allow user to address the inconsistency between grafana dashboards(query_range with step 15s usually) and ALERTS (#1027)
Co-authored-by: zhao.weng <zhao.weng@shopee.com>
2021-01-26 16:38:20 +02:00
Aliaksandr Valialkin
44c74f1e79 app/vmselect/promql: improve documentation for -search.maxPointsPertimeseries command-line flag
This should reduce incorrect usage and assumptions for this flag.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1020
2021-01-22 13:00:35 +02:00
Aliaksandr Valialkin
e55205220b app/vmselect: add -search.maxStepForPointsAdjustment command-line flag, which can be used for disabling adjustment for points returned from /api/v1/query_range handler if they have timestamps closer than -search.latencyOffset to the current time 2021-01-19 22:57:50 +02:00
Aliaksandr Valialkin
5856611291 app/vmselect/graphite: extract getCanonicalPath() function from loop body inside getCanonicalPaths() 2021-01-18 17:31:27 +02:00
Aliaksandr Valialkin
5640e6cbca docs/vmagent.md: follow-up for 184a659c5f 2021-01-13 13:54:28 +02:00
Aliaksandr Valialkin
c5bdab5a4c app/vmselect/promql: add ability to pass multiple labels to sort_by_label and sort_by_label_desc functions
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/992
2021-01-13 12:43:47 +02:00
Aliaksandr Valialkin
8cae98aa78 app/vmselect/promql: properly parse escaped multibyte utf8 code sequences in metric names and labels names
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/990
2021-01-13 10:59:32 +02:00
Nikolay
821492bc0b adds extra_label to all import apis (#1007)
* adds extra_label to all import apis,
changes priority for extra_label - now it has priority over original labels

* Update README.md

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>

* Update README.md

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>

* adds extra labels to vmagent  import api
changes order for adding labels, now its added after user values

* adds tests for extra_label

* import fix

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-01-13 01:07:24 +02:00
Aliaksandr Valialkin
df6e399f73 app/vmselect/promql: add tfirst_over_time(m[d]) and tlast_over_time(m[d]) MetricsQL functions for returning timestamps for the first and the last samples in m over d 2021-01-12 16:12:47 +02:00
Nikolay
9f0a4fd00e Fixes error handling for promscrape.streamParse (#1009)
properly return error if client cannot read data,
properly suppress scraper errors
2021-01-12 13:35:09 +02:00
Roman Khavronenko
304512b668 vmalert-989: return non-empty result in template func query stub to pass validation (#1002)
On templates validation stage vmalert does not acutally send queries, so for complex
chained expression validation may fail. To avoid this, we add a blank sample in response
so validation can pass successfully. Later, during the rule execution, stub will be replaced
with real `query` function.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/989
2021-01-11 12:59:33 +02:00
Aliaksandr Valialkin
4ee53c3961 all: use net.Dial instead of fasthttp.Dial, because fasthttp.Dial limits the number of concurrent dials to 1000 2021-01-11 12:52:51 +02:00
Aliaksandr Valialkin
d5a2b120e9 app/vmstorage: disable final merge by default, since it may result in high disk IO and CPU usage without measurable benefits such as increased query performance and reduced disk space usage 2021-01-08 00:12:12 +02:00
Aliaksandr Valialkin
47872ada7e app/vmselect/promql: do not ajdust offset value provided in the query
Previously it could be modified in order to improve response cache hit ratio.
This is unneeded, since cache hit ratio should remain good because the query time range
should be already aligned to multiple of `step` values.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/976
2020-12-27 14:10:15 +02:00
Aliaksandr Valialkin
5bbf200de2 app/vmselect: add per-tenant /api/v1/status/top_queries handler 2020-12-27 12:53:50 +02:00
Aliaksandr Valialkin
0e739efc88 app/vmselect/promql: simplify defer call for querystats.RegisterQuery 2020-12-27 12:07:56 +02:00
Aliaksandr Valialkin
44932098b5 app/vmselect/querystats: reduce the default number of last queries to track from 100K to 20K
This should reduce memory usage in constrained environments
2020-12-25 17:40:32 +02:00
Aliaksandr Valialkin
e6deb39064 app/vmselect: refactor /api/v1/stats/top_queries
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/907
2020-12-25 17:24:25 +02:00
Nikolay
76d092c091 Adds query stats handler (#945)
* Adds query stat handler,
for query and query_range api, victoriametrics tracks query execution time,
stats are expored at /api/v1/status/queries endpoint with topN param
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/907

* fixed query stats bugs

* improves queryStats tracker

* improves query stat

* small fix

* fix tests

* added more tests

* fixes 386 tests

* naming fixes

* adds drop for outdated records
2020-12-25 17:24:24 +02:00
Nikolay
14915071d6 adds escape for CRLF (#984)
at external.alert.source - \n and \r symbols was url encoded, instead of direct usage.
replace it from "\n" to `\n`  allows to skip url encoding.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890
2020-12-25 11:06:47 +02:00
Aliaksandr Valialkin
b480585905 app/vmalert: typo fix in descriptions for notifier.basicAuth.username and notifier.basicAuth.password command-line flags 2020-12-24 12:49:40 +02:00
Nikolay
0b87f02602
fixes panic (#979)
* fixes panic
https://github.com/VictoriaMetrics/helm-charts/issues/89

* add fast-path

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-12-24 09:01:40 +02:00
Aliaksandr Valialkin
d8511b6651 docs: mention that it is possible to set multiple -notifier.tlsInsecureSkipVerify command-line flags for vmalert
See c3a92968343c2b3619f1ab935702d0e9b3a46733
2020-12-22 22:32:56 +02:00
Nikolay
67e470e598 changes vmalert notifier flag, (#978)
fixes issue with notifier insecure setting, now its possible to use multiple notifier.tlsInsecureSkipVerify multiple time.
2020-12-22 22:27:03 +02:00
Roman Khavronenko
9ce8b36d2a vmalert-974: fix order for labels templating (#975)
The change fixes bug caused by 3adf8c5a6f.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/974
2020-12-19 14:21:27 +02:00
Aliaksandr Valialkin
262cf81757 app/vmselect: properly parse negative combined offsets such as -1h2m3s
Previously such offsets were parsed as `-1h + 2m + 3s`. Now they are parsed as `-(1h + 2m + 3s)`.
2020-12-19 01:25:03 +02:00
Aliaksandr Valialkin
49e800ba55 app/vmagent: add vmagent_remotewrite_blocks_sent_total and vmagent_remotewrite_bytes_sent_total metrics per each -remoteWrite.url 2020-12-15 20:41:08 +02:00
Aliaksandr Valialkin
9ab7ca1133 docs/vmagent.md: typo fix: pearsed->parsed 2020-12-15 19:03:35 +02:00
Aliaksandr Valialkin
11674a9b76 docs/vmagent.md: mention that sample_limit option has no sense when stream parsing is enabled 2020-12-15 18:44:19 +02:00
Aliaksandr Valialkin
8d1031c29a app/vmselect/promql: return expected increase() result for the first point on the graph with value not exceeding 100 2020-12-15 14:10:50 +02:00
Nikolay
7064c4eb8e adds new Array Flags (#965)
* adds ArrayDuration and ArrayBool flags,
makes sendTimeout and tlsInsecure configurable per remoteWrite url

* added backward compatibility testcases for ArrayDuration and ArrayBool

* fixes bool flag

* fixes test cases
2020-12-15 12:59:33 +02:00
Roman Khavronenko
9f578e389c vmalert: add function "query", "first" and "value" to alert templates functions (#960)
The commit adds a support for template function `query`,
`first` and `value`. The function `query` executes
a MetricsQL query for active alerts. In vmalert we
update templates on every evaluation for active alerts
to keep them up to date. With `query` func it may become
a perf issue since it will fire a query on every execution.
We should keep it in mind for now.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/539
2020-12-14 20:12:16 +02:00
Aliaksandr Valialkin
a2eb451de4 app/{vmagent,vminsert}: follow-up for ce8c2dd1f1: return /targets page in HTML when requested via web browser 2020-12-14 14:13:01 +02:00
Nikolay
324e3aa1a5 Changes targets api (#961)
* changes /targets api
adds html response if requester accepts text/html,
adds quick template for /targets api,
fixes pathPrefix for / requests

* changes namings

* renamed targets file

* Update app/victoria-metrics/main.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>

* adds trimspace to qtpl,
moves content-type for targets response closer to writer

* fixes bug with prefix

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-12-14 14:13:00 +02:00
Aliaksandr Valialkin
fc82c22e50 docs: consistently use links to https://victoriametrics.github.io for documentation references 2020-12-11 21:09:17 +02:00
Aliaksandr Valialkin
d6f9bf2d19 app/vmselect/graphite: properly handle wildcards and charsets inside curly braces
For example, `foo{bar*,[a-f]a*b}` should match `foobar`, `foobar123`, `foofab`, etc.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/952
2020-12-11 17:26:32 +02:00
Aliaksandr Valialkin
9e79fc27c8 app/vminsert/netstorage: properly update vm_rpc_rerouted_rows_processed_total metric
Previously this metric wasn't updated because of improper defer call.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/955

Thanks to @xemxx for spotting the bug.
2020-12-11 13:07:05 +02:00
Aliaksandr Valialkin
1a237c6903 all: properly handle CPU limits set on the host system/container
This can reduce memory usage on systems with enabled CPU limits.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946
2020-12-08 21:07:03 +02:00
Aliaksandr Valialkin
bd8eef2528 app/vmselect/promql: do not reduce lookbehind window for any_rollup_func(m) to -search.maxStalenessInterval. It should equal to step value passed to /api/v1/query_range as most users expect 2020-12-08 15:17:05 +02:00
Aliaksandr Valialkin
7bdf07883b app/{vmalert,vmagent}: skip empty values in -remoteWrite.label and -label lists 2020-12-08 14:54:02 +02:00
Aliaksandr Valialkin
9660774fd1 app/vmselect/graphite: remove duplicate name tag from /tags/autoComplete/tags handler
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/942
2020-12-07 01:10:02 +02:00
Aliaksandr Valialkin
d242c2f2bd app/vmselect/promql: add count_eq_over_time(m[d], N) and count_ne_over_time(m[d], N) for calculating the number of samples in m over d that are equal / not equal to N 2020-12-05 12:31:01 +02:00
Aliaksandr Valialkin
bdac2171f1 all: do not print usage info for all the flags when incorrect command-line flag is passed
This should improve usability for VictoriaMetrics apps that have big number of command-line flags,
i.e. all the apps.
2020-12-03 21:46:19 +02:00
Aliaksandr Valialkin
8cf76d8747 app/vmselect/promql: add label_uppercase(q, label1, ... labelN) and label_lowercase(q, label1, ... labelN) functions
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/936
2020-12-03 21:46:18 +02:00
Aliaksandr Valialkin
11bbb3552d app/vmselect/promql: make fmt 2020-12-02 21:34:15 +02:00
Aliaksandr Valialkin
9e98a8f3d3 app/vmselect/promql: return nan from minute(m) when m equals to nan
This aligns VictoriaMetrics behaviour with Prometheus behaviour.

The issue has been spotted in https://promlabs.com/promql-compliance-test-results/2020-12-01/victoriametrics/
2020-12-02 20:16:40 +02:00
Aliaksandr Valialkin
def513355e app/vmselect/promql: do not return 0 value from sum_over_time(m[d]) when there are no samples on the given d window.
This aligns the behaviour of `sum_over_time()` with other `_over_time()` functions and with Prometheus behavior.
2020-12-02 13:12:33 +02:00
Aliaksandr Valialkin
490c70a958 app/vmselect: return metric values from time() cmp_op metric query when cmp_op comparison is true
This aligns MetricsQL behavior to Prometheus' one.

The issue has been identified at https://promlabs.com/promql-compliance-test-results/2020-12-01/victoriametrics/
2020-12-02 12:09:40 +02:00
Aliaksandr Valialkin
4ef7158e89 app/vmselect/promql: return nan from a >bool b if a is nan in the same way as Prometheus does 2020-12-02 00:28:56 +02:00
Aliaksandr Valialkin
adf45b730c app/vmselect/searchutils: return elapsed time in Deadline.String() output
This should improve debuggability for error messages containing Deadline.String() output
2020-12-01 00:14:36 +02:00
Aliaksandr Valialkin
1dce37b2fa app/vmbackup/snapshot: add missing status code check for the returned response when working with snapshot API
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/929
2020-11-30 14:49:29 +02:00
Aliaksandr Valialkin
8b5a38376d app/vmbackup/snapshot: log url and response body on failed JSON response parsing
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/929
2020-11-29 12:16:08 +02:00
Nikolay
e4e33cb757 fixes checksum calculation (#928)
* fixes checksum calculation,
'for' rule param wasnt marshal properly during checksum calculation

* fixes error
2020-11-29 09:50:57 +02:00
Aliaksandr Valialkin
3f52e59efe app/{vmagent,victoria-metrics}: add -dryRun option and make more clear handling for -promscrape.config.dryRun 2020-11-25 23:01:39 +02:00
Aliaksandr Valialkin
ed06990609 app/vmagent: do not enable -promscrape.config.strictParse when -dryRun command-line flag is set
Users can specify -promscrape.config.strictParse if -promscrape.config shouldn't contain unknown config entries
2020-11-25 22:27:41 +02:00
BigFish
3159b41689 Update main.go (#922)
fix spelling mistake
2020-11-24 12:36:47 +02:00
Aliaksandr Valialkin
2cc288c023 app/vmbackup: cosmetic fixes 2020-11-23 17:10:13 +02:00
Aliaksandr Valialkin
e1297c0b78 app/vmselect: add /tags/delSeries handler from Graphite Tags API
See https://graphite.readthedocs.io/en/stable/tags.html#removing-series-from-the-tagdb
2020-11-23 15:32:14 +02:00
Aliaksandr Valialkin
3d2ce31cad app/vmselect/netstorage: code readability improvement: rename *RequestErrors to *Errors 2020-11-23 15:00:15 +02:00
Aliaksandr Valialkin
433ae806ac app/vmselect: implement /tags/tagSeries and /tags/tagMultiSeries` in order to be consistent with single-node VictoriaMetrics 2020-11-23 14:57:08 +02:00
Aliaksandr Valialkin
7987129baa app/vmselect/netstorage: move common code for requests execution on all the storage nodes to startStorageNodesRequest func 2020-11-23 10:51:48 +02:00
Aliaksandr Valialkin
25a57ced6c app/vmselect/netstorage: prevent from data races in ProcessSearchQuery and in Export funcs when -replicationFactor > 1
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
2020-11-23 10:25:51 +02:00
Aliaksandr Valialkin
f4fd917e4f lib/fs: replace fs.OpenReaderAt with fs.MustOpenReaderAt
All the callers for fs.OpenReaderAt expect that the file will be opened.
So it is better to log fatal error inside fs.MustOpenReaderAt instead of leaving this to the caller.
2020-11-23 09:57:30 +02:00
Aliaksandr Valialkin
1dcb438c3b app/vmselect/netstorage: typo fix after 990eb29a9b 2020-11-23 01:09:43 +02:00
Aliaksandr Valialkin
85eecf5801 app/vmselect/netstorage: add -replicationFactor command-line flag for reducing query duration when a part of vmstorage nodes are temporarily slow and/or temporarily unavailable
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
2020-11-23 00:39:53 +02:00
Aliaksandr Valialkin
990eb29a9b app/vmselect/netstorage: move common code for collecting query results from vmstorage nodes to collectResults function 2020-11-23 00:16:02 +02:00
Nikolay
bb2bcb9725 Adds eureka service discovery (#913)
* Adds eureka service discovery
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/851
Netflix service discovery for AWS

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-11-20 14:02:13 +02:00
Aliaksandr Valialkin
e72ccc9239 app/vmselect: add remoteAddr to slow query log in order to improve debuggability
This will simplify identifying the client that sends slow queries to VictoriaMetrics.
2020-11-18 20:40:02 +02:00
Aliaksandr Valialkin
ea4afb201b app/vmselect/netstorage: typo fix in a comment inside SearchMetricNames func 2020-11-18 01:35:37 +02:00
Aliaksandr Valialkin
c6adcafedb app/vminsert: export vm_rpc_vmstorage_is_reachable metric, which can be used for monitoring reachability of vmstorage nodes from vminsert nodes 2020-11-17 22:13:26 +02:00
Aliaksandr Valialkin
7d76fdedcc app/vmselect: use storage.NewSearchQuery() instead of constructing storage.SearchQuery in-place
This should prevent from bugs when AccountID and ProjectID aren't set in storage.SearchQuery.
2020-11-16 18:04:33 +02:00
Aliaksandr Valialkin
911c6d3bcd app/vmselect: add missing graphite prefix to /tags/autoComplete/{tags,values} 2020-11-16 18:04:24 +02:00
Aliaksandr Valialkin
f7f866d83b app/vmselect/netstorage: typo fix 2020-11-16 15:54:45 +02:00
Aliaksandr Valialkin
59fb75717e app/vmselect/netstorage: apply Graphite filter after substituting __name__ with name 2020-11-16 15:50:53 +02:00
Aliaksandr Valialkin
eb763bcb9d app/vmselect/graphite: add /tags/autoComplete/values handler from Graphite Tags API 2020-11-16 15:29:29 +02:00
Aliaksandr Valialkin
f2f16d8e79 app/vmselect/graphite: add /tags/autoComplete/tags handler from Graphite Tags API
See https://graphite.readthedocs.io/en/stable/tags.html#auto-complete-support
2020-11-16 14:58:10 +02:00
Aliaksandr Valialkin
2f4421b86c app/vmselect/prometheus: return __name__ label if match[] query to /api/v1/labels matches at least a single time series 2020-11-16 13:54:50 +02:00
Aliaksandr Valialkin
852aed62f7 app/vmselect/prometheus: improve performance for /api/v1/labels and /api/v1/label/<labelName>/values on time ranges exceeding one day when match[] query arg is set 2020-11-16 13:46:51 +02:00
Aliaksandr Valialkin
e969346e3e app/vmselect/prometheus: fix deadlock in /api/v1/series on a time range exceeding one day 2020-11-16 13:30:57 +02:00
Aliaksandr Valialkin
eea1be0d5c app/vmselect/graphite: add /tags/findSeries handler from Graphite Tags API
See https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
2020-11-16 12:52:23 +02:00
Aliaksandr Valialkin
97100b1d42 app/vmselect/graphite: apply filter then limit 2020-11-16 12:52:18 +02:00
Aliaksandr Valialkin
5889273920 app/vmselect/graphite: add /tags/<tag_name> handler for Graphite Tags API 2020-11-16 03:41:41 +02:00
Aliaksandr Valialkin
99cb1a70cf app/vmselect/graphite: add /tags handler from Graphite Tags API
See https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
2020-11-16 02:57:20 +02:00
Aliaksandr Valialkin
2ac5f00d98 app/vmselect: propagate errors from vmstorage to response to the client if -search.denyPartialResponse command-line flag is set
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/891

This commit also adds `"isPartial":{true|false}` field to `/api/v1/*` responses. `"isPartial":true` is set when the response
is based on a partial data because some of vmstorage nodes weren't available during query processing.
2020-11-14 13:20:10 +02:00
Aliaksandr Valialkin
882e2e2099 app/vminsert/netstorage: return 503 status code to client when all the vmstorage nodes are unavailable
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/896
2020-11-14 00:44:41 +02:00
Aliaksandr Valialkin
8f42e59e05 app/vmselect/promql: remove spikes from increase() and delta() results on time series with spare irregular data points
Do not take into account spare data point value if the next point will is located too far from the current point.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/894
2020-11-13 15:23:37 +02:00
Aliaksandr Valialkin
da6d82a8dd app/vmselect/promql: assume that time series value doesnt change during gaps when calculating increase() and delta()
This should remove unexpected spikes at the end of gaps.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/894
2020-11-13 14:59:32 +02:00
Aliaksandr Valialkin
7ceaf4ba8f all: consistently return text-based HTTP responses with charset=utf-8
This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897
2020-11-13 10:30:21 +02:00
faceair
64e99744f1 add charset on targets response (#897) 2020-11-13 10:18:13 +02:00
Aliaksandr Valialkin
f7a6ae3d11 docs/vmagent.md: added a link to https://valyala.medium.com/how-to-use-relabeling-in-prometheus-and-victoriametrics-8b90fc22c4b2 into Relabeling section 2020-11-12 12:27:13 +02:00
Aliaksandr Valialkin
069979c367 docs/vmagent.md: typo fix 2020-11-11 16:05:04 +02:00
Aliaksandr Valialkin
7a0094adae docs/vmagent.md: add Configuration update section 2020-11-11 16:01:21 +02:00
immerrr again
1ec1a9f27f app/vmstorage: add "/internal/force_flush" endpoint (#893) 2020-11-11 14:46:37 +02:00
Aliaksandr Valialkin
4f2c5877db app/vmselect: add -search.treatDotsAsIsInRegexps command-line flag for automatic escaping of dots in regexp label filters 2020-11-11 12:40:28 +02:00
Aliaksandr Valialkin
a78bf34ff3 app/vmselect: do not return isPartialResponse=true when all the storageNodes return errors 2020-11-10 18:48:57 +02:00
Aliaksandr Valialkin
8f3339fa81 app/vmselect/promql: do not return data points in the end of the selected time range for time series ending in the middle of the selected time range
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/887
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
2020-11-10 14:51:55 +02:00
Aliaksandr Valialkin
6385432611 app/vmselect: typo fix in a description for -search.minStalenessInterval: mimimum->minimum 2020-11-10 01:18:59 +02:00
Roman Khavronenko
4fd2b6cd16 vmalert: explicitly set extra labels to alert entities (#886)
The previous implementation treated extra labels (global and rule labels) as
separate label set to returned time series labels. Hence, time series always contained
only original labels and alert ID was generated from sorted labels key-values.
Extra labels didn't affect the generated ID and were applied on the following actions:
- templating for Summary and Annotations;
- persisting state via remote write;
- restoring state via remote read.

Such behaviour caused difficulties on restore procedure because extra labels had to be dropped
before checking the alert ID, but that not always worked. Consider the case when expression
returns the following time series `up{job="foo"}` and rule has extra label `job=bar`.
This would mean that restored alert ID will be always different to the real time series because
of collision.

To solve the situation extra labels are now always applied beforehand and `vmalert` doesn't
store original labels anymore. However, this could result into a new error situation.
Consider the case when expression returns two time series `up{job="foo"}` and `up{job="baz"}`,
while rule has extra label `job=bar`. In such case, applying extra labels will result into
two identical time series and `vmalert` will return error:
 `result contains metrics with the same labelset after applying rule labels`

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/870
2020-11-10 00:27:56 +02:00
Aliaksandr Valialkin
a8562d643b lib/promscrape: add -promscrape.dropOriginalLabels command-line flag for reducing memory usage when discovering big number of scrape targets
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/878
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-10 00:20:49 +02:00
Aliaksandr Valialkin
b8083b7659 lib/promscrape: clean references to label name and label value strings after applying per-target relabeling
This should reduce memory usage when per-target relabeling creates big number of temporary labels
with long names and/or values.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-07 16:19:52 +02:00
Aliaksandr Valialkin
efebc3b6fb app/vmselect/promql: code cleanup after 43823addea 2020-11-06 01:31:33 +02:00
n4mine
3127aa92b5 app/vmselect/promql: fix when the parameter of maxValue(), minValue() leading by NaN. it will cause {top,bottom}k_{max,min} return inappropriate result (#883) 2020-11-06 01:31:31 +02:00
Aliaksandr Valialkin
767231f41f app/vmstorage/transport: properly handle request to labelValuesOnTimeRange 2020-11-05 02:08:04 +02:00
Aliaksandr Valialkin
72011bcc45 app/vmselect: properly handle errors in GetLabelsOnTimeRange and GetLabelValuesOnTimeRange 2020-11-05 01:36:34 +02:00
Aliaksandr Valialkin
c5e6c5f5a6 app/vmselect: optimize querying for /api/v1/labels and /api/v1/label/<name>/values when start and end args are set 2020-11-05 01:19:29 +02:00
Aliaksandr Valialkin
1336e47c86 docs/vmagent.md: update after 4c808d58bf 2020-11-04 20:33:49 +02:00
Nikolay
5b235b902b Adds ready probe (#874)
* adds leading forward slash check for scrapeURL path
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/835

* adds ready probe for scrape config initialization,
it should prevent metrics loss during vmagent rolling update,
/ready api will return 425 http code, if some scrape config still waits for initialization.

* updates docs

* Update app/vmagent/README.md

* renames var

* Update app/vmagent/README.md

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-11-04 20:33:48 +02:00
Nikolay
d0a9b24c5a reduces memory usage for vmagent, (#880)
* reduces memory usage for vmagent,
limits count of droppedTarget, that can be stored for /api/v1/targets page up to 999 items,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/878

* Update app/vmagent/README.md

* Update app/vmagent/README.md

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-11-04 17:13:33 +02:00
Aliaksandr Valialkin
407a46c11e Revert "docs/vmagent.md: mention about -promscrape.dropOriginalLabels"
This reverts commit 1a80acc712.
2020-11-04 11:45:35 +02:00
Aliaksandr Valialkin
1a80acc712 docs/vmagent.md: mention about -promscrape.dropOriginalLabels 2020-11-04 11:16:16 +02:00
Aliaksandr Valialkin
887a3c317f app/vmagent/remotewrite: drop packets only on 409 status code, since there are other valid 4xx status codes, which shouldnt result in packet drop 2020-11-03 14:24:57 +02:00
Aliaksandr Valialkin
66de02fbb4 app/vmselect/promql: allow dropping trailing sample only for default_rollup function
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/850
2020-11-02 02:11:06 +02:00
Aliaksandr Valialkin
ca2e0f1e04 app/vmagent/remotewrite: drop packets if remote storage returns 4xx status code
This makes consistent the behaviour with Prometheus.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/873
2020-11-02 00:45:01 +02:00
Aliaksandr Valialkin
6b623eba02 app/vmselect/promql: go fmt 2020-11-02 00:18:24 +02:00
Aliaksandr Valialkin
7c0b658865 app/vmselect/promql: do not drop trailing datapoints for instant queries
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
2020-11-02 00:12:53 +02:00
Roman Khavronenko
333675875f vmalert: skip automatically added labels on alerts restore (#871)
Label `alertgroup` was introduced in #611 and automatically added to generated
time series. By mistake, this new label wasn't correctly purged on restore event
and affected alert's ID uniqueness. This commit removes `alertgroup` label
in restore function.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/870
2020-11-01 23:26:00 +02:00
kreedom
40172c0721 vmbackup fix panic when no origin fs given (#859)
* use fsnil when no origin fs
2020-11-01 23:17:01 +02:00
Aliaksandr Valialkin
ed724d25ba lib/promscrape: add stream parse mode for efficient scraping of targets that expose millions of metrics 2020-11-01 23:12:26 +02:00
Aliaksandr Valialkin
abdf22e0bb app/vmagent: expose /api/v1/targets page according to https://prometheus.io/docs/prometheus/latest/querying/api/#targets
This page is exposed by vmagent and by a single-node VictoriaMetrics

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/643
2020-10-20 21:55:14 +03:00
Aliaksandr Valialkin
c4464594b7 app/vmselect/promql: allow passing optional third argument to topk_* and bottomk_* functions in order to obtain sum of time series outside top/bottom K 2020-10-20 20:09:55 +03:00
Aliaksandr Valialkin
9c5cd5a6c5 lib/storage: code cleanup after 5bfd4e6218 2020-10-20 16:10:53 +03:00
Aliaksandr Valialkin
0db7c2b500 app/vmstorage: support for -retentionPeriod smaller than one month
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/173
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/17
2020-10-20 14:42:46 +03:00
kreedom
4526cf92d3 vmalert - add dryRun (#842)
vmalert: add `dryRun` flag for rules validation without running the service
2020-10-20 10:49:22 +03:00
Seva Poliakov
e6bf9eaac7 Fix typo in vnrestore readme 2020-10-20 10:49:22 +03:00
Aliaksandr Valialkin
ee2902ddaf app/vmselect/promql: an attempt to improve heuristics for dropping trailing data points in time series
Now trailing data points are additionally dropped for time series with a single raw sample

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
2020-10-17 10:44:26 +03:00
Roman Khavronenko
d6155a3f33 vmalert: update docs to highlight the state restore requirements; (#833)
Address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/830
2020-10-13 18:34:00 +03:00
Aliaksandr Valialkin
b9a4601c97 app/vmselect/promql: return a single time series at max from absent() function like Prometheus does 2020-10-13 15:56:10 +03:00
Aliaksandr Valialkin
217c192c88 app/vmselect/promql: improve time series staleness detection
This should prevent from double counting for time series at the time when it changes label.
The most common case is in K8S, which changes pod uid label with each new deployment.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
2020-10-13 12:20:08 +03:00
Aliaksandr Valialkin
f877e703c8 app/vmselect/promql: fix mode_over_time calculations
Previously `mode_over_time` could return garbage due to improper shuffling of input data points.
2020-10-13 11:58:30 +03:00
Aliaksandr Valialkin
d884ab13dc app/vmselect/prometheus: fix golangci-lint warning 2020-10-13 09:36:18 +03:00
Aliaksandr Valialkin
0867dea5fc app/vmselect: add ability to export data in CSV format via /api/v1/export/csv 2020-10-12 20:08:08 +03:00
Aliaksandr Valialkin
938b3b7ed1 lib/promscrape: code prettifying after 9bd9f67718 2020-10-12 16:13:59 +03:00
Nikolay Khramchikhin
7f96712b38 Adds dockerswarm sd (#818)
* adds dockerswarm service discovery

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/656

 Following roles supported: services, tasks and nodes.
 Basic, token and tls auth supported.
 Added tests for labels generation.

* added unix socket support to discovery utils

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-10-12 16:13:58 +03:00
Aliaksandr Valialkin
2d03d0e2dd app/vmselect/promql: keep metric name after applying more functions, which dont change time series meaning
Functions are:

* keep_last_value
* keep_next_value
* interpolate
* running_min
* running_max
* running_avg
* range_min
* range_max
* range_avg
* range_first
* range_last
* range_quantile
* smooth_exponential

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674
2020-10-12 11:48:38 +03:00
Aliaksandr Valialkin
3881c84afe Revert "app/vmselect/promql: remove metric name after applying ceil, floor and round functions in order to be more consistent with Prometheus"
This reverts commit ac45082216.

Reason for revert: the previous behavior for VictoriaMetrics is easier to understand and use by users -
functions, which don't change the meaning of the time series shouldn't drop metric name.

Now the following functions do not drop metric names:

* ceil
* floor
* round

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674
2020-10-12 11:48:38 +03:00
Aliaksandr Valialkin
79d70480b7 Revert "app/vmselect/promql: remove metric name after applying clamp_min and clamp_max functions in order to be consistent with Prometheus"
This reverts commit bb61a4769b.

Reason for revert: the previous behavior for VictoriaMetrics is easier to understand and use by users -
functions, which don't change the meaning of the time series shouldn't drop metric name.

Now the following functions do not drop metric name:

* clamp_min
* clamp_max

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674
2020-10-12 11:48:38 +03:00
Aliaksandr Valialkin
8c37b63ea9 Revert "app/vmselect/promql: remove metric name from results of certain rollup functions in order to be consistent with Prometheus"
This reverts commit e5202a4eae.

Reason for revert: the previous behavior for VictoriaMetrics is easier to understand and use by users -
functions, which don't change the meaning of the time series shouldn't drop metric name.

Now the following functions do not drop metric name:

* max_over_time
* min_over_time
* avg_over_time
* quantile_over_time
* geomean_over_time
* mode_over_time
* holt_winters
* predict_linear

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/674
2020-10-12 11:48:38 +03:00
Aliaksandr Valialkin
de1c07b937 lib/backup: add MustStop() method for all remote filesystems 2020-10-09 15:32:13 +03:00
Aliaksandr Valialkin
bf6d523bef lib/backup/fslocal: add FS.MustStop() method for stopping bandwidth limiter 2020-10-09 15:11:55 +03:00
Aliaksandr Valialkin
9b7ce5d004 app/{vminsert,vmagent}: take into account all the inserted rows before relabeling in vm_rows_inserted_total and vmagent_rows_inserted_total metrics 2020-10-09 13:38:49 +03:00
Aliaksandr Valialkin
d2e917d1cb app/vmstorage: add vm_rows_added_to_storage_total metric, which shows the total number of rows added to storage since app start 2020-10-09 13:36:17 +03:00
Aliaksandr Valialkin
4b1c401790 app/vmalert: accept days, weeks and years in for: part of config like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/817
2020-10-08 20:13:20 +03:00
Aliaksandr Valialkin
35b8ffaa17 docs/vmagent.md: clarify -promscrape.suppressDuplicateScrapeTargetErrors command-line flag usage 2020-10-08 19:24:05 +03:00
Aliaksandr Valialkin
0d44e371f3 lib/promscrape: add -promscrape.suppressDuplicateScrapeTargetErrors command-line flag in order to suppress duplicate scrape target errors
Show also original labels for duplicate targets in error message in order to simplify debugging the issue.

Now `/targets` endpoint accepts optional `show_original_labels=1` query arg, which shows original labels for each target.
This may simplify debugging for target relabeling.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651
2020-10-08 18:59:25 +03:00
Aliaksandr Valialkin
f9f8e4a39c app/vmalert: do not pring description for all the flags on config errors
The description is too big to consume by human and it just distracts humans.
2020-10-08 13:35:46 +03:00
Aliaksandr Valialkin
f6ee6efc34 app/vmselect/promql: add missing label filters to binary operands before query execution
This implements the optimization described at https://utcc.utoronto.ca/~cks/space/blog/sysadmin/PrometheusLabelNonOptimization

See also https://github.com/cortexproject/cortex/issues/3253
2020-10-07 21:17:11 +03:00
Dmitry Shihovtsev
aec863e70b Fix typos in the vmalert datasource (#814)
* Fix typos in the vmalert datasource

* Fix typo in the vmalert datasource test
2020-10-07 18:00:29 +03:00