Commit graph

670 commits

Author SHA1 Message Date
Aliaksandr Valialkin
6f9c1bc078 app/vmagent: log unsuccessful attempt number when sending data to -remoteWrite.url 2020-08-30 21:40:15 +03:00
Aliaksandr Valialkin
3b1ecac04b app/vmagent: apply sane limits to -remoteWrite.queues
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/707
2020-08-30 21:25:51 +03:00
Roman Khavronenko
08b76cb26f vmalert: update -rule flag description to enforce quotes using (#709)
Description for `-rule` flag uses as example specific chars like asterisks
which could be interpreted wrong by different shells. To avoid this, description
now contains quoted flag values.

See also #708
2020-08-28 09:46:35 +03:00
Roman Khavronenko
4b89da9463 lib/decimal: rename significant decimal digits to significant figures (#698)
The previous notion was inconsistent with what `decimal.Round` does.
According to [wiki](https://en.wikipedia.org/wiki/Significant_figures) rounding
applied to all significant figures, not just decimal ones.
2020-08-16 17:22:40 +03:00
Aliaksandr Valialkin
6aab2f4989 all: allow using KB, MB, GB, KiB, MiB and GiB suffixes in command-line flag values related to byte sizes or byte rates 2020-08-16 17:08:28 +03:00
Aliaksandr Valialkin
285665e93b app/vmselect/promql: allow passing multiple args to aggregate functions such as avg(q1, q2, q3) 2020-08-15 01:15:16 +03:00
Aliaksandr Valialkin
a2021d0dde docs/vmagent.md: mention that gaps in remote storage may appear if vmagent cannot keep up with data ingestion 2020-08-14 20:48:17 +03:00
Aliaksandr Valialkin
e7c0b2ca56 docs: update docs 2020-08-14 19:14:46 +03:00
Aliaksandr Valialkin
b996280c65 app/{vminsert,vmagent}: improve documentation for -influxListenAddr command-line flag 2020-08-14 18:03:08 +03:00
Aliaksandr Valialkin
60c7397be5 all: support %{ENV_VAR} placeholders in yaml configs in all the vm* components
Such placeholders are substituted by the corresponding environment variable values.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/583
2020-08-13 17:17:06 +03:00
Aliaksandr Valialkin
6721e47ae9 app: respect CPU limits set via cgroups
Update GOMAXPROCS to limits set via cgroups. This should reduce CPU trashing and reduce memory usage
for cases when VictoriaMetrics components run in containers with CPU limits.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685
2020-08-11 23:01:03 +03:00
Aliaksandr Valialkin
62b6e54622 app/vmselect: reduce memory usage when exporting time series with big number of samples via /api/v1/export if max_rows_per_line is set to non-zero value
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685
2020-08-10 20:57:43 +03:00
Aliaksandr Valialkin
c9f5c5623f app/vmselect/netstorage: vary batch size for data unpacking depending on the available CPU cores
This should reduce contention on the channel with unpack work for systems with high number of CPU cores
2020-08-10 15:16:48 +03:00
Aliaksandr Valialkin
b3d4ff7ee2 app/vmstorage: improve error logging when the request times out 2020-08-10 13:17:24 +03:00
Roman Khavronenko
78afc61896 app/vmalert: extend metrics set exported by vmalert #573 (#654)
* app/vmalert: extend metrics set exported by `vmalert` #573

New metrics were added to improve observability:
+ vmalert_alerts_pending{alertname, group} - number of pending alerts per group
per alert;
+ vmalert_alerts_acitve{alertname, group} - number of active alerts per group
per alert;
+ vmalert_alerts_error{alertname, group} - is 1 if alertname ended up with error
during prev execution, is 0 if no errors happened;
+ vmalert_recording_rules_error{recording, group} - is 1 if recording rule
 ended up with error during prev execution, is 0 if no errors happened;
* vmalert_iteration_total{group, file} - now contains group and file name labels.
This should improve control over specific groups;
* vmalert_iteration_duration_seconds{group, file} - now contains group and file name labels. This should improve control over specific groups;

Some collisions for alerts and recording rules are possible, because neither
group name nor alert/recording rule name are unique for compatibility reasons.

Commit contains list of TODOs for Unregistering metrics since groups and rules
are ephemeral and could be removed without application restart. In order to
unlock Unregistering feature corresponding PR was filed - https://github.com/VictoriaMetrics/metrics/pull/13

* app/vmalert: extend metrics set exported by `vmalert` #573

The changes are following:
* add an ID label to rules metrics, since `name` collisions within one group is
a common case - see the k8s example alerts;
* supports metrics unregistering on rule updates. Consider the case when one rule
was added or removed from the group, or the whole group was added or removed.

The change depends on https://github.com/VictoriaMetrics/metrics/pull/16
where race condition for Unregister method was fixed.
2020-08-09 09:42:05 +03:00
ofen
3fea7c39be 401 Unauthorize HTTP error added (#681)
401 Unauthorize HTTP error added to trigger browser credentials pop-up promt [RFC 7235 https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication]
2020-08-09 09:39:37 +03:00
Aliaksandr Valialkin
67cacb22ac lib/httpserver: add -tls, -tlsCertFile and -tlsKeyFile command-line flags in every vm binary
This makes such binaries compatible with binaries from `master` branch (aka single-node version)

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/677
2020-08-07 10:57:32 +03:00
Aliaksandr Valialkin
95a8c492ef app/vmselect/promql: properly handle -n^m like Prometheus does
`-n^m` must be handled as `-(n^m)` instead of `(-n)^m`.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/675
2020-08-07 07:42:42 +03:00
Aliaksandr Valialkin
7f93d61a56 app/vmselect/promql: remove metric name after applying clamp_min and clamp_max functions in order to be consistent with Prometheus
This improves VictoriaMetrics score at https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 23:42:55 +03:00
Aliaksandr Valialkin
01000505a0 app/vmselect/promql: remove metric name after applying ceil, floor and round functions in order to be more consistent with Prometheus
This improves VictoriaMetrics score at https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 23:34:03 +03:00
Aliaksandr Valialkin
75bff1a567 app/vmselect/promql: remove metric name from results of certain rollup functions in order to be consistent with Prometheus
Rollup functions:

  - avg_over_time
  - min_over_time
  - max_over_time
  - quantile_over_time

This improves VictoriaMetrics results at https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 23:29:18 +03:00
Aliaksandr Valialkin
8835004a4c app/vmselect: properly handle PromQL queries like scalar1 < metric < scalar2 like Prometheus does
This fixes some cases from https://promlabs.com/promql-compliance-test-results-victoriametrics/
2020-08-06 23:21:14 +03:00
Aliaksandr Valialkin
14ddb8a34e app/vmselect/netstorage: reduce CPU contention when upacking time series blocks by unpacking batches of such blocks instead of a single block
This should improve query performance on systems with big number of CPU cores (16 and more)
2020-08-06 17:50:13 +03:00
Aliaksandr Valialkin
46c98cd97a app/vmselect/netstorage: reduce contention on unpackworkCh and timeseriesWorkCh for multi-CPU system by providing more capacity for these chans 2020-08-06 17:22:39 +03:00
Aliaksandr Valialkin
a455930ab4 app/vmstorage: rename vm_cache_size_entries{type="storage/prefetchedMetricIDs"} to vm_cache_entries{type="storage/prefetchedMetricIDs"} to be consistent with other vm_cache_entries metrics 2020-08-06 16:34:18 +03:00
Aliaksandr Valialkin
a3e91c593b lib/storage: limit the number of concurrent calls to storage.searchTSIDs to GOMAXPROCS*2
This should limit the maximum memory usage and reduce CPU trashing on vmstorage
when multiple heavy queries are executed.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-08-05 18:27:21 +03:00
Aliaksandr Valialkin
a930460236 app/vmagent: tune http client for sending data to remote storage in order to disable closing keep-alive connections
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/663
2020-08-04 21:01:40 +03:00
Aliaksandr Valialkin
a04f4a3d9a app/vmselect: use warning level instead of info level for logging slow queries that take longer than -search.logSlowQueryDuration 2020-08-04 20:24:38 +03:00
Aliaksandr Valialkin
bdb881c43b app/vmselect/promql: add zscore-related functions: zscore_over_time(m[d]) and zscore(q) by (...) 2020-08-03 21:52:15 +03:00
Aliaksandr Valialkin
94471a1273 app: remove duplicate *-pure makefile rules 2020-07-31 20:01:30 +03:00
Aliaksandr Valialkin
a2aa3a60eb app/vmselect: show X-Forwarded-For contents on /api/v1/status/active_queries page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/659
2020-07-31 20:01:09 +03:00
Aliaksandr Valialkin
106e302d7a all: add mssing APP_NAME to vm*-GOARCH builds 2020-07-31 13:45:32 +03:00
Aliaksandr Valialkin
945645f38f docs/{vmagent,vmalert}: add instruction on how to build for ARM 2020-07-31 09:25:41 +03:00
Aliaksandr Valialkin
0c00fe70cf app/vmselect: do not adjust start and end query args passed to /api/v1/query_range when -search.disableCache command-line flag is set
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/563
2020-07-30 23:14:56 +03:00
Aliaksandr Valialkin
29bbab0ec9 lib/storage: remove prioritizing of merging small parts over merging big parts, since it doesn't work as expected
The prioritizing could lead to big merge starvation, which could end up in too big number of parts that must be merged into big parts.

Multiple big merges may be initiated after the migration from v1.39.0 or v1.39.1. It is OK - these merges should be finished soon,
which should return CPU and disk IO usage to normal levels.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-07-30 20:02:22 +03:00
Aliaksandr Valialkin
338ee47d60 app/vmselect/promql: return non-empty value from rate_over_sum(m[d]) even if a single data point is located in the given [d] window
Just divide the data point value by the window duration in this case.
2020-07-29 12:37:34 +03:00
Aliaksandr Valialkin
717c554fb0 app/vmselect/promql: remove rollupFuncArg.realPrevValue handling, since the corner case in increase() is handled in another way now
See e00cfc854d for the approach used now.
2020-07-29 12:37:34 +03:00
Aliaksandr Valialkin
d9037b3970 app/vmselect/promql: fill gaps with 0 in rate_over_sum response when the last value before the selected time window isnt empty 2020-07-29 12:37:34 +03:00
Aliaksandr Valialkin
f6d4275087 app/{vmagent,vminsert}: properly preserve db tag from query string passed to Influx line protocol query
Previously `db` tag from the query string wasn't added to metrics after encountering `db` tag in the Influx line

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/653
2020-07-28 21:25:49 +03:00
Aliaksandr Valialkin
baebe86844 app/vmagent/remotewrite: add missing resp.Body.Close() after pushing data to remote storage
Missing body close could disable HTTP keep-alive connections.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/653
2020-07-28 21:00:25 +03:00
Aliaksandr Valialkin
0f6f0d30d3 app/vmselect: show query origin (aka remote_addr or client address) on the /api/v1/status/active_queries page for every query 2020-07-28 15:14:40 +03:00
Roman Khavronenko
ec6ed467c6 app/vmalert: support external.label to specify global labelset for all rules #622 (#652)
`external.label` flag supposed to help to distinguish alert or recording rules
source in situations when more than one `vmalert` runs for the same datasource
or AlertManager.
2020-07-28 14:23:04 +03:00
Aliaksandr Valialkin
9dccedc599 app/vmselect/promql: return empty values from group() if all the time series have no values at the given timestamp
This aligns `group()` behaviour to Prometheus
2020-07-28 13:41:04 +03:00
Aliaksandr Valialkin
d5057f6d04 app/vmagent/remotewrite: create new request on failure to send a block of data to remote storage
Previously the request body was already consumed before the retry, so this led to the following error:

    http: ContentLength=... with Body length 0
2020-07-27 17:33:05 +03:00
Aliaksandr Valialkin
b191e425b3 app/vmselect/promql: improve further the accuracy of buckets_limit() function
The accuracy is increased by mergin the smallest bucket with the smallest adjacent bucket.
2020-07-26 12:10:56 +03:00
Aliaksandr Valialkin
43871e79c6 app/vmselect/promql: avoid dropping inf bucket in buckets_limit
The `le="inf"` bucket must be preserved in order to maintain the maximum level of accuracy.
2020-07-25 17:00:25 +03:00
Aliaksandr Valialkin
978c1e930e app/vmselect/promql: optimize buckets_limit(k, buckets) for big number of buckets 2020-07-25 13:24:33 +03:00
Aliaksandr Valialkin
51cbf27077 app/vmselect/promql: improve the accuracy of buckets_limit(k, buckets) function
Now it properly merges the bucket with the previous bucket after deletion.
2020-07-24 17:07:30 +03:00
Aliaksandr Valialkin
cf69b1ea6f app/vmselect/promql: add buckets_limit(k, buckets) function, which limits the number of buckets per time series to k
This function works with both Prometheus-style and VictoriaMetrics-style buckets.
The function removes buckets with the lowest values in order to reserve the highest precision.
The function is useful for building heatmaps in Grafana from too big number of buckets.
2020-07-24 16:14:12 +03:00
Aliaksandr Valialkin
45334f61de app/vmselect: fix tests for rate_over_sum 2020-07-24 02:35:09 +03:00
Aliaksandr Valialkin
3526e8768a app/vmselect/promql: typo fix after 3e557c9861 2020-07-24 02:15:23 +03:00
Aliaksandr Valialkin
8d1721d128 app/vmselect/promql: add rate_over_sum(m[d]) function to MetricsQL, which returns rate over sum of m values over d duration
Something like `sum_over_time(m[d]) / d`, but more accurate.
2020-07-24 01:17:15 +03:00
Aliaksandr Valialkin
88e8bed0c9 app/vmselect/promql: allow setting [d] window smaller than the interval between raw points for avg_over_time
This makes `avg_over_time` behavior consistent with `sum_over_time` and `count_over_time` behaviors.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/636
2020-07-23 22:25:33 +03:00
Aliaksandr Valialkin
fb3d1380ac lib/storage: respect -search.maxQueryDuration when searching for time series in inverted index
Previously the time spent on inverted index search could exceed the configured `-search.maxQueryDuration`.
This commit stops searching in inverted index on query timeout.
2020-07-23 21:22:05 +03:00
Aliaksandr Valialkin
dbf3038637 lib/storage: add more fine-grained pace limiting for search 2020-07-23 19:21:49 +03:00
Aliaksandr Valialkin
16a4b1b20c app/vmselect/netstorage: protect from too smart compiler, which may break memory usage optimization in tmpBlocksFileWrapper.WriteBlocks 2020-07-23 17:57:24 +03:00
Aliaksandr Valialkin
0750d2cec1 app/vminsert: export vm_relabel_metrics_dropped_total metric that shows the number of metrics dropped due to relabeling 2020-07-23 14:58:02 +03:00
Aliaksandr Valialkin
55ed07add1 app/vmselect: typo fix after 0168e21fe32776e2f7f003f88e0e6e490eb2dcb0g 2020-07-23 14:11:15 +03:00
Aliaksandr Valialkin
7aa5b48508 app/vmselect: reduce memory usage when querying big number of time series with long labels
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/646
2020-07-23 13:48:58 +03:00
Aliaksandr Valialkin
49a0011837 app/vminsert: do not call ApplyRelabeling function if relabeling is disabled
This should reduce CPU usage a bit when `-relabelConfig` isn't set
2020-07-23 13:35:36 +03:00
Aliaksandr Valialkin
c91ccce50c app/vminsert: fix relabeling for metrics ingested via Influx line protocol
Previously the enabled relabeling with `-relabelConfig` command-line flag could result in missing labels
if a single Influx line protocol message contains multiple field values.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/638
2020-07-23 13:25:37 +03:00
Aliaksandr Valialkin
b8303afcd8 lib/storage: improve prioritizing of data ingestion over querying
Prioritize also small merges over big merges.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-07-23 01:40:38 +03:00
Aliaksandr Valialkin
20d0c41ac5 app/vmselect/prometheus: support d, w and y suffixes for durations passed to step in /api/v1/query_range like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/641
2020-07-22 16:27:27 +03:00
Aliaksandr Valialkin
bd4299fafe app/vmselect/netstorage: reduce memory allocations when unpacking time series data by using a pool for unpackWork entries
This should slightly reduce load on GC when processing queries that touch big number of time series.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/646 according to the provided memory profile
2020-07-22 15:04:42 +03:00
Aliaksandr Valialkin
a3f48e395e app/vmagent: add -remoteWrite.decimalPlaces command-line flag, which may be used for reducing disk space usage on the remote storage 2020-07-21 21:55:42 +03:00
Aliaksandr Valialkin
5bb4fe1ba4 app/vmselect: take into account the time spent in wait queue before query execution as time spent on the query 2020-07-21 19:00:00 +03:00
Aliaksandr Valialkin
0755cb3b50 app/vmselect/promql: skip the first value in time series passed to increase() if it exceeds by more than 10x the delta between the next value and the first value
This should prvent from inflated `increase()` results for time series that start from big initial values.
Such cases may occur when a label value changes in a metric without counter reset.
2020-07-21 17:24:28 +03:00
Aliaksandr Valialkin
71eba8dcf5 app/vmselect: log the total available memory for concurrent requests on not enough memory errors
This should simplify root cause analysis
2020-07-20 19:51:58 +03:00
Aliaksandr Valialkin
3b246aa569 app/vmagent: add -remoteWrite.proxyURL command-line option
This option allows writing data to `-remoteWrite.url` via http, https or socks5 proxy.
This is similar to `proxy_url` option in `remote_write` section of Prometheus.
See https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write
2020-07-20 19:31:08 +03:00
Aliaksandr Valialkin
8bee3ef91b docs/vmagent.md: sync with app/vmagent/README.md 2020-07-20 17:09:30 +03:00
Roman Khavronenko
8949ec961d app/vmagent: mention grafana dashboard in README (#639) 2020-07-20 17:09:27 +03:00
Aliaksandr Valialkin
86b54f3768 app/vmagent/remotewrite: allow passing empty -remoteWrite.urlRelabelConfig entries 2020-07-20 15:49:13 +03:00
Aliaksandr Valialkin
141e84b5a4 app/vmselect/prometheus: do not return time series with empty list of datapoints from /api/v1/query_range
This matches Prometheus behaviour.

This should fix https://github.com/jacksontj/promxy/issues/329
2020-07-20 15:30:13 +03:00
Aliaksandr Valialkin
4d2011a87d app/vmselect/promql: add mode() aggregate function 2020-07-20 15:30:11 +03:00
Aliaksandr Valialkin
31ef39e8da lib/httpserver: log remote address in error message from httpserver.Errorf
This should improve detection of the root cause of errors.
Thanks to Anant for the idea.
2020-07-20 14:06:29 +03:00
Aliaksandr Valialkin
427fa43ce2 app/vmselect/promql: add mode_over_time(m[d]) function
See https://en.wikipedia.org/wiki/Mode_(statistics) and https://stackoverflow.com/questions/61134078/promql-query-to-return-the-value-from-a-range-vector-which-occurs-maximum-no-of
2020-07-17 18:29:10 +03:00
Aliaksandr Valialkin
eb402a17bd app/vmselect/promql: optimize group(rollup(m)) calculations 2020-07-17 16:47:30 +03:00
Aliaksandr Valialkin
ea8dc85ba8 app/vmselect/promql: check that any() doesn't touch metric name 2020-07-17 16:23:11 +03:00
Aliaksandr Valialkin
fc8fe38a82 app/vmselect/promql: add group() aggregate function to MetricsQL
This function has been added in Prometheus 2.20. See https://github.com/prometheus/prometheus/pull/7480
2020-07-17 15:17:38 +03:00
Aliaksandr Valialkin
c64914a7e4 app/vmselect/promql: keep all labels for time series from any() call 2020-07-17 15:17:37 +03:00
Aliaksandr Valialkin
f9b38f7f2d app/vminsert/influx: properly handle the case when certain labels with empty values are removed by ApplyRelabeling() call
Previously this could lead to `out of range` panic
2020-07-17 00:05:24 +03:00
Aliaksandr Valialkin
14dc426b45 app/vmselect: fix nil pointer dereference panic when unsuccessfully querying vmstorage 2020-07-16 19:15:18 +03:00
Aliaksandr Valialkin
ce381b3868 app/vmalert: consistently use "%w" instead of "%s" in fmt.Errorf when wrapping errors 2020-07-15 13:55:13 +03:00
Aliaksandr Valialkin
e6d96bb0bd docs/vmagent.md: make filtering rules for init container pods less confusing 2020-07-14 20:33:19 +03:00
Aliaksandr Valialkin
c2b4b9138d app/vmagent/remotewrite: return proper value from tssRelabelPool.New
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599
2020-07-14 14:28:14 +03:00
Aliaksandr Valialkin
86044f6561 app/{vminsert,vmagent}: add -influxSkipMeasurement command-line flag for using field name as metric name
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/626
2020-07-14 14:18:40 +03:00
Aliaksandr Valialkin
0e7b2008b2 app/vmselect/prometheus: do not adjust last points in time series with timestamps exceeding the current time
Such timestamps usually mean that the query contains `offset`.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/625
2020-07-14 12:56:21 +03:00
Aliaksandr Valialkin
3898cc0285 app/vmselect/prometheus: minimize the diff for the change 1033dc7e2a over 619b0a25c9 2020-07-13 21:41:17 +03:00
faceair
bf39e67ade fix empty response template (#617) 2020-07-13 21:41:15 +03:00
Aliaksandr Valialkin
b6a5c29549 docs/vmagent.md: sync with app/vmagent/README.md 2020-07-13 21:26:00 +03:00
ofen
9ffa688846 Update README.md (#621)
Troubleshooting section updated to help out with duplicate targets detection
2020-07-13 21:25:59 +03:00
Aliaksandr Valialkin
4353ff7ef1 app/vmagent: fix data race when multiple -remoteWrite.urlRelabelConfig options are set
Previously multiple goroutines could access remoteWriteCtx.tss concurrently, which could lead to data race
and improper relabeling. Now each goroutine has its own copy of tss during relabeling.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599
2020-07-10 15:17:23 +03:00
Aliaksandr Valialkin
805a90f642 app/vmagent/remotewrite: typo fix in -remoteWrite.showURL help message 2020-07-10 14:07:14 +03:00
Aliaksandr Valialkin
6373d377ef app/{vminsert,vmagent}: add ability to import data in Prometheus exposition format via /api/v1/import/prometheus 2020-07-10 12:13:28 +03:00
Aliaksandr Valialkin
d449d0a0e1 app/vmselect/promql: add missing tests for ifnot binary operation 2020-07-09 13:24:12 +03:00
Aliaksandr Valialkin
7e706eea13 app/vmselect/promql: refactor implementations for and and unless binary operations, so they are closer to or implementation 2020-07-09 13:06:01 +03:00
Aliaksandr Valialkin
6c1a47b5e0 app/vmselect/promql/active_queries.go: simplify code a bit by inlining getNextActiveQueryID function 2020-07-09 11:18:53 +03:00
Aliaksandr Valialkin
fb86071552 app/vmselect: add /api/v1/status/active_queries page with the list of currently running queries
This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/575

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/528
2020-07-08 19:09:31 +03:00
DexterZhang
9930ce1fa9
Feat/query list vmselect (#575)
* feat(vmselect): add support for listing current running queries and canceling specific query

* fix(vmselect): change current queries' pid from int64 counter to uuid

* feat(vmselect): add auth to internal operations like `/resetRollupResultCache`, `/query/list` and `/query/kill`. add flag `internalAuthKey` for these auth

* fix(vmselect): add more info to current queries

* review: delete some unnecessary code and use function instead of init

* review: returen *queriesMap in newQueriesMap

* review: delete unused var in struct queriesMap, add comments to exported functions

* review: add return if error occurs

* feat(vmselect): truncate query string in current running query list API since the size of query string might be large;
                use query string's pointer in struct `query` for the same reason;
		add query info API to get full access of query's info;
2020-07-08 19:04:29 +03:00
Aliaksandr Valialkin
0bff96fe4b lib/storage: prioritize data ingestion over heavy queries
Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream.
Prevent this by delaying queries' execution until free resources are available for data ingestion.

Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources
for data ingestion and/or for executing heavy queries.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2020-07-05 19:44:04 +03:00