Aliaksandr Valialkin
86044f6561
app/{vminsert,vmagent}: add -influxSkipMeasurement
command-line flag for using field name as metric name
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/626
2020-07-14 14:18:40 +03:00
Aliaksandr Valialkin
0e7b2008b2
app/vmselect/prometheus: do not adjust last points in time series with timestamps exceeding the current time
...
Such timestamps usually mean that the query contains `offset`.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/625
2020-07-14 12:56:21 +03:00
Aliaksandr Valialkin
3898cc0285
app/vmselect/prometheus: minimize the diff for the change 1033dc7e2a
over 619b0a25c9
2020-07-13 21:41:17 +03:00
faceair
bf39e67ade
fix empty response template ( #617 )
2020-07-13 21:41:15 +03:00
Aliaksandr Valialkin
b6a5c29549
docs/vmagent.md: sync with app/vmagent/README.md
2020-07-13 21:26:00 +03:00
ofen
9ffa688846
Update README.md ( #621 )
...
Troubleshooting section updated to help out with duplicate targets detection
2020-07-13 21:25:59 +03:00
Aliaksandr Valialkin
4353ff7ef1
app/vmagent: fix data race when multiple -remoteWrite.urlRelabelConfig
options are set
...
Previously multiple goroutines could access remoteWriteCtx.tss concurrently, which could lead to data race
and improper relabeling. Now each goroutine has its own copy of tss during relabeling.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599
2020-07-10 15:17:23 +03:00
Aliaksandr Valialkin
805a90f642
app/vmagent/remotewrite: typo fix in -remoteWrite.showURL
help message
2020-07-10 14:07:14 +03:00
Aliaksandr Valialkin
6373d377ef
app/{vminsert,vmagent}: add ability to import data in Prometheus exposition format via /api/v1/import/prometheus
2020-07-10 12:13:28 +03:00
Aliaksandr Valialkin
d449d0a0e1
app/vmselect/promql: add missing tests for ifnot
binary operation
2020-07-09 13:24:12 +03:00
Aliaksandr Valialkin
7e706eea13
app/vmselect/promql: refactor implementations for and
and unless
binary operations, so they are closer to or
implementation
2020-07-09 13:06:01 +03:00
Aliaksandr Valialkin
6c1a47b5e0
app/vmselect/promql/active_queries.go: simplify code a bit by inlining getNextActiveQueryID function
2020-07-09 11:18:53 +03:00
Aliaksandr Valialkin
fb86071552
app/vmselect: add /api/v1/status/active_queries
page with the list of currently running queries
...
This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/575
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/528
2020-07-08 19:09:31 +03:00
DexterZhang
9930ce1fa9
Feat/query list vmselect ( #575 )
...
* feat(vmselect): add support for listing current running queries and canceling specific query
* fix(vmselect): change current queries' pid from int64 counter to uuid
* feat(vmselect): add auth to internal operations like `/resetRollupResultCache`, `/query/list` and `/query/kill`. add flag `internalAuthKey` for these auth
* fix(vmselect): add more info to current queries
* review: delete some unnecessary code and use function instead of init
* review: returen *queriesMap in newQueriesMap
* review: delete unused var in struct queriesMap, add comments to exported functions
* review: add return if error occurs
* feat(vmselect): truncate query string in current running query list API since the size of query string might be large;
use query string's pointer in struct `query` for the same reason;
add query info API to get full access of query's info;
2020-07-08 19:04:29 +03:00
Aliaksandr Valialkin
0bff96fe4b
lib/storage: prioritize data ingestion over heavy queries
...
Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream.
Prevent this by delaying queries' execution until free resources are available for data ingestion.
Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources
for data ingestion and/or for executing heavy queries.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2020-07-05 19:44:04 +03:00
Roman Khavronenko
9afd19d375
app/vmalert: add retries to remotewrite ( #605 )
...
* app/vmalert: add retries to remotewrite
Remotewrite pkg now does limited number of retries if write request failed.
This suppose to make vmalert state persisting more reliable.
New metrics were added to remotewrite in order to track rows/bytes sent/dropped.
defaultFlushInterval was increased from 1s to 5s for sanity reasons.
* fix
* wip
* wip
* wip
* fix bits alignment bug for 32-bit systems
* fix mistakenly dropped field
2020-07-05 18:47:38 +03:00
Aliaksandr Valialkin
82871fb7a5
app/vmselect/prometheus: small fixes on top of 8bb762124a
2020-07-05 18:17:53 +03:00
faceair
17f175ff5a
fix adjust last points avoid influence earlier value ( #606 )
2020-07-05 18:17:52 +03:00
Ween
d28fb0baf9
[VMAlert] Fix error log when remoteWrite queue size is full ( #602 )
...
* Fix Auto metrics relabeled errors
* Finalize auto-genenated Labels
* Fix Test Errors
* fix error logs when queue is full
Co-authored-by: xinyulong <xinyulong@kuaishou.com>
2020-07-03 16:50:43 +03:00
Aliaksandr Valialkin
8bb3622e9d
app/vminsert: prevent from adding and/or selecting labels with empty values
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600
2020-07-02 23:17:12 +03:00
Aliaksandr Valialkin
6ebac3ab63
app/vminsert: add ability to apply relabeling to all the incoming metrics if -relabelConfig
command-line arg points to a file with a list of relabel_config
entries
...
See https://victoriametrics.github.io/#relabeling
2020-07-02 20:36:33 +03:00
Aliaksandr Valialkin
a45856570b
all: typo fix: exptected -> expected
2020-07-02 18:06:21 +03:00
Aliaksandr Valialkin
f10e8809c0
app/vmselect: add interpolate
function for filling gaps with linearly interpolated values
...
See https://stackoverflow.com/q/62565021/274937 for details
2020-07-02 14:54:46 +03:00
Aliaksandr Valialkin
2361ad8ab4
lib/promscrape: add ability to set disable_compression
and disable_keepalive
options in scrape_config
section of the config passed to -promscrape.config
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-07-02 14:19:34 +03:00
BigFish
aa26b94f33
fix: spelling mistakes ( #594 )
...
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-07-01 01:36:40 +03:00
Aliaksandr Valialkin
4cb3e7595c
app/vmstorage: add -denyQueriesOutsideRetention
command-line flag for denying queries outside the configured retention
2020-07-01 00:58:42 +03:00
Aliaksandr Valialkin
0c4e8aeb2b
all: use errors.As
for inspecting errors that implement httpserver.ErrorWithStatusCode
2020-07-01 00:03:11 +03:00
Aliaksandr Valialkin
d962568e93
all: use %w instead of %s for wrapping errors in fmt.Errorf
...
This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode .
See https://blog.golang.org/go1.13-errors for details.
2020-06-30 23:33:46 +03:00
Roman Khavronenko
156c83d112
app/vmalert: support multiple notifier urls ( #584 ) ( #590 )
...
* app/vmalert: support multiple notifier urls (#584 )
User now can set multiple notifier URLs in the same fashion
as for other vmutils (e.g. vmagent). The same is correct for
TLS setting for every configured URL. Alerts sending is done
in sequential way for respecting the specified URLs order.
* app/vmalert: add basicAuth support for notifier client (#585 )
The change adds possibility to set basicAuth creds for notifier
client in the same fasion as for remote write/read and datasource.
2020-06-29 22:21:56 +03:00
Roman Khavronenko
bbeab70de6
app/vmalert: move flags description and initialization into subpackages
...
The change adds no new functionality and aims to move flags definitions
to subpackages that are using them. This should improve readability
of the main function.
2020-06-29 22:18:29 +03:00
kreedom
63c36e2e69
app/vmalert: properly set transport for HTTP clients
...
Fixes issue #586
2020-06-29 22:18:25 +03:00
Aliaksandr Valialkin
2b504f17de
docs: update the info that docker images are built on top of alpine
image now
...
A follow-up after the commit ff624c9125
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/522
2020-06-26 13:52:25 +03:00
Aliaksandr Valialkin
a586b8b6d4
app/vminsert/netstorage: do not re-route every time series to more than two vmstorage nodes when certain vmstorage nodes are temporarily slower than the rest of them
...
Previously vminsert may spread data for a single time series across all the available vmstorage nodes
when vmstorage nodes couldn't handle the given ingestion rate. This could lead to increased usage
of CPU and memory on every vmstorage node, since every vmstorage node had to register all the time
series seen in the cluster. Now a time series may spread to maximum two vmstorage nodes under heavy load.
Every time series is routed to a single vmstorage node under normal load.
2020-06-25 16:42:37 +03:00
Aliaksandr Valialkin
12b87b2088
app/vmselect/netstorage: reset big result values every 10 seconds instead of after processing every time series
...
This should reduce GC pressure when processing time series with big number of rows
2020-06-24 19:37:35 +03:00
nicbaz
46c5c0772c
vmselect: fix label_replace when mismatch ( #579 )
...
As per documentation on `label_replace` function: "If the regular
expression doesn't match then the timeseries is returned unchanged".
Currently this behavior is not enforced, if a regexp on an existing
tag doesn't match then the tag value is copied as-is in the destination
tag. This fix first checks that the regular expression matches the
source tag before applying anything.
Given the current implementation, this fix also changes the behavior
of the **MetricsQL** `label_transform` function which does not
document this behavior at the moment.
2020-06-23 23:54:29 +03:00
nicbaz
ea2ed4b7e8
vmalert: add support for TLS configuration ( #578 )
...
app/vmalert: add support for TLS configuration
Add support for TLS optional configuration in a similar fashion to what
is currently supported in other vmutils such as vmagent. TLS
configuration options are distinct for datasource, remoteRead,
remoteWrite as well as notifier.
2020-06-23 22:47:23 +03:00
Aliaksandr Valialkin
0fdbe5de25
app/vmselect/netstorage: increase concurrency when processing small number of time series with big number of data points per each time series
...
Previously VictoriaMetrics was processing up to 32 time series in a single goroutine.
This could be slow if each time series contains big number of data points (10M+ or more), since only a single CPU core could be loaded with work,
while other CPU cores were idle. Fix this by launching GOMAXPROCS workers for time series processing.
This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/572
2020-06-23 22:45:57 +03:00
Aliaksandr Valialkin
3a444bb7bb
lib/promrelabel: add support for keep_if_equal
and drop_if_equal
actions to relabel configs
...
These actions may be useful for filtering out unneeded targets and/or metrics if they contain equal label values.
For example, the following rule would leave the target only if __meta_kubernetes_annotation_prometheus_io_port
equals __meta_kubernetes_pod_container_port_number:
- action: keep_if_equal
source_labels: [__meta_kubernetes_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number]
2020-06-23 17:29:19 +03:00
kreedom
f227799c87
Support of custom URL path for alert ( #560 )
...
app/vmalert: Support custom URL for alerts source
Add flag `external.alert.source` for configuring custom URL
for alert's source. This may be handy to re-point default source
URL to other systems like Grafana.
Updates #517
2020-06-21 16:33:58 +03:00
Aliaksandr Valialkin
70bf8218bb
app/vmselect/promql: properly override label values from group_left
and group_right
lists like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/577
2020-06-21 16:32:27 +03:00
Aliaksandr Valialkin
2fc2679a3f
app/vminsert/netstorage: remove possible race condition when broken connection may be recovered before acquiring storageNode.bcLock
2020-06-20 16:38:08 +03:00
Aliaksandr Valialkin
9409a31c07
docs/vmauth.md: mention that we can provide custom integration with SAML
2020-06-19 13:13:53 +03:00
Aliaksandr Valialkin
4400700832
app/vminsert: properly replicate data for the last RF-1
storage nodes for -replicationFactor=RF
...
Previously the data for the last `RF-1` storage noes has been incorrectly replicated to the first storage node.
2020-06-19 12:40:22 +03:00
Aliaksandr Valialkin
4f673a5201
app/vminsert: export metrics for determining ingested rows with dropped or truncated labels
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/565
2020-06-19 01:12:44 +03:00
Aliaksandr Valialkin
6939e36fdd
app/vmselect/promql: fill gaps on right side with values from left side of or
operator in the same way as Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/552
2020-06-18 23:05:23 +03:00
Aliaksandr Valialkin
85c1ccb8b8
app/vminsert/netstorage: add missing return
in storageNode.checkHealth on connection failure
2020-06-18 20:51:51 +03:00
Aliaksandr Valialkin
464682f380
app/vminsert/netstorage: periodically check for each -storageNode
health, so it could be marked as healthy when it is ready to accept data
...
This fixes uneven data routing in cluster version when `-replicationFactor` is set to 1 (default value),
i.e. when the replication is disabled.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/546
2020-06-18 20:42:43 +03:00
Roman Khavronenko
1a01fe2cf2
vmalert-537: allow name duplication for rules within one group. ( #559 )
...
Uniqueness of rule is now defined by combination of its name, expression and
labels. The hash of the combination is now used as rule ID and identifies rule within the group.
Set of rules from coreos/kube-prometheus was added for testing purposes to
verify compatibility. The check also showed that `vmalert` doesn't support
`query` template function that was mentioned as limitation in README.
2020-06-18 18:54:35 +03:00
Aliaksandr Valialkin
87151e825e
docs/vmbackup.md: mention that backups from single-node and cluster versions are incompatible
2020-06-18 18:54:34 +03:00
Aliaksandr Valialkin
cc2225cc49
app/vmselect: fix the error after 936f35920a
2020-06-12 22:00:45 +03:00
Aliaksandr Valialkin
936f35920a
app/vmselect/prometheus: allow returning partial response from /api/v1/export
if -search.denyPartialResponse=false
...
This makes `/api/v1/export` behaviour consistent with other `/api/v1/*` handlers.
2020-06-12 21:11:48 +03:00
Clémence Saussez
0b53e380cf
app/vmalert: fix link to testdata ( #547 )
...
Fix broken link to vmalert test data
Signed-off-by: Clemence Saussez <clemence@zen.ly>
2020-06-10 19:37:21 +03:00
Roman Khavronenko
d71b6e6584
vmalert-491: allow to configure concurrent rules execution per group. ( #542 )
...
The feature allows to speed up group rules execution by
executing them concurrently.
Change also contains README changes to reflect configuration
details.
2020-06-09 15:22:11 +03:00
Roman Khavronenko
5c049bf4dd
vmalert-521: allow to disable rules expression validation. ( #536 )
...
This feature may be useful for using `vmalert` with PromQL
compatible datasources like Loki.
2020-06-09 15:19:25 +03:00
Aliaksandr Valialkin
c1be462d42
app/vmauth: disable automatic response compression/uncompression, since it may work improperly in some cases
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/535
2020-06-05 20:14:07 +03:00
Aliaksandr Valialkin
7680b7155d
app/vmauth: emit fatal errors instead of panics when incorrect command-line flags are set
2020-06-05 20:14:05 +03:00
Aliaksandr Valialkin
01719f4949
app/vmstorage/transport: simplify setupTfss in order to prevent the possibility of nil tfs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/534
2020-06-05 13:17:26 +03:00
Aliaksandr Valialkin
e4cef1b678
app/vmstorage: prevent from serving conns from vminsert and vmselect after the server is closed
...
Previously it was possible that the connection is served after the server is closed if the following
steps are performed:
1) Server accepts new connection.
2) Server.MustClose() is called and successfully finished.
3) Server starts processing the connection accepted at step 1. There could be various crashes
like in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/534 since the storage may be already closed.
Now the server closes the connection at step 3 without processing it.
2020-06-05 11:55:48 +03:00
Aliaksandr Valialkin
58069f5a6a
app/vmalert: print brief usage info for vmalert -help
2020-06-05 10:43:24 +03:00
Aliaksandr Valialkin
3848ea3a4a
app/vmauth: print brief usage info for vmauth -help
2020-06-05 10:40:11 +03:00
Aliaksandr Valialkin
8ad8ca350a
app/vmagent: print brief usage info for vmagent -help
2020-06-05 10:40:10 +03:00
Aliaksandr Valialkin
3d0a0b3785
lib/fs: optimize MustGetFreeSpace performance by caching the results for up to 2 seconds
2020-06-04 13:14:04 +03:00
DexterZhang
fa103875a0
feat(vmselect): add tmp block dir size metrics vm_tmp_blocks_files_size_total
( #527 )
...
* feat(vmselect): add tmp block dir size metrics `vm_tmp_blocks_files_size_total`
* refactor(vmselect): use free space instead of used space in tmp block file metrics
* fix: add `bytes` suffix to tmp dir free space metric
2020-06-04 13:05:50 +03:00
Aliaksandr Valialkin
faea804b88
app/vmauth: log when -auth.config is reloaded in SIGHUP
2020-06-03 23:22:20 +03:00
Aliaksandr Valialkin
045b87c662
app/vmalert: fix comment for UpdateWith exported methods
2020-06-01 14:35:03 +03:00
Aliaksandr Valialkin
43b14b9569
app/vminsert/netstorage: free up unused memory in buffer after memory usage spikes
2020-06-01 14:33:35 +03:00
Roman Khavronenko
44c51c627f
vmalert: Add recording rules support. ( #519 )
...
* vmalert: Add recording rules support.
Recording rules support required additional service refactoring since
it wasn't planned to support them from the very beginning. The list
of changes is following:
* new entity RecordingRule was added for writing results of MetricsQL
expressions into remote storage;
* interface Rule now unites both recording and alerting rules;
* configuration parser was moved to separate package and now performs
more strict validation;
* new endpoint for listing all groups and rules in json format was added;
* evaluation interval may be set to every particular group;
* vmalert: uncomment tests
* vmalert: rm outdated TODO
* vmalert: fix typos in README
2020-06-01 13:53:46 +03:00
Aliaksandr Valialkin
37aa4fe282
app/vmagent: reload -remoteWrite.relabelConfig and -remoteWrite.urlRelabelConfig on SIGHUP and on /-/reload
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/518
2020-05-30 14:37:02 +03:00
Aliaksandr Valialkin
a646131a33
app/vmagent: log fatal errors instead of panics when improper command-line flags are passed to vmagent
2020-05-30 14:22:38 +03:00
Aliaksandr Valialkin
f41a01332a
app/vminsert/netstorage: evenly distribute rerouted rows among all the availalbe storage nodes
...
Previously such rows were distributed to the original storage node or to the next storage node.
This may result to uneven load among the remaining storage nodes.
2020-05-30 13:51:09 +03:00
Aliaksandr Valialkin
02b2064d8e
app/vminsert/netstorage: do not increment vm_rpc_rows_lost_total when all the vmstorage nodes are unavailable, since vminsert retries sending the data instead of dropping it
2020-05-28 22:36:56 +03:00
Aliaksandr Valialkin
7a61357b5d
app/vminsert/netstorage: make sure that the the data is always replicated among -replicationFactor vmstorage nodes
...
Previously vminsert could write multiple copies of the data to a single vmstorage node when the ingestion rate
exceeds the maximum throughput for connections to vmstorage nodes.
2020-05-28 19:59:07 +03:00
Aliaksandr Valialkin
77e5165e7b
app/vminsert: add -replicationFactor
command-line flag for enabling data replication among available -storageNode instances
2020-05-27 17:29:44 +03:00
Aliaksandr Valialkin
b4e3bffe4b
app/vminsert/netstorage: emit warnings instead of errors when re-routing data to healthy storage nodes
2020-05-27 16:31:41 +03:00
Aliaksandr Valialkin
75f2f3b09d
app/vminsert/netstorage: improve ingestion performance when a single vmstorage node is slower than other vmstorage nodes
...
Previously the ingestion performance has been limited by the slowest vmstorage node.
Now vminsert should re-route data from the slowest vmstorage node to the remaining nodes.
2020-05-27 15:08:22 +03:00
Aliaksandr Valialkin
9844845d79
app/vminsert: tune the maximum summary buffer size for pending data to 1/4 of available RAM, since 1/2 of RAM is too big considering GOGC overhead
2020-05-25 02:00:37 +03:00
Aliaksandr Valialkin
4a82631e44
app/vminsert: limit the summary buffer sizes for all the storage nodes to a half of the allowed memory
2020-05-25 01:39:33 +03:00
Aliaksandr Valialkin
4bd3d4b148
app/vminsert/netstorage: do not return error from storageNode.flushBufLocked when the buffer has been successfully re-routed to healthy nodes
...
This should reduce the number of false errors in the log and the number of falsely lost rows
2020-05-22 18:29:43 +03:00
Aliaksandr Valialkin
6edc33d9bb
app/vminsert/netstorage: capture the first error instead of the last error when sending data to vmstorage
...
The first error has more chances to point to the real root cause of the issue.
2020-05-22 17:49:33 +03:00
Aliaksandr Valialkin
bb4a2bf1aa
app/vmauth: fix make run-vmauth
command
2020-05-22 16:45:19 +03:00
Aliaksandr Valialkin
dcbdc009f5
app/vmagent: check for error returned from flag.Set
2020-05-21 16:30:48 +03:00
Aliaksandr Valialkin
b59e089ac7
app/vmagent: add -dryRun
option for checking all the configs mentioned in command-line flags without running vmagent
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/362
2020-05-21 15:23:18 +03:00
Aliaksandr Valialkin
901093279e
app/vmstorage/transport: update stale comment - vmstorage now sends small ack
packets to vminsert
2020-05-21 14:04:52 +03:00
kreedom
2752d6cb26
vmalert add quotes escape function ( #510 )
...
* vmalert add quotes escape function
Co-authored-by: kreedom
2020-05-21 12:10:35 +03:00
Aaron France
b26245c48b
Update README.md
2020-05-21 12:10:33 +03:00
Aliaksandr Valialkin
d83c68ca03
app/vmselect/promql: add ascent_over_time(m[d])
and descent_over_time(m[d])
functions
...
These functions could be useful in GPS tracking apps for calculating the summary for height gain/loss
over the given duration `d`.
2020-05-21 12:06:34 +03:00
Aliaksandr Valialkin
8ff28f5b91
app/vmselect/promql: update numbers after the upgrade of github.com/VictoriaMetrics/metrics from v1.11.2 to v1.11.3
2020-05-20 03:07:07 +03:00
Aliaksandr Valialkin
ddc9e69bd6
docs/vmagent.md: mention an alternative to refresh_interval
option in scrape configs
2020-05-19 23:10:16 +03:00
Aliaksandr Valialkin
7d46dd452a
app/vmselect/promql: move common code from aggrFuncOutliersK and newAggrFuncRangeTopK into getRangeTopKTimeseries
2020-05-19 16:11:03 +03:00
Aliaksandr Valialkin
37068064dd
app/vmselect/promql: fix outilersk
calculations
2020-05-19 14:45:10 +03:00
Aliaksandr Valialkin
fc81ea38d4
app/vmselect/promql: add outliersk(N, m)
aggregate function for anomaly detection across groups of similar time series
2020-05-19 13:52:44 +03:00
Aliaksandr Valialkin
9ca781b8f0
app/vmalert/notifier: go fmt
2020-05-19 13:00:18 +03:00
kreedom
27911ae179
vmalert - add expr to variables, add escape functions ( #495 )
...
* vmalert - add expr to variables, add escape functions
Co-authored-by: kreedom
2020-05-19 11:55:03 +03:00
Roman Khavronenko
c7f3e58032
vmalert: avoid sending resolves for pending alerts ( #498 )
...
Before the change we were sending notifications to notifier
if following conditions are met:
* alert is in Fire state
* alert is in Inactive state
We were sending Inactive notifications to resolve alert ASAP.
Unfortunately, we were sending resolves for Pending alerts that become
Inactive, which is wrong.
In this change we delete alert from the active list if
it was Pending and become Inactive. In this way we now
have Inactive alerts only if they were in state Fire before.
See test change for example.
2020-05-19 11:55:00 +03:00
Roman Khavronenko
e5f5342e18
vmalert: fix potential race during configuration reloads ( #497 )
...
Configuration reload and rules evaluation can't be executed
in same time now. This may make reload time longer but
prevents from potential races.
2020-05-19 11:54:55 +03:00
Aliaksandr Valialkin
b99d03a956
app/vmalert: run make quicktemplate-gen
from the root dir of the repository
2020-05-16 22:45:45 +03:00
Aliaksandr Valialkin
2784015a4d
all: print --help
output to stdout instead of stderr
...
This is easier to grep and pipe
2020-05-16 12:03:06 +03:00
Aliaksandr Valialkin
dbf8048134
app/vmrestore: document better that vmrestore
works like rsync --delete
, i.e. it deletes files in -storageDataPath
, which are missing in the backup
2020-05-16 09:02:46 +03:00
Aliaksandr Valialkin
e544155a82
app/vmagent/Makefile: fix make run-vmagent
rule
2020-05-15 19:35:16 +03:00
Aliaksandr Valialkin
6c43ba1cb1
app/vmagent/remotewrite: remove unused import after the commit 93267f143f
2020-05-15 17:42:31 +03:00
Aliaksandr Valialkin
1d71253653
app/vmagent/remotewrite: allow ingesting time series with multiple samples at once
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/481
2020-05-15 17:37:27 +03:00
Aliaksandr Valialkin
a853869e75
app/vmstorage/transport: prevent from uncontrolled memory usage growth when vminsert
sends big packets with too long labels
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/490
2020-05-15 15:42:54 +03:00
Aliaksandr Valialkin
1e5c1d7eaa
app/vmstorage: add vm_slow_metric_name_loads_total
metric, which could be used as an indicator when more RAM is needed for improving query performance
2020-05-15 14:12:24 +03:00
Aliaksandr Valialkin
d6b9a49481
app/vmstorage: add vm_slow_row_inserts_total
and vm_slow_per_day_index_inserts_total
metrics for determining whether VictoriaMetrics required more RAM for the current number of active time series
2020-05-15 13:46:57 +03:00
Roman Khavronenko
e850bf0eff
vmalert: fix the access to rules slice element by wrong index ( #486 )
...
During group's update rules deletion was causing slice
mutations while slice index was assumed to be unchanged.
This caused "slice bounds out of range" errors when multiple
rules were deleted sequentially.
2020-05-15 13:26:06 +03:00
hagen1778
d369450f27
vmalert: update README
2020-05-15 13:26:04 +03:00
Aliaksandr Valialkin
3845420a8f
lib: extract common code for returning fast unix timestamp into lib/fasttime
2020-05-14 23:06:50 +03:00
Roman Khavronenko
e208e76222
vmalert: check if remoteRead object was initied before calling Restore ( #473 )
...
The check for non-nil remoteRead was mistakenly dropped
during refactoring which caused panics when `vmalert`
wasn't configured with `remoteRead` flag.
2020-05-13 22:57:26 +03:00
Roman Khavronenko
1523890742
vmalert: fix flag names and description in README ( #475 )
...
Change also adds the recommendation for `remotewrite`
queue error.
2020-05-13 22:57:20 +03:00
肖贝贝
8c3e9adf7f
Feat/vmalert add max queue size ( #472 )
...
* feat: add remoteWrite.maxQueueSize to reduce queue full
* rename remote(write|read) flags to remote(Write|Read) for the sake of consistency
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-05-13 22:57:16 +03:00
Aliaksandr Valialkin
bac9a684e8
docs/vmbackup.md: add a link to vmbackuper tool
2020-05-13 22:57:11 +03:00
Aliaksandr Valialkin
f3d9a5b0ec
app/vmselect/promql: suppress "SA4006: this value of dstValues
is never used" error in golangci-lint
2020-05-13 11:46:05 +03:00
Aliaksandr Valialkin
3b0f66a227
app/vmagent: fix a bug with improper relabeling when multiple -remoteWrite.urlRelableConfig
args are set
...
This bug could result in incorrect relabeling and metrics' drop.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/467
2020-05-12 22:03:45 +03:00
Aliaksandr Valialkin
18a0caee43
app/vmselect/promql: fix any(..)
calculations - return all the data points instead of the first one
2020-05-12 20:36:49 +03:00
Aliaksandr Valialkin
3d3f41b961
app/vmstorage/transport: fix panic during server stop on 32-bit arches
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
2020-05-12 20:21:40 +03:00
Aliaksandr Valialkin
81b8811cf4
app/vmselect/promql: remove -search.maxPointsPerTimeseries
command-line flag
...
Limit the estimated time series count after aggregation with grouping by the number of source time series.
2020-05-12 19:54:44 +03:00
Aliaksandr Valialkin
408ade27a9
app/vmselect/promql: add any(x) by (y)
aggregate function, which returns any time series from q
for each group y
2020-05-12 19:50:29 +03:00
Aliaksandr Valialkin
21c2982ac8
app/vmselect/promql: support for sum(x) by (y) limit N
syntax in order to limit the number of output time series after aggregation
2020-05-12 19:50:12 +03:00
Aliaksandr Valialkin
f341c6fcc4
Revert "app/vmselect: add -search.estimatedSeriesCountAfterAggregation
command-line flag for tuning the probability of OOMs or false-positive not enough memory
errors"
...
This reverts commit fbb7986dd2380fce2fc8633b7eda8b67f419e74c.
Reason for revert: this commit has been removed from single-node version
2020-05-12 19:50:08 +03:00
Aliaksandr Valialkin
d54a93fc81
app/vmagent: fix scraping mTLS targets, which has been broken in v1.35.1
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470
2020-05-12 17:23:43 +03:00
Aliaksandr Valialkin
405cf44aed
app/vmagent,lib/promscrape: do not set HostClient.DialDualStack, since it isnt used if HostClient.Dial is set
2020-05-12 15:24:53 +03:00
Aliaksandr Valialkin
da6a84e147
app/vmagent/remotewrite: properly dial TCP6 addresses set via -remoteWrite.url
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/469
2020-05-12 15:24:50 +03:00
Aliaksandr Valialkin
4e237b4670
app/vminsert/influx: support passing AccountID and ProjectID via plain TCP and UDP
...
Now `vminsert` accepts AccountID and ProjectID via `VictoriaMetrics_AccountID` and `VictoriaMetrics_ProjectID` tags
when reading Influx line protocol data via plain TCP or UDP (i.e. when `-influxListenAddr` is set).
2020-05-12 13:13:04 +03:00
Aliaksandr Valialkin
f7753b1469
lib/storage: gradually pre-populate per-day inverted index for the next day
...
This should prevent from CPU usage spikes at 00:00 UTC every day when
inverted index for new day must be quickly created for all the active time series.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430
2020-05-12 12:13:32 +03:00
Roman Khavronenko
0157566fdb
vmalert: cleanup and restructure of code to improve maintainability ( #471 )
...
The change introduces new entity `manager` which replaces
`watchdog`, decouples requestHandler and groups. Manager
supposed to control life cycle of groups, rules and
config reloads.
Groups export an ID method which returns a hash
from filename and group name. ID supposed to be unique
identifier across all loaded groups.
Some tests were added to improve coverage.
Bug with wrong annotation value if $value is used in
templates after metrics being restored fixed.
Notifier interface was extended to accept context.
New set of metrics was introduced for config reload.
2020-05-11 14:35:55 +03:00
Nikolay Khramchikhin
0e8c345ffb
vmalert config reload
...
added config hot reload for vmalert with sighup and api call
2020-05-11 14:35:50 +03:00
Aliaksandr Valialkin
6646b380ef
docs/vmauth.md: fix a link to docker images
2020-05-08 14:11:10 +03:00
Aliaksandr Valialkin
28ad350a31
app/vmagent: return 200 from /-/reload
endpoint as Prometheus does
2020-05-07 19:29:48 +03:00
Aliaksandr Valialkin
3052b479b7
lib/httpserver: reduce typical duration for http server graceful shutdown
...
Previously the duration for graceful shutdown for http server could take more than a minute
because of imporperly set timeouts in setNetworkTimeout.
Now typical duration for graceful shutdown should be reduced to less than 5 seconds.
2020-05-07 14:16:38 +03:00
Aliaksandr Valialkin
dc04040781
docs/{vmagent,vmauth}: small clarifications in the docs
2020-05-07 12:55:06 +03:00
Aliaksandr Valialkin
2b403d3f42
app/vmauth: prevent from attacks with ..
in path for accessing resources outside the configured url_prefix
2020-05-07 12:55:04 +03:00
Aliaksandr Valialkin
20538a2a5d
app/vmagent: allow setting independent auth configs per each configured -remoteWrite.url
2020-05-06 16:52:32 +03:00
Aliaksandr Valialkin
12dbb9e22c
app/vmagent: properly set client-side TLS certificates for -remoteWrite.url
. Previously they were mistakenly set as server-side
2020-05-06 16:50:37 +03:00
Aliaksandr Valialkin
8665c2edb1
docs/vmagent.md: small fixes
2020-05-06 14:49:25 +03:00
Aliaksandr Valialkin
8ab5e47b5c
lib/promscrape: add Prometheus-compatible DNS-based service discovery aka dns_sd_configs
2020-05-06 00:02:41 +03:00
Aliaksandr Valialkin
21b91599c2
docs/{vmauth,vmagent}: fix ports for profiling
2020-05-05 20:16:09 +03:00
Aliaksandr Valialkin
309700ab8c
docs/vmauth.md: mention that we can help creating customized proxy
2020-05-05 12:34:08 +03:00
Aliaksandr Valialkin
20e958789a
docs/{vmagent,vmauth}: add Profiling
section
2020-05-05 11:45:29 +03:00
Aliaksandr Valialkin
1153f30fee
docs: add vmauth.md
2020-05-05 11:17:45 +03:00
Aliaksandr Valialkin
782fb30cd0
app/vmauth: build fixes
2020-05-05 11:03:25 +03:00
Aliaksandr Valialkin
de31d16154
app/vmauth: add initial version of vmauth. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmauth/README.md for details
2020-05-05 10:56:20 +03:00
Aliaksandr Valialkin
61df59b9ea
docs/vmagent.md: /targets
page doesnt expose infomration about imporperly configured scrape configs now. It is written in error log instead
2020-05-05 10:56:18 +03:00
Roman Khavronenko
abce2b092f
app/vmalert: restore alerts state from datasource metrics ( #461 )
...
* app/vmalert: restore alerts state from datasource metrics
Vmalert will restore alerts state for rules that have `rule.For` > 0 from previously written timeseries via `remotewrite.url` flag.
* app/vmalert: mention remotewerite and remoteread configuration in README
2020-05-05 00:52:19 +03:00
Aliaksandr Valialkin
89aa6dbf56
lib/promscrape: add Prometheus-compatible service discovery for Consul aka consul_sd_configs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330
2020-05-04 20:53:06 +03:00
Aliaksandr Valialkin
b21b73115a
app/vminsert: add /-/reload
handler in the same way as for vmagent
2020-04-30 02:18:08 +03:00
DexterZhang
ae215e5538
feat(vmagent): add promscrap config reload suppport via http ( #450 )
...
* feat(vmagent): add promscrap config reload suppport via http endpoint `/-/reload`
* fix: typo fix
2020-04-30 02:18:01 +03:00
Artem Navoiev
121f7e1d56
Update README.md
2020-04-29 17:41:04 +03:00
Aliaksandr Valialkin
b6d88bac04
vendor: use github.com/VictoriaMetrics/fasthttp instead of github.com/fasthttp/fasthttp
...
The upstream fasthttp may contain issues like 996610f021
,
plus a code that isn't used by VictoriaMetrics. So let's use a private copy under our control instead.
2020-04-29 16:43:09 +03:00
Aliaksandr Valialkin
9ed4951ec8
lib/metricsql: move it to a separate repository - github.com/VictoriaMetrics/metrics
2020-04-28 15:30:06 +03:00
Aliaksandr Valialkin
cd1145e5f4
app/vmselect: add -search.estimatedSeriesCountAfterAggregation
command-line flag for tuning the probability of OOMs or false-positive not enough memory
errors
2020-04-28 12:51:48 +03:00
Aliaksandr Valialkin
a858b7e393
app/vmalert: added missing comments for public entities
2020-04-28 11:19:48 +03:00
Aliaksandr Valialkin
716bbe79d4
app/vminsert/netstorage: increase timeout for waiting for ack
message after sending big data block to vmstorage
2020-04-28 11:19:46 +03:00
Aliaksandr Valialkin
50af16baf2
app/vmalert: fix build
2020-04-28 00:34:01 +03:00
Aliaksandr Valialkin
e3db2c73a6
app/vmalert: sync with master branch
2020-04-28 00:19:42 +03:00
Aliaksandr Valialkin
7644f40763
app/vmalert: include it into the next release
2020-04-28 00:11:41 +03:00
Aliaksandr Valialkin
86a1d9cb0c
lib/promscrape: add initial support for Prometheus-compatible service discovery for Amazon EC2 aka ec2_sd_configs
2020-04-27 19:29:22 +03:00
Aliaksandr Valialkin
0daa37fa02
lib/promscrape/discovery/gce: allow empty project and zone for gce_sd_config
2020-04-27 11:45:45 +03:00
Aliaksandr Valialkin
989d84cf3f
app/{vminsert,vmstorage}: wait for ack
from vmstorage
after each packet sent to it from vminsert
...
This should protect from possible data loss when `vmstorage` is stopped while the packet is sent from `vminsert`.
This commit switches to new protocol between vminsert and vmstorage, which is incompatible
with the previous protocol. So it is required that both vminsert and vmstorage nodes are updated.
2020-04-27 09:53:26 +03:00
Aliaksandr Valialkin
e933cbac16
lib/storage: postpone reading data from blocks during search
...
This eliminates the need for storing block data into temporary files on a single-node VictoriaMetrics
during heavy queries, which touch big number of time series over long time ranges.
This improves single-node VM performance on heavy queries by up to 2x.
2020-04-27 08:44:01 +03:00
Aliaksandr Valialkin
23a310cc68
app/vmselect/netstorage: substitute sorting packedTimeseries with the natural order of the fetched blocks
...
This should minimize the number of disk seeks when reading data from temporary file.
2020-04-26 16:46:17 +03:00
Aliaksandr Valialkin
31861c5b8e
lib/promscrape/discovery/gce: allow empty zone
arg in gce_sd_config
- in this case zones for the given project are automatically discovered
2020-04-26 14:37:38 +03:00
Aliaksandr Valialkin
d9bdda408c
docs/{vmbackup,vmrestore}.md: update -help
output
2020-04-24 22:44:45 +03:00
Jason Gardner
7a6b2839b4
app/vmbackup: added ability to create and delete snapshots during backup ( #428 )
...
* app/vmbackup: added ability to create and delete snapshots during backup
Resolves: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/422
* Add snapshot create and delete url flags
* Fixed errcheck warnings in build
2020-04-24 22:35:50 +03:00
Aliaksandr Valialkin
32b3f959fc
app/vmselect: fix description for -search.resetCacheAuthKey
2020-04-24 19:44:35 +03:00
Aliaksandr Valialkin
069690e3bd
lib/promscrape: initial implementation for gce_sd_configs
aga Prometheus-compatible service discovery for Google Compute Engine
2020-04-24 17:53:43 +03:00
Aliaksandr Valialkin
f9526809e5
app/vmselect: add /api/v1/status/tsdb
page with useful stats for locating root cause for high cardinality issues
...
See https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/425
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/268
2020-04-22 22:03:23 +03:00
Aliaksandr Valialkin
b59f1f1504
app/vmselect: add -search.minStalenessInterval
command-line flag for removing gaps on graphs built from time series with irregular duration between samples
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/426
2020-04-20 19:42:41 +03:00
Aliaksandr Valialkin
603d4c9217
app/vmselect: merge -search.maxLookback
and -search.maxStalenessInterval
flags, since it has been appeared they have identical purpose :(
...
Leave both flags for backwards compatibility reasons.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/426
2020-04-20 19:28:28 +03:00
Aliaksandr Valialkin
db5fe03170
deployment/docker: allow building docker images on top of any base image set via ROOT_IMAGE environment var
...
For example, the following command will build VictoriaMetrics docker image on top of alpine image:
ROOT_IMAGE=alpine make package-victoria-metrics
2020-04-20 01:16:21 +03:00
Aliaksandr Valialkin
1b911f6965
app/vmagent/remotewrite: retry sending data if the server closes keep-alive connection
...
This should fix the following error when sending data to remote storage:
couldn't send a block with size XX bytes to "YYY": the server closed connection before returning the first response byte. Make sure the server returns 'Connection: close' response header before closing the connection
2020-04-17 15:53:17 +03:00
Aliaksandr Valialkin
9105f72f17
docs/vmagent.md: typo fix: unvailable -> unavailable
2020-04-17 13:12:13 +03:00
Aliaksandr Valialkin
d46311fd93
app/vmagent/README.md: mention about prodmscrape.suppressScrapeErrors
2020-04-17 13:09:08 +03:00
Aliaksandr Valialkin
b9b5641c2f
app/vmselect: properly apply -search.maxLookback
to queries sent to /api/v1/query
2020-04-17 12:31:18 +03:00
Aliaksandr Valialkin
d4bc60d63c
lib/logger: add WARN level for logging expected errors such as invalid user queries
2020-04-15 20:50:45 +03:00
Aliaksandr Valialkin
a873b553cf
app/vmselect: handle timestamp(metric offset X)
the same way as Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/415
2020-04-15 12:01:05 +03:00
Aliaksandr Valialkin
6ec582acb9
lib/promscrape: show information on improperly configured scrape targets at the bottom of /targets
page
...
This is a common error whith improperly configured target autodiscovery and/or relabeling.
This error leads to duplicate scraping of the same targets with the same set of labels, which leads
to duplicate samples in time series.
2020-04-14 14:55:13 +03:00
Aliaksandr Valialkin
755f649c72
docs/vmagent.md: mention that vmagent supports kubernetes_sd_configs
now
2020-04-13 21:07:00 +03:00
Aliaksandr Valialkin
38256bd66d
docs: update minimum supported Go version from 1.12 to 1.13
2020-04-07 13:39:15 +03:00
Aliaksandr Valialkin
2b4d3effad
app/vmagent/remotewrite: add "X-Prometheus-Remote-Write-Version: 0.1.0" http header to remote_write request
...
This header is required by Cortex (and, probably, other remote storage systems).
See 9c1f44d090/docs/apis.md (remote-api)
.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/399
2020-04-04 16:24:47 +03:00
Aliaksandr Valialkin
87da127fbf
app/victoria-metrics: remove accidentally added testdata for single-node VM
2020-04-04 16:09:08 +03:00
Aliaksandr Valialkin
a012f6fe70
app/vmselect/promql: keep metric name after applying first_over_time
and last_over_time
functions
2020-04-04 14:54:02 +03:00
Aliaksandr Valialkin
a53e332a93
app/vmstorage: add missing shutdown for http server on graceful shutdown
...
This could result in the following panic during graceful shutdown when `/metrics` page is requested:
http: panic serving 10.101.66.5:57366: runtime error: invalid memory address or nil pointer dereference
goroutine 2050 [running]:
net/http.(*conn).serve.func1(0xc00ef22000)
net/http/server.go:1772 +0x139
panic(0xa0fc00, 0xe91d80)
runtime/panic.go:973 +0x3e3
github.com/VictoriaMetrics/VictoriaMetrics/lib/workingsetcache.(*Cache).UpdateStats(0x0, 0xc0000516c8)
github.com/VictoriaMetrics/VictoriaMetrics/lib/workingsetcache/cache.go:224 +0x37
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexDB).UpdateMetrics(0xc00b931d00, 0xc02c41acf8)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:258 +0x9f
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*Storage).UpdateMetrics(0xc0000bc7e0, 0xc02c41ac00)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/storage.go:413 +0x4c5
main.registerStorageMetrics.func1(0x0)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/main.go:186 +0xd9
main.registerStorageMetrics.func3(0xc00008c380)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/main.go:196 +0x26
main.registerStorageMetrics.func7(0xc)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/main.go:211 +0x26
github.com/VictoriaMetrics/metrics.(*Gauge).marshalTo(0xc000010148, 0xaa407d, 0x20, 0xb50d60, 0xc005319890)
github.com/VictoriaMetrics/metrics@v1.11.2/gauge.go:38 +0x3f
github.com/VictoriaMetrics/metrics.(*Set).WritePrometheus(0xc000084300, 0x7fd56809c940, 0xc005319860)
github.com/VictoriaMetrics/metrics@v1.11.2/set.go:51 +0x1e1
github.com/VictoriaMetrics/metrics.WritePrometheus(0x7fd56809c940, 0xc005319860, 0xa16f01)
github.com/VictoriaMetrics/metrics@v1.11.2/metrics.go:42 +0x41
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver.writePrometheusMetrics(0x7fd56809c940, 0xc005319860)
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver/metrics.go:16 +0x44
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver.handlerWrapper(0xb5a120, 0xc005319860, 0xc005018f00, 0xc00002cc90)
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver/httpserver.go:154 +0x58d
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver.gzipHandler.func1(0xb5a120, 0xc005319860, 0xc005018f00)
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver/httpserver.go:119 +0x8e
net/http.HandlerFunc.ServeHTTP(0xc00002d110, 0xb5a660, 0xc0044141c0, 0xc005018f00)
net/http/server.go:2012 +0x44
net/http.serverHandler.ServeHTTP(0xc004414000, 0xb5a660, 0xc0044141c0, 0xc005018f00)
net/http/server.go:2807 +0xa3
net/http.(*conn).serve(0xc00ef22000, 0xb5bf60, 0xc010532080)
net/http/server.go:1895 +0x86c
created by net/http.(*Server).Serve
net/http/server.go:2933 +0x35c
2020-04-02 21:09:55 +03:00
Aliaksandr Valialkin
3b744f3c32
app/vmstorage: typo fix
2020-04-01 23:43:09 +03:00
Aliaksandr Valialkin
f838cdc86e
app/vmstorage: add vm_free_disk_space_bytes
metric for monitoring the remaining disk space at -storageDataPath
2020-04-01 23:10:44 +03:00
Aliaksandr Valialkin
5270b7a097
app/victoria-metrics/testdata: add a test for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395
2020-03-31 12:51:46 +03:00
Aliaksandr Valialkin
d450249955
lib/storage: properly handle {label=~"foo|"}
filters as Prometheus does
...
Such filters must match all the time series with `label="foo"` plus all the time series without `label`
Previously only time series with `label="foo"` were matched.
2020-03-30 20:21:47 +03:00
Aliaksandr Valialkin
c6cbc0bd19
app/vmselect/prometheus: allow passing relative time to start
, end
and time
args of /api/v1/*
queries
2020-03-29 21:56:52 +03:00
Aliaksandr Valialkin
cb8696699a
app/vmselect/prometheus: code simplification: (d.Seconds()/1e3) -> d.Milliseconds()
2020-03-29 21:50:35 +03:00
Aliaksandr Valialkin
ceb6d1459f
docs/vmagent.md: add prometheus remote_write proxy
use case
2020-03-28 23:17:41 +02:00
Dmitry Naumov
b84071fc25
Rootless docker images by default ( #358 )
...
* Rootless docker images by default
* Migrate to rootless base image
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-03-27 21:18:32 +02:00
Aliaksandr Valialkin
58cb7fc476
app/vmselect: adjust label_map()
handling for corner cases
...
The following corner cases now supported:
* label_map(q, "label", "", "foo") - adds `label="foo"` to series with missing `label`
* label_map(q, "label", "foo", "") - removes `label="foo"` from series
All the unmatched labels are kept unchanged.
2020-03-13 18:41:52 +02:00
Aliaksandr Valialkin
0e7a71a245
app/vmselect: add label_map(q, label, srcValue1, dstValue1, ... srcValueN, dstValueN)
function to MetricsQL
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/369
2020-03-12 19:13:56 +02:00
Aliaksandr Valialkin
50555d89d3
app/vmselect: add -search.maxStalenessInterval
for tuning Prometheus data model closer to Influx-style data model
2020-03-11 16:44:03 +02:00
Aliaksandr Valialkin
b46af9678e
app/vmagent: mention that vmagent can filter data
2020-03-11 16:23:10 +02:00
Aliaksandr Valialkin
8939c19281
app/vmstorage: return 500 status code instead of 200 status code on internal errors inside /snapshot/*
handlers
2020-03-10 23:54:27 +02:00
Aliaksandr Valialkin
f6410ff2bf
app/vmselect: add optional max_rows_per_line
query arg to /api/v1/export
...
This arg allows limiting the number of data points that may be exported on a single line.
2020-03-10 21:47:43 +02:00
Aliaksandr Valialkin
2f0a36044c
app/{vmagent,vminsert}: add support for importing csv data via /api/v1/import/csv
2020-03-10 21:17:40 +02:00
Aliaksandr Valialkin
3fc6599aa2
app/vmagent: properly apply -remoteWrite.sendTimeout
to fasthttp.HostClient
2020-03-09 13:31:22 +02:00
Aliaksandr Valialkin
47e986c26f
app/vmagent: properly add labels set via -remoteWrite.label
to metrics before sending them to -remoteWrite.url
2020-03-06 19:28:14 +02:00
Aliaksandr Valialkin
0d893eff36
Makefile: add build and test rules with enabled race detector. These rules have -race
suffix
...
Fix also `unsafe pointer conversion` errors detected by Go1.14. See https://golang.org/doc/go1.14#compiler .
2020-03-05 12:05:16 +02:00
Aliaksandr Valialkin
0176fc4206
app/vmagent/README.md: small fixes
2020-03-04 18:15:24 +02:00
Aliaksandr Valialkin
ac03be5a2c
app/vmagent/README.md: typo fix
2020-03-04 18:05:43 +02:00
Aliaksandr Valialkin
9354b9177a
app/vmagent/README.md: clarification
2020-03-04 18:04:06 +02:00
Aliaksandr Valialkin
4302555228
app/vmagent/README.md: add iot and edge monitoring
use case
2020-03-04 18:01:40 +02:00
Aliaksandr Valialkin
ea5904fd76
app/vmagent/README.md: add use cases
section
2020-03-04 17:42:57 +02:00
Aliaksandr Valialkin
f01d1bf4a8
app/vmagent: add -remoteWrite.maxDiskUsagePerURL
for limiting the maximum disk usage for each -remoteWrite.url
buffer
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/352
2020-03-03 19:49:20 +02:00
Aliaksandr Valialkin
808c17e250
app/vmagent/remotewrite: do not reset empty relabelCtx
2020-03-03 15:01:21 +02:00
Aliaksandr Valialkin
af19ca2483
app/vmagent: add -remoteWrite.urlRelabelConfig
for applying individual relabeling for each -remoteWrite.url
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/320
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/308
2020-03-03 13:13:06 +02:00
Aliaksandr Valialkin
d23df53ba2
app/vmselect/prometheus: do not add __name__!=
filter when searching for all the matching metric names via /api/v1/label/__name__/values
with non-empty label filter
...
This should reduce query time.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/343
2020-02-28 23:36:38 +02:00
Aliaksandr Valialkin
0eed71c7f4
app/vmagent/remotewrite: yet another typo fix
2020-02-28 20:07:00 +02:00
Aliaksandr Valialkin
6cdc97a53f
app/vmagent/remotewrite: typo fix
2020-02-28 19:05:11 +02:00
Aliaksandr Valialkin
cc39c9d74b
app/vmagent/remotewrite: limit memory usage when big scrape blocks are pushed to remote storage
2020-02-28 18:58:13 +02:00
Aliaksandr Valialkin
45d21d18a8
docs: add a doc for vmagent
2020-02-28 12:23:44 +02:00
Aliaksandr Valialkin
8fa1cd24d8
app/vmselect/prometheus: properly pass filter for labelName=__name__
in labelValuesWithMatches
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/343
2020-02-28 12:17:30 +02:00
Aliaksandr Valialkin
cf9aee4ec3
all: properly split vm_deduplicated_samples_total
among cluster components
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/345
2020-02-27 23:47:51 +02:00
Aliaksandr Valialkin
1286cead75
app/vminsert: properly initialize InsertCtx
...
This should prevent from panic described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/339
2020-02-26 21:21:02 +02:00
Aliaksandr Valialkin
0597f1e39a
app/vmagent: allow setting -httpListenAddr
to empty string in order to disable listening for http requests
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/340
2020-02-26 20:58:26 +02:00
Aliaksandr Valialkin
266101feb4
app/vmagent/README.md: list service discovery mechanisms, which will be added soon
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/334
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330
2020-02-26 19:28:19 +02:00
Aliaksandr Valialkin
c6c7843e93
app/vmagent: add -remoteWrite.maxBlockSize
command-line flag for limiting the maximum size of unpacked block to send to remote storage
2020-02-25 19:58:11 +02:00
Aliaksandr Valialkin
c4194020ef
app/vmagent: do not allow sending unpacked requests with sizes exceeding -maxInsertRequestSize
2020-02-25 19:35:43 +02:00
Aliaksandr Valialkin
2471340e0d
app/vmagent: add ability to accept Influx line protocol data via TCP and UDP
...
Just set `-influxListenAddr` command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/333
2020-02-25 19:18:01 +02:00
Aliaksandr Valialkin
f96fb93ca5
app/vmagent: add a counter for /targets
handler calls
2020-02-25 18:17:25 +02:00
Aliaksandr Valialkin
25c570dae7
app/vmagent/README.md: mention that vmagent
exposes target statuses at /targets
page
2020-02-25 18:16:08 +02:00
Aliaksandr Valialkin
ca28a3e805
app/vmagent: logo fix
2020-02-25 00:09:55 +02:00
Aliaksandr Valialkin
777a39f7a1
app/vmagent: update docs
2020-02-25 00:09:53 +02:00
Aliaksandr Valialkin
61e67b8922
app/vmagent/README.md: small fixes
2020-02-24 21:26:12 +02:00
Aliaksandr Valialkin
6ca1e58d98
app/vmselect/promql: properly take into account the first datapoint when calculating rollup_candlestick
2020-02-24 13:25:07 +02:00
Aliaksandr Valialkin
b58e3fc8a9
app/vmselect/promql: do not take into account values outside the current window in rollup_candlestick
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-23 18:06:26 +02:00
Yaroslav
c69d4b01f0
fix rollupOpen(), rollupHigh(), rollupLow() functions ( #328 )
2020-02-23 18:06:25 +02:00
Aliaksandr Valialkin
7ee7614e90
app/vmagent: initial implementation for vmagent
2020-02-23 17:31:54 +02:00
Aliaksandr Valialkin
f22aefdb16
app/vmselect/promql: log when rollupResult cache is cleared
2020-02-21 20:06:53 +02:00
Aliaksandr Valialkin
d5c2a0ce64
app/vmselect: add -search.cacheTimestampOffset
command-line flag
...
This flag can be used for removing gaps on graphs if the difference between the current time
and the timestamps from the ingested data exceeds 5 minutes.
This is the case when the time between data sources and VictoriaMetrics is improperly synchronized.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/312
2020-02-21 14:02:15 +02:00
Aliaksandr Valialkin
c70822db50
app/vmselect: add /internl/resetRollupResultCache
handler for resetting response cache
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/312
2020-02-21 14:02:12 +02:00
Aliaksandr Valialkin
afecb34491
app/vmstorage: limit the maximum error message size before sending it to client
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/315
2020-02-13 17:33:12 +02:00
Aliaksandr Valialkin
846d7fa7e9
app/vmselect: add sort_by_label(q, label)
and sort_by_label_desc(q, label)
functions
...
This is implementation of https://github.com/prometheus/prometheus/pull/1533 for VictoriaMetrics.
2020-02-13 17:01:50 +02:00
Aliaksandr Valialkin
347aaba79d
lib/{storage,mergeset}: use time.Ticker instead of time.Timer where appropriate
...
It has been appeared that time.Timer was used in places where time.Ticker must be used instead.
This could result in blocked goroutines as in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 13:21:48 +02:00
Aliaksandr Valialkin
6e0013ca39
app/vmselect/prometheus: typo fix in -latencyOffset
description
2020-02-12 14:00:38 +02:00
Edouard Hur
e8f92a4ee8
Cluster - prometheus metrics fix ( #314 )
...
* add missing '/{}' in prom query range requests
* fix missing leading '/' on prom lavelValuesErrors path
2020-02-10 22:15:21 +02:00
Aliaksandr Valialkin
1010a57882
all: allow setting flags via environment vars
...
Now flags can be set via environment vars with the same names as flags.
Command-line flags override flags set via env vars.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-10 13:31:21 +02:00
Aliaksandr Valialkin
ea66212c93
lib/storage: move -dedup.minScrapeInterval
flag outside lib/storage, so it doesnt show up in vminsert
in cluster version
2020-02-10 13:07:25 +02:00
Aliaksandr Valialkin
e6d9ea3094
app/vmselect/promql: do not add step to range end, since this hack became obsolete since commit 9e1119dab8
2020-02-05 21:23:44 +02:00
Aliaksandr Valialkin
4a1de7fee9
app/vmselect/promql: properly adjust time range for data to select
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-05 21:23:43 +02:00
Aliaksandr Valialkin
8e77b54846
app/vmselect: unconditionally offset -step
to rollup_candlestick
. This makes results more consistent
2020-02-04 23:31:47 +02:00
Aliaksandr Valialkin
ce38b176bc
app/vmselect/promql: automatically apply offset -step
to rollup_candlestick
function in order to obtain the expected OHLC results
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-04 23:24:04 +02:00
Aliaksandr Valialkin
4f7116d1ee
app/vmselect/promql: adjust rollup_candlestick calculations to the exepcted results
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-04 22:43:28 +02:00
Aliaksandr Valialkin
ccd3aa4f15
app/vmselect: take into account the time the requests wait in the queue if -search.maxConcurrentRequests
is exceeded
...
This will prevent from excess CPU usage for timed out queries.
2020-02-04 16:20:48 +02:00
Aliaksandr Valialkin
e6bf88a4d4
app/vmselect: add a placeholder for /api/v1/metadata
, which could be requested by Grafana
...
See https://prometheus.io/docs/prometheus/latest/querying/api/#querying-metric-metadata
VictoriaMetrics doesn't collect any metadata for metrics, so just return empty response.
2020-02-04 15:56:01 +02:00
Aliaksandr Valialkin
7cde594696
all: do not clash flag description with back-quoted flag types
...
See https://golang.org/pkg/flag/#PrintDefaults for more details.
2020-02-04 15:56:01 +02:00
Aliaksandr Valialkin
45bc6c62f2
app/vmselect/promql: adjust and
and unless
binary operator handling to be consistent with Prometheus
2020-01-31 18:52:51 +02:00
Aliaksandr Valialkin
e3adc095bd
all: add -dedup.minScrapeInterval
command-line flag for data de-duplication
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/86
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/278
2020-01-31 01:18:54 +02:00
Aliaksandr Valialkin
cb5c39ee70
lib/fs: optimize small reads for ReaderAt.MustReadAt
by reading from memory-mapped space instead of reading from file descriptor
...
This should improve performance when reading many small blocks.
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
4ed5e9a7ce
lib/storage: pre-fetch metricNames for the found metricIDs in Search.Init
...
This should speed up Search.NextMetricBlock loop for big number of found time series.
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
170c1c3a4e
app/vmselect/promql: add keep_next_value(q)
for filling gaps with the next non-empty value
2020-01-29 00:48:14 +02:00
Aliaksandr Valialkin
a9c1d5b351
app/vminsert: moved -maxInsertRequestSize
command-line flag out of lib/prompb
in order to prevent its inclusion in vmselect
and vmstorage
apps
2020-01-28 22:53:50 +02:00
Aliaksandr Valialkin
b28c9a3944
app/vmselect/promql: return expected results from increase()
over the beginning of time series, which start from big value
...
Examples for such counters: OS-level counters for network or cpu stats.
2020-01-28 16:31:05 +02:00
Aliaksandr Valialkin
3e304890a6
app/vmselect/promql: fix panic on a single zero vmrange bucket in prometheus_buckets() function
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/296
2020-01-27 18:05:12 +02:00
Aliaksandr Valialkin
4d70a81e18
app/vminsert: do not drop pending rows if all the vmstorage backends are unavailable
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/294
2020-01-24 22:10:10 +02:00
Aliaksandr Valialkin
0cda6afa8e
app/vminsert: move ingestion protocol parsers to lib/protoparser, so they could be re-used in the upcoming vmagent
2020-01-24 16:55:18 +02:00
Aliaksandr Valialkin
ea53a21b02
all: consistently log durations in seconds with millisecond precision
...
This should improve logs readability
2020-01-22 18:35:24 +02:00
Aliaksandr Valialkin
e1a264173a
app/vmselect: mention the original query and time range in error messages
...
This should simplify debugging invalid or heavy queries.
2020-01-22 17:34:35 +02:00
Aliaksandr Valialkin
e127173984
app/vmselect: mention command-line flag, which could be used for adjusting query timeouts, in timeout errors
2020-01-22 15:53:42 +02:00
Aliaksandr Valialkin
f3b9f8b823
app/vmselect/prometheus: increase default value -maxExportDuration
to 30 days, since 10 minutes beat users exporting bit amounts of data
2020-01-22 15:53:41 +02:00
Aliaksandr Valialkin
40e564eb9c
app/vmselect/promql: add range_over_time(m[d])
function for calculating value range for m
over d
2020-01-21 19:05:29 +02:00
Aliaksandr Valialkin
ecddba30fe
app/vminsert/netstorage: increase timeout for pushing data from vminsert to vmstorage by 3x
...
Our clients report that the previous timeout could lead to frequent errors when
vmstorage starts background merge for big parts on slow HDD.
2020-01-21 18:21:49 +02:00
Aliaksandr Valialkin
9eaa2ab871
app/vmselect/promql: add label_match(q, label, regexp)
and label_mismatch(q, label, regexp)
functions for filtering out time series with labels matching the given regexp
2020-01-21 15:00:35 +02:00
Aliaksandr Valialkin
cbd0452317
app/vminsert: increase default value for -insert.maxQueueDuration
from 30s to 60s
...
This should help catching up with high ingestion rate after VictoriaMetrics restart.
2020-01-18 14:39:30 +02:00
Aliaksandr Valialkin
2084921e64
all: use github.com/klauspost/compress/gzip
instead of compress/gzip
...
`github.com/klauspost/compress/gzip` is more optimized than `compress/gzip`.
This gives better gzip compression and decompression speeds.
2020-01-17 23:59:17 +02:00
Aliaksandr Valialkin
54db08a60f
app/vmselect/netstorage: make fmt
2020-01-17 17:46:20 +02:00
Aliaksandr Valialkin
d21cc2d16a
app/vmselect/netstorage: limit the maximum size for in-memory buffer for temporary blocks file
...
This should reduce memory usage on systems with more than 8GB RAM.
2020-01-17 16:28:28 +02:00
Aliaksandr Valialkin
b05f6cf11c
app/vmselect: limit the default value for -search.maxConcurrentRequests
, so it plays well on systems with more than 16 vCPUs
...
A single heavy request can saturate all the available CPUs, so let's limit the number of concurrent requests to lower value.
This will give more chances for executing insert path.
2020-01-17 16:16:43 +02:00
Aliaksandr Valialkin
a9f683423c
app/{vminsert,vmselect}: improve error messages when VictoriaMetrics cannot handle too high number of concurrent inserts / selects
2020-01-17 16:16:43 +02:00
Aliaksandr Valialkin
4b16b7fd11
all: mention command-line flags used for limiting the incoming request size in error messages
...
This should improve error logs usability.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/287
2020-01-16 13:06:43 +02:00
Aliaksandr Valialkin
ce0b602405
app/vmselect/promql: fix panic on sum(aggr_over_time(...))
with incorrect number of args
2020-01-15 16:26:16 +02:00
Aliaksandr Valialkin
bcd3f0c5bd
app/vmselect/promql: add hoeffding_bound_upper(phi, m[d])
and hoeffding_bound_lower(phi, m[d])
functions
...
These functions can be used for calculating Hoeffding bounds
for `m` over `d` time range and for the given `phi` in the range `[0..1]`.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/283
2020-01-11 14:47:13 +02:00
Aliaksandr Valialkin
fc01b11ddc
app/vmselect/promql: return continuous values for min_over_time
and max_over_time
when step
is smaller than scrape_interval
2020-01-11 12:47:57 +02:00
Aliaksandr Valialkin
16fb128bbc
app/vmselect/promql: do not take into account the previous point before time window in square brackets for min_over_time
, max_over_time
, rollup_first
and rollup_last
functions
...
This makes the behaviour for these functions similar to Prometheus when processing broken time series with irregular data points
like `gitlab_runner_jobs`. See https://gitlab.com/gitlab-org/gitlab-exporter/issues/50 for details.
2020-01-11 00:26:11 +02:00
Aliaksandr Valialkin
1c445bf7eb
vendor: update github.com/valyala/fastjson from v1.4.2 to v1.4.5
...
This should fix parsing Inf values in `/api/v1/import`. The previous attempt to fix this in VictoriaMetrics v1.32.1 was unsuccessful.
2020-01-10 23:14:34 +02:00
Aliaksandr Valialkin
adc36d00b7
app/vmselect/promql: properly handle aggr(aggr_over_time(...))
2020-01-10 21:57:11 +02:00
Aliaksandr Valialkin
87a106702b
app/vmselect/promql: add aggr_over_time(("aggr_func1", "aggr_func2", ...), m[d])
function
...
This function can be used for simultaneous calculating of multiple `aggr_func*` functions
that accept range vector. For example, `aggr_over_time(("min_over_time", "max_over_time"), m[d])`
would calculate `min_over_time` and `max_over_time` for `m[d]`.
2020-01-10 21:18:12 +02:00
Aliaksandr Valialkin
c314d9a219
app/vmselect/promql: add tmin_over_time(m[d])
and tmax_over_time(m[d])
functions
...
These functions return timestamp in seconds for the minimum and maximum value for `m` over time range `d`
2020-01-10 19:39:34 +02:00
Aliaksandr Valialkin
705af61587
app/{vmbackup,vmrestore}: add backup complete
file to backup when it is complete and check for this file before restoring from backup
...
This should prevent from restoring from incomplete backups.
Add `-skipBackupCompleteCheck` command-line flag to `vmrestore` in order to be able restoring from old backups without `backup complete` file.
2020-01-09 15:35:45 +02:00
Aliaksandr Valialkin
7c6df1e51d
app/vmselect/promql: skip rate
calculation for the first point on time series
2020-01-08 14:42:44 +02:00
Aliaksandr Valialkin
76707b2ab9
app/vmselect/promql: fix calculations for histogram_share
2020-01-04 14:44:15 +02:00
Aliaksandr Valialkin
accad01b3e
app/vmselect/promql: add missing MetricName into netstorage.Result in tests
2020-01-04 12:53:14 +02:00
Aliaksandr Valialkin
6f29d37cb5
app/vmselect/promql: add histogram_share(le, buckets)
function
2020-01-04 12:53:08 +02:00
Aliaksandr Valialkin
2290503140
app/vmselect/promql: add absent_over_time(m[d])
func similar to the function in Prometheus 2.16
...
See https://github.com/prometheus/prometheus/issues/2882
2020-01-04 12:53:01 +02:00
Aliaksandr Valialkin
67f94bbe12
app/vmselect/promql: add histogram_over_time(m[d])
rollup function
2020-01-04 12:52:56 +02:00
Aliaksandr Valialkin
9a1f6848ca
app/vmselect/promql: fix results caching for multi-arg rollup functions such as quantile_over_time
...
Previosly only a single arg was taken into account, so caching didn't work properly for multi-arg rollup funcs.
2020-01-03 20:44:54 +02:00
Aliaksandr Valialkin
3d0c7b095a
app/vmselect/promql: use scrapeInterval instead of window in denominator when calculating rate
for the first point on the time series
...
This should provide better estimation for `rate` in the beginning of time series.
2020-01-03 19:03:32 +02:00
Aliaksandr Valialkin
6ea7f23446
app/vmselect/promql: increase the estimated number of time series returned by aggr() by (something)
from 100 to 1K, since 100 may result in OOM for high number of time series
2020-01-03 01:02:30 +02:00
Aliaksandr Valialkin
e0abf45d45
app/vmselect/promql: add share_le_over_time
and share_gt_over_time
functions for SLI and SLO calculations
2020-01-03 00:41:36 +02:00
Aliaksandr Valialkin
eb1a66c577
lib/metricsq: add ExpandWithExprs
2019-12-25 22:20:21 +02:00
Aliaksandr Valialkin
453d71d082
Rename lib/promql to lib/metricsql and apply small fixes
2019-12-25 22:09:09 +02:00
Mike Poindexter
009d1559db
Split Extended PromQL parsing to a separate library
2019-12-25 22:09:07 +02:00
Aliaksandr Valialkin
ff18101d30
app/vmselect/promql: make sure AdjustStartEnd returns time range covering the same number of points as the initial time range
...
This should prevent from the following panic at app/vmselect/promql/binary_op.go:255:
BUG: len(leftVaues) must match len(rightValues) and len(dstValues)
2019-12-24 22:45:49 +02:00
Aliaksandr Valialkin
e24ee43109
app/vmselect/promql: adjust calculations for rate
and increase
for the first value
...
These calculations should trigger alerts on `/api/v1/query` for counters starting from values greater than 0.
2019-12-24 19:41:03 +02:00
Aliaksandr Valialkin
9a2554691c
app/vmselect/promql: properly calculate rate
on the first data point
...
It is calculated as `value / scrape_interval`, since the value was missing on the previous scrape,
i.e. we can assume its value was 0 at this time.
2019-12-24 15:55:15 +02:00
Aliaksandr Valialkin
97de50dd4c
app/vmselect/netstorage: improve error message when reading data size in readBytes
2019-12-24 14:40:14 +02:00
Aliaksandr Valialkin
29d2ce54cb
all: use gozstd instead of pure Go zstd for GOARCH=amd64
2019-12-24 12:43:59 +02:00
Aliaksandr Valialkin
6358cf3d47
app/vmselect/netstorage: move MustAdviseSequentialRead to lib/fs
2019-12-23 23:16:26 +02:00
Aliaksandr Valialkin
cc8a1bae0e
app/vmselect: add -search.maxExportDuration
command-line flag for limiting /api/v1/export
duration
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/275
2019-12-20 11:37:18 +02:00
Aliaksandr Valialkin
8b56b849e9
app/vminsert: return StatusNoContent http response for /api/v1/import
to be consistent with other insert handlers
2019-12-19 01:22:01 +02:00
Aliaksandr Valialkin
6a185b7809
app/vmselect: add ability to pass match[]
, start
and end
to /api/v1/labels
...
This makes the `/api/v1/labels` handler consistent with already existing functionality for `/api/v1/label/.../values`.
See https://github.com/prometheus/prometheus/issues/6178 for more details.
2019-12-15 00:20:43 +02:00
Aliaksandr Valialkin
a7bf8e77af
app/vminsert: simultaneously accept telnet put
and HTTP /api/put
OpenTSDB metrics at -opentsdbListenAddr
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/266
2019-12-14 00:42:18 +02:00
Aliaksandr Valialkin
b238997a84
all: rename Extended PromQL
to PromQL extensions
2019-12-12 19:29:59 +02:00
Aliaksandr Valialkin
cffaeda0f1
all: publish Docker images for the following GOARCH: amd64, arm, arm64, ppc64le and 386
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/258
2019-12-11 23:33:11 +02:00
Aliaksandr Valialkin
c25b97829f
app/vmselect/promql: return lower
and upper
bounds for the estimated percentile from histogram_quantile
if third arg is passed
...
Updates https://github.com/prometheus/prometheus/issues/5706
2019-12-11 14:00:18 +02:00
Aliaksandr Valialkin
f79b61e2a1
app/vmselect/promql: return matrix instead of vector on subqueries to /api/v1/query
like Prometheus does
2019-12-11 00:57:54 +02:00
Aliaksandr Valialkin
5d2ff573aa
app/vmselect/promql: allow negative offsets
...
Updates https://github.com/prometheus/prometheus/issues/6282
2019-12-11 00:57:51 +02:00
Aliaksandr Valialkin
c81a89a8ed
app/vminsert: add /api/v1/import
handler
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
2019-12-09 22:37:49 +02:00
Aliaksandr Valialkin
0c304439d4
app/vminsert: consistency renaming for counters
2019-12-09 16:44:26 +02:00
Aliaksandr Valialkin
c217a53c35
make vendor-update
2019-12-07 23:11:26 +02:00
Aliaksandr Valialkin
3534e71c96
app/vminsert/influx: add a test case from https://community.librenms.org/t/integration-with-victoriametrics/9689
2019-12-07 23:00:54 +02:00
Aliaksandr Valialkin
d39bba3547
app/vmselect/promql: add {topk|bottomk}_{min|max|avg|median}
aggregate functions for returning the exact k time series on the given time range
...
The full list of functions added:
- `topk_min(k, q)` - returns top K time series with the max minimums on the given time range
- `topk_max(k, q)` - returns top K time series with the max maximums on the given time range
- `topk_avg(k, q)` - returns top K time series with the max averages on the given time range
- `topk_median(k, q)` - returns top K time series with the max medians on the given time range
- `bottomk_min(k, q)` - returns bottom K time series with the min minimums on the given time range
- `bottomk_max(k, q)` - returns bottom K time series with the min maximums on the given time range
- `bottomk_avg(k, q)` - returns bottom K time series with the min averages on the given time range
- `bottomk_median(k, q)` - returns bottom K time series with the min medians on the given time range
2019-12-05 19:27:45 +02:00
Aliaksandr Valialkin
e0f43e1f66
app/vmselect: add placeholders for /api/v1/rules
and /api/v1/alerts
2019-12-03 19:38:09 +02:00
Aliaksandr Valialkin
4e22b521c2
lib/storage: remove metricID with missing metricID->metricName entry
...
The metricID->metricName entry can be missing in the indexdb after unclean shutdown
when only a part of entries for new time series is written into indexdb.
Recover from such a situation by removing the broken metricID. New metricID
will be automatically created for time series with the given metricName
when new data point will arive to it.
2019-12-02 20:52:13 +02:00
Aliaksandr Valialkin
819bb36852
app/vmselect/promql: estimate per-series scrape interval as 0.6 quantile for the first 100 intervals
...
This should improve scrape interval estimation for tiem series with gaps.
2019-12-02 13:43:04 +02:00
Aliaksandr Valialkin
1595dcd3d9
app/vminsert/influx: allow empty measurement in Influx line protocol
...
In this case metric names are mapped directly from field names without any prefixes.
2019-11-30 21:59:07 +02:00
Aliaksandr Valialkin
1e2019b1b6
app/vmselect/promql: fix corner case for increase
over time series with gaps
...
In this case `increase` could return invalid high value for the first point after the gap.
2019-11-30 01:34:18 +02:00
Aliaksandr Valialkin
4c63caa37c
deployment/docker/certs: update TLS certs source from alpine:3.9 to alpine:3.10
2019-11-29 19:55:36 +02:00
Aliaksandr Valialkin
7e734433a3
lib/backup: cosmetic fixes after #243
2019-11-29 18:07:41 +02:00
glebsam
4a192cb832
Add option to provide custom endpoint for S3, add option to specify S3 config profile ( #243 )
...
* Add option to provide custom endpoint for S3 for use with s3-compatible storages, add option to specify S3 config profile
* make fmt
2019-11-29 18:07:39 +02:00
Aliaksandr Valialkin
def9ccd360
app/vmselect/prometheus: consistently apply nocache
arg to /api/v1/query
the same way ast to /api/v1/query_range
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/241
2019-11-26 22:55:50 +02:00
Aliaksandr Valialkin
e0ac068112
app/vmselect/prometheus: fix content-type for /api/v1/export
responses
...
The correct Content-Type should be `application/stream+json` instead of `application/json`
Thanks to Joshua Ryder for pointing to this.
2019-11-26 17:44:27 +02:00
Aliaksandr Valialkin
28cc4c09b5
app/vmselect/promql: remove zero timeseries from prometheus_buckets
output
2019-11-25 19:10:13 +02:00
Aliaksandr Valialkin
8811bec14e
app/vmselect/prometheus: reduce default value for -search.latencyOffset
from 60s to 30s
...
30 seconds should be enough for almost all the cases
2019-11-25 16:33:36 +02:00
Aliaksandr Valialkin
f7da9b2db2
app/vmselect/promql: allow nested parens
2019-11-25 16:13:33 +02:00
Aliaksandr Valialkin
d2619d6dce
vendor: update github.com/VictoriaMetrics/metrics from v1.9.0 to v1.9.1
2019-11-25 15:22:50 +02:00
Aliaksandr Valialkin
f46fb6c740
app/vmselect/promql: re-use metrics.Histogram when calculating histogram function for each point on the graph
...
This should reduce the amounts memory allocations
2019-11-25 14:24:30 +02:00
Aliaksandr Valialkin
0f184affa7
app/vmselect/promql: optimize binary search over big number of samples during rollup calculations
2019-11-25 14:01:54 +02:00
Aliaksandr Valialkin
dbd07041ae
app/vmselect/promql: adjust tests after the upgrade of github.com/VictoriaMetrics/metrics from v1.8.3 to v1.9.0
2019-11-25 13:44:08 +02:00
Aliaksandr Valialkin
8bb254d960
app/vmselect/promql: add histogram
aggregate function, which is useful for building heatmaps from multiple time series
2019-11-24 00:04:15 +02:00
Aliaksandr Valialkin
414259f47b
app/vmselect/promql: do not take into account buckets with negative counters in prometheus_buckets
2019-11-23 14:19:19 +02:00
Aliaksandr Valialkin
193d553f6d
app/vmselect/promql: properly handle histogram_quantile(0, ...)
with zero buckets
2019-11-23 14:02:25 +02:00
Aliaksandr Valialkin
f8298c7f13
app/vmselect: add vm_per_query_{rows,series}_processed_count
histograms
2019-11-23 13:23:03 +02:00
Aliaksandr Valialkin
4d76977745
app/vmselect/promql: transparently apply prometheus_buckets
in histogram_quantile
2019-11-23 11:49:16 +02:00
Aliaksandr Valialkin
5f6f03c692
app/vmselect/promql: add prometheus_buckets
function for converting the upcoming histogram buckets from github.com/VictoriaMetrics/metrics
to Prometheus-compatible buckets
2019-11-23 00:21:56 +02:00
Aliaksandr Valialkin
17d08c1fe0
app/vmselect: adjust end
arg instead of adjusting start
arg if start > end
...
`start` arg has higher chances to be set properly comparing to `end` arg,
so it is expected that the `end` arg could be adjusted if it was set incorrectly.
2019-11-22 16:12:53 +02:00
Aliaksandr Valialkin
216a260ced
app/{vmbackup,vmrestore}: add -maxBytesPerSecond
command-line flag for limiting the used network bandwidth during backup / restore
2019-11-19 20:32:43 +02:00
Aliaksandr Valialkin
5ae47e8940
app/vmselect/prometheus: properly adjust too big time time
on /api/v1/query
...
Too big `time` must be adjusted to `now()-queryOffset`.
2019-11-19 00:42:07 +02:00
Aliaksandr Valialkin
77bb66a5be
app/vmselect/promql: properly calculate integrate(q[d])
2019-11-13 21:11:03 +02:00
Aliaksandr Valialkin
c33640664a
app/vmselect/promql: use universal approach for determining maxByteSliceLen on 32-bit and 64-bit archs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/235
2019-11-13 20:26:07 +02:00
Aliaksandr Valialkin
d297b65089
lib/storage: add vm_cache_size_bytes{type="storage/hour_metric_ids"}
metric
2019-11-13 20:26:05 +02:00
Aliaksandr Valialkin
494ad0fdb3
lib/storage: remove inmemory index for recent hour, since it uses too much memory
...
Production workload shows that the index requires ~4Kb of RAM per active time series.
This is too much for high number of active time series, so let's delete this index.
Now the queries should fall back to the index for the current day instead of the index
for the recent hour. The query performance for the current day index should be good enough
given the 100M rows/sec scan speed per CPU core.
2019-11-13 18:08:58 +02:00
Aliaksandr Valialkin
633dd81bb5
lib/storage: add -disableRecentHourIndex
flag for disabling inmemory index for recent hour
...
This may be useful for saving RAM on high number of time series aka high cardinality
2019-11-13 15:10:12 +02:00
Aliaksandr Valialkin
f1620ba7c0
lib/storage: fix inmemory inverted index issues found in v1.29
...
Issues fixed:
- Slow startup times. Now the index is loaded from cache during start.
- High memory usage related to superflouos index copies every 10 seconds.
2019-11-13 13:35:38 +02:00
Aliaksandr Valialkin
87b39222be
Revert "lib/fs: do not postpone directory removal on NFS error"
...
This reverts commit 21aeb02b46649ac9906cb37733f7b155a77a0db9.
2019-11-12 16:29:50 +02:00
Oleg Kovalov
74ba42d111
fix misspelled words ( #229 )
2019-11-12 00:18:24 +02:00
Aliaksandr Valialkin
c48e39eea9
lib/storage: add tests for dateMetricIDCache
2019-11-11 13:21:05 +02:00
Aliaksandr Valialkin
5f52eb7653
lib/fs: do not postpone directory removal on NFS error
...
Continue trying to remove NFS directory on temporary errors for up to a minute.
The previous async removal process breaks in the following case during VictoriaMetrics start
- VictoriaMetrics opens index, finds incomplete merge transactions and starts replaying them.
- The transaction instructs removing old directories for parts, which were already merged into bigger part.
- VictoriaMetrics removes these directories, but their removal is delayed due to NFS errors.
- VictoriaMetrics scans partition directory after all the incomplete merge transactions are finished
and finds directories, which should be removed, but weren't still removed due to NFS errors.
- VictoriaMetrics panics when it finds unexpected empty directory.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-11-10 13:27:16 +02:00