Commit graph

382 commits

Author SHA1 Message Date
Aliaksandr Valialkin
4302555228 app/vmagent/README.md: add iot and edge monitoring use case 2020-03-04 18:01:40 +02:00
Aliaksandr Valialkin
ea5904fd76 app/vmagent/README.md: add use cases section 2020-03-04 17:42:57 +02:00
Aliaksandr Valialkin
f01d1bf4a8 app/vmagent: add -remoteWrite.maxDiskUsagePerURL for limiting the maximum disk usage for each -remoteWrite.url buffer
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/352
2020-03-03 19:49:20 +02:00
Aliaksandr Valialkin
808c17e250 app/vmagent/remotewrite: do not reset empty relabelCtx 2020-03-03 15:01:21 +02:00
Aliaksandr Valialkin
af19ca2483 app/vmagent: add -remoteWrite.urlRelabelConfig for applying individual relabeling for each -remoteWrite.url
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/320
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/308
2020-03-03 13:13:06 +02:00
Aliaksandr Valialkin
d23df53ba2 app/vmselect/prometheus: do not add __name__!= filter when searching for all the matching metric names via /api/v1/label/__name__/values with non-empty label filter
This should reduce query time.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/343
2020-02-28 23:36:38 +02:00
Aliaksandr Valialkin
0eed71c7f4 app/vmagent/remotewrite: yet another typo fix 2020-02-28 20:07:00 +02:00
Aliaksandr Valialkin
6cdc97a53f app/vmagent/remotewrite: typo fix 2020-02-28 19:05:11 +02:00
Aliaksandr Valialkin
cc39c9d74b app/vmagent/remotewrite: limit memory usage when big scrape blocks are pushed to remote storage 2020-02-28 18:58:13 +02:00
Aliaksandr Valialkin
45d21d18a8 docs: add a doc for vmagent 2020-02-28 12:23:44 +02:00
Aliaksandr Valialkin
8fa1cd24d8 app/vmselect/prometheus: properly pass filter for labelName=__name__ in labelValuesWithMatches
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/343
2020-02-28 12:17:30 +02:00
Aliaksandr Valialkin
cf9aee4ec3 all: properly split vm_deduplicated_samples_total among cluster components
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/345
2020-02-27 23:47:51 +02:00
Aliaksandr Valialkin
1286cead75 app/vminsert: properly initialize InsertCtx
This should prevent from panic described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/339
2020-02-26 21:21:02 +02:00
Aliaksandr Valialkin
0597f1e39a app/vmagent: allow setting -httpListenAddr to empty string in order to disable listening for http requests
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/340
2020-02-26 20:58:26 +02:00
Aliaksandr Valialkin
266101feb4 app/vmagent/README.md: list service discovery mechanisms, which will be added soon
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/334
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330
2020-02-26 19:28:19 +02:00
Aliaksandr Valialkin
c6c7843e93 app/vmagent: add -remoteWrite.maxBlockSize command-line flag for limiting the maximum size of unpacked block to send to remote storage 2020-02-25 19:58:11 +02:00
Aliaksandr Valialkin
c4194020ef app/vmagent: do not allow sending unpacked requests with sizes exceeding -maxInsertRequestSize 2020-02-25 19:35:43 +02:00
Aliaksandr Valialkin
2471340e0d app/vmagent: add ability to accept Influx line protocol data via TCP and UDP
Just set `-influxListenAddr` command-line flag

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/333
2020-02-25 19:18:01 +02:00
Aliaksandr Valialkin
f96fb93ca5 app/vmagent: add a counter for /targets handler calls 2020-02-25 18:17:25 +02:00
Aliaksandr Valialkin
25c570dae7 app/vmagent/README.md: mention that vmagent exposes target statuses at /targets page 2020-02-25 18:16:08 +02:00
Aliaksandr Valialkin
ca28a3e805 app/vmagent: logo fix 2020-02-25 00:09:55 +02:00
Aliaksandr Valialkin
777a39f7a1 app/vmagent: update docs 2020-02-25 00:09:53 +02:00
Aliaksandr Valialkin
61e67b8922 app/vmagent/README.md: small fixes 2020-02-24 21:26:12 +02:00
Aliaksandr Valialkin
6ca1e58d98 app/vmselect/promql: properly take into account the first datapoint when calculating rollup_candlestick 2020-02-24 13:25:07 +02:00
Aliaksandr Valialkin
b58e3fc8a9 app/vmselect/promql: do not take into account values outside the current window in rollup_candlestick
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-23 18:06:26 +02:00
Yaroslav
c69d4b01f0 fix rollupOpen(), rollupHigh(), rollupLow() functions (#328) 2020-02-23 18:06:25 +02:00
Aliaksandr Valialkin
7ee7614e90 app/vmagent: initial implementation for vmagent 2020-02-23 17:31:54 +02:00
Aliaksandr Valialkin
f22aefdb16 app/vmselect/promql: log when rollupResult cache is cleared 2020-02-21 20:06:53 +02:00
Aliaksandr Valialkin
d5c2a0ce64 app/vmselect: add -search.cacheTimestampOffset command-line flag
This flag can be used for removing gaps on graphs if the difference between the current time
and the timestamps from the ingested data exceeds 5 minutes.

This is the case when the time between data sources and VictoriaMetrics is improperly synchronized.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/312
2020-02-21 14:02:15 +02:00
Aliaksandr Valialkin
c70822db50 app/vmselect: add /internl/resetRollupResultCache handler for resetting response cache
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/312
2020-02-21 14:02:12 +02:00
Aliaksandr Valialkin
afecb34491 app/vmstorage: limit the maximum error message size before sending it to client
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/315
2020-02-13 17:33:12 +02:00
Aliaksandr Valialkin
846d7fa7e9 app/vmselect: add sort_by_label(q, label) and sort_by_label_desc(q, label) functions
This is implementation of https://github.com/prometheus/prometheus/pull/1533 for VictoriaMetrics.
2020-02-13 17:01:50 +02:00
Aliaksandr Valialkin
347aaba79d lib/{storage,mergeset}: use time.Ticker instead of time.Timer where appropriate
It has been appeared that time.Timer was used in places where time.Ticker must be used instead.
This could result in blocked goroutines as in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 13:21:48 +02:00
Aliaksandr Valialkin
6e0013ca39
app/vmselect/prometheus: typo fix in -latencyOffset description 2020-02-12 14:00:38 +02:00
Edouard Hur
e8f92a4ee8
Cluster - prometheus metrics fix (#314)
* add missing '/{}' in prom query range requests

* fix missing leading '/' on prom lavelValuesErrors path
2020-02-10 22:15:21 +02:00
Aliaksandr Valialkin
1010a57882 all: allow setting flags via environment vars
Now flags can be set via environment vars with the same names as flags.
Command-line flags override flags set via env vars.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-10 13:31:21 +02:00
Aliaksandr Valialkin
ea66212c93 lib/storage: move -dedup.minScrapeInterval flag outside lib/storage, so it doesnt show up in vminsert in cluster version 2020-02-10 13:07:25 +02:00
Aliaksandr Valialkin
e6d9ea3094 app/vmselect/promql: do not add step to range end, since this hack became obsolete since commit 9e1119dab8 2020-02-05 21:23:44 +02:00
Aliaksandr Valialkin
4a1de7fee9 app/vmselect/promql: properly adjust time range for data to select
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-05 21:23:43 +02:00
Aliaksandr Valialkin
8e77b54846 app/vmselect: unconditionally offset -step to rollup_candlestick. This makes results more consistent 2020-02-04 23:31:47 +02:00
Aliaksandr Valialkin
ce38b176bc app/vmselect/promql: automatically apply offset -step to rollup_candlestick function in order to obtain the expected OHLC results
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-04 23:24:04 +02:00
Aliaksandr Valialkin
4f7116d1ee app/vmselect/promql: adjust rollup_candlestick calculations to the exepcted results
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/309
2020-02-04 22:43:28 +02:00
Aliaksandr Valialkin
ccd3aa4f15 app/vmselect: take into account the time the requests wait in the queue if -search.maxConcurrentRequests is exceeded
This will prevent from excess CPU usage for timed out queries.
2020-02-04 16:20:48 +02:00
Aliaksandr Valialkin
e6bf88a4d4 app/vmselect: add a placeholder for /api/v1/metadata, which could be requested by Grafana
See https://prometheus.io/docs/prometheus/latest/querying/api/#querying-metric-metadata

VictoriaMetrics doesn't collect any metadata for metrics, so just return empty response.
2020-02-04 15:56:01 +02:00
Aliaksandr Valialkin
7cde594696 all: do not clash flag description with back-quoted flag types
See https://golang.org/pkg/flag/#PrintDefaults for more details.
2020-02-04 15:56:01 +02:00
Aliaksandr Valialkin
45bc6c62f2 app/vmselect/promql: adjust and and unless binary operator handling to be consistent with Prometheus 2020-01-31 18:52:51 +02:00
Aliaksandr Valialkin
e3adc095bd all: add -dedup.minScrapeInterval command-line flag for data de-duplication
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/86
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/278
2020-01-31 01:18:54 +02:00
Aliaksandr Valialkin
cb5c39ee70 lib/fs: optimize small reads for ReaderAt.MustReadAt by reading from memory-mapped space instead of reading from file descriptor
This should improve performance when reading many small blocks.
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
4ed5e9a7ce lib/storage: pre-fetch metricNames for the found metricIDs in Search.Init
This should speed up Search.NextMetricBlock loop for big number of found time series.
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
170c1c3a4e app/vmselect/promql: add keep_next_value(q) for filling gaps with the next non-empty value 2020-01-29 00:48:14 +02:00
Aliaksandr Valialkin
a9c1d5b351 app/vminsert: moved -maxInsertRequestSize command-line flag out of lib/prompb in order to prevent its inclusion in vmselect and vmstorage apps 2020-01-28 22:53:50 +02:00
Aliaksandr Valialkin
b28c9a3944 app/vmselect/promql: return expected results from increase() over the beginning of time series, which start from big value
Examples for such counters: OS-level counters for network or cpu stats.
2020-01-28 16:31:05 +02:00
Aliaksandr Valialkin
3e304890a6 app/vmselect/promql: fix panic on a single zero vmrange bucket in prometheus_buckets() function
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/296
2020-01-27 18:05:12 +02:00
Aliaksandr Valialkin
4d70a81e18 app/vminsert: do not drop pending rows if all the vmstorage backends are unavailable
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/294
2020-01-24 22:10:10 +02:00
Aliaksandr Valialkin
0cda6afa8e app/vminsert: move ingestion protocol parsers to lib/protoparser, so they could be re-used in the upcoming vmagent 2020-01-24 16:55:18 +02:00
Aliaksandr Valialkin
ea53a21b02 all: consistently log durations in seconds with millisecond precision
This should improve logs readability
2020-01-22 18:35:24 +02:00
Aliaksandr Valialkin
e1a264173a app/vmselect: mention the original query and time range in error messages
This should simplify debugging invalid or heavy queries.
2020-01-22 17:34:35 +02:00
Aliaksandr Valialkin
e127173984 app/vmselect: mention command-line flag, which could be used for adjusting query timeouts, in timeout errors 2020-01-22 15:53:42 +02:00
Aliaksandr Valialkin
f3b9f8b823 app/vmselect/prometheus: increase default value -maxExportDuration to 30 days, since 10 minutes beat users exporting bit amounts of data 2020-01-22 15:53:41 +02:00
Aliaksandr Valialkin
40e564eb9c app/vmselect/promql: add range_over_time(m[d]) function for calculating value range for m over d 2020-01-21 19:05:29 +02:00
Aliaksandr Valialkin
ecddba30fe app/vminsert/netstorage: increase timeout for pushing data from vminsert to vmstorage by 3x
Our clients report that the previous timeout could lead to frequent errors when
vmstorage starts background merge for big parts on slow HDD.
2020-01-21 18:21:49 +02:00
Aliaksandr Valialkin
9eaa2ab871 app/vmselect/promql: add label_match(q, label, regexp) and label_mismatch(q, label, regexp) functions for filtering out time series with labels matching the given regexp 2020-01-21 15:00:35 +02:00
Aliaksandr Valialkin
cbd0452317 app/vminsert: increase default value for -insert.maxQueueDuration from 30s to 60s
This should help catching up with high ingestion rate after VictoriaMetrics restart.
2020-01-18 14:39:30 +02:00
Aliaksandr Valialkin
2084921e64 all: use github.com/klauspost/compress/gzip instead of compress/gzip
`github.com/klauspost/compress/gzip` is more optimized than `compress/gzip`.
This gives better gzip compression and decompression speeds.
2020-01-17 23:59:17 +02:00
Aliaksandr Valialkin
54db08a60f app/vmselect/netstorage: make fmt 2020-01-17 17:46:20 +02:00
Aliaksandr Valialkin
d21cc2d16a app/vmselect/netstorage: limit the maximum size for in-memory buffer for temporary blocks file
This should reduce memory usage on systems with more than 8GB RAM.
2020-01-17 16:28:28 +02:00
Aliaksandr Valialkin
b05f6cf11c app/vmselect: limit the default value for -search.maxConcurrentRequests, so it plays well on systems with more than 16 vCPUs
A single heavy request can saturate all the available CPUs, so let's limit the number of concurrent requests to lower value.
This will give more chances for executing insert path.
2020-01-17 16:16:43 +02:00
Aliaksandr Valialkin
a9f683423c app/{vminsert,vmselect}: improve error messages when VictoriaMetrics cannot handle too high number of concurrent inserts / selects 2020-01-17 16:16:43 +02:00
Aliaksandr Valialkin
4b16b7fd11 all: mention command-line flags used for limiting the incoming request size in error messages
This should improve error logs usability.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/287
2020-01-16 13:06:43 +02:00
Aliaksandr Valialkin
ce0b602405 app/vmselect/promql: fix panic on sum(aggr_over_time(...)) with incorrect number of args 2020-01-15 16:26:16 +02:00
Aliaksandr Valialkin
bcd3f0c5bd app/vmselect/promql: add hoeffding_bound_upper(phi, m[d]) and hoeffding_bound_lower(phi, m[d]) functions
These functions can be used for calculating Hoeffding bounds
for `m` over `d` time range and for the given `phi` in the range `[0..1]`.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/283
2020-01-11 14:47:13 +02:00
Aliaksandr Valialkin
fc01b11ddc app/vmselect/promql: return continuous values for min_over_time and max_over_time when step is smaller than scrape_interval 2020-01-11 12:47:57 +02:00
Aliaksandr Valialkin
16fb128bbc app/vmselect/promql: do not take into account the previous point before time window in square brackets for min_over_time, max_over_time, rollup_first and rollup_last functions
This makes the behaviour for these functions similar to Prometheus when processing broken time series with irregular data points
like `gitlab_runner_jobs`. See https://gitlab.com/gitlab-org/gitlab-exporter/issues/50 for details.
2020-01-11 00:26:11 +02:00
Aliaksandr Valialkin
1c445bf7eb vendor: update github.com/valyala/fastjson from v1.4.2 to v1.4.5
This should fix parsing Inf values in `/api/v1/import`. The previous attempt to fix this in VictoriaMetrics v1.32.1 was unsuccessful.
2020-01-10 23:14:34 +02:00
Aliaksandr Valialkin
adc36d00b7 app/vmselect/promql: properly handle aggr(aggr_over_time(...)) 2020-01-10 21:57:11 +02:00
Aliaksandr Valialkin
87a106702b app/vmselect/promql: add aggr_over_time(("aggr_func1", "aggr_func2", ...), m[d]) function
This function can be used for simultaneous calculating of multiple `aggr_func*` functions
that accept range vector. For example, `aggr_over_time(("min_over_time", "max_over_time"), m[d])`
would calculate `min_over_time` and `max_over_time` for `m[d]`.
2020-01-10 21:18:12 +02:00
Aliaksandr Valialkin
c314d9a219 app/vmselect/promql: add tmin_over_time(m[d]) and tmax_over_time(m[d]) functions
These functions return timestamp in seconds for the minimum and maximum value for `m` over time range `d`
2020-01-10 19:39:34 +02:00
Aliaksandr Valialkin
705af61587 app/{vmbackup,vmrestore}: add backup complete file to backup when it is complete and check for this file before restoring from backup
This should prevent from restoring from incomplete backups.

Add `-skipBackupCompleteCheck` command-line flag to `vmrestore` in order to be able restoring from old backups without `backup complete` file.
2020-01-09 15:35:45 +02:00
Aliaksandr Valialkin
7c6df1e51d app/vmselect/promql: skip rate calculation for the first point on time series 2020-01-08 14:42:44 +02:00
Aliaksandr Valialkin
76707b2ab9 app/vmselect/promql: fix calculations for histogram_share 2020-01-04 14:44:15 +02:00
Aliaksandr Valialkin
accad01b3e app/vmselect/promql: add missing MetricName into netstorage.Result in tests 2020-01-04 12:53:14 +02:00
Aliaksandr Valialkin
6f29d37cb5 app/vmselect/promql: add histogram_share(le, buckets) function 2020-01-04 12:53:08 +02:00
Aliaksandr Valialkin
2290503140 app/vmselect/promql: add absent_over_time(m[d]) func similar to the function in Prometheus 2.16
See https://github.com/prometheus/prometheus/issues/2882
2020-01-04 12:53:01 +02:00
Aliaksandr Valialkin
67f94bbe12 app/vmselect/promql: add histogram_over_time(m[d]) rollup function 2020-01-04 12:52:56 +02:00
Aliaksandr Valialkin
9a1f6848ca app/vmselect/promql: fix results caching for multi-arg rollup functions such as quantile_over_time
Previosly only a single arg was taken into account, so caching didn't work properly for multi-arg rollup funcs.
2020-01-03 20:44:54 +02:00
Aliaksandr Valialkin
3d0c7b095a app/vmselect/promql: use scrapeInterval instead of window in denominator when calculating rate for the first point on the time series
This should provide better estimation for `rate` in the beginning of time series.
2020-01-03 19:03:32 +02:00
Aliaksandr Valialkin
6ea7f23446 app/vmselect/promql: increase the estimated number of time series returned by aggr() by (something) from 100 to 1K, since 100 may result in OOM for high number of time series 2020-01-03 01:02:30 +02:00
Aliaksandr Valialkin
e0abf45d45 app/vmselect/promql: add share_le_over_time and share_gt_over_time functions for SLI and SLO calculations 2020-01-03 00:41:36 +02:00
Aliaksandr Valialkin
eb1a66c577 lib/metricsq: add ExpandWithExprs 2019-12-25 22:20:21 +02:00
Aliaksandr Valialkin
453d71d082 Rename lib/promql to lib/metricsql and apply small fixes 2019-12-25 22:09:09 +02:00
Mike Poindexter
009d1559db Split Extended PromQL parsing to a separate library 2019-12-25 22:09:07 +02:00
Aliaksandr Valialkin
ff18101d30 app/vmselect/promql: make sure AdjustStartEnd returns time range covering the same number of points as the initial time range
This should prevent from the following panic at app/vmselect/promql/binary_op.go:255:

    BUG: len(leftVaues) must match len(rightValues) and len(dstValues)
2019-12-24 22:45:49 +02:00
Aliaksandr Valialkin
e24ee43109 app/vmselect/promql: adjust calculations for rate and increase for the first value
These calculations should trigger alerts on `/api/v1/query` for counters starting from values greater than 0.
2019-12-24 19:41:03 +02:00
Aliaksandr Valialkin
9a2554691c app/vmselect/promql: properly calculate rate on the first data point
It is calculated as `value / scrape_interval`, since the value was missing on the previous scrape,
i.e. we can assume its value was 0 at this time.
2019-12-24 15:55:15 +02:00
Aliaksandr Valialkin
97de50dd4c app/vmselect/netstorage: improve error message when reading data size in readBytes 2019-12-24 14:40:14 +02:00
Aliaksandr Valialkin
29d2ce54cb all: use gozstd instead of pure Go zstd for GOARCH=amd64 2019-12-24 12:43:59 +02:00
Aliaksandr Valialkin
6358cf3d47 app/vmselect/netstorage: move MustAdviseSequentialRead to lib/fs 2019-12-23 23:16:26 +02:00
Aliaksandr Valialkin
cc8a1bae0e app/vmselect: add -search.maxExportDuration command-line flag for limiting /api/v1/export duration
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/275
2019-12-20 11:37:18 +02:00
Aliaksandr Valialkin
8b56b849e9 app/vminsert: return StatusNoContent http response for /api/v1/import to be consistent with other insert handlers 2019-12-19 01:22:01 +02:00
Aliaksandr Valialkin
6a185b7809 app/vmselect: add ability to pass match[], start and end to /api/v1/labels
This makes the `/api/v1/labels` handler consistent with already existing functionality for `/api/v1/label/.../values`.

See https://github.com/prometheus/prometheus/issues/6178 for more details.
2019-12-15 00:20:43 +02:00