Commit graph

1455 commits

Author SHA1 Message Date
Aliaksandr Valialkin
a4c96d9e6d lib/protoparser: properly update vm_protoparser_rows_read_total{type="promscrape"} metric 2020-07-14 12:15:56 +03:00
Seva Poliakov
a5e713b6e0
add vm_protoparser_rows_read_total metrics to promscrape (#624)
* add vm_protoparser_rows_read_total metrics to promscrape

move vm_protoparser_rows_read_total for promscrape to better place

move vm_protoparser_rows_read_total for promscrape to better place

* remove possibility of infinity loop at prometheus parser
2020-07-14 12:02:25 +03:00
Roman Khavronenko
207e93b50d lib/flagutil: specify additional description for all Array type flags (#620)
Array type flag is now defined as `value` type in flag description when printed.
This change adds additional description to every Array type flag so it would be
clear what exact type is used:
```
  -remoteWrite.urlRelabelConfig array
        Optional path to relabel config for the corresponding -remoteWrite.url
        Supports array of values separated by comma or specified via multiple flags.
```
2020-07-13 22:00:03 +03:00
Roman Khavronenko
605711bde5 lib/persistentqueue: add vm_persistentqueue_bytes_pending metric (#619)
Metric `vm_persistentqueue_bytes_pending` is a gauge that shows current amount
of bytes in persistentqueue flushed on disk as a difference between write and read
offsets. This metric is very similar to `vmagent_remotewrite_pending_data_bytes`
except of accounting for bytes in-memory.
2020-07-13 21:54:54 +03:00
Roman Khavronenko
a02097e657 Extend metric vm_promscrape_targets with status label (#615)
The change to `vm_promscrape_targets` metric suppose to improve observability
for `vmagent` so it will be possible to track how many targets are up or down
for every specific scrape group:
```
vm_promscrape_targets{type="static_configs", status="down"} 1
vm_promscrape_targets{type="static_configs", status="up"} 2
```
2020-07-13 21:54:53 +03:00
Aliaksandr Valialkin
3898cc0285 app/vmselect/prometheus: minimize the diff for the change 1033dc7e2a over 619b0a25c9 2020-07-13 21:41:17 +03:00
faceair
bf39e67ade fix empty response template (#617) 2020-07-13 21:41:15 +03:00
Aliaksandr Valialkin
b6a5c29549 docs/vmagent.md: sync with app/vmagent/README.md 2020-07-13 21:26:00 +03:00
ofen
9ffa688846 Update README.md (#621)
Troubleshooting section updated to help out with duplicate targets detection
2020-07-13 21:25:59 +03:00
Aliaksandr Valialkin
4353ff7ef1 app/vmagent: fix data race when multiple -remoteWrite.urlRelabelConfig options are set
Previously multiple goroutines could access remoteWriteCtx.tss concurrently, which could lead to data race
and improper relabeling. Now each goroutine has its own copy of tss during relabeling.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/599
2020-07-10 15:17:23 +03:00
Aliaksandr Valialkin
805a90f642 app/vmagent/remotewrite: typo fix in -remoteWrite.showURL help message 2020-07-10 14:07:14 +03:00
Aliaksandr Valialkin
5910207d61 vendor: update github.com/valyala/quicktemplate from v1.5.0 to v1.5.1
This should fix incorrect encoding for json strings with char codes below 0x20

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/613
2020-07-10 12:58:40 +03:00
Aliaksandr Valialkin
5d21c79af9 docs/Single-server-VictoriaMetrics.md: sync with the original README.md 2020-07-10 12:16:16 +03:00
Aliaksandr Valialkin
6373d377ef app/{vminsert,vmagent}: add ability to import data in Prometheus exposition format via /api/v1/import/prometheus 2020-07-10 12:13:28 +03:00
Aliaksandr Valialkin
2012e294d1 properly calculate readCalls 2020-07-10 12:01:05 +03:00
Aliaksandr Valialkin
d449d0a0e1 app/vmselect/promql: add missing tests for ifnot binary operation 2020-07-09 13:24:12 +03:00
Aliaksandr Valialkin
7e706eea13 app/vmselect/promql: refactor implementations for and and unless binary operations, so they are closer to or implementation 2020-07-09 13:06:01 +03:00
Aliaksandr Valialkin
6c1a47b5e0 app/vmselect/promql/active_queries.go: simplify code a bit by inlining getNextActiveQueryID function 2020-07-09 11:18:53 +03:00
Aliaksandr Valialkin
418f0e46cb docs: add a link to the The CMS monitoring infrastructure and applications publication from CERN 2020-07-08 20:16:31 +03:00
Aliaksandr Valialkin
87f8c728bf lib/promscrape: send Accept header similar to Prometheus when scraping targets
This should fix scraping Spring Boot servers, which return incorrect response
unless `Accept: text/plain` request header is set.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/608
2020-07-08 19:50:06 +03:00
Aliaksandr Valialkin
fd4d593c75 vendor: make vendor-update 2020-07-08 19:24:59 +03:00
Aliaksandr Valialkin
cd58e4356d docs/Cluster-VictoriaMetrics.md: mention about api/v1/status/active_queries page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/528
2020-07-08 19:15:38 +03:00
Aliaksandr Valialkin
fb86071552 app/vmselect: add /api/v1/status/active_queries page with the list of currently running queries
This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/575

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/528
2020-07-08 19:09:31 +03:00
DexterZhang
9930ce1fa9
Feat/query list vmselect (#575)
* feat(vmselect): add support for listing current running queries and canceling specific query

* fix(vmselect): change current queries' pid from int64 counter to uuid

* feat(vmselect): add auth to internal operations like `/resetRollupResultCache`, `/query/list` and `/query/kill`. add flag `internalAuthKey` for these auth

* fix(vmselect): add more info to current queries

* review: delete some unnecessary code and use function instead of init

* review: returen *queriesMap in newQueriesMap

* review: delete unused var in struct queriesMap, add comments to exported functions

* review: add return if error occurs

* feat(vmselect): truncate query string in current running query list API since the size of query string might be large;
                use query string's pointer in struct `query` for the same reason;
		add query info API to get full access of query's info;
2020-07-08 19:04:29 +03:00
Aliaksandr Valialkin
7335743d57 lib/storage: limit the maximum concurrency for data ingestion to GOMAXPROCS
Previously the concurrency has been limited to GOMAXPROCS*2. This had little sense,
since every call to Storage.AddRows is bound to CPU, so the maximum ingestion bandwidth
is achieved when the number of concurrent calls to Storage.AddRows is limited to the number of CPUs,
i.e. to GOMAXPROCS.
2020-07-08 17:34:27 +03:00
Roman Khavronenko
929ad74de6 lib/protoparser: fix metric name of unmarshal errors in promremotewrite (#607)
The change fixes the typo in metric name `vm_protoparser_unmarshal_errors` to
respect the naming standard.
2020-07-08 14:19:27 +03:00
Aliaksandr Valialkin
e401b8d527 lib/protoparser/graphite: go fmt 2020-07-08 14:13:06 +03:00
Aliaksandr Valialkin
50ecf09042 lib/protoparser/graphite: add more tests after eb45185eef
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/610
2020-07-08 14:13:03 +03:00
Seva Poliakov
1ae0334e17 Fix graphite minus one timestamp (#609)
* fix graphite -1 timestamp

* format the graphite fix -1 timestamp
2020-07-08 14:13:01 +03:00
Aliaksandr Valialkin
fad008df7e lib/storage: clarify out of retention period error message by mentioning -retentionPeriod command-line flag 2020-07-08 13:54:13 +03:00
Aliaksandr Valialkin
fe58462bef lib/storage: reset MetricName->TSID cache after deleting time series
This should prevent from adding new data points to deleted time series
without the need to check for the deleted time series.

This improves ingestion performance a bit when the `deleted time series ids` aka `dmis` set
contains big number of time series.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/596

Based on the idea from @n4mine at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/604
2020-07-06 22:01:24 +03:00
Aliaksandr Valialkin
77bb0e6595 lib/fs: clarify description for -fs.disableMmap command-line flag 2020-07-06 14:28:57 +03:00
Aliaksandr Valialkin
0bff96fe4b lib/storage: prioritize data ingestion over heavy queries
Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream.
Prevent this by delaying queries' execution until free resources are available for data ingestion.

Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources
for data ingestion and/or for executing heavy queries.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2020-07-05 19:44:04 +03:00
Roman Khavronenko
9afd19d375 app/vmalert: add retries to remotewrite (#605)
* app/vmalert: add retries to remotewrite

Remotewrite pkg now does limited number of retries if write request failed.
This suppose to make vmalert state persisting more reliable.

New metrics were added to remotewrite in order to track rows/bytes sent/dropped.

defaultFlushInterval was increased from 1s to 5s for sanity reasons.

* fix

* wip

* wip

* wip

* fix bits alignment bug for 32-bit systems

* fix mistakenly dropped field
2020-07-05 18:47:38 +03:00
Aliaksandr Valialkin
82871fb7a5 app/vmselect/prometheus: small fixes on top of 8bb762124a 2020-07-05 18:17:53 +03:00
faceair
17f175ff5a fix adjust last points avoid influence earlier value (#606) 2020-07-05 18:17:52 +03:00
Aliaksandr Valialkin
6f1d926698 lib/promscrape: use HostClient.DoDeadline instead of HostClient.Do in order to guarantee strict deadline across multiple scrape attempts 2020-07-03 21:33:48 +03:00
Aliaksandr Valialkin
ee03b4ccbd lib/promscrape: prevent from too big deadline misses on scrape retries
The maximum deadline miss duration is reduced to 2x scrape_interval in the worst case.
By default it is limited to scrape_interval configured for the given scrape target.
2020-07-03 20:42:09 +03:00
Aliaksandr Valialkin
dfa83a4a35 lib/promscrape: check for nil error before checking for the returned status code when scraping targets 2020-07-03 18:37:25 +03:00
Ween
d28fb0baf9 [VMAlert] Fix error log when remoteWrite queue size is full (#602)
* Fix Auto metrics relabeled errors

* Finalize auto-genenated  Labels

* Fix Test Errors

* fix error logs when queue is full

Co-authored-by: xinyulong <xinyulong@kuaishou.com>
2020-07-03 16:50:43 +03:00
Aliaksandr Valialkin
8bb3622e9d app/vminsert: prevent from adding and/or selecting labels with empty values
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600
2020-07-02 23:17:12 +03:00
Aliaksandr Valialkin
6ebac3ab63 app/vminsert: add ability to apply relabeling to all the incoming metrics if -relabelConfig command-line arg points to a file with a list of relabel_config entries
See https://victoriametrics.github.io/#relabeling
2020-07-02 20:36:33 +03:00
Aliaksandr Valialkin
a45856570b all: typo fix: exptected -> expected 2020-07-02 18:06:21 +03:00
Aliaksandr Valialkin
f10e8809c0 app/vmselect: add interpolate function for filling gaps with linearly interpolated values
See https://stackoverflow.com/q/62565021/274937 for details
2020-07-02 14:54:46 +03:00
Aliaksandr Valialkin
2361ad8ab4 lib/promscrape: add ability to set disable_compression and disable_keepalive options in scrape_config section of the config passed to -promscrape.config
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-07-02 14:19:34 +03:00
Aliaksandr Valialkin
0f754bea49 lib/promscrape: add -promscrape.disableKeepAlive command-line flag for disabling http keep-alive connections when scraping targets
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-07-01 02:20:46 +03:00
BigFish
aa26b94f33 fix: spelling mistakes (#594)
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-07-01 01:36:40 +03:00
Aliaksandr Valialkin
618bcc818c vendor: make vendor-update 2020-07-01 01:04:05 +03:00
Aliaksandr Valialkin
4cb3e7595c app/vmstorage: add -denyQueriesOutsideRetention command-line flag for denying queries outside the configured retention 2020-07-01 00:58:42 +03:00
Aliaksandr Valialkin
81e3d4305f lib/httpserver: add Unwrap method to ErrorWithStatusCode, so As and Is functions in standard errors package may properly unwrap the error inside ErrorWithStatusCode 2020-07-01 00:53:49 +03:00