Aliaksandr Valialkin
1c8e97c8a0
lib/procutil: add NewSighupChan function, which returns a channel, which is triggered on every SIGHUP
2020-05-05 10:56:15 +03:00
Aliaksandr Valialkin
054457d1f4
lib/promscrape: allow explicitly setting empty token via token: ""
in consul_sd_config
2020-05-05 07:49:54 +03:00
Aliaksandr Valialkin
89aa6dbf56
lib/promscrape: add Prometheus-compatible service discovery for Consul aka consul_sd_configs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330
2020-05-04 20:53:06 +03:00
Aliaksandr Valialkin
28e0e8fd88
lib/promauth: properly set up client certificate in tls.Config
...
Previously the client certificate has been mistakenly set up as a server certificate
2020-05-04 20:53:04 +03:00
Aliaksandr Valialkin
ed91fe1d9b
lib/promscrape: move common code for discovery api config map handling into discoveryutils
2020-05-04 20:52:58 +03:00
Aliaksandr Valialkin
c50fd219dc
lib/promscrape/discovery/kubernetes/: unify apiConfig creation
2020-05-04 20:52:53 +03:00
Aliaksandr Valialkin
a5880f17af
lib/promscrape: remove debug line left after the commit e4aac6ea40
2020-05-03 17:16:19 +03:00
Aliaksandr Valialkin
1f0e8fdc0d
lib/promscrape: fix tests after the commit 658a8742ac
...
The original commit copies `__address__` label to `instance` label when generating per-target labels as Prometheus does.
See https://www.robustperception.io/life-of-a-label for details.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/453
2020-05-03 16:59:29 +03:00
DexterZhang
317688f144
fix(vmagent): different behavior as how prometheus deal with labels. [Issue#453] ( #454 )
2020-05-03 16:59:28 +03:00
Aliaksandr Valialkin
ab1e6a76bb
lib/promscrape: make consistent scrape time offsets across reloads for the same ScrapeURL and Labels
...
This should make consistent intervals between data points for scrape targets across reloads.
Previously these intervals were random.
2020-05-03 14:31:22 +03:00
Aliaksandr Valialkin
f25416984b
lib/promscrape: fix TestGetFileSDScrapeWorkSuccess after 3b234d82e5
2020-05-03 14:31:20 +03:00
Aliaksandr Valialkin
f422203e10
lib/promscrape: reload only modified scrapers on config changes
...
This should improve scrape stability when big number of targets are scraped and these targets are frequently changed.
Thanks to @xbsura for the idea and initial implementation attempts at the following pull requests:
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/449
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/458
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/459
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/460
2020-05-03 12:47:16 +03:00
Aliaksandr Valialkin
bbaca16ce8
lib/httpserver: rename http.externalURL
to http.pathPrefix
and improve help message for this flag
...
The `http.externalURL` flag name was slightly misleading, so it has been renamed to `http.pathPrefix`.
2020-05-02 13:12:24 +03:00
DexterZhang
a0589f2ca5
feat(httpserver): add http.externalUrl
config to http server, it adds prefix to http path automatically ( #452 )
2020-05-02 13:12:23 +03:00
Aliaksandr Valialkin
b21b73115a
app/vminsert: add /-/reload
handler in the same way as for vmagent
2020-04-30 02:18:08 +03:00
Aliaksandr Valialkin
a970705d8e
lib/procutil: prevent from app termination on SIGHUP signal, since this signal is frequently used for config reload
2020-04-30 02:18:06 +03:00
Aliaksandr Valialkin
d99f48aa48
lib/httpserver: mention that -http.maxGracefulShutdownDuration
command-line flag value can be increased on shutdown timeout
2020-04-30 01:37:02 +03:00
Aliaksandr Valialkin
de5f923476
lib/promscrape: set 30 seconds timeout for discovery api requests
...
Previously such requests could hang for long time. This could make debugging harder.
2020-04-29 17:29:03 +03:00
Aliaksandr Valialkin
b6d88bac04
vendor: use github.com/VictoriaMetrics/fasthttp instead of github.com/fasthttp/fasthttp
...
The upstream fasthttp may contain issues like 996610f021
,
plus a code that isn't used by VictoriaMetrics. So let's use a private copy under our control instead.
2020-04-29 16:43:09 +03:00
Aliaksandr Valialkin
9ed4951ec8
lib/metricsql: move it to a separate repository - github.com/VictoriaMetrics/metrics
2020-04-28 15:30:06 +03:00
Aliaksandr Valialkin
d78ed50edd
lib/storage: recover when metricID->metricName entry is missing in the inverted index after unclean shutdown
...
Newly added index entries can be missing after unclean shutdown, since they didn't flush to persistent storage yet.
Log about this and delete the corresponding metricID, so it could be re-created next time.
2020-04-28 12:01:32 +03:00
Aliaksandr Valialkin
53740d0026
lib/promscrape: handle connection reset when targets responds with http redirect
2020-04-28 02:14:32 +03:00
肖贝贝
3e6f29f462
fix: vmagent not follow 301/302 redirect bug ( #445 )
...
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-04-28 02:14:31 +03:00
Aliaksandr Valialkin
424068f804
lib/promscrape: handle connection reset when targets responds with http redirect
2020-04-28 02:14:26 +03:00
肖贝贝
7d045bf2ca
fix: vmagent not follow 301/302 redirect bug ( #445 )
...
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-04-28 02:14:25 +03:00
Aliaksandr Valialkin
2aecf7c37c
lib/{encoding,decimal}: typo fixes in tests: epxecting->expecting
2020-04-28 00:02:19 +03:00
Aliaksandr Valialkin
806dc73d8a
lib/encoding: reduce possibility of failure in TestMarshalInt64ArraySize
2020-04-28 00:02:18 +03:00
Aliaksandr Valialkin
a603a15757
lib/promscrape/discovery/gce: make golangci-lint happy
2020-04-27 19:29:42 +03:00
Aliaksandr Valialkin
86a1d9cb0c
lib/promscrape: add initial support for Prometheus-compatible service discovery for Amazon EC2 aka ec2_sd_configs
2020-04-27 19:29:22 +03:00
Aliaksandr Valialkin
1acb6eb25a
lib/promscrape/discovery/gce: properly set filter
query arg in api url
2020-04-27 16:01:53 +03:00
Aliaksandr Valialkin
0daa37fa02
lib/promscrape/discovery/gce: allow empty project and zone for gce_sd_config
2020-04-27 11:45:45 +03:00
Aliaksandr Valialkin
989d84cf3f
app/{vminsert,vmstorage}: wait for ack
from vmstorage
after each packet sent to it from vminsert
...
This should protect from possible data loss when `vmstorage` is stopped while the packet is sent from `vminsert`.
This commit switches to new protocol between vminsert and vmstorage, which is incompatible
with the previous protocol. So it is required that both vminsert and vmstorage nodes are updated.
2020-04-27 09:53:26 +03:00
Aliaksandr Valialkin
e933cbac16
lib/storage: postpone reading data from blocks during search
...
This eliminates the need for storing block data into temporary files on a single-node VictoriaMetrics
during heavy queries, which touch big number of time series over long time ranges.
This improves single-node VM performance on heavy queries by up to 2x.
2020-04-27 08:44:01 +03:00
Aliaksandr Valialkin
31861c5b8e
lib/promscrape/discovery/gce: allow empty zone
arg in gce_sd_config
- in this case zones for the given project are automatically discovered
2020-04-26 14:37:38 +03:00
Aliaksandr Valialkin
b16e19c053
lib/storage/dedup.go: go fmt
2020-04-26 14:37:36 +03:00
Aliaksandr Valialkin
a0000c3a6e
lib/storage: improve deduplication algorithm
...
Now it leaves only the first data point on each `-dedup.minScrapeInterval` interval.
Previously it may leave two data points on the interval. This could lead to unexpected results
for `histogram_quantile(phi, sum(rate(buckets)) by (le))` query.
2020-04-26 13:10:18 +03:00
Aliaksandr Valialkin
13b4069c59
lib/storage: postpone label filters matching too many time series instead of giving up with error
...
This should reduce the frequency of the following errors:
cannot find tag filter matching less than N time series; either increase -search.maxUniqueTimeseries or use more specific tag filters
more than N time series found on the time range [...]; either increase -search.maxUniqueTimeseries or shrink the time range
2020-04-24 21:18:52 +03:00
Aliaksandr Valialkin
7c74efd640
lib/promscrape/discovery/gce: make golint happy by ignoring resp.Body.Close() result
2020-04-24 18:13:26 +03:00
Aliaksandr Valialkin
069690e3bd
lib/promscrape: initial implementation for gce_sd_configs
aga Prometheus-compatible service discovery for Google Compute Engine
2020-04-24 17:53:43 +03:00
Aliaksandr Valialkin
de991551f5
lib/promscrape: query /api/v1/namespaces/*
for the configured namespaces in kubernetes_sd_config
...
This should fix authroization issues described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/432
2020-04-24 14:42:02 +03:00
Aliaksandr Valialkin
387a21c96d
lib/promscrape: add -promscrape.configCheckInterval
command-line flag for automating config checking
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/431
2020-04-23 23:41:26 +03:00
Aliaksandr Valialkin
83e4c8427e
lib/promscrape: access Config entries by reference, so they can be compared by addresses
2020-04-23 14:38:29 +03:00
Aliaksandr Valialkin
e220f3eeb6
lib/promscrape: move KubernetesSDConfig to lib/promscrape/discovery/kubernetes
2020-04-23 11:34:30 +03:00
Aliaksandr Valialkin
1187494c8f
lib/promscrape/discovery/kubernetes: hide role switch logic behind GetLabels function
2020-04-22 22:16:18 +03:00
Aliaksandr Valialkin
f9526809e5
app/vmselect: add /api/v1/status/tsdb
page with useful stats for locating root cause for high cardinality issues
...
See https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/425
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/268
2020-04-22 22:03:23 +03:00
Aliaksandr Valialkin
f3e5722257
lib/writeconcurrencylimiter: improve docs for -maxConcurrentInserts command-line flag
2020-04-20 21:03:09 +03:00
Aliaksandr Valialkin
81481abaa9
lib/promscrape/discovery/kubernetes: reuse a client for empty api_server
inside different jobs
2020-04-20 17:07:37 +03:00
Aliaksandr Valialkin
6764efde39
lib/promscrape/discovery/kubernetes: update stale comments
2020-04-17 14:06:26 +03:00
Aliaksandr Valialkin
d86640d609
lib/promscrape: suppress scrape errors if -promscrape.suppressScrapeErrors
flag is set
2020-04-16 23:41:52 +03:00
Aliaksandr Valialkin
70104f3fb1
lib/promscrape: print all the labels for the target on error message for failed scrape
...
This should improve debuggability.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/420
2020-04-16 23:35:10 +03:00
Aliaksandr Valialkin
266bbec52d
lib/promscrape: retry target scraping when the target closes previously established keep-alive connection to it
...
This should fix the following error:
the server closed connection before returning the first response byte. Make sure the server returns 'Connection: close' response header before closing the connection
2020-04-16 23:25:34 +03:00
Aliaksandr Valialkin
b2d009c8db
lib/logger: typo fix
2020-04-16 00:20:02 +03:00
Aliaksandr Valialkin
d4bc60d63c
lib/logger: add WARN level for logging expected errors such as invalid user queries
2020-04-15 20:50:45 +03:00
Aliaksandr Valialkin
a873b553cf
app/vmselect: handle timestamp(metric offset X)
the same way as Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/415
2020-04-15 12:01:05 +03:00
Aliaksandr Valialkin
99f0cb1f5f
lib/promscrape: code cleanup in runScraper func
2020-04-15 11:36:35 +03:00
Aliaksandr Valialkin
e9d9638627
lib/storage: skip metricID if the corresponding metricID->metricName is missing in inverted index during search
...
This case is possible when the corresponding metricID->metricName entry didn't propagate to inverted index yet.
This should fix the following error:
error when searching tsids for tfss [...]: cannot find metricName by metricID 1582417212213420669: EOF
2020-04-15 00:10:11 +03:00
Aliaksandr Valialkin
6ec582acb9
lib/promscrape: show information on improperly configured scrape targets at the bottom of /targets
page
...
This is a common error whith improperly configured target autodiscovery and/or relabeling.
This error leads to duplicate scraping of the same targets with the same set of labels, which leads
to duplicate samples in time series.
2020-04-14 14:55:13 +03:00
Aliaksandr Valialkin
391fb0903e
lib/promscrape/discovery/kubernetes: remove only unused client for API server during cleaning
2020-04-14 14:19:26 +03:00
Aliaksandr Valialkin
636e1578de
lib/promscrape: add promrelabel.GetLabelValueByName helper function
2020-04-14 14:12:15 +03:00
Aliaksandr Valialkin
3945bf9dec
lib/promscrape: mention job name in error messages when target cannot be scraped
...
This should improve debuggability
2020-04-14 13:33:18 +03:00
Aliaksandr Valialkin
66da177fe9
lib/promscrape: reset ScrapeWork.ID in tests
2020-04-14 13:31:37 +03:00
Aliaksandr Valialkin
88366cad15
lib/promscrape: properly expose statuses for targets with duplicate scrape urls at /targets
page
...
Previously targets with duplicate scrape urls were merged into a single line on the page.
Now each target with duplicate scrape url is displayed on a separate line.
2020-04-14 13:10:06 +03:00
Aliaksandr Valialkin
09f796e2ab
lib/promscrape: remove labels starting with __meta_
after applying relabel_configs
as Prometheus does
...
This should reduce CPU load during scraping when target discovery generates
big number of `__meta_*` labels (for instance, k8s discovery).
See https://www.robustperception.io/life-of-a-label for details.
2020-04-14 12:23:30 +03:00
Aliaksandr Valialkin
f58d15f27c
lib/promscrape: rename 'scrape_config->scrape_limit' to 'scrape_config->sample_limit'
...
`scrape_config` block from Prometheus config contains `sample_limit` field,
while in `vmagent` this field was mistakenly named as `scrape_limit`.
2020-04-14 12:00:03 +03:00
Aliaksandr Valialkin
7c4fb038e3
lib/promscrape: add initial support for kubernetes_sd_config
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/334
2020-04-13 21:03:53 +03:00
Aliaksandr Valialkin
4017163393
lib/promscrape: add -promscrape.config.strictParse
flag for detecting errors in -promscrape.config
file
2020-04-13 13:15:52 +03:00
Aliaksandr Valialkin
7fbfef2aee
lib/promscrape: extract common auth code to lib/promauth
2020-04-13 12:59:22 +03:00
Aliaksandr Valialkin
e0c6da8e2a
lib/storage: disable deduplication after dedup tests are complete
...
The rest of tests expect that the de-duplication is disabled.
2020-04-10 17:33:38 +03:00
Aliaksandr Valialkin
8ed0d5471a
lib/storage: correctly handle -dedup.minScrapeInterval
values smaller than 8ms
...
Such small values may be used for removing samples with duplicate timestamps.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/409 for details.
2020-04-10 16:40:41 +03:00
Aliaksandr Valialkin
0b2f678d8e
lib/{storage,mergeset}: make sure that requests
and misses
cache counters never go down
2020-04-10 14:44:52 +03:00
Aliaksandr Valialkin
661cfb03e2
lib/protoparser: add -*TrimTimstamp
command-line flags for Influx, Graphite, OpenTSDB and CSV data
...
These flags can be used for reducing disk space usage for timestamps data ingested over the given protocols
2020-04-10 12:44:46 +03:00
Aliaksandr Valialkin
f0b08dbd9e
lib/workingsetcache: accumulate stat counters on cache rotation
...
This should prevent from cache stats counters going down after cache rotation,
which may corrupt `cache hit ratio` graph on the official Grafan dasbhoards
when using the following query:
1 - (sum(rate(vm_cache_misses_total[5m])) by (type) / sum(rate(vm_cache_requests_total[5m])) by (type))
2020-04-10 11:51:47 +03:00
Aliaksandr Valialkin
28c65b58a2
lib/memory: add more details to -memory.allowedPercent
help message
2020-04-09 15:34:21 +03:00
Aliaksandr Valialkin
84fa146792
lib/httpserver: remove unnecessary http.HandlerFunc
wrapper in gzipHandler
2020-04-01 18:14:47 +03:00
Aliaksandr Valialkin
0ad7aaf535
lib/storage: add missing reset for tagFilter.matchesEmptyValue on tagFilter.Init
2020-04-01 17:40:27 +03:00
Aliaksandr Valialkin
c189104be7
lib/promscrape: reduce timestamp jitter when scraping targets
...
This should improve compression for timestamps
2020-04-01 16:13:01 +03:00
Aliaksandr Valialkin
4c56acbafa
lib/storage: remove duplicate data points on 7/8*minScrapeInterval interval instead of 1/2*minScrapeInterval
...
This should reduce storage usage and should improve deduplication accuracy
2020-04-01 15:47:04 +03:00
Aliaksandr Valialkin
504ea876f2
lib/storage: handle errors returned from TagFilters.Add
when cloning TagFilters with negative filter
2020-03-31 16:18:34 +03:00
Aliaksandr Valialkin
ef714e01c1
lib/storage: add fast path for the previous indexdb search if it doesn't contain per-day inverted index yet
2020-03-31 12:35:15 +03:00
Aliaksandr Valialkin
7e755b4bac
lib/storage: optimize per-day inverted index search for tag filters matching big number of time series
...
- Sort tag filters in the ascending number of matching time series
in order to apply the most specific filters first.
- Fall back to metricName search for filters matching big number of time series
(usually this are negative filters or regexp filters).
2020-03-31 00:53:29 +03:00
Aliaksandr Valialkin
d450249955
lib/storage: properly handle {label=~"foo|"}
filters as Prometheus does
...
Such filters must match all the time series with `label="foo"` plus all the time series without `label`
Previously only time series with `label="foo"` were matched.
2020-03-30 20:21:47 +03:00
Aliaksandr Valialkin
b47444e69d
lib/envflag: add -envflag.prefix
for setting optional prefix for environment vars
2020-03-30 15:51:44 +03:00
kreedom
f058efb3d1
[vmalert] config parser ( #393 )
...
* [vmalert] config parser
* make linter be happy
* fix test
* fix sprintf add test for rule validation
2020-03-29 01:49:40 +02:00
Aliaksandr Valialkin
42c290ce9f
lib/httpserver: add -http.maxGracefulShutdownDuration
command-line flag for tuning the maximum duration required for graceful shutdown of http server
2020-03-27 20:10:05 +02:00
Aliaksandr Valialkin
8fa80a2dbc
lib/uint64set: remove zero buckets after Set.Intersect
2020-03-27 01:16:34 +02:00
Aliaksandr Valialkin
7a35447031
lib/uint64set: small code cleanup and perf tuning
...
* Remember the last accessed bucket on Has() call.
* Inline fast paths inside Add() and Has() calls.
* Remove fragile code with maxUnsortedBuckets inside bucket32.
2020-03-25 15:29:59 +02:00
Aliaksandr Valialkin
cce936de5b
lib/uint64set: go fmt
2020-03-24 22:28:09 +02:00
Aliaksandr Valialkin
7cdac6634c
lib/storage: serialize snapshot creation process with mutex
...
This guarantees that the snapshot contains all the recently added data
from inmemory buffers when multiple concurrent calls to Storage.CreateSnapshot are performed.
2020-03-24 22:27:28 +02:00
Aliaksandr Valialkin
c31b956355
lib/uint64set: added more tests
2020-03-24 22:27:26 +02:00
Aliaksandr Valialkin
31a533656e
lib/storage: remove obsolete code
2020-03-13 22:42:42 +02:00
Aliaksandr Valialkin
bf1869d33d
lib/promscrape: allow overriding external_labels as Prometheus does
...
Prometheus docs at https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config say:
> In communication with external systems, they are always applied only
> when a time series does not have a given label yet and are ignored otherwise.
Though this may result in consistency chaos when scrape targets override `external_labels`,
let's stick with Prometheus behavior for the sake of backwards compatibility.
There is last resort in vmagent with `-remoteWrite.label`, which consistently
sets the configured labels to all the metrics before sending them to remote storage.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/366
2020-03-12 20:25:27 +02:00
Aliaksandr Valialkin
0e7a71a245
app/vmselect: add label_map(q, label, srcValue1, dstValue1, ... srcValueN, dstValueN)
function to MetricsQL
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/369
2020-03-12 19:13:56 +02:00
Aliaksandr Valialkin
fa7910fba1
lib/protoparser/csvimport: add missing metric vm_rows_invalid_total{type="csvimport"}
2020-03-12 15:28:10 +02:00
Aliaksandr Valialkin
375d5483fa
lib/promscrape: remove possible races when registering and de-registering scrape workers for /targets
page
2020-03-11 16:30:43 +02:00
Aliaksandr Valialkin
187fd89c70
lib/promscrape: consistently update /targets
page after SIGHUP
2020-03-11 03:20:38 +02:00
Aliaksandr Valialkin
2f0a36044c
app/{vmagent,vminsert}: add support for importing csv data via /api/v1/import/csv
2020-03-10 21:17:40 +02:00
Aliaksandr Valialkin
7545784a49
all: fix golangci-lint
issues
2020-03-10 19:40:03 +02:00
Aliaksandr Valialkin
d39dd8aa69
lib/promscrape: do not retry idempotent requests when scraping targets
...
This should prevent from the following unexpected side-effects of idempotent request retries:
- increased actual timeout when scraping the target comparing to the configured scrape_timeout
- increased load on the target
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/357
2020-03-09 13:31:20 +02:00
Aliaksandr Valialkin
12789a4621
app/vmagent: do not allow non-supported fields in -remoteWrite.relabelConfig
and file_sd_configs
...
This should reduce possible confusion like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/363
2020-03-06 20:19:34 +02:00
Aliaksandr Valialkin
0d893eff36
Makefile: add build and test rules with enabled race detector. These rules have -race
suffix
...
Fix also `unsafe pointer conversion` errors detected by Go1.14. See https://golang.org/doc/go1.14#compiler .
2020-03-05 12:05:16 +02:00
Aliaksandr Valialkin
31a76a7b3a
lib/promscrape: consistency renaming: stopCh -> globalStopCh
2020-03-03 20:08:22 +02:00
Aliaksandr Valialkin
f01d1bf4a8
app/vmagent: add -remoteWrite.maxDiskUsagePerURL
for limiting the maximum disk usage for each -remoteWrite.url
buffer
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/352
2020-03-03 19:49:20 +02:00
Aliaksandr Valialkin
c3b239eb1a
lib/protoparser/prometheus: allow trailing comma in tags list
...
The trailing comma is generated by cloudwatch exporter.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/350
2020-03-02 22:23:28 +02:00
Aliaksandr Valialkin
6a1aab88fd
lib/protoparser: metrics renaming: vm_protoparser_<type>_*
-> vm_protoparser_*{type="<type>"}
...
This should improve composability of these metrics in PromQL queries
2020-02-28 20:19:59 +02:00
Aliaksandr Valialkin
c2e602286c
lib/persistentqueue: reset chunk file when the persistent queue is empty
2020-02-28 20:06:59 +02:00
Aliaksandr Valialkin
cf9aee4ec3
all: properly split vm_deduplicated_samples_total
among cluster components
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/345
2020-02-27 23:47:51 +02:00
Aliaksandr Valialkin
5e7b4795bd
lib/envflag: typo fix in docs to -envflag.enable
: envoronment->environment
2020-02-27 21:56:46 +02:00
Aliaksandr Valialkin
e6a481ab11
lib/promscrape: properly reload new configs on SIGHUP
2020-02-26 13:54:24 +02:00
Aliaksandr Valialkin
fa6815712f
lib/promscrape: go fmt
2020-02-26 13:24:40 +02:00
Aliaksandr Valialkin
f2a6948a14
lib/promscrape: do not add missing port to __address__ label in order to be consistent with Prometheus behavior
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/331
2020-02-25 20:50:18 +02:00
Aliaksandr Valialkin
c4194020ef
app/vmagent: do not allow sending unpacked requests with sizes exceeding -maxInsertRequestSize
2020-02-25 19:35:43 +02:00
Aliaksandr Valialkin
2471340e0d
app/vmagent: add ability to accept Influx line protocol data via TCP and UDP
...
Just set `-influxListenAddr` command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/333
2020-02-25 19:18:01 +02:00
Aliaksandr Valialkin
7a045125cc
lib/fs: typo fix: read blocks bigger than 8KB via pread()
call instead of using mmap
2020-02-25 18:04:06 +02:00
Aliaksandr Valialkin
13ee8271d0
lib/envflag: substitute dots with underscores in env var names if -envflag.enable is set
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-24 21:15:11 +02:00
Aliaksandr Valialkin
7ee7614e90
app/vmagent: initial implementation for vmagent
2020-02-23 17:31:54 +02:00
Aliaksandr Valialkin
110cce24d9
lib/storage: add vm_ prefix to deduplicated_samples_total
metric
2020-02-21 19:33:36 +02:00
Aliaksandr Valialkin
9d279e26a7
lib/protoparser/prometheus: skip leading whitespace from tag names
2020-02-16 19:06:23 +02:00
Aliaksandr Valialkin
a2b81b71b9
lib/storage: typo fix
2020-02-16 15:53:48 +02:00
Aliaksandr Valialkin
ad4cb9f3ca
lib/storage: prevent from clobbering nin-nil lastError in Storage.add
2020-02-16 15:51:35 +02:00
Aliaksandr Valialkin
846d7fa7e9
app/vmselect: add sort_by_label(q, label)
and sort_by_label_desc(q, label)
functions
...
This is implementation of https://github.com/prometheus/prometheus/pull/1533 for VictoriaMetrics.
2020-02-13 17:01:50 +02:00
Aliaksandr Valialkin
e3b18ca1ab
lib/mergeset: skip createing temporary part objects when merging source inmemory parts
...
This should reduce CPU usage when adding new entries to inverted index.
This should alos prevent from creating stalled cleaner goroutines for the created temporary parts,
since they were never closed.
This should fix the following issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 14:09:13 +02:00
Aliaksandr Valialkin
347aaba79d
lib/{storage,mergeset}: use time.Ticker instead of time.Timer where appropriate
...
It has been appeared that time.Timer was used in places where time.Ticker must be used instead.
This could result in blocked goroutines as in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 13:21:48 +02:00
Aliaksandr Valialkin
e7d1037210
docs: migrate ExtendedPromQL->MetricsQL in order to be more consistent
2020-02-10 23:03:31 +02:00
Aliaksandr Valialkin
fcdd95a6ef
lib/envflag: check for incorrect flag values read from environment vars
2020-02-10 16:09:03 +02:00
Aliaksandr Valialkin
9c5db9400c
lib/envflag: add -envflag.enable
command-line flag for enabling reading flags from environment vars
...
By default flags are read only from command line. They can be read from environment vars if `-envflag.enable` is set.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-10 16:09:01 +02:00
Aliaksandr Valialkin
1010a57882
all: allow setting flags via environment vars
...
Now flags can be set via environment vars with the same names as flags.
Command-line flags override flags set via env vars.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-10 13:31:21 +02:00
Aliaksandr Valialkin
ea66212c93
lib/storage: move -dedup.minScrapeInterval
flag outside lib/storage, so it doesnt show up in vminsert
in cluster version
2020-02-10 13:07:25 +02:00
Aliaksandr Valialkin
8b360a25e9
lib/logger: initialize output to os.Stderr by default
2020-02-04 22:43:26 +02:00
Aliaksandr Valialkin
1f271a9815
lib/logger: add -loggerOutput
command-line flag
...
This flag allows changing log output from `stderr` to `stdout` if `-loggerOutput=stdout` is set.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/306
2020-02-04 21:48:24 +02:00
Aliaksandr Valialkin
49ab3fa076
lib/logger: do not clutter -loggerFormat=json
output with stack trace
...
This should improve json parsing
2020-02-04 21:40:20 +02:00
Aliaksandr Valialkin
56d6b8ed0a
lib/storage: do not deduplicate blocks with less than 32 samples during merge
...
This should improve deduplication accuracy for blocks with higher number of samples.
2020-02-04 18:41:37 +02:00
Aliaksandr Valialkin
7cde594696
all: do not clash flag description with back-quoted flag types
...
See https://golang.org/pkg/flag/#PrintDefaults for more details.
2020-02-04 15:56:01 +02:00
Aliaksandr Valialkin
9b25a2fb67
lib/fs: remove unused readerAt
interface
2020-01-31 15:13:00 +02:00
Aliaksandr Valialkin
e3adc095bd
all: add -dedup.minScrapeInterval
command-line flag for data de-duplication
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/86
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/278
2020-01-31 01:18:54 +02:00
Aliaksandr Valialkin
a45f25699c
lib/storage: re-use indexSearch inside Storage.prefetchMetricNames
2020-01-31 01:18:53 +02:00
Aliaksandr Valialkin
cb5c39ee70
lib/fs: optimize small reads for ReaderAt.MustReadAt
by reading from memory-mapped space instead of reading from file descriptor
...
This should improve performance when reading many small blocks.
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
da19fffa08
all: rename ReadAt* to MustReadAt* in order to dont clash with io.ReaderAt
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
1332ddc15e
lib/storage: pass missing AccountID and ProjectID to searchMetricName
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
4ed5e9a7ce
lib/storage: pre-fetch metricNames for the found metricIDs in Search.Init
...
This should speed up Search.NextMetricBlock loop for big number of found time series.
2020-01-30 15:16:16 +02:00
Aliaksandr Valialkin
cb2a2f281f
lib/mergeset: properly update lastAccesstime
in indexBlockCache entries
...
This is a follow-up for 6665f10e7b
2020-01-29 21:21:01 +02:00
Aliaksandr Valialkin
170c1c3a4e
app/vmselect/promql: add keep_next_value(q)
for filling gaps with the next non-empty value
2020-01-29 00:48:14 +02:00
Aliaksandr Valialkin
a9c1d5b351
app/vminsert: moved -maxInsertRequestSize
command-line flag out of lib/prompb
in order to prevent its inclusion in vmselect
and vmstorage
apps
2020-01-28 22:53:50 +02:00
Aliaksandr Valialkin
81ba371eaf
lib/logger: fix improperly set skipframes for all the logging functions
2020-01-26 18:34:58 +02:00
Aliaksandr Valialkin
9f595cb2b1
lib/httpserver: log the caller of httpserver.Errorf
...
Previously log message contained `httpserver.Errorf`, not it contains the caller of `httpserver.Errorf`, which is more useful.
2020-01-25 20:18:06 +02:00
Aliaksandr Valialkin
36a1a21d6e
lib/protoparser: add parser for Prometheus exposition text format
...
This parser will be used by vmagent
2020-01-24 20:11:19 +02:00
Aliaksandr Valialkin
0cda6afa8e
app/vminsert: move ingestion protocol parsers to lib/protoparser, so they could be re-used in the upcoming vmagent
2020-01-24 16:55:18 +02:00
Aliaksandr Valialkin
ea53a21b02
all: consistently log durations in seconds with millisecond precision
...
This should improve logs readability
2020-01-22 18:35:24 +02:00
Aliaksandr Valialkin
40e564eb9c
app/vmselect/promql: add range_over_time(m[d])
function for calculating value range for m
over d
2020-01-21 19:05:29 +02:00
Aliaksandr Valialkin
9eaa2ab871
app/vmselect/promql: add label_match(q, label, regexp)
and label_mismatch(q, label, regexp)
functions for filtering out time series with labels matching the given regexp
2020-01-21 15:00:35 +02:00
Aliaksandr Valialkin
62b041e90a
lib/{mergeset,storage}: properly update lastAccessTime
in index and data block cache entries
2020-01-20 15:00:10 +02:00
Aliaksandr Valialkin
607d4418b8
lib/uint64set: add missing bucket32.b16his values
2020-01-18 14:26:23 +02:00
Aliaksandr Valialkin
e3379537cd
lib/uint64set: optimize Set.Union
...
This should improve performance for queries over big number of time series
2020-01-18 13:47:34 +02:00
Aliaksandr Valialkin
5077efd3f7
lib/uint64set: add benchmarks for Set.Union
2020-01-18 13:47:33 +02:00
Aliaksandr Valialkin
a851c75703
lib/storage: skip recovering timestamps order for lossless compression (PrecisionBits=64)
2020-01-17 23:59:19 +02:00
Aliaksandr Valialkin
2084921e64
all: use github.com/klauspost/compress/gzip
instead of compress/gzip
...
`github.com/klauspost/compress/gzip` is more optimized than `compress/gzip`.
This gives better gzip compression and decompression speeds.
2020-01-17 23:59:17 +02:00
Aliaksandr Valialkin
ab4d5d72eb
lib/uint64set: reduce memory allocations in Set.AppendTo
2020-01-17 22:33:00 +02:00
Aliaksandr Valialkin
476c7fb109
lib/storage: reduce memory allocations when merging metricID sets
2020-01-17 22:10:56 +02:00
Aliaksandr Valialkin
29d21259f0
lib/uint64set: typo fix in Set.Intersect
2020-01-17 18:11:46 +02:00
Aliaksandr Valialkin
ed1d259b10
lib/uint64set: optimize Intersect, Subtract and Union functions
...
This should improve performance for queries over big number of time series.
2020-01-17 16:16:43 +02:00
Aliaksandr Valialkin
68d35357b1
lib/uint64set: improve benchmark for Set.Intersect
2020-01-17 16:16:43 +02:00
Aliaksandr Valialkin
ffe352ad31
lib/uint64set: add benchmark for Set.Intersect
2020-01-17 16:16:43 +02:00
Aliaksandr Valialkin
4b16b7fd11
all: mention command-line flags used for limiting the incoming request size in error messages
...
This should improve error logs usability.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/287
2020-01-16 13:06:43 +02:00
Aliaksandr Valialkin
7d429e2806
lib/uint64set: reduce memory usage in Union, Intersect and Subtract methods
...
Iterate items with newly added Set.ForEach method instead of allocating `[]uint64`
slice for all the items before the iteration.
2020-01-15 12:15:48 +02:00
Aliaksandr Valialkin
caffb0cd01
lib/{mergeset,storage}: fix uint64 counters alignment for 32-bit architectures (GOARCH=386, GOARCH=arm)
2020-01-14 22:47:42 +02:00
Aliaksandr Valialkin
b03ccbf6f7
lib/{storage,mergeset}: gradually remove stale entries from block cache and index caches
...
This should reduce memory usage in the long run when old blocks and indexes
aren't accessed anymore.
2020-01-14 21:38:29 +02:00
Aliaksandr Valialkin
bcd3f0c5bd
app/vmselect/promql: add hoeffding_bound_upper(phi, m[d])
and hoeffding_bound_lower(phi, m[d])
functions
...
These functions can be used for calculating Hoeffding bounds
for `m` over `d` time range and for the given `phi` in the range `[0..1]`.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/283
2020-01-11 14:47:13 +02:00
Aliaksandr Valialkin
87a106702b
app/vmselect/promql: add aggr_over_time(("aggr_func1", "aggr_func2", ...), m[d])
function
...
This function can be used for simultaneous calculating of multiple `aggr_func*` functions
that accept range vector. For example, `aggr_over_time(("min_over_time", "max_over_time"), m[d])`
would calculate `min_over_time` and `max_over_time` for `m[d]`.
2020-01-10 21:18:12 +02:00
Aliaksandr Valialkin
c314d9a219
app/vmselect/promql: add tmin_over_time(m[d])
and tmax_over_time(m[d])
functions
...
These functions return timestamp in seconds for the minimum and maximum value for `m` over time range `d`
2020-01-10 19:39:34 +02:00
Aliaksandr Valialkin
1029b6ab34
lib/backup/s3remote: check whether the file exists before deleting it
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/284
2020-01-09 23:20:51 +02:00
Aliaksandr Valialkin
705af61587
app/{vmbackup,vmrestore}: add backup complete
file to backup when it is complete and check for this file before restoring from backup
...
This should prevent from restoring from incomplete backups.
Add `-skipBackupCompleteCheck` command-line flag to `vmrestore` in order to be able restoring from old backups without `backup complete` file.
2020-01-09 15:35:45 +02:00
Aliaksandr Valialkin
53e176ed67
lib/storage: limit maxRaRowsPerPartition by 500K for any number of rawRowsShardsPerPartition
...
This should reduce write amplification for high ingestion rate on multi-CPU systems
2020-01-04 23:58:23 +02:00
Aliaksandr Valialkin
89b551201c
lib/metricsql: export IsRollupFunc and IsTransformFunc, since they can be used by package users
2020-01-04 13:25:13 +02:00
Aliaksandr Valialkin
6f29d37cb5
app/vmselect/promql: add histogram_share(le, buckets)
function
2020-01-04 12:53:08 +02:00
Aliaksandr Valialkin
2290503140
app/vmselect/promql: add absent_over_time(m[d])
func similar to the function in Prometheus 2.16
...
See https://github.com/prometheus/prometheus/issues/2882
2020-01-04 12:53:01 +02:00
Aliaksandr Valialkin
67f94bbe12
app/vmselect/promql: add histogram_over_time(m[d])
rollup function
2020-01-04 12:52:56 +02:00
Aliaksandr Valialkin
588531dd76
lib/uint64set: reduce memory usage when storing big number of sparse metric_id values
2020-01-03 18:17:17 +02:00
Aliaksandr Valialkin
e0abf45d45
app/vmselect/promql: add share_le_over_time
and share_gt_over_time
functions for SLI and SLO calculations
2020-01-03 00:41:36 +02:00
Aliaksandr Valialkin
19962e2732
docs: refer to standalone MetricsQL package
2020-01-02 23:43:43 +02:00
Aliaksandr Valialkin
0d2e83e9d7
lib/metricsql: add example for ExpandWithExprs
2019-12-26 21:31:15 +02:00
Aliaksandr Valialkin
eb1a66c577
lib/metricsq: add ExpandWithExprs
2019-12-25 22:20:21 +02:00
Aliaksandr Valialkin
453d71d082
Rename lib/promql to lib/metricsql and apply small fixes
2019-12-25 22:09:09 +02:00
Mike Poindexter
009d1559db
Split Extended PromQL parsing to a separate library
2019-12-25 22:09:07 +02:00
Aliaksandr Valialkin
f22c9dbb0f
lib/fs: typo fix in fadvise_unix.go
2019-12-24 21:00:04 +02:00
Aliaksandr Valialkin
d3c185f0ca
lib/encoding: log the compressed block contents if it cannot be decompressed or unmarshaled
...
This should help detecting the root cause of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/281
2019-12-24 20:48:25 +02:00
Aliaksandr Valialkin
091e35cf0c
lib/encoding: mention src contents in error message returned from unmarshalInt64NearestDelta*
...
This should simplify detecting the root cause of the issue at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/281
2019-12-24 20:41:38 +02:00
Aliaksandr Valialkin
0e51058a0d
lib/encoding: mention unpacked block size in the error message if unparsed tail left
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/281
2019-12-24 20:35:20 +02:00
Aliaksandr Valialkin
afa8b34d27
Revert "lib/logger: prevent from blocking when log output isn't consumed in timely manner"
...
This reverts commit 9f50232e70
.
Reason to revert: this leaves incomplete logs on app shutdown.
2019-12-24 12:20:45 +02:00
Aliaksandr Valialkin
6358cf3d47
app/vmselect/netstorage: move MustAdviseSequentialRead to lib/fs
2019-12-23 23:16:26 +02:00
Aliaksandr Valialkin
44f886cc9c
lib/encoding/zstd: typo fix
2019-12-23 18:37:20 +02:00
Aliaksandr Valialkin
108a60d69e
lib/encoding/zstd: call zstd.Decoder.Close instead of zstd.Decoder.Reset in order to free up occupied goroutines
...
This should fix goroutine leak for https://github.com/klauspost/compress/issues/195
2019-12-23 18:32:28 +02:00
Aliaksandr Valialkin
335bd0ac0a
lib/encoding/zstd: prevent from possible encoder leak when concurrent goroutines create encoders for the same compressionLevel
...
Thanks to @klauspost for the pointer to this issue. See https://github.com/klauspost/compress/issues/195 for details.
2019-12-23 18:06:02 +02:00
Aliaksandr Valialkin
9f50232e70
lib/logger: prevent from blocking when log output isn't consumed in timely manner
...
Drop log messages instead of blocking and increment `vm_log_messages_dropped_total` metric.
2019-12-20 11:49:42 +02:00
Aliaksandr Valialkin
a37a006f11
lib/storage: scale ingestion performance by sharding rawRows on systems with more than 8 CPU cores
2019-12-19 18:17:05 +02:00
Aliaksandr Valialkin
8d79412b26
lib/storage: optimize bulk import performance when multiple data points are inserted for the same time series
...
This should speed up `/api/v1/import` and make it more scalable on multi-core systems.
2019-12-19 15:13:36 +02:00
Aliaksandr Valialkin
05ec8afb3a
lib/httpserver: sync the code with master branch
2019-12-18 23:08:32 +02:00
Aliaksandr Valialkin
a7bf8e77af
app/vminsert: simultaneously accept telnet put
and HTTP /api/put
OpenTSDB metrics at -opentsdbListenAddr
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/266
2019-12-14 00:42:18 +02:00
Aliaksandr Valialkin
bc3984a5b3
lib/logger: add -loggerFormat
for choosing log message formats
...
Supported formats: default, json
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/265
2019-12-13 15:09:18 +02:00
Aliaksandr Valialkin
5d2ff573aa
app/vmselect/promql: allow negative offsets
...
Updates https://github.com/prometheus/prometheus/issues/6282
2019-12-11 00:57:51 +02:00
Aliaksandr Valialkin
3694efd005
lib/{mergeset,storage}: log info message when both source and destination part paths from txn are missing during startup
...
This is expected condition after unclean shutdown (OOM, hard reset, `kill -9`) on NFS disk.
2019-12-09 15:45:23 +02:00
Aliaksandr Valialkin
639967db59
lib/{mergeset,storage}: make sure pending transaction deletions are finished before and after runTransactions
call.
...
`runTransactions` call issues async deletions for transaction files. The previously issued transaction deletions
can race with the next call to `runTransactions`. Prevent this by waiting until all the pending transaction
deletions are funished in the beginning of `runTransactions`. Also make sure that all the pending transaction
deletions are finished before returning from `runTransactions`.
2019-12-04 21:40:52 +02:00
Aliaksandr Valialkin
7c0dd85a7c
lib/httpserver: add /ping
handler for compatibility with Influx agents
...
Certain Influx agents check for `/ping` endpoint before starting
to send Influx line protocol data. See https://docs.influxdata.com/influxdb/v1.7/tools/api/#ping-http-endpoint
2019-12-04 19:18:18 +02:00
Aliaksandr Valialkin
534da0a8c3
lib/storage: fall back to global inverted index if a filter match too many time series in per-day index
...
Previously this resulted to error message. The query may succeed via search in global index.
2019-12-03 14:48:08 +02:00
Aliaksandr Valialkin
6eb698d1cc
lib/storage: fix printing tag filters in TagFilters.String
2019-12-03 14:25:20 +02:00
Aliaksandr Valialkin
c04f60db35
lib/storage: print __name__
instead of empty string in user-visible tag filters
2019-12-03 14:18:18 +02:00
Aliaksandr Valialkin
625f6ca761
lib/storage: optimize regexp filter search
2019-12-03 00:33:53 +02:00
Aliaksandr Valialkin
b9616c017f
lib/{mergeset,storage}: remove transaction files only after the mentioned dirs are really removed
...
This should fix the issue on NFS when incompletely removed dirs may be left
after unclean shutdown (OOM, kill -9, hard reset, etc.), while the corresponding transaction
files are already removed.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-12-02 21:34:37 +02:00
Aliaksandr Valialkin
4e22b521c2
lib/storage: remove metricID with missing metricID->metricName entry
...
The metricID->metricName entry can be missing in the indexdb after unclean shutdown
when only a part of entries for new time series is written into indexdb.
Recover from such a situation by removing the broken metricID. New metricID
will be automatically created for time series with the given metricName
when new data point will arive to it.
2019-12-02 20:52:13 +02:00
Aliaksandr Valialkin
5a62415bec
lib/storage: protect from time drift during indexdb rotation
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/248
2019-12-02 14:43:11 +02:00
Aliaksandr Valialkin
cf85c567d1
lib/logger: merge file
and line
labels into location="file:line"
...
This should improve the usability for `vm_log_messages_total` metric during practical queries
2019-12-02 14:43:09 +02:00
Aliaksandr Valialkin
f055dbefda
lib/storage: generate more human-friendly result in TagFilters.String
2019-12-02 13:56:40 +02:00
Aliaksandr Valialkin
29f39f866e
lib/logger: consistency renaming from vm_log_messages_count
to vm_log_messages_total
, since this is a counter
2019-12-02 00:47:12 +02:00
Aliaksandr Valialkin
15eaff1745
lib/logger: track the number of log messages by (level, file, line)
in the vm_log_messages_count
metric
2019-12-01 18:38:30 +02:00
Aliaksandr Valialkin
d456ec7589
lib/netutil: use IPv6 for both listening and dialing if -enabledTCP6
is set
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/244
2019-12-01 02:52:53 +02:00
Aliaksandr Valialkin
7e734433a3
lib/backup: cosmetic fixes after #243
2019-11-29 18:07:41 +02:00
glebsam
4a192cb832
Add option to provide custom endpoint for S3, add option to specify S3 config profile ( #243 )
...
* Add option to provide custom endpoint for S3 for use with s3-compatible storages, add option to specify S3 config profile
* make fmt
2019-11-29 18:07:39 +02:00
Aliaksandr Valialkin
4810f1dde6
lib/netutil: add -enableTCP6
command-line flag for enabling listening for IPv6 additionally to IPv4 TCP ports
2019-11-29 17:33:07 +02:00
Aliaksandr Valialkin
409c939621
lib/backup: remove flock.lock
file in empty dirs
...
This fixes an issue when VictoriaMetrics doesn't see the restored data after the following operations:
1. Stop VictoriaMetrics.
2. Delete `<-storageDataPath>` dir.
3. Start VictoriaMetrics, then stop it.
4. Restore data from backup with `vmrestore`.
5. Start VictoriaMetrics.
`vmrestore` didn't delete properly empty dirs in `<-storageDataPath>/indexdb` because of the remaining `flock.lock` files in these dirs.
2019-11-28 13:39:28 +02:00
Aliaksandr Valialkin
0f184affa7
app/vmselect/promql: optimize binary search over big number of samples during rollup calculations
2019-11-25 14:01:54 +02:00
Aliaksandr Valialkin
d24fc87a6f
lib/decimal: calculate ln2/ln10 constant during compile time
2019-11-23 15:52:39 +02:00
Aliaksandr Valialkin
2af7ca1122
vendor: update github.com/VictoriaMetrics/metrics from v1.7.2 to v1.8.0. This version supports histograms
2019-11-23 00:21:57 +02:00
Aliaksandr Valialkin
b9e53490b9
lib/storage: move non-matching tag filters to the top at matchTagFilters
...
This should reduce the amount of useless work needed for matching the next metricNames.
2019-11-21 21:40:36 +02:00
Aliaksandr Valialkin
33d9d63393
lib/storage: speed up time series search for queries with multiple filters
...
Use optimized specialized binary search for uint64 metricIDs instead of generic sort.Search.
2019-11-21 18:43:40 +02:00
Aliaksandr Valialkin
a02a57fbe9
lib/storage: verify the number of returned metricIDs in BenchmarkHeadPostingForMatchers
2019-11-20 15:40:03 +02:00
Aliaksandr Valialkin
3d1f4408cf
lib/decimal: increase decimal->float speed conversion for integer numbers
2019-11-20 14:09:10 +02:00
Aliaksandr Valialkin
f1f2eff08f
lib/decimal: reduce rounding error when converting from decimal to float with negative exponent
...
While at it, slightly increase the conversion performance by moving fast path to the top of the loop.
2019-11-19 23:34:41 +02:00
Aliaksandr Valialkin
17eca31989
lib/backup: retrieve only the required metadata when reading GCS objects
2019-11-19 21:30:51 +02:00
Aliaksandr Valialkin
216a260ced
app/{vmbackup,vmrestore}: add -maxBytesPerSecond
command-line flag for limiting the used network bandwidth during backup / restore
2019-11-19 20:32:43 +02:00
Aliaksandr Valialkin
9d1ee1e2ae
lib/backup: prevent from restoring to directory which is in use by VictoriaMetrics during the restore
2019-11-19 18:35:59 +02:00
Aliaksandr Valialkin
6ca4b94511
lib/storage: increase the number of created time series in BenchmarkHeadPostingForMatchers in order to be on par with Promethues
...
The previous commit was accidentally creating 10x smaller number of time series than Prometheus
and this led to invalid benchmark results.
The updated benchmark results:
benchmark old ns/op new ns/op delta
BenchmarkHeadPostingForMatchers/n="1" 272756688 6194893 -97.73%
BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 10781372 -92.19%
BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 10632834 -92.11%
BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 10679975 -94.55%
BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 100118510 -98.74%
BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 154955671 -97.96%
BenchmarkHeadPostingForMatchers/i=~"" 1142371741 258003769 -77.42%
BenchmarkHeadPostingForMatchers/i!="" 9964150263 159783895 -98.40%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 10937895 -94.96%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 10990027 -94.57%
BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 87004349 -82.11%
BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 53342793 -84.79%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 54256156 -85.76%
BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 21823279 -75.62%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 46671359 -87.70%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 53915842 -87.30%
VictoriaMetrics uses 1GB of RAM during the benchmark (vs 3.5GB of RAM for Prometheus)
2019-11-18 19:48:27 +02:00
Aliaksandr Valialkin
6f61fd367a
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus
...
See the corresponding benchmark in Prometheus - 23c0299d85/tsdb/head_bench_test.go (L52)
The benchmark allows performing apples-to-apples comparison of time series search
in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness -
contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix it.
Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots:
- Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers
- VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers
Benchmark results:
benchmark old ns/op new ns/op delta
BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87%
BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14%
BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15%
BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41%
BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89%
BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84%
BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59%
BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46%
BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36%
BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62%
BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32%
BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27%
BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41%
The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics.
Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 18:47:02 +02:00
Aliaksandr Valialkin
d297b65089
lib/storage: add vm_cache_size_bytes{type="storage/hour_metric_ids"}
metric
2019-11-13 20:26:05 +02:00
Aliaksandr Valialkin
494ad0fdb3
lib/storage: remove inmemory index for recent hour, since it uses too much memory
...
Production workload shows that the index requires ~4Kb of RAM per active time series.
This is too much for high number of active time series, so let's delete this index.
Now the queries should fall back to the index for the current day instead of the index
for the recent hour. The query performance for the current day index should be good enough
given the 100M rows/sec scan speed per CPU core.
2019-11-13 18:08:58 +02:00
Aliaksandr Valialkin
633dd81bb5
lib/storage: add -disableRecentHourIndex
flag for disabling inmemory index for recent hour
...
This may be useful for saving RAM on high number of time series aka high cardinality
2019-11-13 15:10:12 +02:00
Aliaksandr Valialkin
f1620ba7c0
lib/storage: fix inmemory inverted index issues found in v1.29
...
Issues fixed:
- Slow startup times. Now the index is loaded from cache during start.
- High memory usage related to superflouos index copies every 10 seconds.
2019-11-13 13:35:38 +02:00
Aliaksandr Valialkin
87b39222be
Revert "lib/fs: do not postpone directory removal on NFS error"
...
This reverts commit 21aeb02b46649ac9906cb37733f7b155a77a0db9.
2019-11-12 16:29:50 +02:00
Mike Poindexter
955a592106
Add test for invalid caching of tsids ( #232 )
...
* Add test for invalid caching of tsids
* Clean up error handling
2019-11-12 15:52:46 +02:00
Oleg Kovalov
74ba42d111
fix misspelled words ( #229 )
2019-11-12 00:18:24 +02:00
Aliaksandr Valialkin
c48e39eea9
lib/storage: add tests for dateMetricIDCache
2019-11-11 13:21:05 +02:00
Aliaksandr Valialkin
6bdde0d6d4
lib/storage: eliminate data race when updating lastSyncTime in dateMetricIDCache.Has
2019-11-10 22:04:23 +02:00
Aliaksandr Valialkin
5f52eb7653
lib/fs: do not postpone directory removal on NFS error
...
Continue trying to remove NFS directory on temporary errors for up to a minute.
The previous async removal process breaks in the following case during VictoriaMetrics start
- VictoriaMetrics opens index, finds incomplete merge transactions and starts replaying them.
- The transaction instructs removing old directories for parts, which were already merged into bigger part.
- VictoriaMetrics removes these directories, but their removal is delayed due to NFS errors.
- VictoriaMetrics scans partition directory after all the incomplete merge transactions are finished
and finds directories, which should be removed, but weren't still removed due to NFS errors.
- VictoriaMetrics panics when it finds unexpected empty directory.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/162
2019-11-10 13:27:16 +02:00
Aliaksandr Valialkin
9ea2bd822e
lib/storage: implement per-day inverted index
2019-11-10 00:20:32 +02:00
Aliaksandr Valialkin
dea2f3efed
lib/storage: use specialized cache for (date, metricID) entries
...
This improves ingestion performance.
2019-11-09 23:09:18 +02:00
Aliaksandr Valialkin
9a43902bd8
lib/storage: remove unused code from getMetricIDsForTimeRange: it is expected that time range is always non-zero
2019-11-09 19:03:51 +02:00
Aliaksandr Valialkin
c16e17dede
lib/storage: properly set time range when deleting time series
2019-11-09 18:50:02 +02:00
Aliaksandr Valialkin
8126007c15
lib/storage: obtain all the time series ids from (tag->metricIDs) rows instead of (metricID->TSID) rows, since this much faster
2019-11-09 18:04:26 +02:00
Aliaksandr Valialkin
50773348d3
lib/storage: small code prettifying
2019-11-09 14:01:24 +02:00
Aliaksandr Valialkin
44fa8226df
lib/uint64set: remove superflouos check for item existence before deleting it in Set.Subtract
2019-11-09 14:01:24 +02:00
Aliaksandr Valialkin
0bc54c23ce
lib/storage: inmemoryInvertedIndex prettifying
2019-11-09 14:01:24 +02:00
Aliaksandr Valialkin
46e67bb78c
lib/storage: export vm_new_timeseries_created_total
metric for determining time series churn rate
2019-11-08 19:58:21 +02:00
Aliaksandr Valialkin
0063c857f5
lib/storage: add inmemory inverted index for the last hour
...
It should improve performance for `last N hours` dashboards with update intervals smaller than 1 hour.
2019-11-08 19:37:46 +02:00