Aliaksandr Valialkin
825a2dd554
lib/procutil: prevent from app termination on SIGHUP signal, since this signal is frequently used for config reload
2020-04-30 02:09:27 +03:00
Aliaksandr Valialkin
01c17092e1
lib/httpserver: mention that -http.maxGracefulShutdownDuration
command-line flag value can be increased on shutdown timeout
2020-04-30 01:38:06 +03:00
Aliaksandr Valialkin
5ec036439d
lib/promscrape: set 30 seconds timeout for discovery api requests
...
Previously such requests could hang for long time. This could make debugging harder.
2020-04-29 17:33:34 +03:00
Aliaksandr Valialkin
43c39dc36c
vendor: use github.com/VictoriaMetrics/fasthttp instead of github.com/fasthttp/fasthttp
...
The upstream fasthttp may contain issues like 996610f021
,
plus a code that isn't used by VictoriaMetrics. So let's use a private copy under our control instead.
2020-04-29 17:33:34 +03:00
Aliaksandr Valialkin
4e4f57b121
lib/metricsql: move it to a separate repository - github.com/VictoriaMetrics/metrics
2020-04-28 15:28:22 +03:00
Aliaksandr Valialkin
83aca79137
lib/storage: recover when metricID->metricName entry is missing in the inverted index after unclean shutdown
...
Newly added index entries can be missing after unclean shutdown, since they didn't flush to persistent storage yet.
Log about this and delete the corresponding metricID, so it could be re-created next time.
2020-04-28 12:00:33 +03:00
Aliaksandr Valialkin
521df0e2fc
lib/promscrape: handle connection reset when targets responds with http redirect
2020-04-28 02:13:02 +03:00
肖贝贝
2b16c188e8
fix: vmagent not follow 301/302 redirect bug ( #445 )
...
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-04-28 01:29:37 +03:00
Aliaksandr Valialkin
303905cd84
lib/{encoding,decimal}: typo fixes in tests: epxecting->expecting
2020-04-28 00:01:55 +03:00
Aliaksandr Valialkin
36fa3078c2
lib/encoding: reduce possibility of failure in TestMarshalInt64ArraySize
2020-04-28 00:01:54 +03:00
Aliaksandr Valialkin
95942f1ac6
lib/promscrape/discovery/gce: make golangci-lint happy
2020-04-27 19:28:10 +03:00
Aliaksandr Valialkin
b768bc9a6a
lib/promscrape: add initial support for Prometheus-compatible service discovery for Amazon EC2 aka ec2_sd_configs
2020-04-27 19:25:53 +03:00
Aliaksandr Valialkin
de59703a16
lib/promscrape/discovery/gce: properly set filter
query arg in api url
2020-04-27 16:01:17 +03:00
Aliaksandr Valialkin
b4afe562c1
lib/storage: postpone reading data from blocks during search
...
This eliminates the need for storing block data into temporary files on a single-node VictoriaMetrics
during heavy queries, which touch big number of time series over long time ranges.
This improves single-node VM performance on heavy queries by up to 2x.
2020-04-27 11:45:24 +03:00
Aliaksandr Valialkin
0224071ebe
lib/promscrape/discovery/gce: allow empty project and zone for gce_sd_config
2020-04-27 11:45:02 +03:00
Aliaksandr Valialkin
6954d0edb7
lib/promscrape/discovery/gce: allow empty zone
arg in gce_sd_config
- in this case zones for the given project are automatically discovered
2020-04-26 14:34:11 +03:00
kreedom
fb967ae6c8
happy fmt
2020-04-26 14:16:32 +03:00
Aliaksandr Valialkin
d7c1ff8b0c
lib/storage: improve deduplication algorithm
...
Now it leaves only the first data point on each `-dedup.minScrapeInterval` interval.
Previously it may leave two data points on the interval. This could lead to unexpected results
for `histogram_quantile(phi, sum(rate(buckets)) by (le))` query.
2020-04-26 13:10:02 +03:00
Aliaksandr Valialkin
491b31b369
lib/storage: postpone label filters matching too many time series instead of giving up with error
...
This should reduce the frequency of the following errors:
cannot find tag filter matching less than N time series; either increase -search.maxUniqueTimeseries or use more specific tag filters
more than N time series found on the time range [...]; either increase -search.maxUniqueTimeseries or shrink the time range
2020-04-24 21:13:50 +03:00
Aliaksandr Valialkin
7b8008e0bd
lib/promscrape/discovery/gce: make golint happy by ignoring resp.Body.Close() result
2020-04-24 18:13:09 +03:00
Aliaksandr Valialkin
9ef5935552
lib/promscrape: initial implementation for gce_sd_configs
aga Prometheus-compatible service discovery for Google Compute Engine
2020-04-24 17:51:22 +03:00
Aliaksandr Valialkin
24461153bf
lib/promscrape: query /api/v1/namespaces/*
for the configured namespaces in kubernetes_sd_config
...
This should fix authroization issues described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/432
2020-04-24 14:33:50 +03:00
Aliaksandr Valialkin
00e897119f
lib/promscrape: add -promscrape.configCheckInterval
command-line flag for automating config checking
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/431
2020-04-23 23:41:08 +03:00
Aliaksandr Valialkin
a9a7a7175e
lib/promscrape: access Config entries by reference, so they can be compared by addresses
2020-04-23 14:38:20 +03:00
Aliaksandr Valialkin
1c5d14a2eb
lib/promscrape: move KubernetesSDConfig to lib/promscrape/discovery/kubernetes
2020-04-23 11:34:22 +03:00
Aliaksandr Valialkin
a714568374
lib/promscrape/discovery/kubernetes: hide role switch logic behind GetLabels function
2020-04-22 22:16:11 +03:00
Aliaksandr Valialkin
364db13c9c
app/vmselect: add /api/v1/status/tsdb
page with useful stats for locating root cause for high cardinality issues
...
See https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/425
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/268
2020-04-22 22:03:43 +03:00
Aliaksandr Valialkin
2dc5593b75
lib/writeconcurrencylimiter: improve docs for -maxConcurrentInserts command-line flag
2020-04-20 21:03:00 +03:00
Aliaksandr Valialkin
5454b518a6
lib/promscrape/discovery/kubernetes: reuse a client for empty api_server
inside different jobs
2020-04-20 17:07:11 +03:00
Aliaksandr Valialkin
43375df923
lib/promscrape/discovery/kubernetes: update stale comments
2020-04-17 14:06:20 +03:00
Aliaksandr Valialkin
5d1537a395
lib/promscrape: suppress scrape errors if -promscrape.suppressScrapeErrors
flag is set
2020-04-16 23:41:30 +03:00
Aliaksandr Valialkin
600490131f
lib/promscrape: print all the labels for the target on error message for failed scrape
...
This should improve debuggability.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/420
2020-04-16 23:35:05 +03:00
Aliaksandr Valialkin
bd4c6d21dd
lib/promscrape: retry target scraping when the target closes previously established keep-alive connection to it
...
This should fix the following error:
the server closed connection before returning the first response byte. Make sure the server returns 'Connection: close' response header before closing the connection
2020-04-16 23:25:29 +03:00
Aliaksandr Valialkin
2fd2dec5eb
lib/logger: typo fix
2020-04-16 00:19:10 +03:00
Aliaksandr Valialkin
071fdf5518
lib/logger: add WARN level for logging expected errors such as invalid user queries
2020-04-15 20:50:26 +03:00
Aliaksandr Valialkin
6f7f64f757
app/vmselect: handle timestamp(metric offset X)
the same way as Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/415
2020-04-15 12:01:00 +03:00
Aliaksandr Valialkin
426a0567c4
lib/promscrape: code cleanup in runScraper func
2020-04-15 11:36:24 +03:00
Aliaksandr Valialkin
c1de3f67b4
lib/storage: skip metricID if the corresponding metricID->metricName is missing in inverted index during search
...
This case is possible when the corresponding metricID->metricName entry didn't propagate to inverted index yet.
This should fix the following error:
error when searching tsids for tfss [...]: cannot find metricName by metricID 1582417212213420669: EOF
2020-04-15 00:06:43 +03:00
Aliaksandr Valialkin
067c7afebc
lib/promscrape: show information on improperly configured scrape targets at the bottom of /targets
page
...
This is a common error whith improperly configured target autodiscovery and/or relabeling.
This error leads to duplicate scraping of the same targets with the same set of labels, which leads
to duplicate samples in time series.
2020-04-14 14:55:05 +03:00
Aliaksandr Valialkin
ac35635b71
lib/promscrape/discovery/kubernetes: remove only unused client for API server during cleaning
2020-04-14 14:19:21 +03:00
Aliaksandr Valialkin
78863d7066
lib/promscrape: add promrelabel.GetLabelValueByName helper function
2020-04-14 14:12:01 +03:00
Aliaksandr Valialkin
c64f003cfb
lib/promscrape: mention job name in error messages when target cannot be scraped
...
This should improve debuggability
2020-04-14 13:33:13 +03:00
Aliaksandr Valialkin
4718a5d951
lib/promscrape: reset ScrapeWork.ID in tests
2020-04-14 13:31:31 +03:00
Aliaksandr Valialkin
257521a634
lib/promscrape: properly expose statuses for targets with duplicate scrape urls at /targets
page
...
Previously targets with duplicate scrape urls were merged into a single line on the page.
Now each target with duplicate scrape url is displayed on a separate line.
2020-04-14 13:10:01 +03:00
Aliaksandr Valialkin
6a75c95194
lib/promscrape: remove labels starting with __meta_
after applying relabel_configs
as Prometheus does
...
This should reduce CPU load during scraping when target discovery generates
big number of `__meta_*` labels (for instance, k8s discovery).
See https://www.robustperception.io/life-of-a-label for details.
2020-04-14 12:23:22 +03:00
Aliaksandr Valialkin
01d7d799dc
lib/promscrape: rename 'scrape_config->scrape_limit' to 'scrape_config->sample_limit'
...
`scrape_config` block from Prometheus config contains `sample_limit` field,
while in `vmagent` this field was mistakenly named as `scrape_limit`.
2020-04-14 11:59:57 +03:00
Aliaksandr Valialkin
2e4e202c2b
lib/promscrape: add initial support for kubernetes_sd_config
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/334
2020-04-13 21:03:28 +03:00
Aliaksandr Valialkin
2814b1490f
lib/promscrape: add -promscrape.config.strictParse
flag for detecting errors in -promscrape.config
file
2020-04-13 13:15:44 +03:00
Aliaksandr Valialkin
90b4a6dd12
lib/promscrape: extract common auth code to lib/promauth
2020-04-13 12:59:10 +03:00
Aliaksandr Valialkin
4de6c6bbf0
lib/storage: disable deduplication after dedup tests are complete
...
The rest of tests expect that the de-duplication is disabled.
2020-04-10 17:28:31 +03:00
Aliaksandr Valialkin
ded0c0d3c7
lib/storage: correctly handle -dedup.minScrapeInterval
values smaller than 8ms
...
Such small values may be used for removing samples with duplicate timestamps.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/409 for details.
2020-04-10 16:36:41 +03:00
Aliaksandr Valialkin
7d73623c69
lib/{storage,mergeset}: make sure that requests
and misses
cache counters never go down
2020-04-10 14:45:01 +03:00
Aliaksandr Valialkin
e62afc7366
lib/protoparser: add -*TrimTimstamp
command-line flags for Influx, Graphite, OpenTSDB and CSV data
...
These flags can be used for reducing disk space usage for timestamps data ingested over the given protocols
2020-04-10 12:44:39 +03:00
Aliaksandr Valialkin
0681b4c27a
lib/workingsetcache: accumulate stat counters on cache rotation
...
This should prevent from cache stats counters going down after cache rotation,
which may corrupt `cache hit ratio` graph on the official Grafan dasbhoards
when using the following query:
1 - (sum(rate(vm_cache_misses_total[5m])) by (type) / sum(rate(vm_cache_requests_total[5m])) by (type))
2020-04-10 11:51:40 +03:00
Aliaksandr Valialkin
f86947d55c
lib/memory: add more details to -memory.allowedPercent
help message
2020-04-09 15:28:53 +03:00
kreedom
298eb0a0f8
[vmalert] improve external url handling
2020-04-01 22:29:11 +03:00
Aliaksandr Valialkin
cdf0a4cf8f
lib/httpserver: remove unnecessary http.HandlerFunc
wrapper in gzipHandler
2020-04-01 18:14:17 +03:00
Aliaksandr Valialkin
e0d0348f36
lib/storage: add missing reset for tagFilter.matchesEmptyValue on tagFilter.Init
2020-04-01 17:42:44 +03:00
Aliaksandr Valialkin
3e55c7e069
lib/promscrape: reduce timestamp jitter when scraping targets
...
This should improve compression for timestamps
2020-04-01 16:11:35 +03:00
Aliaksandr Valialkin
c4acd20d2a
lib/storage: remove duplicate data points on 7/8*minScrapeInterval interval instead of 1/2*minScrapeInterval
...
This should reduce storage usage and should improve deduplication accuracy
2020-04-01 15:48:48 +03:00
Aliaksandr Valialkin
b699c46046
lib/storage: handle errors returned from TagFilters.Add
when cloning TagFilters with negative filter
2020-03-31 16:18:02 +03:00
Aliaksandr Valialkin
972713bd79
lib/storage: add fast path for the previous indexdb search if it doesn't contain per-day inverted index yet
2020-03-31 12:51:21 +03:00
Aliaksandr Valialkin
5d99ca6cfc
lib/storage: optimize per-day inverted index search for tag filters matching big number of time series
...
- Sort tag filters in the ascending number of matching time series
in order to apply the most specific filters first.
- Fall back to metricName search for filters matching big number of time series
(usually this are negative filters or regexp filters).
2020-03-31 00:48:35 +03:00
Aliaksandr Valialkin
318326c309
lib/storage: properly handle {label=~"foo|"}
filters as Prometheus does
...
Such filters must match all the time series with `label="foo"` plus all the time series without `label`
Previously only time series with `label="foo"` were matched.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395
2020-03-31 00:48:18 +03:00
Aliaksandr Valialkin
b98ca56d94
lib/envflag: add -envflag.prefix
for setting optional prefix for environment vars
2020-03-30 15:51:19 +03:00
kreedom
bf6c24d0f4
[vmalert] config parser ( #393 )
...
* [vmalert] config parser
* make linter be happy
* fix test
* fix sprintf add test for rule validation
2020-03-29 01:48:30 +02:00
Aliaksandr Valialkin
149f365f74
lib/httpserver: add -http.maxGracefulShutdownDuration
command-line flag for tuning the maximum duration required for graceful shutdown of http server
2020-03-27 21:23:30 +02:00
Aliaksandr Valialkin
047849e855
lib/uint64set: remove zero buckets after Set.Intersect
2020-03-27 01:15:58 +02:00
Aliaksandr Valialkin
f3ec424e7d
lib/uint64set: small code cleanup and perf tuning
...
* Remember the last accessed bucket on Has() call.
* Inline fast paths inside Add() and Has() calls.
* Remove fragile code with maxUnsortedBuckets inside bucket32.
2020-03-25 15:30:25 +02:00
Aliaksandr Valialkin
dde4a97534
lib/uint64set: go fmt
2020-03-24 22:28:43 +02:00
Aliaksandr Valialkin
f3e0c55ea1
lib/storage: serialize snapshot creation process with mutex
...
This guarantees that the snapshot contains all the recently added data
from inmemory buffers when multiple concurrent calls to Storage.CreateSnapshot are performed.
2020-03-24 22:27:05 +02:00
Aliaksandr Valialkin
97fb0edd07
lib/uint64set: added more tests
2020-03-24 22:27:04 +02:00
Aliaksandr Valialkin
df91d2d91f
lib/storage: remove obsolete code
2020-03-13 22:48:17 +02:00
Aliaksandr Valialkin
499594f421
lib/promscrape: allow overriding external_labels as Prometheus does
...
Prometheus docs at https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config say:
> In communication with external systems, they are always applied only
> when a time series does not have a given label yet and are ignored otherwise.
Though this may result in consistency chaos when scrape targets override `external_labels`,
let's stick with Prometheus behavior for the sake of backwards compatibility.
There is last resort in vmagent with `-remoteWrite.label`, which consistently
sets the configured labels to all the metrics before sending them to remote storage.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/366
2020-03-12 20:24:42 +02:00
Aliaksandr Valialkin
fdc2a9d1d7
app/vmselect: add label_map(q, label, srcValue1, dstValue1, ... srcValueN, dstValueN)
function to MetricsQL
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/369
2020-03-12 19:13:47 +02:00
Aliaksandr Valialkin
c8dc1cd218
lib/protoparser/csvimport: add missing metric vm_rows_invalid_total{type="csvimport"}
2020-03-12 15:27:45 +02:00
Aliaksandr Valialkin
d4beb17ebe
lib/promscrape: remove possible races when registering and de-registering scrape workers for /targets
page
2020-03-11 16:30:21 +02:00
Aliaksandr Valialkin
cdf70b7944
lib/promscrape: consistently update /targets
page after SIGHUP
2020-03-11 03:20:03 +02:00
Aliaksandr Valialkin
1fe66fb3cc
app/{vmagent,vminsert}: add support for importing csv data via /api/v1/import/csv
2020-03-10 21:15:35 +02:00
Aliaksandr Valialkin
49d7cb1a3f
all: fix golangci-lint
issues
2020-03-10 19:41:46 +02:00
Aliaksandr Valialkin
7c432da788
lib/promscrape: do not retry idempotent requests when scraping targets
...
This should prevent from the following unexpected side-effects of idempotent request retries:
- increased actual timeout when scraping the target comparing to the configured scrape_timeout
- increased load on the target
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/357
2020-03-09 13:31:52 +02:00
Aliaksandr Valialkin
986dba5ab3
app/vmagent: do not allow non-supported fields in -remoteWrite.relabelConfig
and file_sd_configs
...
This should reduce possible confusion like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/363
2020-03-06 20:19:13 +02:00
Aliaksandr Valialkin
c5f894b361
Makefile: add build and test rules with enabled race detector. These rules have -race
suffix
...
Fix also `unsafe pointer conversion` errors detected by Go1.14. See https://golang.org/doc/go1.14#compiler .
2020-03-05 12:03:38 +02:00
Aliaksandr Valialkin
9a944fd169
lib/promscrape: consistency renaming: stopCh -> globalStopCh
2020-03-03 20:08:08 +02:00
Aliaksandr Valialkin
76036c1897
app/vmagent: add -remoteWrite.maxDiskUsagePerURL
for limiting the maximum disk usage for each -remoteWrite.url
buffer
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/352
2020-03-03 19:49:07 +02:00
Aliaksandr Valialkin
1d7ab78b55
lib/protoparser/prometheus: allow trailing comma in tags list
...
The trailing comma is generated by cloudwatch exporter.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/350
2020-03-02 22:22:09 +02:00
Aliaksandr Valialkin
b785429ddb
lib/protoparser: metrics renaming: vm_protoparser_<type>_*
-> vm_protoparser_*{type="<type>"}
...
This should improve composability of these metrics in PromQL queries
2020-02-28 20:20:10 +02:00
Aliaksandr Valialkin
e22fdc1073
lib/persistentqueue: reset chunk file when the persistent queue is empty
2020-02-28 20:05:53 +02:00
Aliaksandr Valialkin
18af31a4c2
all: properly split vm_deduplicated_samples_total
among cluster components
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/345
2020-02-27 23:48:07 +02:00
Aliaksandr Valialkin
6819db5686
lib/envflag: typo fix in docs to -envflag.enable
: envoronment->environment
2020-02-27 21:47:58 +02:00
Aliaksandr Valialkin
b63e4464f4
lib/promscrape: properly reload new configs on SIGHUP
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/335
2020-02-26 13:54:00 +02:00
Aliaksandr Valialkin
6739c2749d
lib/promscrape: go fmt
2020-02-25 20:56:44 +02:00
Aliaksandr Valialkin
7a33da8fea
lib/promscrape: do not add missing port to __address__ label in order to be consistent with Prometheus behavior
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/331
2020-02-25 20:49:50 +02:00
Aliaksandr Valialkin
4e24839a2c
app/vmagent: do not allow sending unpacked requests with sizes exceeding -maxInsertRequestSize
2020-02-25 19:34:41 +02:00
Aliaksandr Valialkin
6386aeb1e0
app/vmagent: add ability to accept Influx line protocol data via TCP and UDP
...
Just set `-influxListenAddr` command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/333
2020-02-25 19:12:49 +02:00
Aliaksandr Valialkin
7ef7c9368e
lib/fs: typo fix: read blocks bigger than 8KB via pread()
call instead of using mmap
2020-02-25 18:05:06 +02:00
Aliaksandr Valialkin
fed2959658
lib/envflag: substitute dots with underscores in env var names if -envflag.enable is set
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-24 21:14:44 +02:00
Aliaksandr Valialkin
04762344c6
app/vmagent: initial implementation for vmagent
2020-02-23 13:36:03 +02:00
Aliaksandr Valialkin
d21cb43e48
lib/storage: add vm_ prefix to deduplicated_samples_total
metric to be conistent with other metrics
2020-02-21 19:33:59 +02:00
Aliaksandr Valialkin
71a52f5f90
lib/protoparser/prometheus: skip leading whitespace from tag names
2020-02-16 19:06:33 +02:00