Aliaksandr Valialkin
4fa97430d7
app/{vminsert,vmagent}: allow adding extra labels when importing data via Prometheus, CSV and JSON line formats
...
Extra labels may be added to the imported data by passing `extra_label=name=value` query args.
Multiple query args may be passed in order to add multiple extra labels.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/719
2020-09-02 19:47:02 +03:00
Aliaksandr Valialkin
95ce89e7d7
lib/promscrape: use the number of parsed rows as a basis for writeRequestCtxPool leveling
...
The previous basis on `cap(sw.labels)` doesn't work anymore after 7785869ccc
,
because `sw.labels` may be reset multiple times when processing big number of rows.
2020-09-02 18:46:55 +03:00
Aliaksandr Valialkin
bc1ca4b20b
lib/httpserver: add -http.idleConnTimeout
command-line flag for tuning the timeout for incoming idle http connections
2020-09-01 15:33:31 +03:00
Aliaksandr Valialkin
a01c56104a
lib/promscrape: fix applying sample_limit when scraping targets with big number of metrics
...
This has been broken at 7785869ccc
2020-09-01 11:09:25 +03:00
Aliaksandr Valialkin
deff8d419a
lib/promscrape: reduce memory usage when scraping targets with millions of metrics
...
This should help when scraping /federate endpoints from Prometheus instances,
which scrape millions of metrics. See https://prometheus.io/docs/prometheus/latest/federation/
2020-09-01 10:55:24 +03:00
Aliaksandr Valialkin
5f2277624a
lib/{promscrape,leveledbytebufferpool}: rename getPoolIdAndCapacity to getPoolIDAndCapacity in order to make golint happy
2020-08-28 09:49:22 +03:00
Aliaksandr Valialkin
45e770ed20
lib/cgroup: limit the maximum GOMAXPROCS value to the number of available CPU cores
...
There is no sense in setting GOMAXPROCS to value higher than the number of available CPU cores.
2020-08-28 09:49:22 +03:00
Roman Khavronenko
34ef10fbcc
lib/flagutil: avoid int overflow for arch 386 ( #710 )
...
Arch 386 is a 32-bit architecture and interprets int type for numbers as an explicit int32,
whereas on most modern CPUs int is implicitly an int64. This makes tests to fail with
`int overflow` error.
2020-08-28 09:46:35 +03:00
Aliaksandr Valialkin
9a77ae9d1c
lib/promscrape: reduce memory usage when scraping targets with big number of metrics alongside targets with small number of labels
...
Previously targets with big number of metrics and/or labels could generated too big buffers,
which then could be re-used when scraping targets with small number of metrics.
This resulted in memory waste.
Now big buffers are used only for targets with big number of metrics / labels,
while small buffers are used for targets with small number of metrics / labels.
2020-08-16 22:30:34 +03:00
Aliaksandr Valialkin
3ea6444219
lib/leveledbytebufferpool: allocate byte buffers with capacity rounded to the upper boundary for the given bucket
...
This should reduce the number of resizings for the returned byte buffers.
2020-08-16 22:13:38 +03:00
Roman Khavronenko
4b89da9463
lib/decimal: rename significant decimal digits
to significant figures
( #698 )
...
The previous notion was inconsistent with what `decimal.Round` does.
According to [wiki](https://en.wikipedia.org/wiki/Significant_figures ) rounding
applied to all significant figures, not just decimal ones.
2020-08-16 17:22:40 +03:00
Aliaksandr Valialkin
6aab2f4989
all: allow using KB
, MB
, GB
, KiB
, MiB
and GiB
suffixes in command-line flag values related to byte sizes or byte rates
2020-08-16 17:08:28 +03:00
Aliaksandr Valialkin
1b5467f7fd
lib/memory: improve log message about the memory allowed to use by VictoriaMetrics
2020-08-16 17:08:27 +03:00
Aliaksandr Valialkin
d9f7ea1c6e
lib/protoparser: removed unnecessary call to SetReadDeadline when reading a stream of data
...
The OS should return any buffered data in the stream without the need to set the read timeout.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/696
2020-08-15 15:38:35 +03:00
Aliaksandr Valialkin
edf3ee2a9b
lib: dump compressed block contents on error during decompression
...
This should improve detecting root cause for https://github.com/facebook/zstd/issues/2222
2020-08-15 14:51:14 +03:00
Aliaksandr Valialkin
6e863376f7
lib/leveledbytebufferpool: pre-allocate byte slice with the given capacity if the pool is empty
...
This should reduce memory allocations and copying when the byte slice is growing.
2020-08-15 01:41:59 +03:00
Aliaksandr Valialkin
3efa4e4e1c
lib/protoparser: move common code for detecting timeouts to ReadLinesBlockExt
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/696
2020-08-14 20:39:51 +03:00
Aliaksandr Valialkin
c6b0547847
lib/protoparser: prevent from busy loop on repeated timeout errors when reading streams of ingested data
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/696
2020-08-14 20:13:37 +03:00
Aliaksandr Valialkin
9d79a3a99d
lib/memory: add -memory.allowedBytes
command-line flag for setting absolute memory limit for VictoriaMetrics caches
2020-08-14 19:19:10 +03:00
Aliaksandr Valialkin
b996280c65
app/{vminsert,vmagent}: improve documentation for -influxListenAddr
command-line flag
2020-08-14 18:03:08 +03:00
Aliaksandr Valialkin
c82a485cf6
lib/protoparser/prometheus: typo fix in error message
2020-08-14 11:04:15 +03:00
Aliaksandr Valialkin
9e67343756
lib/promscrape: use a hint on body length instead of body capacity
...
This should reduce memory usage for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/689
2020-08-14 01:17:46 +03:00
Aliaksandr Valialkin
b4119bb51e
lib/promscrape: reduce memory usage when scraping big number of targets
...
Thanks to @dxtrzhang for the original idea at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/688
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/689
2020-08-14 01:05:04 +03:00
Aliaksandr Valialkin
1724cc241e
lib/promscrape: properly retry requests on the server closed connection before returning the first response byte
error during service discover API calls and target scrapes
2020-08-13 22:32:29 +03:00
Aliaksandr Valialkin
60c7397be5
all: support %{ENV_VAR}
placeholders in yaml configs in all the vm* components
...
Such placeholders are substituted by the corresponding environment variable values.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/583
2020-08-13 17:17:06 +03:00
Aliaksandr Valialkin
6721e47ae9
app: respect CPU limits set via cgroups
...
Update GOMAXPROCS to limits set via cgroups. This should reduce CPU trashing and reduce memory usage
for cases when VictoriaMetrics components run in containers with CPU limits.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685
2020-08-11 23:01:03 +03:00
Aliaksandr Valialkin
59d95961b8
lib/protoparser: clarify that the string passed to Unmarshal()
function must remain available when the parsed rows are in use
2020-08-11 17:05:21 +03:00
Aliaksandr Valialkin
890cfe5b61
lib/protoparser/influx: accept precision=us
and precision=µ
according to https://docs.influxdata.com/influxdb/v1.8/tools/api/#write-http-endpoint
2020-08-10 20:23:20 +03:00
Aliaksandr Valialkin
e3439a6cd0
lib/promscrape: optimize per-metric hash calculations
...
This increases vmagent performance by up to 10% when scraping big number of metrics
2020-08-10 19:47:50 +03:00
Aliaksandr Valialkin
4ce1368e4b
lib/storage: mention time range used in the query that led to error message
...
This should improve detecting slow queries with too big time ranges
2020-08-10 13:46:29 +03:00
Aliaksandr Valialkin
f92255e803
lib/storage: mention tag filters used in the query that led to error message
...
This should improve detecting invalid or heavy queries that lead to errors.
2020-08-10 13:36:54 +03:00
Aliaksandr Valialkin
b3d4ff7ee2
app/vmstorage: improve error logging when the request times out
2020-08-10 13:17:24 +03:00
Aliaksandr Valialkin
e3999ac010
lib/promscrape: show real timestamp and real duration for the scape on /targets
page
...
Previously the scrape duration may be negative when calculated scrape timestamp drifts away from the real scrape timestamp
2020-08-10 12:40:49 +03:00
Aliaksandr Valialkin
c09c881264
lib/promscrape: make errcheck happy
2020-08-09 13:17:30 +03:00
Aliaksandr Valialkin
2dfb42a8b4
lib/promscrape: export scrape_samples_added
per-target metric like Prometheus does
...
This metric may be useful for detecting targets with high churn rate for the exported metrics.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/683
2020-08-09 12:45:30 +03:00
Aliaksandr Valialkin
fd9f1463df
lib/fs: use WARN instead of ERROR log level for the message when NFS diretory removal temporarily fails
...
this is expected condition, so it is better to use WARN log level for it
2020-08-09 12:07:35 +03:00
Aliaksandr Valialkin
d4be3efc60
lib/promscrape: add a test for scrape config for blackbox exporter
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/684
2020-08-09 12:03:51 +03:00
Aliaksandr Valialkin
67cacb22ac
lib/httpserver: add -tls
, -tlsCertFile
and -tlsKeyFile
command-line flags in every vm binary
...
This makes such binaries compatible with binaries from `master` branch (aka single-node version)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/677
2020-08-07 10:57:32 +03:00
Aliaksandr Valialkin
307281e922
lib/storage: slow down concurrent searches when the number of concurrent inserts reaches the limit
...
This should improve data ingestion performance when heavy searches are executed
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-08-07 08:49:13 +03:00
Aliaksandr Valialkin
dd1d59f57a
lib/storage: properly check timeouts and pace limits
...
Previously they were checked on every iteration for small number of iterations
2020-08-07 08:40:56 +03:00
Aliaksandr Valialkin
a2039b3bbc
app/vmselect: return the upper bound on the number of found time series from storage.Search.Init
...
This is used by a single-node version in order to reduce memory allocations during search.
See bc8381613d
for details.
2020-08-06 19:20:31 +03:00
Aliaksandr Valialkin
b690eeff53
lib/storage: reduce the frequency (and overhead) for timeout and pace limiter checks by 4x
2020-08-06 18:45:47 +03:00
Aliaksandr Valialkin
6c0a92a1ee
lib/pacelimiter: increase scalability for multi-CPU system
2020-08-06 18:33:07 +03:00
Aliaksandr Valialkin
13f8644f8e
lib/storage: optimize prefetching metric names for the given metricIDs
2020-08-06 16:52:58 +03:00
Aliaksandr Valialkin
f789e0fa44
lib/fs: export vm_nfs_pending_dirs_to_remove
metric for monitoring the number of pending directories that couldn't be removed due to NFS lock
2020-08-06 15:31:50 +03:00
Aliaksandr Valialkin
a3e91c593b
lib/storage: limit the number of concurrent calls to storage.searchTSIDs to GOMAXPROCS*2
...
This should limit the maximum memory usage and reduce CPU trashing on vmstorage
when multiple heavy queries are executed.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-08-05 18:27:21 +03:00
Aliaksandr Valialkin
76064ba9e7
Perform conversion from string to []byte according to rule #6 at https://golang.org/pkg/unsafe/#Pointer
2020-08-05 11:55:12 +03:00
Aliaksandr Valialkin
8cc2e01386
lib/backup: allow using ~/.aws/config
without region
...
Use us-west-2 for determining bucket region.
2020-08-04 13:08:05 +03:00
Aliaksandr Valialkin
a2aa3a60eb
app/vmselect: show X-Forwarded-For
contents on /api/v1/status/active_queries
page
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/659
2020-07-31 20:01:09 +03:00
Aliaksandr Valialkin
3149af624d
lib/storage: reduce the maximum number of concurrent merge workers to GOMAXPROCS/2
...
Previously the limit has been raised to GOMAXPROCS, but it has been appeared that this
increases query latencies since more CPUs are busy with merges.
While at it, substitute `*MergeConcurrencyLimitCh` channels with simple integer limits.
2020-07-31 17:53:13 +03:00
Aliaksandr Valialkin
29bbab0ec9
lib/storage: remove prioritizing of merging small parts over merging big parts, since it doesn't work as expected
...
The prioritizing could lead to big merge starvation, which could end up in too big number of parts that must be merged into big parts.
Multiple big merges may be initiated after the migration from v1.39.0 or v1.39.1. It is OK - these merges should be finished soon,
which should return CPU and disk IO usage to normal levels.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-07-30 20:02:22 +03:00
Aliaksandr Valialkin
96039dcb40
lib/storage: properly update vm_slow_row_inserts_total
metric when importing multiple data points per time series at once
...
Previously the `vm_slow_row_inserts_total` metric may be incremented multiple times for different data points per a single time series,
while only a single increment is needed when inserting the first data point for this time series.
2020-07-30 16:17:19 +03:00
Aliaksandr Valialkin
1e067401ba
lib/httpserver: emit X-Forwarded-For additionally to remoteAddr in error logs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/659
2020-07-29 13:12:35 +03:00
Sasasu
96bc476e53
lib/storage: metaindexRow use memroy more efficiently ( #655 )
...
due to memory align the metaindexRow structure use 64-byte pre object.
this commit changes the order of field, make metaindexRow use 56-byte pre
object.
Signed-off-by: Sasasu <su@sasasu.me>
2020-07-27 23:23:25 +03:00
Aliaksandr Valialkin
f26ef58137
lib/protoparser/prometheus: add a test for cassandra-exporter
...
Thanks to Seva
2020-07-27 18:37:46 +03:00
Aliaksandr Valialkin
94cc677b0c
lib/storage: slightly reduce code difference between single-node and cluster versions
2020-07-24 01:18:05 +03:00
Aliaksandr Valialkin
fb3d1380ac
lib/storage: respect -search.maxQueryDuration
when searching for time series in inverted index
...
Previously the time spent on inverted index search could exceed the configured `-search.maxQueryDuration`.
This commit stops searching in inverted index on query timeout.
2020-07-23 21:22:05 +03:00
Aliaksandr Valialkin
dbf3038637
lib/storage: add more fine-grained pace limiting for search
2020-07-23 19:21:49 +03:00
Aliaksandr Valialkin
b8303afcd8
lib/storage: improve prioritizing of data ingestion over querying
...
Prioritize also small merges over big merges.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-07-23 01:40:38 +03:00
Aliaksandr Valialkin
7d0743422b
lib/storage: properly calculate global metrics in UpdateStats()
2020-07-23 00:35:31 +03:00
Aliaksandr Valialkin
6afdcf8a20
lib/mergeset: properly calculate global metrics in UpdateStats()
...
Previously these metrics could be calculated multiple times for multiple mergeset.Table instances.
2020-07-23 00:35:29 +03:00
Aliaksandr Valialkin
23fa44e56e
lib/storage: reorder mergeBlockStreams() args in order to make them more consistent
2020-07-22 21:58:25 +03:00
Aliaksandr Valialkin
754eac676d
lib/storage: prevent possible race condition when all the goroutines exit Storage.AddRows, before goroutines other goroutines are blocked on searchTSIDsCond inside Storage.searchTSIDs
...
This condition may occur after the following sequence of events:
1) A goroutine enters the loop body when len(addRowsConcurrencyCh) == cap(addRowsConcurrencyCh) inside Storage.searchTSIDs.
2) All the goroutines return from Storage.AddRows.
3) The goroutine from step 1 blocks on searchTSIDsCond.Wait() inside the loop body.
The goroutine remains blocked until the next call to Storage.AddRows, which calls searchTSIDsCond.Signal().
This may take indefinite time.
2020-07-22 21:52:42 +03:00
Aliaksandr Valialkin
a3f48e395e
app/vmagent: add -remoteWrite.decimalPlaces
command-line flag, which may be used for reducing disk space usage on the remote storage
2020-07-21 21:55:42 +03:00
Aliaksandr Valialkin
67be79a0bc
lib/uint64set: optimize adding items to the set via Set.AddMulti
2020-07-21 20:57:05 +03:00
Aliaksandr Valialkin
31ef39e8da
lib/httpserver: log remote address in error message from httpserver.Errorf
...
This should improve detection of the root cause of errors.
Thanks to Anant for the idea.
2020-07-20 14:06:29 +03:00
Aliaksandr Valialkin
be0ab4fbfe
lib/storage: reset MetricName->TSID
cache after marking metricIDs as deleted
...
This is a follow-up commit after 12b16077c4
,
which didn't reset the `tsidCache` in all the required places.
This could result in indefinite errors like:
missing metricName by metricID ...; this could be the case after unclean shutdown; deleting the metricID, so it could be re-created next time
Fix this by resetting the cache inside deleteMetricIDs function.
2020-07-14 14:05:19 +03:00
Aliaksandr Valialkin
a4c96d9e6d
lib/protoparser: properly update vm_protoparser_rows_read_total{type="promscrape"}
metric
2020-07-14 12:15:56 +03:00
Seva Poliakov
a5e713b6e0
add vm_protoparser_rows_read_total metrics to promscrape ( #624 )
...
* add vm_protoparser_rows_read_total metrics to promscrape
move vm_protoparser_rows_read_total for promscrape to better place
move vm_protoparser_rows_read_total for promscrape to better place
* remove possibility of infinity loop at prometheus parser
2020-07-14 12:02:25 +03:00
Roman Khavronenko
207e93b50d
lib/flagutil: specify additional description for all Array type flags ( #620 )
...
Array type flag is now defined as `value` type in flag description when printed.
This change adds additional description to every Array type flag so it would be
clear what exact type is used:
```
-remoteWrite.urlRelabelConfig array
Optional path to relabel config for the corresponding -remoteWrite.url
Supports array of values separated by comma or specified via multiple flags.
```
2020-07-13 22:00:03 +03:00
Roman Khavronenko
605711bde5
lib/persistentqueue: add vm_persistentqueue_bytes_pending
metric ( #619 )
...
Metric `vm_persistentqueue_bytes_pending` is a gauge that shows current amount
of bytes in persistentqueue flushed on disk as a difference between write and read
offsets. This metric is very similar to `vmagent_remotewrite_pending_data_bytes`
except of accounting for bytes in-memory.
2020-07-13 21:54:54 +03:00
Roman Khavronenko
a02097e657
Extend metric vm_promscrape_targets
with status
label ( #615 )
...
The change to `vm_promscrape_targets` metric suppose to improve observability
for `vmagent` so it will be possible to track how many targets are up or down
for every specific scrape group:
```
vm_promscrape_targets{type="static_configs", status="down"} 1
vm_promscrape_targets{type="static_configs", status="up"} 2
```
2020-07-13 21:54:53 +03:00
Aliaksandr Valialkin
6373d377ef
app/{vminsert,vmagent}: add ability to import data in Prometheus exposition format via /api/v1/import/prometheus
2020-07-10 12:13:28 +03:00
Aliaksandr Valialkin
2012e294d1
properly calculate readCalls
2020-07-10 12:01:05 +03:00
Aliaksandr Valialkin
87f8c728bf
lib/promscrape: send Accept
header similar to Prometheus when scraping targets
...
This should fix scraping Spring Boot servers, which return incorrect response
unless `Accept: text/plain` request header is set.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/608
2020-07-08 19:50:06 +03:00
Aliaksandr Valialkin
7335743d57
lib/storage: limit the maximum concurrency for data ingestion to GOMAXPROCS
...
Previously the concurrency has been limited to GOMAXPROCS*2. This had little sense,
since every call to Storage.AddRows is bound to CPU, so the maximum ingestion bandwidth
is achieved when the number of concurrent calls to Storage.AddRows is limited to the number of CPUs,
i.e. to GOMAXPROCS.
2020-07-08 17:34:27 +03:00
Roman Khavronenko
929ad74de6
lib/protoparser: fix metric name of unmarshal errors in promremotewrite ( #607 )
...
The change fixes the typo in metric name `vm_protoparser_unmarshal_errors` to
respect the naming standard.
2020-07-08 14:19:27 +03:00
Aliaksandr Valialkin
e401b8d527
lib/protoparser/graphite: go fmt
2020-07-08 14:13:06 +03:00
Aliaksandr Valialkin
50ecf09042
lib/protoparser/graphite: add more tests after eb45185eef
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/610
2020-07-08 14:13:03 +03:00
Seva Poliakov
1ae0334e17
Fix graphite minus one timestamp ( #609 )
...
* fix graphite -1 timestamp
* format the graphite fix -1 timestamp
2020-07-08 14:13:01 +03:00
Aliaksandr Valialkin
fad008df7e
lib/storage: clarify out of retention period
error message by mentioning -retentionPeriod
command-line flag
2020-07-08 13:54:13 +03:00
Aliaksandr Valialkin
fe58462bef
lib/storage: reset MetricName->TSID cache after deleting time series
...
This should prevent from adding new data points to deleted time series
without the need to check for the deleted time series.
This improves ingestion performance a bit when the `deleted time series ids` aka `dmis` set
contains big number of time series.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/596
Based on the idea from @n4mine at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/604
2020-07-06 22:01:24 +03:00
Aliaksandr Valialkin
77bb0e6595
lib/fs: clarify description for -fs.disableMmap
command-line flag
2020-07-06 14:28:57 +03:00
Aliaksandr Valialkin
0bff96fe4b
lib/storage: prioritize data ingestion over heavy queries
...
Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream.
Prevent this by delaying queries' execution until free resources are available for data ingestion.
Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources
for data ingestion and/or for executing heavy queries.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2020-07-05 19:44:04 +03:00
Aliaksandr Valialkin
6f1d926698
lib/promscrape: use HostClient.DoDeadline instead of HostClient.Do in order to guarantee strict deadline across multiple scrape attempts
2020-07-03 21:33:48 +03:00
Aliaksandr Valialkin
ee03b4ccbd
lib/promscrape: prevent from too big deadline misses on scrape retries
...
The maximum deadline miss duration is reduced to 2x scrape_interval in the worst case.
By default it is limited to scrape_interval configured for the given scrape target.
2020-07-03 20:42:09 +03:00
Aliaksandr Valialkin
dfa83a4a35
lib/promscrape: check for nil error before checking for the returned status code when scraping targets
2020-07-03 18:37:25 +03:00
Aliaksandr Valialkin
8bb3622e9d
app/vminsert: prevent from adding and/or selecting labels with empty values
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600
2020-07-02 23:17:12 +03:00
Aliaksandr Valialkin
6ebac3ab63
app/vminsert: add ability to apply relabeling to all the incoming metrics if -relabelConfig
command-line arg points to a file with a list of relabel_config
entries
...
See https://victoriametrics.github.io/#relabeling
2020-07-02 20:36:33 +03:00
Aliaksandr Valialkin
2361ad8ab4
lib/promscrape: add ability to set disable_compression
and disable_keepalive
options in scrape_config
section of the config passed to -promscrape.config
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-07-02 14:19:34 +03:00
Aliaksandr Valialkin
0f754bea49
lib/promscrape: add -promscrape.disableKeepAlive
command-line flag for disabling http keep-alive connections when scraping targets
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-07-01 02:20:46 +03:00
Aliaksandr Valialkin
4cb3e7595c
app/vmstorage: add -denyQueriesOutsideRetention
command-line flag for denying queries outside the configured retention
2020-07-01 00:58:42 +03:00
Aliaksandr Valialkin
81e3d4305f
lib/httpserver: add Unwrap method to ErrorWithStatusCode, so As
and Is
functions in standard errors
package may properly unwrap the error inside ErrorWithStatusCode
2020-07-01 00:53:49 +03:00
Aliaksandr Valialkin
fe77d661b3
all: use errors.As
instead of type assertion for detecting net.Error
2020-07-01 00:16:13 +03:00
Aliaksandr Valialkin
0c4e8aeb2b
all: use errors.As
for inspecting errors that implement httpserver.ErrorWithStatusCode
2020-07-01 00:03:11 +03:00
Aliaksandr Valialkin
d962568e93
all: use %w instead of %s for wrapping errors in fmt.Errorf
...
This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode .
See https://blog.golang.org/go1.13-errors for details.
2020-06-30 23:33:46 +03:00
Aliaksandr Valialkin
5a43842bd3
lib/promscrape: add missing label sorting for autogenerated metrics
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/592
2020-06-29 22:39:40 +03:00
Ween
b42cf33c4d
Fix Auto metrics relabeled errors ( #593 )
...
* Fix Auto metrics relabeled errors
* Finalize auto-genenated Labels
* Fix Test Errors
Co-authored-by: xinyulong <xinyulong@kuaishou.com>
2020-06-29 22:39:39 +03:00
Aliaksandr Valialkin
aad38c8283
lib/promrelabel: properly apply ^
and $
anchors to regex
value in Prometheus relabeling rules
2020-06-25 17:19:02 +03:00
Aliaksandr Valialkin
fd7a3d880e
lib/fs: go fmt
2020-06-23 23:03:08 +03:00
Aliaksandr Valialkin
08edb90814
lib/fs: fall back to cgo copy for copying the last 4KB of mmaped data
...
This probably should fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/581
2020-06-23 22:55:56 +03:00
Aliaksandr Valialkin
3a444bb7bb
lib/promrelabel: add support for keep_if_equal
and drop_if_equal
actions to relabel configs
...
These actions may be useful for filtering out unneeded targets and/or metrics if they contain equal label values.
For example, the following rule would leave the target only if __meta_kubernetes_annotation_prometheus_io_port
equals __meta_kubernetes_pod_container_port_number:
- action: keep_if_equal
source_labels: [__meta_kubernetes_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number]
2020-06-23 17:29:19 +03:00
Aliaksandr Valialkin
de7e585ac8
lib/promscrape: preserve the previously discovered targets on discovery errors per each job_name
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/582
2020-06-23 15:42:46 +03:00
Aliaksandr Valialkin
521c657f8d
lib/fs: an attempt to fix SIGBUS error by rounding mmap`ed region to multiple of 4KB pages
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/581
2020-06-23 13:40:20 +03:00
Aliaksandr Valialkin
5fb60dd647
lib/logger: add -loggerErrorsPerSecondLimit
for limiting the rate of ERROR messages
2020-06-23 12:42:59 +03:00
Aliaksandr Valialkin
a80e852aab
lib/promscrape: retry performing the request to the server for up to 3 times before giving up when it closes keep-alive connections
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-06-23 12:34:12 +03:00
Aliaksandr Valialkin
70bf8218bb
app/vmselect/promql: properly override label values from group_left
and group_right
lists like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/577
2020-06-21 16:32:27 +03:00
Aliaksandr Valialkin
50aa34bcbe
lib/promscrape/discovery/consul: reduce load on Consul when discovering big number of targets by using background caching
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-06-20 18:20:07 +03:00
Aliaksandr Valialkin
62e1908986
lib/promscrape: reduce default value for -promscrape.discovery.concurrency
from 500 to 100
...
This should reduce load on Kubernetes API server and Consul when big number of targets are discovered
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-06-20 17:53:48 +03:00
Aliaksandr Valialkin
1f2826bae2
lib/promscrape/discovery/ec2: expose __meta_ec2_ami
like the next Prometheus release will do
...
See b5d61fb66c
for details
2020-06-20 17:45:30 +03:00
Tristan Su
c254b683fd
lib/storage: set big/small merge concurrency ( #568 )
...
fixed #567
Co-authored-by: Tristan Su <suqing.sq@alibaba-inc.com>
2020-06-19 02:21:55 +03:00
Aliaksandr Valialkin
2e5b6220a4
lib/promrelabel: allows regex capture groups in target_label like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/569
2020-06-19 02:20:58 +03:00
Aliaksandr Valialkin
4f673a5201
app/vminsert: export metrics for determining ingested rows with dropped or truncated labels
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/565
2020-06-19 01:12:44 +03:00
Aliaksandr Valialkin
5f3a895c23
lib/storage: add key!=".+"
filter additionally to negative filter matching empty value such as key!~"|foo"
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/546
2020-06-18 20:05:45 +03:00
Aliaksandr Valialkin
c40f29f783
lib/storage: properly match {tag!="|foo"}
filters
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/546
2020-06-10 19:34:37 +03:00
Aliaksandr Valialkin
9f55dea162
lib/httpserver: do not flush and do not close gzip writer if response compression is disabled
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/535
2020-06-05 21:37:46 +03:00
Aliaksandr Valialkin
cf91a94daf
lib/backup: properly create missing parent directories in fs.CreateFile
2020-06-05 19:28:25 +03:00
Aliaksandr Valialkin
ba1f764b29
lib/fs: optimize queries that read recent samples for big number of time series
...
Use standard copy() func instead of mmap-aware copy func for reading recently touched mmap-ed data.
This improves read performance by up to 4x.
2020-06-05 19:10:22 +03:00
Aliaksandr Valialkin
2358d9e41d
lib/fs: add a benchmark for ReaderAt.MustReadAt
2020-06-05 19:10:21 +03:00
Aliaksandr Valialkin
8b0d9df51d
lib/backup/fsremote: create all the parent directories before creating file in CreateFile
2020-06-05 10:25:28 +03:00
Aliaksandr Valialkin
3d0a0b3785
lib/fs: optimize MustGetFreeSpace performance by caching the results for up to 2 seconds
2020-06-04 13:14:04 +03:00
Vyacheslav Mitrofanov
89a922fb19
allow to use values lower than 10 with the flag -memory.allowedPercent ( #531 )
...
Co-authored-by: Vyacheslav Mitrofanov <vmitrofanov@mfms.ru>
2020-06-03 23:40:13 +03:00
Aliaksandr Valialkin
304f9499cf
lib/bytesutil: prevent from garbage collecting s before returning from ToUnsafeBytes
2020-06-03 00:23:27 +03:00
Aliaksandr Valialkin
eca1afdc20
lib/storage: fix Graphite wildcard matching, which has been broken in v1.36.0
2020-05-28 11:58:47 +03:00
Aliaksandr Valialkin
b0131c79b6
lib/storage: improve search speed for time series matching Graphite whildcards such as foo.*.bar.baz
...
Add index for reverse Graphite-like metric names with dots. Use this index during search for filters
like `__name__=~"foo\\.[^.]*\\.bar\\.baz"` which end with non-empty suffix with dots, i.e. `.bar.baz` in this case.
This change may "hide" historical time series during queries. The workaround is to add `[.]*` to the end of regexp label filter,
i.e. "foo\\.[^.]*\\.bar\\.baz" should be substituted with "foo\\.[^.]*\\.bar\\.baz[.]*".
2020-05-27 21:48:08 +03:00
Aliaksandr Valialkin
301838e7b1
lib/httpserver: properly set status code for empty response
2020-05-24 23:55:55 +03:00
Aliaksandr Valialkin
64bec11c91
lib/httpserver: fix compression for static files
2020-05-24 22:16:51 +03:00
Aliaksandr Valialkin
b747362936
lib/promscrape: mention about -promscrape.maxScrapeSize in the error message when target returns too big response
2020-05-24 14:41:24 +03:00
Aliaksandr Valialkin
be7253c084
lib/httpserver: do not recompress already compressed response
...
This shoud help with vmauth issue - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/514
2020-05-22 16:45:20 +03:00
Aliaksandr Valialkin
b59e089ac7
app/vmagent: add -dryRun
option for checking all the configs mentioned in command-line flags without running vmagent
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/362
2020-05-21 15:23:18 +03:00
Aliaksandr Valialkin
482bae8466
lib/promscrape: add -promscrape.config.dryRun
flag for checking -promscrape.config
for errors or unsupported options
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/508
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/362
2020-05-21 14:54:32 +03:00
Aliaksandr Valialkin
73ec5cf460
lib/promscrape: add -promscrape.discovery.concurrency
and -promscrape.discovery.concurrentWaitTime
flags for tuning the number of concurrent requests to autodiscovery API servers at Consul or Kubernetes
2020-05-19 17:35:59 +03:00
Aliaksandr Valialkin
2a8f1e6931
lib/storage: do not increment vm_slow_metric_name_loads_total
counter for metric_ids which shouldnt be prefetched, since this may mislead users
2020-05-16 10:23:39 +03:00
Aliaksandr Valialkin
dc16cdd1ca
lib/persistentqueue: a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/484
2020-05-16 09:32:30 +03:00
肖贝贝
c154a92d29
fix: fix vmagent multi queue may become one because sync bug ( #484 )
...
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-05-16 09:32:29 +03:00
Aliaksandr Valialkin
0f3d46810b
lib/backup: remove misleading -dst
mention in error message
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/482
2020-05-15 17:13:27 +03:00
Aliaksandr Valialkin
e72518e8c6
lib/backup: donload only the remaining parts for partially downloaded files after vmrestore
restart
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/487
2020-05-15 17:03:25 +03:00
Aliaksandr Valialkin
1e5c1d7eaa
app/vmstorage: add vm_slow_metric_name_loads_total
metric, which could be used as an indicator when more RAM is needed for improving query performance
2020-05-15 14:12:24 +03:00
Aliaksandr Valialkin
d6b9a49481
app/vmstorage: add vm_slow_row_inserts_total
and vm_slow_per_day_index_inserts_total
metrics for determining whether VictoriaMetrics required more RAM for the current number of active time series
2020-05-15 13:46:57 +03:00
Aliaksandr Valialkin
a72f18e821
lib/{storage,mergeset}: further tuning of compression levels depending on block size
...
This should improve performance for querying newly added data, since it can be unpacked faster.
2020-05-15 13:12:28 +03:00
Aliaksandr Valialkin
2cf2e9955b
lib/storage: wait for all the goroutines to finish in TestSearch in order to prevent racy behavior on test finish
2020-05-15 12:12:20 +03:00
Aliaksandr Valialkin
67e331ac62
lib/storage: optimize ingestion pefrormance for new time series
2020-05-15 12:12:19 +03:00
Aliaksandr Valialkin
6838fa876c
lib/mergeset: tune compression levels in order to improve ingestion performance a bit
2020-05-15 12:12:15 +03:00
Aliaksandr Valialkin
1b5d272e07
lib/storage: reduce indentation in Storage.add
2020-05-14 23:23:56 +03:00
Aliaksandr Valialkin
71d29a8fa1
lib/storage: return the first error instead of the last error, since the first error usually points to the root cause
2020-05-14 23:18:59 +03:00
Aliaksandr Valialkin
3845420a8f
lib: extract common code for returning fast unix timestamp into lib/fasttime
2020-05-14 23:06:50 +03:00
Aliaksandr Valialkin
7e831741f9
lib/{storage,mergeset}: return dst on error from unmarshalBlockHeaders, so it could be reused
2020-05-14 15:32:23 +03:00
Aliaksandr Valialkin
2f42b85e0e
lib/storage: document that getnerateUniqueMetricID should return dense ids
2020-05-14 14:08:59 +03:00
Aliaksandr Valialkin
f442d81648
lib/{storage,mergeset}: cleanup: remove unused partSearch.indexBlockReuse
2020-05-14 14:03:15 +03:00
Aliaksandr Valialkin
8bb44a5d09
lib/storage: optimize label matching for regexp ending with literal suffix
...
For example, `{label=~"foo.*bar.+baz"}` contains literal suffix `baz`,
so it should work faster now.
2020-05-13 11:39:05 +03:00
Aliaksandr Valialkin
3b0f66a227
app/vmagent: fix a bug with improper relabeling when multiple -remoteWrite.urlRelableConfig
args are set
...
This bug could result in incorrect relabeling and metrics' drop.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/467
2020-05-12 22:03:45 +03:00
Aliaksandr Valialkin
c9ab6dc532
lib/fs: do not use mmap for 32-bit arches by default, since they cannot map files bigger than 4GB in RAM
2020-05-12 20:21:39 +03:00
Aliaksandr Valialkin
d54a93fc81
app/vmagent: fix scraping mTLS targets, which has been broken in v1.35.1
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470
2020-05-12 17:23:43 +03:00
Aliaksandr Valialkin
405cf44aed
app/vmagent,lib/promscrape: do not set HostClient.DialDualStack, since it isnt used if HostClient.Dial is set
2020-05-12 15:24:53 +03:00
Aliaksandr Valialkin
bd5f4e0344
lib/storage: properly initialize part struct before trying to close it on error
...
This should prevent from nil pointer dereference bug at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/468 .
2020-05-12 14:54:16 +03:00
Aliaksandr Valialkin
f7753b1469
lib/storage: gradually pre-populate per-day inverted index for the next day
...
This should prevent from CPU usage spikes at 00:00 UTC every day when
inverted index for new day must be quickly created for all the active time series.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430
2020-05-12 12:13:32 +03:00
Aliaksandr Valialkin
8c77cb436a
lib/storage: typo fixes in error messages: or -> of
2020-05-12 12:12:33 +03:00
Aliaksandr Valialkin
bbf06a4248
lib/storage: speed up matching for common regexps in label filters
...
The following regexps have been optimized:
* 'foo.+bar'
* 'foo.+bar.+baz'
This should improve performance for matching Graphite-like metrics.
2020-05-11 22:49:01 +03:00
Aliaksandr Valialkin
37254a139a
lib/storage: add a benchmark for Graphite-like regexps for metric names
2020-05-11 22:49:00 +03:00
Aliaksandr Valialkin
2f28e945b8
lib/httpserver: add -http.shutdownDelay
flag for a grace period before http server shutdown
...
The http server returns 503 non-OK error at `/health` page during grace period,
so load balancers in front of the http server could re-route incoming requests
to other servers.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/463
2020-05-07 15:25:51 +03:00
Aliaksandr Valialkin
3052b479b7
lib/httpserver: reduce typical duration for http server graceful shutdown
...
Previously the duration for graceful shutdown for http server could take more than a minute
because of imporperly set timeouts in setNetworkTimeout.
Now typical duration for graceful shutdown should be reduced to less than 5 seconds.
2020-05-07 14:16:38 +03:00
Aliaksandr Valialkin
c43a265716
lib/flagutil: make errcheck happy by explicitly ignoring Array.Set result in tests
2020-05-06 22:37:28 +03:00
Aliaksandr Valialkin
15e3682b40
lib/flagutil: properly parse quoted flag values for flagutil.Array
2020-05-06 22:28:15 +03:00
Aliaksandr Valialkin
20538a2a5d
app/vmagent: allow setting independent auth configs per each configured -remoteWrite.url
2020-05-06 16:52:32 +03:00
Aliaksandr Valialkin
9f39e618ed
lib/promscrape/discovery/gce: discover per-zone instances for gce_sd_config
in parallel. This should reduce discovery latency
2020-05-06 15:00:23 +03:00
Aliaksandr Valialkin
8ab5e47b5c
lib/promscrape: add Prometheus-compatible DNS-based service discovery aka dns_sd_configs
2020-05-06 00:02:41 +03:00
Aliaksandr Valialkin
42d563934b
lib/promscrape: properly connect to TCP6 addresses if -enableTCP6
is set
2020-05-06 00:02:40 +03:00
Aliaksandr Valialkin
1c8e97c8a0
lib/procutil: add NewSighupChan function, which returns a channel, which is triggered on every SIGHUP
2020-05-05 10:56:15 +03:00
Aliaksandr Valialkin
054457d1f4
lib/promscrape: allow explicitly setting empty token via token: ""
in consul_sd_config
2020-05-05 07:49:54 +03:00
Aliaksandr Valialkin
89aa6dbf56
lib/promscrape: add Prometheus-compatible service discovery for Consul aka consul_sd_configs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330
2020-05-04 20:53:06 +03:00
Aliaksandr Valialkin
28e0e8fd88
lib/promauth: properly set up client certificate in tls.Config
...
Previously the client certificate has been mistakenly set up as a server certificate
2020-05-04 20:53:04 +03:00
Aliaksandr Valialkin
ed91fe1d9b
lib/promscrape: move common code for discovery api config map handling into discoveryutils
2020-05-04 20:52:58 +03:00
Aliaksandr Valialkin
c50fd219dc
lib/promscrape/discovery/kubernetes/: unify apiConfig creation
2020-05-04 20:52:53 +03:00
Aliaksandr Valialkin
a5880f17af
lib/promscrape: remove debug line left after the commit e4aac6ea40
2020-05-03 17:16:19 +03:00
Aliaksandr Valialkin
1f0e8fdc0d
lib/promscrape: fix tests after the commit 658a8742ac
...
The original commit copies `__address__` label to `instance` label when generating per-target labels as Prometheus does.
See https://www.robustperception.io/life-of-a-label for details.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/453
2020-05-03 16:59:29 +03:00
DexterZhang
317688f144
fix(vmagent): different behavior as how prometheus deal with labels. [Issue#453] ( #454 )
2020-05-03 16:59:28 +03:00
Aliaksandr Valialkin
ab1e6a76bb
lib/promscrape: make consistent scrape time offsets across reloads for the same ScrapeURL and Labels
...
This should make consistent intervals between data points for scrape targets across reloads.
Previously these intervals were random.
2020-05-03 14:31:22 +03:00
Aliaksandr Valialkin
f25416984b
lib/promscrape: fix TestGetFileSDScrapeWorkSuccess after 3b234d82e5
2020-05-03 14:31:20 +03:00
Aliaksandr Valialkin
f422203e10
lib/promscrape: reload only modified scrapers on config changes
...
This should improve scrape stability when big number of targets are scraped and these targets are frequently changed.
Thanks to @xbsura for the idea and initial implementation attempts at the following pull requests:
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/449
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/458
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/459
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/460
2020-05-03 12:47:16 +03:00
Aliaksandr Valialkin
bbaca16ce8
lib/httpserver: rename http.externalURL
to http.pathPrefix
and improve help message for this flag
...
The `http.externalURL` flag name was slightly misleading, so it has been renamed to `http.pathPrefix`.
2020-05-02 13:12:24 +03:00
DexterZhang
a0589f2ca5
feat(httpserver): add http.externalUrl
config to http server, it adds prefix to http path automatically ( #452 )
2020-05-02 13:12:23 +03:00
Aliaksandr Valialkin
b21b73115a
app/vminsert: add /-/reload
handler in the same way as for vmagent
2020-04-30 02:18:08 +03:00
Aliaksandr Valialkin
a970705d8e
lib/procutil: prevent from app termination on SIGHUP signal, since this signal is frequently used for config reload
2020-04-30 02:18:06 +03:00
Aliaksandr Valialkin
d99f48aa48
lib/httpserver: mention that -http.maxGracefulShutdownDuration
command-line flag value can be increased on shutdown timeout
2020-04-30 01:37:02 +03:00
Aliaksandr Valialkin
de5f923476
lib/promscrape: set 30 seconds timeout for discovery api requests
...
Previously such requests could hang for long time. This could make debugging harder.
2020-04-29 17:29:03 +03:00
Aliaksandr Valialkin
b6d88bac04
vendor: use github.com/VictoriaMetrics/fasthttp instead of github.com/fasthttp/fasthttp
...
The upstream fasthttp may contain issues like 996610f021
,
plus a code that isn't used by VictoriaMetrics. So let's use a private copy under our control instead.
2020-04-29 16:43:09 +03:00
Aliaksandr Valialkin
9ed4951ec8
lib/metricsql: move it to a separate repository - github.com/VictoriaMetrics/metrics
2020-04-28 15:30:06 +03:00
Aliaksandr Valialkin
d78ed50edd
lib/storage: recover when metricID->metricName entry is missing in the inverted index after unclean shutdown
...
Newly added index entries can be missing after unclean shutdown, since they didn't flush to persistent storage yet.
Log about this and delete the corresponding metricID, so it could be re-created next time.
2020-04-28 12:01:32 +03:00
Aliaksandr Valialkin
53740d0026
lib/promscrape: handle connection reset when targets responds with http redirect
2020-04-28 02:14:32 +03:00
肖贝贝
3e6f29f462
fix: vmagent not follow 301/302 redirect bug ( #445 )
...
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-04-28 02:14:31 +03:00
Aliaksandr Valialkin
424068f804
lib/promscrape: handle connection reset when targets responds with http redirect
2020-04-28 02:14:26 +03:00
肖贝贝
7d045bf2ca
fix: vmagent not follow 301/302 redirect bug ( #445 )
...
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-04-28 02:14:25 +03:00
Aliaksandr Valialkin
2aecf7c37c
lib/{encoding,decimal}: typo fixes in tests: epxecting->expecting
2020-04-28 00:02:19 +03:00
Aliaksandr Valialkin
806dc73d8a
lib/encoding: reduce possibility of failure in TestMarshalInt64ArraySize
2020-04-28 00:02:18 +03:00
Aliaksandr Valialkin
a603a15757
lib/promscrape/discovery/gce: make golangci-lint happy
2020-04-27 19:29:42 +03:00
Aliaksandr Valialkin
86a1d9cb0c
lib/promscrape: add initial support for Prometheus-compatible service discovery for Amazon EC2 aka ec2_sd_configs
2020-04-27 19:29:22 +03:00
Aliaksandr Valialkin
1acb6eb25a
lib/promscrape/discovery/gce: properly set filter
query arg in api url
2020-04-27 16:01:53 +03:00
Aliaksandr Valialkin
0daa37fa02
lib/promscrape/discovery/gce: allow empty project and zone for gce_sd_config
2020-04-27 11:45:45 +03:00
Aliaksandr Valialkin
989d84cf3f
app/{vminsert,vmstorage}: wait for ack
from vmstorage
after each packet sent to it from vminsert
...
This should protect from possible data loss when `vmstorage` is stopped while the packet is sent from `vminsert`.
This commit switches to new protocol between vminsert and vmstorage, which is incompatible
with the previous protocol. So it is required that both vminsert and vmstorage nodes are updated.
2020-04-27 09:53:26 +03:00
Aliaksandr Valialkin
e933cbac16
lib/storage: postpone reading data from blocks during search
...
This eliminates the need for storing block data into temporary files on a single-node VictoriaMetrics
during heavy queries, which touch big number of time series over long time ranges.
This improves single-node VM performance on heavy queries by up to 2x.
2020-04-27 08:44:01 +03:00
Aliaksandr Valialkin
31861c5b8e
lib/promscrape/discovery/gce: allow empty zone
arg in gce_sd_config
- in this case zones for the given project are automatically discovered
2020-04-26 14:37:38 +03:00
Aliaksandr Valialkin
b16e19c053
lib/storage/dedup.go: go fmt
2020-04-26 14:37:36 +03:00
Aliaksandr Valialkin
a0000c3a6e
lib/storage: improve deduplication algorithm
...
Now it leaves only the first data point on each `-dedup.minScrapeInterval` interval.
Previously it may leave two data points on the interval. This could lead to unexpected results
for `histogram_quantile(phi, sum(rate(buckets)) by (le))` query.
2020-04-26 13:10:18 +03:00
Aliaksandr Valialkin
13b4069c59
lib/storage: postpone label filters matching too many time series instead of giving up with error
...
This should reduce the frequency of the following errors:
cannot find tag filter matching less than N time series; either increase -search.maxUniqueTimeseries or use more specific tag filters
more than N time series found on the time range [...]; either increase -search.maxUniqueTimeseries or shrink the time range
2020-04-24 21:18:52 +03:00
Aliaksandr Valialkin
7c74efd640
lib/promscrape/discovery/gce: make golint happy by ignoring resp.Body.Close() result
2020-04-24 18:13:26 +03:00
Aliaksandr Valialkin
069690e3bd
lib/promscrape: initial implementation for gce_sd_configs
aga Prometheus-compatible service discovery for Google Compute Engine
2020-04-24 17:53:43 +03:00
Aliaksandr Valialkin
de991551f5
lib/promscrape: query /api/v1/namespaces/*
for the configured namespaces in kubernetes_sd_config
...
This should fix authroization issues described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/432
2020-04-24 14:42:02 +03:00
Aliaksandr Valialkin
387a21c96d
lib/promscrape: add -promscrape.configCheckInterval
command-line flag for automating config checking
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/431
2020-04-23 23:41:26 +03:00
Aliaksandr Valialkin
83e4c8427e
lib/promscrape: access Config entries by reference, so they can be compared by addresses
2020-04-23 14:38:29 +03:00
Aliaksandr Valialkin
e220f3eeb6
lib/promscrape: move KubernetesSDConfig to lib/promscrape/discovery/kubernetes
2020-04-23 11:34:30 +03:00
Aliaksandr Valialkin
1187494c8f
lib/promscrape/discovery/kubernetes: hide role switch logic behind GetLabels function
2020-04-22 22:16:18 +03:00
Aliaksandr Valialkin
f9526809e5
app/vmselect: add /api/v1/status/tsdb
page with useful stats for locating root cause for high cardinality issues
...
See https://prometheus.io/docs/prometheus/latest/querying/api/#tsdb-stats
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/425
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/268
2020-04-22 22:03:23 +03:00
Aliaksandr Valialkin
f3e5722257
lib/writeconcurrencylimiter: improve docs for -maxConcurrentInserts command-line flag
2020-04-20 21:03:09 +03:00
Aliaksandr Valialkin
81481abaa9
lib/promscrape/discovery/kubernetes: reuse a client for empty api_server
inside different jobs
2020-04-20 17:07:37 +03:00
Aliaksandr Valialkin
6764efde39
lib/promscrape/discovery/kubernetes: update stale comments
2020-04-17 14:06:26 +03:00
Aliaksandr Valialkin
d86640d609
lib/promscrape: suppress scrape errors if -promscrape.suppressScrapeErrors
flag is set
2020-04-16 23:41:52 +03:00
Aliaksandr Valialkin
70104f3fb1
lib/promscrape: print all the labels for the target on error message for failed scrape
...
This should improve debuggability.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/420
2020-04-16 23:35:10 +03:00
Aliaksandr Valialkin
266bbec52d
lib/promscrape: retry target scraping when the target closes previously established keep-alive connection to it
...
This should fix the following error:
the server closed connection before returning the first response byte. Make sure the server returns 'Connection: close' response header before closing the connection
2020-04-16 23:25:34 +03:00
Aliaksandr Valialkin
b2d009c8db
lib/logger: typo fix
2020-04-16 00:20:02 +03:00
Aliaksandr Valialkin
d4bc60d63c
lib/logger: add WARN level for logging expected errors such as invalid user queries
2020-04-15 20:50:45 +03:00
Aliaksandr Valialkin
a873b553cf
app/vmselect: handle timestamp(metric offset X)
the same way as Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/415
2020-04-15 12:01:05 +03:00
Aliaksandr Valialkin
99f0cb1f5f
lib/promscrape: code cleanup in runScraper func
2020-04-15 11:36:35 +03:00
Aliaksandr Valialkin
e9d9638627
lib/storage: skip metricID if the corresponding metricID->metricName is missing in inverted index during search
...
This case is possible when the corresponding metricID->metricName entry didn't propagate to inverted index yet.
This should fix the following error:
error when searching tsids for tfss [...]: cannot find metricName by metricID 1582417212213420669: EOF
2020-04-15 00:10:11 +03:00
Aliaksandr Valialkin
6ec582acb9
lib/promscrape: show information on improperly configured scrape targets at the bottom of /targets
page
...
This is a common error whith improperly configured target autodiscovery and/or relabeling.
This error leads to duplicate scraping of the same targets with the same set of labels, which leads
to duplicate samples in time series.
2020-04-14 14:55:13 +03:00
Aliaksandr Valialkin
391fb0903e
lib/promscrape/discovery/kubernetes: remove only unused client for API server during cleaning
2020-04-14 14:19:26 +03:00
Aliaksandr Valialkin
636e1578de
lib/promscrape: add promrelabel.GetLabelValueByName helper function
2020-04-14 14:12:15 +03:00
Aliaksandr Valialkin
3945bf9dec
lib/promscrape: mention job name in error messages when target cannot be scraped
...
This should improve debuggability
2020-04-14 13:33:18 +03:00
Aliaksandr Valialkin
66da177fe9
lib/promscrape: reset ScrapeWork.ID in tests
2020-04-14 13:31:37 +03:00
Aliaksandr Valialkin
88366cad15
lib/promscrape: properly expose statuses for targets with duplicate scrape urls at /targets
page
...
Previously targets with duplicate scrape urls were merged into a single line on the page.
Now each target with duplicate scrape url is displayed on a separate line.
2020-04-14 13:10:06 +03:00
Aliaksandr Valialkin
09f796e2ab
lib/promscrape: remove labels starting with __meta_
after applying relabel_configs
as Prometheus does
...
This should reduce CPU load during scraping when target discovery generates
big number of `__meta_*` labels (for instance, k8s discovery).
See https://www.robustperception.io/life-of-a-label for details.
2020-04-14 12:23:30 +03:00
Aliaksandr Valialkin
f58d15f27c
lib/promscrape: rename 'scrape_config->scrape_limit' to 'scrape_config->sample_limit'
...
`scrape_config` block from Prometheus config contains `sample_limit` field,
while in `vmagent` this field was mistakenly named as `scrape_limit`.
2020-04-14 12:00:03 +03:00
Aliaksandr Valialkin
7c4fb038e3
lib/promscrape: add initial support for kubernetes_sd_config
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/334
2020-04-13 21:03:53 +03:00
Aliaksandr Valialkin
4017163393
lib/promscrape: add -promscrape.config.strictParse
flag for detecting errors in -promscrape.config
file
2020-04-13 13:15:52 +03:00
Aliaksandr Valialkin
7fbfef2aee
lib/promscrape: extract common auth code to lib/promauth
2020-04-13 12:59:22 +03:00
Aliaksandr Valialkin
e0c6da8e2a
lib/storage: disable deduplication after dedup tests are complete
...
The rest of tests expect that the de-duplication is disabled.
2020-04-10 17:33:38 +03:00
Aliaksandr Valialkin
8ed0d5471a
lib/storage: correctly handle -dedup.minScrapeInterval
values smaller than 8ms
...
Such small values may be used for removing samples with duplicate timestamps.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/409 for details.
2020-04-10 16:40:41 +03:00
Aliaksandr Valialkin
0b2f678d8e
lib/{storage,mergeset}: make sure that requests
and misses
cache counters never go down
2020-04-10 14:44:52 +03:00
Aliaksandr Valialkin
661cfb03e2
lib/protoparser: add -*TrimTimstamp
command-line flags for Influx, Graphite, OpenTSDB and CSV data
...
These flags can be used for reducing disk space usage for timestamps data ingested over the given protocols
2020-04-10 12:44:46 +03:00
Aliaksandr Valialkin
f0b08dbd9e
lib/workingsetcache: accumulate stat counters on cache rotation
...
This should prevent from cache stats counters going down after cache rotation,
which may corrupt `cache hit ratio` graph on the official Grafan dasbhoards
when using the following query:
1 - (sum(rate(vm_cache_misses_total[5m])) by (type) / sum(rate(vm_cache_requests_total[5m])) by (type))
2020-04-10 11:51:47 +03:00
Aliaksandr Valialkin
28c65b58a2
lib/memory: add more details to -memory.allowedPercent
help message
2020-04-09 15:34:21 +03:00
Aliaksandr Valialkin
84fa146792
lib/httpserver: remove unnecessary http.HandlerFunc
wrapper in gzipHandler
2020-04-01 18:14:47 +03:00
Aliaksandr Valialkin
0ad7aaf535
lib/storage: add missing reset for tagFilter.matchesEmptyValue on tagFilter.Init
2020-04-01 17:40:27 +03:00
Aliaksandr Valialkin
c189104be7
lib/promscrape: reduce timestamp jitter when scraping targets
...
This should improve compression for timestamps
2020-04-01 16:13:01 +03:00
Aliaksandr Valialkin
4c56acbafa
lib/storage: remove duplicate data points on 7/8*minScrapeInterval interval instead of 1/2*minScrapeInterval
...
This should reduce storage usage and should improve deduplication accuracy
2020-04-01 15:47:04 +03:00
Aliaksandr Valialkin
504ea876f2
lib/storage: handle errors returned from TagFilters.Add
when cloning TagFilters with negative filter
2020-03-31 16:18:34 +03:00
Aliaksandr Valialkin
ef714e01c1
lib/storage: add fast path for the previous indexdb search if it doesn't contain per-day inverted index yet
2020-03-31 12:35:15 +03:00
Aliaksandr Valialkin
7e755b4bac
lib/storage: optimize per-day inverted index search for tag filters matching big number of time series
...
- Sort tag filters in the ascending number of matching time series
in order to apply the most specific filters first.
- Fall back to metricName search for filters matching big number of time series
(usually this are negative filters or regexp filters).
2020-03-31 00:53:29 +03:00
Aliaksandr Valialkin
d450249955
lib/storage: properly handle {label=~"foo|"}
filters as Prometheus does
...
Such filters must match all the time series with `label="foo"` plus all the time series without `label`
Previously only time series with `label="foo"` were matched.
2020-03-30 20:21:47 +03:00
Aliaksandr Valialkin
b47444e69d
lib/envflag: add -envflag.prefix
for setting optional prefix for environment vars
2020-03-30 15:51:44 +03:00
kreedom
f058efb3d1
[vmalert] config parser ( #393 )
...
* [vmalert] config parser
* make linter be happy
* fix test
* fix sprintf add test for rule validation
2020-03-29 01:49:40 +02:00