Aliaksandr Valialkin
e15f3f4f2a
lib/proxy: pass proxy hostname in Host
header of the CONNECT
request
...
This should resolve the following issue when connecting to tls proxy:
cannot validate certificate for ... because it doesn't contain any IP SANs
2021-03-09 20:41:18 +02:00
Aliaksandr Valialkin
9d8223eafb
lib/proxy: set missing ServerName in TLS config for proxy_url
.
...
While at it, allow setting Proxy-Authorization for `proxy_url` via `basic_auth` and `bearer_token` configs.
2021-03-09 19:01:14 +02:00
Nikolay
1310f84122
Changes tlsConfig init for proxy connections ( #1121 )
...
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-09 19:01:13 +02:00
Aliaksandr Valialkin
0554430d7e
lib/promscrape: apply sample_limit
after metric relabeling is applied as Prometheus does
...
See the description for `sample_limit` option from Prometheus docs:
Per-scrape limit on number of scraped samples that will be accepted.
If more than this number of samples are present after metric relabeling
the entire scrape will be treated as failed. 0 means no limit.
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
2021-03-09 15:52:41 +02:00
Aliaksandr Valialkin
7b66c8cbf8
lib/promscrape/discovery/kubernetes: remove too verbose logs about starting and stopping the watchers
...
Log the number of objects loaded per each watch url
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-09 15:07:12 +02:00
John Belmonte
edf39aa225
spelling fix: adjacent ( #1115 )
2021-03-09 09:19:16 +02:00
Aliaksandr Valialkin
502fab797a
lib/promscrape: add scrape_offset
option to scrape_config
...
This option can be used for specifying the particular offset per each scrape interval for target scraping
2021-03-08 11:59:32 +02:00
Aliaksandr Valialkin
c4a0bd5eac
lib/storage: go fmt
2021-03-08 11:59:31 +02:00
Aliaksandr Valialkin
c76a904bb0
lib/storage: tune loopsCount estimations in getMetricIDsForTagFilterSlow
...
The adjusted estmations give up to 2x lower median response times on 200qps /api/v1/query_range workload
2021-03-07 21:17:48 +02:00
Aliaksandr Valialkin
c04505e585
lib/promscrape/discovery/kubernetes: reduce memory usage further when big number of scrape jobs are configured for the same kubernetes_sd_config
role
...
Serialize reloading per-role objects, so they don't occupy too much memory when objects for many scrape jobs are simultaneously refreshed.
Do not reload per-role objects if they were already refreshed by concurrent goroutines. This should reduce load on Kubernetes API server
when big number of scrape jobs are configured for the same Kubernetes role.
This is a follow-up for 17b87725ed
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-07 20:03:22 +02:00
Aliaksandr Valialkin
175466bb41
lib/decimal: prevent exponent overflow when processing values close to zero
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1114
2021-03-05 18:53:41 +02:00
Aliaksandr Valialkin
5807ff57f3
lib/promscrape/discovery/kubernetes: reduce memory usage when Kubernetes service discovery is configured on a big number of scrape jobs
...
Previously vmagent was creating a separate Kubernetes object cache per each scrape job.
This could result in increased memory usage when monitoring a Kubernetes cluster with big number of objects (pods / nodes / services, etc.)
as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
Now it uses a shared map of scrape objects across multiple scrape jobs.
2021-03-05 17:32:33 +02:00
Aliaksandr Valialkin
92ddb8f197
lib/promscrape/discovery/kubernetes: move apiWatcher code to a separate file
2021-03-05 17:32:32 +02:00
Aliaksandr Valialkin
02c0959380
lib/promscrape: make cluster membership calculations consistent across 32-bit and 64-bit architectures
2021-03-05 09:06:08 +02:00
Aliaksandr Valialkin
133fb9fc00
lib/promscrape: add -promscrape.cluster.replicationFactor
command-line flag for replicating scrape targets among vmagent
instances in the cluster
2021-03-04 10:21:27 +02:00
Aliaksandr Valialkin
bae7a1b47a
lib/promscrape/discovery/kubernetes: fix tests after e154f4a644
2021-03-03 22:42:04 +02:00
Nikolay
7d92ef3acd
Fix ingress discovery api ( #1110 )
2021-03-03 10:45:50 +02:00
Aliaksandr Valialkin
25f453ce1a
lib/promscrape/discovery/kubernetes: properly check for nil pointer inside interface
...
See https://mangatmodi.medium.com/go-check-nil-interface-the-right-way-d142776edef1
This fixes a panic when the ScrapeWork is filtered out in swcFunc.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1108
2021-03-03 10:42:54 +02:00
Aliaksandr Valialkin
fea27b9845
lib/promscrape: go fmt
2021-03-02 21:20:23 +02:00
Aliaksandr Valialkin
c67a07b469
lib/handshake: log read/write operation duration on connection errors
...
This improve debuggability of network errors
2021-03-02 21:20:20 +02:00
Aliaksandr Valialkin
c8dde1fd6b
lib/storage: typo fix: umarshal -> unmarshal
2021-03-02 20:48:44 +02:00
Aliaksandr Valialkin
3fbe2bf1c8
lib/promscrape: pre-allocate space for a map in mergeLabels
...
This should reduce the number of memory allocations when discovering big number of targets
2021-03-02 18:42:44 +02:00
Aliaksandr Valialkin
ac5c47a9f5
lib/promscrape/discovery: properly track vm_promscrape_discovery_kubernetes_objects_removed_total metric
2021-03-02 18:33:29 +02:00
Aliaksandr Valialkin
62d7e07ff7
lib/promrelabel: remove unneded optimizations for labeldrop
and labelkeep
actions
...
These optimizations may slow down code execution by matching the same label against regexp two times instead of a single time
2021-03-02 18:01:08 +02:00
Aliaksandr Valialkin
f9c1fe3852
lib/promscrape/discovery/kubernetes: cache ScrapeWork objects as soon as the corresponding k8s objects are changed
...
This should reduce CPU usage and memory usage when Kubernetes contains tens of thousands of objects
2021-03-02 16:44:19 +02:00
Aliaksandr Valialkin
f686174329
lib/promscrape/discovery/ec2: follow-up after f6114345de
2021-03-02 13:47:35 +02:00
Nikolay
fd8ca7df50
Adds webIndentity token for aws ( #1099 )
...
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1080
2021-03-02 13:47:34 +02:00
Aliaksandr Valialkin
e45c399467
lib/protoparser/prometheus: properly unescape label values in Prometheus exposition format
...
Unescape only `\n`, `\"` and `\\` sequences as Prometheus does. Other escape sequences shouldn't be unescaped.
2021-03-02 13:22:10 +02:00
Aliaksandr Valialkin
f4969a624d
lib/protoparser/graphite: fix parsing of a Graphite line with empty tags such as foo; 1 2
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1100
2021-03-01 17:17:01 +02:00
Aliaksandr Valialkin
b89a4fac2f
lib/promscrape/discovery/kubernetes: deflake tests; a follow-up for 05fb08713c
2021-03-01 14:31:44 +02:00
Aliaksandr Valialkin
c3bf72992f
lib/promscrape: explicitly stop and cleanup service discovery routines when new config is read from -promscrape.config
...
This should reduce memory usage when `-promscrape.config` file frequently changes
2021-03-01 14:15:16 +02:00
Aliaksandr Valialkin
8af9370bf2
lib/promscrape: use target arg in ScrapeWork cache
2021-03-01 12:31:11 +02:00
Aliaksandr Valialkin
5328506c3e
lib/promscrape: typo fix, which prevented from caching ScrapeWork entries
2021-03-01 12:13:13 +02:00
Aliaksandr Valialkin
d5058caccc
lib/promscrape: add vm_promscrape_scrapework_cache_* metrics for tracking ScrapeWork cache effectiveness
2021-03-01 12:05:58 +02:00
Aliaksandr Valialkin
ee5d26a546
lib/httpserver: make make errcheck
happy after the commit 9fc7726d84
2021-03-01 00:35:30 +02:00
Aliaksandr Valialkin
109bfaadad
lib/promscrape: reduce CPU usage an memory allocations when constructing scrapeWorkKey
2021-02-28 22:30:36 +02:00
Aliaksandr Valialkin
9e644ef111
lib/httpserver: make sure the gzipResponseWriter.Write() is called on Flush() and Close() calls
...
This should fix the `http: superfluous response.WriteHeader call` issue
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1078
2021-02-28 19:23:26 +02:00
Aliaksandr Valialkin
9a2bf65134
lib/promscrape: add ability to spread scrape targets among multiple vmagent instances
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1084
2021-02-28 18:40:42 +02:00
Aliaksandr Valialkin
3e44d9947e
lib/promscrape/discovery/kubernetes: properly account the number of objects when watcher is stopped
...
A follow-up for b21b110b7a
2021-02-28 17:06:49 +02:00
Aliaksandr Valialkin
0ef7a94056
lib/promscrape/discovery/kubernetes: add vm_promscrape_discovery_kubernetes_*
metrics for monitoring internal state of k8s service discovery
2021-02-28 16:58:45 +02:00
Aliaksandr Valialkin
f52bdbe2a3
lib/promscrape/discovery/kubernetes: remove resourceVersionMatch=NotOlderThan
query arg when watching for k8s object changes, since it cannot be used when watch=1
query arg is passed
2021-02-28 16:08:43 +02:00
Aliaksandr Valialkin
6f8866a40a
lib/promscrape: fix possible deadlock in parallel execution of target relabeling
2021-02-28 16:05:32 +02:00
Aliaksandr Valialkin
5c9e657808
lib/promscrape/discovery/kubernetes: fix deadlock in startWatcherForURL
...
reloadObjects must be called without holding aw.mu lock
2021-02-28 15:25:33 +02:00
Aliaksandr Valialkin
e77f2f8630
lib/promscrape/discovery/kubernetes: typo fix after 241ffd1f3b
2021-02-28 15:15:27 +02:00
Aliaksandr Valialkin
82441537ff
lib/promscrape/discovery/kubernetes: pre-populate labelsByKey in reloadObject()
2021-02-28 15:09:43 +02:00
Aliaksandr Valialkin
e003453941
lib/promscrape/discovery/kubernetes: compare sorted sets of labels in tests
...
This should deflake tests where the order of labels isn't stable
2021-02-28 14:12:32 +02:00
Aliaksandr Valialkin
6a21ef87b7
lib/promscrape: add missing startWatchersForRole()
call at the beginning of apiWatcher.getLabelsForRole
2021-02-28 14:00:00 +02:00
Aliaksandr Valialkin
6d0e7fb8b0
lib/promscrape/discovery/kubernetes: reload k8s resources on every error
...
This is needed for obtaining fresh resourceVersion
2021-02-27 01:46:59 +02:00
Aliaksandr Valialkin
7f1302688f
lib/fs: follow-up after f3a03c4164
2021-02-27 01:09:37 +02:00
Nikolay
d88fa5ebe4
Adds windows build ( #1040 )
...
* fixes windows compilation,
adds signal impl for windows,
adds free space usage for windows,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1036
NOTE victoria metrics database still CANNOT work under windows system,
only vmagent is supported.
To completly port victoria metrics, you have to fix issues with separators,
parsing and posix file removall
* rollback separator
* Adds windows setInformation api,
it must behave like unix, need to test it.
changes procutil
* check for invlaid param
* Fixes posix delete semantic
* refactored a bit
* fixes openbsd build
* removed windows api call
* Fixes code after windows add
* Update lib/procutil/signal_windows.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-02-27 01:06:22 +02:00
Aliaksandr Valialkin
44975e28fe
lib/promscrape: yet another typo fix after ed8441ec52
2021-02-26 23:36:34 +02:00
Aliaksandr Valialkin
c1b8729bd8
lib/fs: properly handle stale NFS file handle
error during file deletion
...
This error can appear when -storageDataPath points to NFS volume and the given file has been already removed.
2021-02-26 23:24:46 +02:00
Aliaksandr Valialkin
50e74c439c
lib/promscrape: typo fix after ed8441ec52
2021-02-26 23:04:40 +02:00
Aliaksandr Valialkin
fa3ce450fb
lib/promscrape: cache ScrapeWork
...
This should reduce the time needed for updating big number of scrape targets.
2021-02-26 21:43:41 +02:00
Aliaksandr Valialkin
efcdf613c2
lib/promscrape/discovery/kubernetes: cache target labels
...
This should reduce CPU usage on repeated SDConfig.GetLabels() calls.
2021-02-26 20:24:29 +02:00
Aliaksandr Valialkin
22822feea3
lib/promscrape/discovery/kubernetes: errcheck fix
2021-02-26 19:09:12 +02:00
Aliaksandr Valialkin
dc8c045378
lib/promscrape: cleanup after 9b2246c29b
...
Main points:
* Revert changes outside lib/promscrape/discovery/kuberntes . These changes can be applied later in a separate commit
* Minimize changes in lib/promscrape/discovery/kubernetes compared to a93e644001
* Corner case fixes.
2021-02-26 19:09:12 +02:00
Nikolay
cf9262b01f
vmagent kubernetes watch stream discovery. ( #1082 )
...
* started work on sd for k8s
* continue work on watch sd
* fixes
* continue work
* continue work on sd k8s
* disable gzip
* fixes typos
* log errror
* minor fix
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-02-26 19:09:12 +02:00
Aliaksandr Valialkin
4cfac70cde
lib/promscrape: remove duplicate code a bit
2021-02-26 15:54:07 +02:00
Aliaksandr Valialkin
55eb983193
lib/promscrape: reduce processing time for big number of discovered targets by processing them in parallel
2021-02-26 12:48:48 +02:00
Aliaksandr Valialkin
f8baf1a76d
lib/promrelabel: optimize labeldrop
and labelkeep
relabeling for prefix.*
and prefix.+
regexps
2021-02-24 17:58:11 +02:00
Aliaksandr Valialkin
06676c8feb
lib/storage: consistency renaming: durationsPerDateTagFilterCache -> loopsPerDateTagFilterCache
2021-02-23 15:50:08 +02:00
faceair
b1409a7413
lib/storage: correct tagfilter match cost ( #1079 )
2021-02-22 21:54:37 +02:00
Aliaksandr Valialkin
197ecca426
lib/promrelabel: add more optimizations for relabeling for common cases
2021-02-22 16:36:54 +02:00
Aliaksandr Valialkin
63c16c3fdf
lib/promrelabel: optimize relabeling performance for common cases
2021-02-22 00:51:07 +02:00
Aliaksandr Valialkin
8b87398333
lib/promscrape: export vm_promscrape_target_relabel_duration_seconds metric
2021-02-21 23:21:35 +02:00
Aliaksandr Valialkin
587132555f
lib/mergeset: reduce memory usage for inmemoryBlock by using more compact items representation
...
This also should reduce CPU time spent by GC, since inmemoryBlock.items don't have pointers now,
so GC doesn't need visiting them.
2021-02-21 22:09:10 +02:00
Aliaksandr Valialkin
bd3bcdc43c
lib/storage: do not re-calculate stats for heavy tag filters
...
This should reduce the number of slow queries when stats for heavy tag filters was recalculated.
2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
b8a5ee2e93
lib/{mergeset,storage}: allow merging smaller number of small parts
...
While this may increase CPU and disk IO usage needed for background merge,
this also recudes CPU usage during queries in production. This is because
such queries tend to read recently added data and it is better to have lower number
of parts for such data in order to reduce CPU usage.
This partially reverts ebf8da3730
2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
34195218e1
lib/{mergeset,storage}: do not use pools for indexBlock and inmemoryBlock during their caching, since this results in higher memory usage in production without any performance gains
2021-02-21 21:43:37 +02:00
Aliaksandr Valialkin
f09b46b2b1
lib/promscrape: typo fix after the commit f26162ec99
2021-02-19 00:34:18 +02:00
Aliaksandr Valialkin
72eef964d9
app/vmagent: properly perform graceful shutdown, which was broken in the commit 1d1ba889fe
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1065
2021-02-19 00:34:17 +02:00
Aliaksandr Valialkin
502d0e2524
lib/promscrape: add scrape_align_interval config option into scrape config
...
This option allows aligning scrapes to a particular intervals.
2021-02-18 23:53:04 +02:00
Aliaksandr Valialkin
10ccb92e4d
lib/storage: use composite index for a query with a name filter and negative filters
2021-02-18 18:57:45 +02:00
Aliaksandr Valialkin
418de71509
lib/storage: properly handle queries containing a filter on metric name plus any number of negative filters and zero non-negative filters
...
Example: `node_cpu_seconds_total{mode!="idle"}`
2021-02-18 18:33:05 +02:00
Aliaksandr Valialkin
628b8eb55e
lib/storage: prevent from running identical heavy tag filters in concurrent queries when measuring the number of loops for such tag filter.
...
This should reduce CPU usage spikes when measuring the number of loops needed for heavy tag filters
2021-02-18 14:01:18 +02:00
Aliaksandr Valialkin
fd41f070db
lib/storage: sort tag filters by the number of loops they need for the execution
...
This metric should work better than the filter execution duration, since it cannot be distorted
by concurrently running queries.
2021-02-18 12:52:29 +02:00
Aliaksandr Valialkin
9566015a36
Revert "lib/mergeset: tune lifetime for entries inside block caches"
...
This reverts commit 458c89324d
.
Production testing revealed zero improvements for memory usage with reduced lifetime for entries in block caches.
2021-02-17 20:42:15 +02:00
Aliaksandr Valialkin
b4c7d5992b
lib/storage: move composite filters to the top during sorting
2021-02-17 20:26:32 +02:00
Aliaksandr Valialkin
2005d8212a
lib/storage: return back filter arg to getMetricIDsForTagFilter function
...
The filter arg has been removed in the commit c7ee2fabb8
because it was preventing from caching the number of matching time series per each tf.
Now the cache contains duration for tf execution, so the filter shouldn't break such caching.
2021-02-17 19:33:15 +02:00
Aliaksandr Valialkin
83da939947
app/vmstorage: export vm_composite_filter_success_conversions_total and vm_composite_filter_missing_conversions_total metrics
2021-02-17 19:13:49 +02:00
Aliaksandr Valialkin
0c5bb2a397
lib/storage: revert ecf132933e
, since negative filters require the same amount of work as non-negative filters
2021-02-17 18:56:13 +02:00
Aliaksandr Valialkin
bbc287ea6a
lib/storage: tag filters sorting...
2021-02-17 18:56:12 +02:00
Aliaksandr Valialkin
f5e841c1e9
lib/storage: further tune tag filters sorting
2021-02-17 17:28:36 +02:00
Aliaksandr Valialkin
9d1f14d94c
lib/storage: tune the logic for sorting tag filters according the their exeuction times
2021-02-17 15:02:19 +02:00
Aliaksandr Valialkin
1a19702d92
lib/storage: make sure that nobody uses partitions when closing the table
2021-02-17 15:02:18 +02:00
Aliaksandr Valialkin
3062ff0fdb
app/vmselect: export per-tenant stats on the number of requests and the cumulative request duration
...
The metrics are:
- vm_vmselect_http_requests_total{accountID="...",projectID="..."} - the total number of select requests per each tenant
- vm_vmselect_http_duration_ms_total{accountID="...",projectID="..."} - the total duration in milliseconds for per-tenant select requests
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/932
2021-02-16 23:30:29 +02:00
Aliaksandr Valialkin
3b8596bfec
lib/httpserver: make errcheck happy
2021-02-16 22:05:59 +02:00
Aliaksandr Valialkin
1ab7a8dfd5
lib/storage: more tuning for tag filters sorting according the time they take
2021-02-16 21:27:27 +02:00
Aliaksandr Valialkin
35a23234ca
lib/mergeset: tune lifetime for entries inside block caches
...
This should reduce memory usage in general case without significant CPU usage increase
2021-02-16 18:12:32 +02:00
Aliaksandr Valialkin
500acb958c
lib/mergeset: clarify comments in the code a bit
2021-02-16 18:03:11 +02:00
Aliaksandr Valialkin
da05c10544
lib/uint64set: remove memory allocation in bucket16.appendTo when sorting smallPool
2021-02-16 15:31:59 +02:00
Aliaksandr Valialkin
18305caadb
lib/httpserver: cache /metrics
output for a second
...
This should reduce CPU load when `/metrics` output is scraped with a frequency exceeding a request per second
2021-02-16 14:57:13 +02:00
Aliaksandr Valialkin
7869d38043
lib/protoparser/influx: make sure that escaped whitespace can be put in measurement, tag names and field names
2021-02-16 13:59:11 +02:00
Aliaksandr Valialkin
d788876a7a
lib/mergeset: remove unused code after a4140de9e6
2021-02-16 13:40:29 +02:00
Aliaksandr Valialkin
55952f8f2e
lib/storage: tune sorting for tag filters
2021-02-16 13:07:42 +02:00
Aliaksandr Valialkin
3eae03a337
lib/storage: increase match cost for negative tag filters, since they need to scan all the label pairs
2021-02-15 16:39:52 +02:00
Aliaksandr Valialkin
46e98ed490
vendor: update github.com/VictoriaMetrics/metrics from v1.13.1 to v1.14.0
...
The new version switches from log-linear histograms to log-based histograms,
which provide up to 3.6 times better accuracy.
2021-02-15 15:11:15 +02:00
Aliaksandr Valialkin
93ff866e91
lib/storage: reduce the minimum supported retention for inverted index from one month to one day
2021-02-15 15:11:15 +02:00
Aliaksandr Valialkin
5d5d310dcc
lib/flagutil: prevent from integer overflow when parsing duration
2021-02-15 15:11:15 +02:00
Aliaksandr Valialkin
fccb481de2
lib/promscrape/discovery/kubernetes: add __meta_kubernetes_endpoints_label_*
and __meta_kuberntes_endpoints_annotation_*
labels to role: endpoints
...
This syncs kubernetes SD with Prometheus 2.25
See 617c56f55a
2021-02-15 02:51:36 +02:00
Aliaksandr Valialkin
54a09de037
lib/logger: explicitly import "time/tzdata" package for embedding tzdata into the app
...
The approach with `timetzdata` build tag didn't work for GOARCH=arm and GOARCH=ppc64le
due to the issue https://github.com/golang/go/issues/44073#issuecomment-778854298
2021-02-15 01:00:30 +02:00
Aliaksandr Valialkin
6f3bbf21b8
lib/storage: sort tag filters by actual execution time instead of by the number of matching time series
...
This should improve query speed for queries with regexp filters matching small number of time series
on a label with big number of unique values.
2021-02-15 00:19:46 +02:00
Aliaksandr Valialkin
9e3993c585
lib/storage: properly hanle regexp tag filters with dots, which can be converted to full string match filters.
...
For example `{label=~"foo\.bar"}` should be converted to `{label="foo.bar"}`. Previously it has was mistakenly conveted to `{label="foo\.bar"}` .
This could result in missing time series for such tag filters.
2021-02-14 23:39:19 +02:00
Aliaksandr Valialkin
18c2075159
lib/promscrape: remove vm_promscrape_scrapes_failed_per_url_total and vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total metrics
...
These metrics may result in big number of time series when vmagent scrapes thousands of targets and these targets constantly changes.
* It is better using `up == 0` query for determining failing targets.
* It is better using the following query for determining targets with exceeded limit on the number of metrics:
scrape_samples_scraped > 0 if up == 0
2021-02-12 05:23:27 +02:00
Aliaksandr Valialkin
4e645a5fd3
lib/storage: return back in-order applying of tag filters, since concurrently executing tag filters can result in CPU and RAM waste in common case
2021-02-10 22:43:07 +02:00
Aliaksandr Valialkin
eeb92eb7fc
lib/storage: load metadata before loading indexdb, since indexdb depends on the metadata
2021-02-10 17:55:51 +02:00
Aliaksandr Valialkin
08f21d8761
app/vmstorage: export vm_composite_index_min_timestamp metric
2021-02-10 17:14:00 +02:00
Aliaksandr Valialkin
b27288f1b0
lib/storage: parallelize tag filters execution a bit
...
This should reduce execution time when a query contains multiple tag filters and each such filter matches big number of time series.
2021-02-10 16:32:27 +02:00
Aliaksandr Valialkin
4262c2f7c2
lib/storage: remove filter arg from getMetricIDsForDateTagFilter function
...
The `filter` arg breaks the logic for sorting tag filters by the matching metrics,
which may result in non-optimal performance during time series search.
2021-02-10 16:32:26 +02:00
Aliaksandr Valialkin
681dfb7485
lib/storage: fix inconsistencies in error logs
2021-02-10 16:32:21 +02:00
Aliaksandr Valialkin
148422bcba
lib/storage: disable composite index usage when querying old data
2021-02-10 14:57:58 +02:00
Aliaksandr Valialkin
17d5a03f6e
lib/storage: fix metric name match for composite filter
2021-02-10 01:27:34 +02:00
Aliaksandr Valialkin
fa0ef143b1
lib/storage: optimize search by label filters matching big number of time series
2021-02-10 00:46:17 +02:00
Aliaksandr Valialkin
5c9715a89a
lib/storage: reduce lock contention in dateMetricIDCache when registering new time series for the current day
...
This should help systems with multiple CPU cores
2021-02-10 00:04:19 +02:00
Aliaksandr Valialkin
082eabf51e
lib/fs: remove the code for tracking whether the given memory region is in page cache
...
This code didn't give performance gains under production workload, so let's remove it in order to simplify the code.
2021-02-09 16:51:11 +02:00
Aliaksandr Valialkin
afa9cf9c57
lib/mergeset: remove dead code left after a4140de9e6
2021-02-09 16:51:09 +02:00
Aliaksandr Valialkin
9ed7789fef
optimize Storage.updatePerDateData()
2021-02-09 02:59:53 +02:00
Aliaksandr Valialkin
ea328b7391
lib/storage: skip deduplication when creating inmemory data blocks
...
The deduplication will be performed later during merging such blocks.
2021-02-09 02:26:16 +02:00
Aliaksandr Valialkin
7b7963a77f
lib/mergeset: unconditionally cache indexdb blocks
...
Production workloads show that indexdb blocks must be cached unconditionally for reducing CPU usage.
This shouldn't increase memory usage too much, since unused blocks are removed from the cache every two minutes.
2021-02-09 00:49:59 +02:00
Aliaksandr Valialkin
e8ee9fa7fe
app/vmstorage: export missing vm_cache_size_bytes
metrics for indexdb and data caches
2021-02-09 00:49:58 +02:00
Aliaksandr Valialkin
2b36eb3d82
lib/cgroup: follow-up after b9bf3cbe3e
2021-02-08 16:01:26 +02:00
Nikolay
32603f57cc
refactored cgroups limits, ( #1061 )
...
adds tests, remove os.Exec
2021-02-08 16:01:26 +02:00
Aliaksandr Valialkin
2dbb12563b
lib/storage: optimize data ingestion in the beginning of every hour
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046
2021-02-08 12:04:51 +02:00
Aliaksandr Valialkin
1a6eb0c3cf
lib/logger: exit the app if unsupported timezone value has been passed to -loggerTimezone
...
While at it, clarify descrption for `-loggerTimezone` command-line flag.
2021-02-07 23:33:35 +02:00
Aliaksandr Valialkin
c6a7288109
lib/storage: check for prevHourMetricIDs cache before falling back to checking for (date, metricID) entries during data ingestion
...
This should reduce possible CPU usage spikes at the beginning of every hour.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1046
2021-02-04 18:46:23 +02:00
Aliaksandr Valialkin
b5eba70595
lib/httpserver: expose process_open_fds
and process_max_fds
metrics
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/402
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1037
2021-02-04 16:42:32 +02:00
Nikolay
6fc2a8e544
fixes dockerswarm ( #1053 )
...
fixes improper usage of host network services
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1028
2021-02-04 15:57:41 +02:00
Aliaksandr Valialkin
8249f13104
app/vmselect,lib/storage: properly parse Graphite selectors with inner wildcards
...
Example: foo{bar{x,yz},a[b-c],*de}
2021-02-03 20:16:28 +02:00
Aliaksandr Valialkin
2976ec89b8
lib/storage: fix a bug, which breaks searching by Graphite wildcard filters
2021-02-03 20:15:50 +02:00
Aliaksandr Valialkin
45a63a1da9
sort orSuffixes in tagFilter.InitFromGraphiteQuery for faster seeks
2021-02-03 20:15:37 +02:00
Aliaksandr Valialkin
4b930b9ffe
app/vmselect: add ability to set Graphite-compatible filter via {__graphite__="foo.*.bar"}
syntax
2021-02-03 01:17:19 +02:00
Aliaksandr Valialkin
755b0998ce
lib/promscrape: add vm_promscrape_service_discovery_duration_seconds metric
2021-02-02 16:16:51 +02:00
Aliaksandr Valialkin
4c59dbc127
lib/promscrape: add vm_promscrape_scrape_retries_total
, vm_promscrape_discovery_retries_total
and vm_promscrape_discovery_requests_total
metrics
2021-02-01 20:06:16 +02:00
Aliaksandr Valialkin
e79799a784
lib/flagutil: typo fix in comment to ArrayInt.GetOptionalArgOrDefault() func
2021-02-01 14:42:15 +02:00
Aliaksandr Valialkin
fdf9de98f8
app/vmagent: add -remoteWrite.roundDigits command-line option for limiting the number of digits after the point for stored values
...
This commit also adds --vm-round-digits command-line option to vmctl tool.
2021-02-01 14:42:15 +02:00
Aliaksandr Valialkin
2c14c70903
lib/logger: initialize timezone by UTC in order to fix failing tests
2021-01-27 01:00:57 +02:00
Aliaksandr Valialkin
ffec5131ae
lib/promscrape: export vm_promscrape_scrapes_failed_per_url_total
and vm_promscrape_scrapes_skipped_by_sample_limit_per_url_total
metrics
...
These metrics could be useful for determining imporperly working scrape targets.
Note that these metrics are exported only for failing scrape targets. They aren't exposed for normally working targets.
2021-01-27 00:40:39 +02:00
Aliaksandr Valialkin
4b324da947
all: consistently use timers from timerpool
2021-01-27 00:40:39 +02:00
Aliaksandr Valialkin
588090765c
lib/fs: properly initialize cleaner for pageCache bitmaps
...
Previously it wasnt working because the timer was fired only once
2021-01-27 00:40:39 +02:00
Aliaksandr Valialkin
68a66be811
lib/logger: add -loggerTimezone
command-line flag for adjusting timezone for timestamps in log messages
2021-01-26 22:53:25 +02:00
Aliaksandr Valialkin
fdced59278
lib/promscrape: retry scrape and service discovery requests when the remote server closes http keep-alive connection
2021-01-22 13:22:59 +02:00
faceair
cf76d8dd79
lib/mergeset: add missing shouldCacheBlock ( #1019 )
2021-01-15 11:47:25 +02:00
Aliaksandr Valialkin
81789da731
lib/backup: increase backup chunk size from 128MB to 1GB
...
This should reduce costs for object storage API calls by 8x. See https://cloud.google.com/storage/pricing#operations-pricing
2021-01-13 12:16:39 +02:00
Nikolay
095be83f2f
bumps minimal tls version ( #1012 )
2021-01-13 00:39:09 +02:00
Aliaksandr Valialkin
3d79471fb3
lib/storage: inline marshalTags function and remove the code for handling duplicate tags from here
...
This is a follow-up commit after c8ea697db8
2021-01-12 15:20:22 +02:00
Aliaksandr Valialkin
719ad49adf
lib/storage: de-duplicate tags in MetricName.sortTags
...
Leave only the last tag among tags with duplicate keys. This is needed for reliable addition of extra_labels
during data ingestion. See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1007 for details.
2021-01-12 15:03:22 +02:00
Nikolay
9f0a4fd00e
Fixes error handling for promscrape.streamParse ( #1009 )
...
properly return error if client cannot read data,
properly suppress scraper errors
2021-01-12 13:35:09 +02:00
Aliaksandr Valialkin
c97681b45c
lib/promscrape: properly show scrape duration on /targets
page
...
Previously it has been shown as 0.000s for any scrape duration.
2021-01-11 21:15:50 +02:00
Aliaksandr Valialkin
4ee53c3961
all: use net.Dial
instead of fasthttp.Dial
, because fasthttp.Dial
limits the number of concurrent dials to 1000
2021-01-11 12:52:51 +02:00
Aliaksandr Valialkin
d5a2b120e9
app/vmstorage: disable final merge by default, since it may result in high disk IO and CPU usage without measurable benefits such as increased query performance and reduced disk space usage
2021-01-08 00:12:12 +02:00
Aliaksandr Valialkin
ca8919e8e1
lib/storage: wait for pending transactions before closing and dropping the partition
...
This deflakes `make test-full-386` test
2020-12-25 11:46:47 +02:00
Aliaksandr Valialkin
c0511144e3
lib/storage: physically remove stale parts
...
Previously they were removed from partition struct, but the corresponding directories weren't removed.
This is a follow-up for 46dba00756
2020-12-24 16:56:09 +02:00
Nikolay
dd2f815204
fixes kubernetes_sd ( #983 )
...
* fixes kubernetes_sd,
adds missing service metadata for pod ports without endpoint
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/982
* fix test
2020-12-24 11:34:12 +02:00
Aliaksandr Valialkin
367fc17933
lib/promscrape: code prettifying for 8dd03ecf19
2020-12-24 10:57:20 +02:00
Nikolay
b00f7816e2
adds proxy_url support, ( #980 )
...
* adds proxy_url support,
adds proxy_url to the dockerswarm, eureka, kubernetes and consul service discovery,
adds proxy_url to the scrape_config for targets scrapping,
http based proxy is supported atm,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/503
* fixes imports
2020-12-24 10:57:19 +02:00
Aliaksandr Valialkin
66f8fbbb32
lib/storage: do not remove parts outside the configured retention if they are currently merged
...
These parts are automatically removed after the merge is complete.
2020-12-24 09:02:12 +02:00
Aliaksandr Valialkin
fa3bcf220f
lib/storage: remove stale parts as soon as they go outside the configured retention
...
Previously such parts could remain undeleted for long durations until they are merged with other parts.
This should help for `-retentionPeriod` values smaller than one month.
2020-12-22 19:55:07 +02:00
Aliaksandr Valialkin
6859737329
lib/storage: properly determine max rows for output part when merging small parts
2020-12-18 23:26:28 +02:00
Aliaksandr Valialkin
edbe35509e
lib/{storage,mergeset}: tune background merge process in order to reduce CPU usage and disk IO usage
2020-12-18 20:01:20 +02:00
Aliaksandr Valialkin
1ee5a234dc
lib/promscrape: remove ID
field from ScrapeWork
struct. Use a pointer to ScrapeWork as a key in targetStatusMap
...
This simplifies the code a bit.
2020-12-17 14:31:55 +02:00
Aliaksandr Valialkin
4fd2973e7c
lib/protoparser/prometheus: follow-up commit after 7d38627b9f6f212ae602aea6a72f469fe3c70ba2
...
Document the bugfix in docs/CHANGELOG.md and add a test for the bugfix.
2020-12-16 23:42:17 +02:00
BigFish
60dd48c9eb
lib/protoparser/prometheus/parser.go ( #970 )
...
fix parse timestamp error if there are some whitespaces after timestamp
2020-12-16 23:42:16 +02:00
Aliaksandr Valialkin
c4ec977934
lib/promscrape: properly remove deleted target from /targets
page
...
Previously `sw` variable wasn't captured correctly by the started goroutine.
2020-12-15 20:58:37 +02:00
Aliaksandr Valialkin
eddc2bd017
lib/promscrape: properly handle scrape errors when stream parsing is enabled
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/967
2020-12-15 14:10:52 +02:00
Nikolay
7064c4eb8e
adds new Array Flags ( #965 )
...
* adds ArrayDuration and ArrayBool flags,
makes sendTimeout and tlsInsecure configurable per remoteWrite url
* added backward compatibility testcases for ArrayDuration and ArrayBool
* fixes bool flag
* fixes test cases
2020-12-15 12:59:33 +02:00
Aliaksandr Valialkin
104aac170e
lib/promscrape: add bootstrap styles to /targets
html page
2020-12-15 12:38:29 +02:00
Aliaksandr Valialkin
ad961fe1f1
lib/promscrape: formatting fixes for /tarets
page
2020-12-15 11:59:28 +02:00
Aliaksandr Valialkin
38145cfbb8
lib/promscrape: formatting fixes for /targets
page
2020-12-15 11:27:22 +02:00
Aliaksandr Valialkin
d98a2f217b
lib/persistentqueue: verify that ReaderOffset doesnt exceed WriterOffset when opening the persistent queue
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/964
2020-12-14 19:25:53 +02:00
Aliaksandr Valialkin
b904ae3722
lib/promscrape: add missing whitespace between duration and ago
word at /targets
page
2020-12-14 14:20:30 +02:00
Aliaksandr Valialkin
a2eb451de4
app/{vmagent,vminsert}: follow-up for ce8c2dd1f1
: return /targets
page in HTML when requested via web browser
2020-12-14 14:13:01 +02:00
Nikolay
324e3aa1a5
Changes targets api ( #961 )
...
* changes /targets api
adds html response if requester accepts text/html,
adds quick template for /targets api,
fixes pathPrefix for / requests
* changes namings
* renamed targets file
* Update app/victoria-metrics/main.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
* adds trimspace to qtpl,
moves content-type for targets response closer to writer
* fixes bug with prefix
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-12-14 14:13:00 +02:00
Aliaksandr Valialkin
c80d38f00c
lib/promscrape/discovery/consul: reduce load on Consul API server by increasing timeout for blocking requests from 50 seconds to 9 minutes
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-12-11 17:26:34 +02:00
Aliaksandr Valialkin
a84467958a
lib/promscrape/discovery/consul: properly pass Datacenter filter to Consul API server
...
Previously it has been passed as `sdc` query arg, while it should be passed as `dc` query arg.
See https://www.consul.io/api-docs/health#list-nodes-for-service for details.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574#issuecomment-740454170
2020-12-08 21:53:23 +02:00
Aliaksandr Valialkin
1a237c6903
all: properly handle CPU limits set on the host system/container
...
This can reduce memory usage on systems with enabled CPU limits.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/946
2020-12-08 21:07:03 +02:00
Aliaksandr Valialkin
38188e1d6b
lib/promscrape: store ScrapeWork items by pointer in the slice returned from get*ScrapeWork()
...
This should prevent from possible 'memory leaks' when a pointer to ScrapeWork item stored in the slice
could prevent from releasing memory occupied by all the ScrapeWork items stored in the slice when they
are no longer used.
See the related commit e205975716
and the related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-12-08 17:55:21 +02:00
Aliaksandr Valialkin
d5faad0240
lib/promscrape: re-use strings for labels stored in ScrapeWork
...
This should reduce memory usage when working with big number of scrape targets.
2020-12-08 12:23:44 +02:00
Aliaksandr Valialkin
06091cfdf8
lib/promscrape: export vm_promscrape_scrapers_{started|stopped}_total
metrics for monitoring target churn rate
2020-12-08 11:58:44 +02:00
Aliaksandr Valialkin
affcee199c
lib/promscrape: store targetStatus entries in targetStatusMap by pointer instead of by value
...
This guarantees that GC frees memory occupied by targetStatus after it is unregistered from targetStatusMap.
2020-12-08 11:52:20 +02:00
Aliaksandr Valialkin
56a0b058c1
lib/promscrape: export vm_promscrape_active_scrapers{type="<sd_type>"}
metric for tracking the number of active scrapers per each service discovery type
2020-12-08 01:54:44 +02:00
Aliaksandr Valialkin
b5b32c65b0
lib/promscrape: do not enable strict config parsing when -promscrape.config.dryRun
command-line flag is passed
...
Strict parsing for -promscrape.config can be enabled by passing `-promscrape.config.strictParse` command-line flag.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/944
2020-12-07 13:18:16 +02:00
Aliaksandr Valialkin
9d79d11a6c
lib/promscrape: mention in scrape error message that scrape errors can be disabled by -promscrape.suppressScrapeErrors
command-line flag
2020-12-06 23:27:07 +02:00
Aliaksandr Valialkin
f6d32f99d7
lib/promscrape: clarify error message on failed connection to scrape target when -enableTCP6 command-line flag isn't set
2020-12-06 13:19:32 +02:00
Aliaksandr Valialkin
3d00613076
lib/protoparser/influx: allow multiple whitespace chars between measurement, fields and timestamp in Influx line protocol
2020-12-06 12:00:28 +02:00
Aliaksandr Valialkin
1430bbcf33
lib/promscrape/discoveryutils: remove limit on the number of concurrently running blocking queries
...
Too low limit could result in unexpected errors when performing big number of blocking queries.
2020-12-05 12:15:47 +02:00
Aliaksandr Valialkin
528587deef
lib/flagutil: make golangci-lint happy by using strings.TrimPrefix instead of manual prefix removal via strings.HasPrefix
2020-12-03 22:07:26 +02:00
Aliaksandr Valialkin
bdac2171f1
all: do not print usage info for all the flags when incorrect command-line flag is passed
...
This should improve usability for VictoriaMetrics apps that have big number of command-line flags,
i.e. all the apps.
2020-12-03 21:46:19 +02:00
Aliaksandr Valialkin
96190f9d45
lib/promscrape/discovery/consul: log the time needed for stoppig Consul service watcher
2020-12-03 20:14:48 +02:00
Aliaksandr Valialkin
4e4a93c586
lib/promscrape/discovery/consul: make sure that block response contains X-Consul-Index header
2020-12-03 20:05:54 +02:00
Aliaksandr Valialkin
7a889f6850
lib/promscrape: code cleanup after c6dee6c52d
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-12-03 19:52:09 +02:00
Nikolay
0b302d33cb
Changes consul discovery api ( #921 )
...
* adds consul watch api,
it must reduce load on consul service with blocking wait requests,
changed discoveryClient api with fetchResponseMeta callback.
* small fix
* fix after master merge
* adds watch client at discovery utils
* fixes consul watcher,
changes namings,
fixes data race
* small typo fix
* sanity fix
* fix naming and service node update
2020-12-03 19:52:08 +02:00
Aliaksandr Valialkin
9eca96596f
lib/storage: add missing (AccountID, ProjectID) in MetricName.String() test
2020-11-29 01:25:50 +02:00
Aliaksandr Valialkin
2385ac11c0
lib/promscrape: fix failing tests after a906b3862f
2020-11-29 01:25:49 +02:00
Aliaksandr Valialkin
2ed721e457
lib/protoparser/prometheus: properly parse OpenMetrics timestamps
...
OpenMetrics timestamps are floating-point numbers, that represent Unix timestamp in seconds.
This differs from Prometheus exposition format, where timestamps are integer numbers representing Unix timestamp in milliseconds.
2020-11-27 14:54:36 +02:00
Aliaksandr Valialkin
3bb9bf33d6
lib/promscrape: reduce memory allocations when unpacking gzipped responses received from scrape targets
2020-11-26 18:32:16 +02:00
Aliaksandr Valialkin
af667c59c1
all: typo fix: thouthand->thousand
2020-11-26 13:34:05 +02:00
Aliaksandr Valialkin
3dd2282ed9
lib/promscrape: release http response non-200 status code
2020-11-26 13:25:25 +02:00
Aliaksandr Valialkin
3f52e59efe
app/{vmagent,victoria-metrics}: add -dryRun
option and make more clear handling for -promscrape.config.dryRun
2020-11-25 23:01:39 +02:00
Aliaksandr Valialkin
8f8727cb65
lib/mergeset: tune the number of rawItemsBlocks to merge at once
...
512 blocks give higher ingestion performance and slightly lower memory usage
2020-11-25 21:53:15 +02:00
Aliaksandr Valialkin
8fcd87a6a5
lib/mergeset: help GC by removing refereces to slices in inmemoryBlock.Reset
2020-11-25 21:20:02 +02:00
Aliaksandr Valialkin
03002f1fe1
lib/storage: log metric name plus all its labels when the metric timestamp is outside the configured retention
...
This should simplify debugging when the source of the metric with unexpected timestamp must be found.
2020-11-25 14:44:29 +02:00
Aliaksandr Valialkin
4848a05924
lib/storage: typo fix in error message: allowd->allowed
2020-11-25 14:15:54 +02:00
Aliaksandr Valialkin
26e699c440
lib/protoparser/prometheus: properly parse "infinity" values in OpenMetrics format
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/924
2020-11-24 19:02:50 +02:00
Aliaksandr Valialkin
a8b5a6e517
lib/logger: disable rate limiting for error and warn logs by default
2020-11-24 12:42:07 +02:00
Aliaksandr Valialkin
7f3e884a31
all: spelling fix: superflouos->superfluous. This is a follow-up for 0acdab3ab9
2020-11-24 12:42:04 +02:00
Aliaksandr Valialkin
dad8b76a0e
lib/protoparser/prometheus: properly parse metrics with exemplars
...
Examplars have been introduced in OpenMetrics - see https://github.com/OpenObservability/OpenMetrics/blob/master/OpenMetrics.md#exemplars-1
Previously VictoriaMetrics couldn't parse the following metric
foo{bar="baz"} 123 # exemplar here
This commit fixes this. Note that VictoriaMetrics ignores the exemplar as for now.
2020-11-24 12:30:34 +02:00
Aliaksandr Valialkin
8b82f9d8b8
lib/promscrape: expose __meta_ec2_ipv6_addresses
label for ec2_sd_config
like Prometheus will do in the next release
2020-11-23 16:57:03 +02:00
Aliaksandr Valialkin
c2186279b7
lib/promscrape: add filters
option to dockerswarm_sd_config
like Prometheus did in v2.23.0
2020-11-23 16:27:33 +02:00
Aliaksandr Valialkin
f4fd917e4f
lib/fs: replace fs.OpenReaderAt with fs.MustOpenReaderAt
...
All the callers for fs.OpenReaderAt expect that the file will be opened.
So it is better to log fatal error inside fs.MustOpenReaderAt instead of leaving this to the caller.
2020-11-23 09:57:30 +02:00
Aliaksandr Valialkin
d892d63204
lib/promscrape: hint that -enableTCP6 command-line flag can be used for connecting to IPv6 addresses
2020-11-21 14:39:05 +02:00
Aliaksandr Valialkin
8608e956dd
lib/promscrape/discovery/eureka: follow-up after eec76718e9
2020-11-20 14:02:14 +02:00
Nikolay
bb2bcb9725
Adds eureka service discovery ( #913 )
...
* Adds eureka service discovery
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/851
Netflix service discovery for AWS
* Apply suggestions from code review
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-11-20 14:02:13 +02:00
Aliaksandr Valialkin
8e1f657ef9
lib/logger: follow-up for 09105ff49c
2020-11-19 12:37:05 +02:00
Nikolay
f54a5f3868
Adds log suppression per caller ( #908 )
...
* Adds log suppression per caller
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/905
* fixes style and report message
2020-11-19 12:19:20 +02:00
Aliaksandr Valialkin
0895b7f411
lib/logger: add -loggerWarnsPerSecondLimit
command-line flag for rate limiting of WARN log messages
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/905
2020-11-18 03:43:17 +02:00
Aliaksandr Valialkin
7d76fdedcc
app/vmselect: use storage.NewSearchQuery() instead of constructing storage.SearchQuery in-place
...
This should prevent from bugs when AccountID and ProjectID aren't set in storage.SearchQuery.
2020-11-16 18:04:33 +02:00
Aliaksandr Valialkin
a9287cf564
lib/storage: do not pass (accountID, projectID) to SearchTagNames(), since they are already passed via tfss
2020-11-16 18:04:30 +02:00
Aliaksandr Valialkin
ac7460abdd
lib/storage: add a test for Storage.SearchMetricNames
2020-11-16 13:18:48 +02:00
Aliaksandr Valialkin
eea1be0d5c
app/vmselect/graphite: add /tags/findSeries handler from Graphite Tags API
...
See https://graphite.readthedocs.io/en/stable/tags.html#exploring-tags
2020-11-16 12:52:23 +02:00
Aliaksandr Valialkin
4be5b5733a
app/vminsert: add /tags/tagSeries
and /tags/tagMultiSeries
handlers from Graphite Tags API
...
See https://graphite.readthedocs.io/en/stable/tags.html#adding-series-to-the-tagdb
2020-11-16 02:40:04 +02:00
Aliaksandr Valialkin
9ec964bff8
lib/storage: do not show artifically created label for reverse Graphite labels at /api/v1/labels page
2020-11-16 00:44:54 +02:00
Aliaksandr Valialkin
f80d6473e1
lib/protoparser/promremotewrite: log the time spent on unsuccessful data read from the network
...
This should help with debugging `connection timed out` errors.
2020-11-13 17:49:21 +02:00
Vasily
8ba168f3be
Add omitempty for DisableCompression and DisableKeepAlive fields in ScrapeConfig ( #796 )
...
* Add omitempty for DisableCompression and DisableKeepAlive fields in ScrapeConfig
* Add omitempty annotation to all the default/optional values
* Fix annotations after review
2020-11-13 16:17:03 +02:00
Aliaksandr Valialkin
d21b1606a1
lib/protoparser/opentsdbhttp: increment errors counter on unmarshal errors
...
This is a follow-up for 149c0c4a6d
2020-11-13 13:23:11 +02:00
Aliaksandr Valialkin
22c1e29284
lib/protoparser: propagate callback error to the caller of ParseStream for every supported data ingestion protocols
...
The caller of ParseStream then can generate HTTP 503 responses for non-nil errors occured in callbacks when processing incoming requests.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/896
2020-11-13 13:05:34 +02:00
Aliaksandr Valialkin
9dfe00c962
lib/protoparser/promremotewrite: synchronously process Prometheus remote_write requests
...
There is no reason in processing these requests asynchronously in the face of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/896
Synchronous processing code is easier to read and understand than the previous async code
2020-11-13 12:17:32 +02:00
Aliaksandr Valialkin
739b88c1e4
lib/protoparser/promremotewrite: forward errors, which can occur during data ingestion, to the caller of ParseStream, so it could properly return HTTP 503 status code on non-nil error
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/896
2020-11-13 11:00:41 +02:00
Aliaksandr Valialkin
7ceaf4ba8f
all: consistently return text-based HTTP responses with charset=utf-8
...
This is a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897
2020-11-13 10:30:21 +02:00
immerrr again
1ec1a9f27f
app/vmstorage: add "/internal/force_flush" endpoint ( #893 )
2020-11-11 14:46:37 +02:00
Aliaksandr Valialkin
697fd44158
lib/promscrape: make a copy of ScrapeWork from discovered []ScrapeWork slice instead of referring to an item in this slice
...
This should prevent from holding previously discovered []ScrapeWork slices when a part of discovered targets changes over time.
This should reduce memory usage for the case when big number of discovered scrape targets changes over time.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-10 16:13:31 +02:00
Aliaksandr Valialkin
2ec02b7bdb
lib/promscrape: pre-allocate slice for discovered targets based on previously discovered targets
...
This should reduce load on GC a bit when discovering big number of scrape targets
2020-11-10 15:57:43 +02:00
Aliaksandr Valialkin
a8562d643b
lib/promscrape: add -promscrape.dropOriginalLabels
command-line flag for reducing memory usage when discovering big number of scrape targets
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/878
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-10 00:20:49 +02:00
Aliaksandr Valialkin
aa3e46a02d
lib/promscrape: further reduce memory usage for per-scrape target labels by making a copy of actually used labels
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-09 10:55:03 +02:00
Aliaksandr Valialkin
b8083b7659
lib/promscrape: clean references to label name and label value strings after applying per-target relabeling
...
This should reduce memory usage when per-target relabeling creates big number of temporary labels
with long names and/or values.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-07 16:19:52 +02:00
Aliaksandr Valialkin
b4efe626d7
lib/promscrape/discovery/kubernetes: go fmt
2020-11-07 13:04:09 +02:00
Aliaksandr Valialkin
92bc1afcee
lib/promscrape/discovery/kubernetes: reduce memory usage for labels when discovering big number of scrape targets by using string concatenation instead of fmt.Sprintf
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825
2020-11-07 13:03:01 +02:00
Aliaksandr Valialkin
535fea3d11
lib/promscrape: eliminate data race in stream parse
mode
...
Previously `-promscrape.streamParse` mode could result in garbage labels for the scraped metrics because of data race.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/825#issuecomment-723198247
2020-11-07 12:45:52 +02:00
Aliaksandr Valialkin
72011bcc45
app/vmselect: properly handle errors in GetLabelsOnTimeRange and GetLabelValuesOnTimeRange
2020-11-05 01:36:34 +02:00
Aliaksandr Valialkin
f2bff64933
lib/storage: remove data race when updating rowsDeleted
2020-11-05 01:19:30 +02:00
Aliaksandr Valialkin
c5e6c5f5a6
app/vmselect: optimize querying for /api/v1/labels
and /api/v1/label/<name>/values
when start
and end
args are set
2020-11-05 01:19:29 +02:00
Nikolay
5b235b902b
Adds ready probe ( #874 )
...
* adds leading forward slash check for scrapeURL path
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/835
* adds ready probe for scrape config initialization,
it should prevent metrics loss during vmagent rolling update,
/ready api will return 425 http code, if some scrape config still waits for initialization.
* updates docs
* Update app/vmagent/README.md
* renames var
* Update app/vmagent/README.md
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-11-04 20:33:48 +02:00
Aliaksandr Valialkin
2cd86d0220
lib/promscrape: docs update after e4182dd896
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/878
2020-11-04 17:13:34 +02:00
Nikolay
d0a9b24c5a
reduces memory usage for vmagent, ( #880 )
...
* reduces memory usage for vmagent,
limits count of droppedTarget, that can be stored for /api/v1/targets page up to 999 items,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/878
* Update app/vmagent/README.md
* Update app/vmagent/README.md
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-11-04 17:13:33 +02:00
Aliaksandr Valialkin
c736339843
lib/{storage,mergeset}: clean cached index blocks and inmemory blocks more aggressively
...
Previously such blocks were cleaned after they weren't accessed during 10 minutes.
Now they are cleaned after one minute of missing access. This should reduce memory usage in general case.
2020-11-04 16:44:15 +02:00
Aliaksandr Valialkin
f35aafb6a5
Revert "lib/promscrape: add -promscrape.dropOriginalLabels
command-line flag for reducing memory usage when discovering big number of scrape targets"
...
This reverts commit b08c6f5144
.
2020-11-04 11:45:38 +02:00
Aliaksandr Valialkin
b08c6f5144
lib/promscrape: add -promscrape.dropOriginalLabels
command-line flag for reducing memory usage when discovering big number of scrape targets
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/878
2020-11-04 11:09:05 +02:00
Aliaksandr Valialkin
c046735571
lib/promscrape: reduce memory allocations in promLabelsString() function
...
This should help with reducing memory usage in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/878
2020-11-04 10:38:59 +02:00
Aliaksandr Valialkin
c0bd208c77
lib/storage: do not report about the need of free disk space if parts cannot be merged due to too big write amplification
2020-11-03 15:32:09 +02:00
Aliaksandr Valialkin
1b9778a756
lib/storage: remove unneeded fmt.Sprintf
2020-11-03 14:21:04 +02:00
John Belmonte
8653e2658e
add short_version label to vm_app_version metric ( #877 )
...
* add short_version label to vm_app_version metric
use case: Version panel of Grafana dashboard should use a live query, but currently it uses a template query which becomes stale. Grafana is not able to preform regex substitution on labels.
* Update metrics.go
* fix compile
2020-11-03 14:12:42 +02:00
Aliaksandr Valialkin
f3a7e6f6e3
lib/storage: remove obsolete code
2020-11-02 19:17:30 +02:00
Aliaksandr Valialkin
3ed9f1d5a9
lib/promscrape: properly handle response body after 301 redirect
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/869
2020-11-02 01:09:59 +02:00
Nikolay
e8fe618bbb
fixes panic at scrape error body formating, ( #868 )
...
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/864
regression after body reuse improvements
2020-11-01 23:25:23 +02:00
Nikolay
058f49de57
adds leading forward slash check for scrapeURL path ( #855 )
...
* fixes in-consistency with prometheus behaviour for scrape targets url path.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/835
2020-11-01 23:22:16 +02:00
Aliaksandr Valialkin
ed724d25ba
lib/promscrape: add stream parse
mode for efficient scraping of targets that expose millions of metrics
2020-11-01 23:12:26 +02:00
Aliaksandr Valialkin
901514be88
lib/storage: drop more samples outside the given retention during background merge
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/17
2020-10-31 20:44:47 +02:00
Aliaksandr Valialkin
abdf22e0bb
app/vmagent: expose /api/v1/targets
page according to https://prometheus.io/docs/prometheus/latest/querying/api/#targets
...
This page is exposed by vmagent and by a single-node VictoriaMetrics
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/643
2020-10-20 21:55:14 +03:00
Aliaksandr Valialkin
7599e5c835
lib/storage: properly handle the case when key="__name__" is passed to MetricName.AddTag*
2020-10-20 20:09:52 +03:00
Aliaksandr Valialkin
9c5cd5a6c5
lib/storage: code cleanup after 5bfd4e6218
2020-10-20 16:10:53 +03:00
Aliaksandr Valialkin
0db7c2b500
app/vmstorage: support for -retentionPeriod
smaller than one month
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/173
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/17
2020-10-20 14:42:46 +03:00
Aliaksandr Valialkin
5518b11720
lib/memory: do not print trailing zeroes in logs for -memory.allowedPercent
command-line flag
2020-10-20 14:42:37 +03:00
faceair
0093ee3cd9
disable response compression on websocket ( #841 )
2020-10-17 13:33:37 +03:00
Aliaksandr Valialkin
efb1989193
lib/storage: small code adjustements after d2960a20e0
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/781
2020-10-17 01:17:12 +03:00
faceair
8ddf089deb
evaluate the execution cost of all tag filters ( #824 )
...
* evaluate the execution cost of all tag filters
* fix suffixes typo
2020-10-17 01:13:20 +03:00
Nikolay Khramchikhin
a5d842caf8
fixes openstack api endpoint with suffix trim adds openstack ( #840 )
...
api v2.0 check
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728
2020-10-16 23:01:45 +03:00
Aliaksandr Valialkin
938b3b7ed1
lib/promscrape: code prettifying after 9bd9f67718
2020-10-12 16:13:59 +03:00
Nikolay Khramchikhin
7f96712b38
Adds dockerswarm sd ( #818 )
...
* adds dockerswarm service discovery
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/656
Following roles supported: services, tasks and nodes.
Basic, token and tls auth supported.
Added tests for labels generation.
* added unix socket support to discovery utils
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-10-12 16:13:58 +03:00
Aliaksandr Valialkin
de1c07b937
lib/backup: add MustStop() method for all remote filesystems
2020-10-09 15:32:13 +03:00
Aliaksandr Valialkin
bf6d523bef
lib/backup/fslocal: add FS.MustStop() method for stopping bandwidth limiter
2020-10-09 15:11:55 +03:00
Aliaksandr Valialkin
d2e917d1cb
app/vmstorage: add vm_rows_added_to_storage_total
metric, which shows the total number of rows added to storage since app start
2020-10-09 13:36:17 +03:00
Aliaksandr Valialkin
a93c62cd60
lib/promscrape: fix tests after 71ea4935de
2020-10-08 19:32:48 +03:00
Aliaksandr Valialkin
0d44e371f3
lib/promscrape: add -promscrape.suppressDuplicateScrapeTargetErrors
command-line flag in order to suppress duplicate scrape target
errors
...
Show also original labels for duplicate targets in error message in order to simplify debugging the issue.
Now `/targets` endpoint accepts optional `show_original_labels=1` query arg, which shows original labels for each target.
This may simplify debugging for target relabeling.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/651
2020-10-08 18:59:25 +03:00
Aliaksandr Valialkin
ad0950f630
lib/backup/actions: improve logging to be more clear to humans
2020-10-08 14:23:00 +03:00
Aliaksandr Valialkin
b51fa16177
app/vmstorage: add -finalMergeDelay
command-line flag for configuring the delay before final merge for per-month partitions after no new data is ingested to it
2020-10-07 17:42:31 +03:00
Aliaksandr Valialkin
54ff78c6c9
lib/protoparser/graphite: support parsing floating-point timestamp like Graphite does
...
Such timestamps are rounded to seconds like Carbon does.
See b0ba62a62d/lib/carbon/protocols.py (L197)
2020-10-06 11:38:35 +03:00
Aliaksandr Valialkin
1da41177a8
lib/promscrape/discovery/openstack: show expiration time for refreshed OpenStack token in seconds - this is easier to interpret by human
2020-10-06 11:20:36 +03:00
Aliaksandr Valialkin
97b836a6f4
lib/fs: fix GOOS=openbsd build by adding fadviseSequentialRead
implementation.
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/785
2020-10-05 23:32:28 +03:00
Aliaksandr Valialkin
50316070d6
lib/promscrape/discovery/openstack: code prettifying after cbe3cf683b
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728
2020-10-05 18:12:31 +03:00
Nikolay Khramchikhin
b4c77fc6d2
Adds openstack sd ( #811 )
...
* adds openstack service discovery
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/728
implemented hypervisors and instance discovery with openstack v3 api.
Added tests for labeling and data parsing.
Added token refresh.
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-10-05 16:48:54 +03:00
Aliaksandr Valialkin
7505e4f390
lib/promrelabel: make a copy of label with new name for action: labelmap
in the same way as Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/812
2020-10-05 16:20:22 +03:00
Aliaksandr Valialkin
9a7c863bd8
lib/protoparser/influx: add -influx.maxLineSize
command-line flag for configuring the maximum size for a single Influx line during parsing
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/807
2020-10-05 15:19:11 +03:00
Aliaksandr Valialkin
11ddb79aeb
lib/decimal: add tests for negative values passed to maxUpExponent
2020-10-05 14:57:03 +03:00
Aliaksandr Valialkin
d448e07546
lib/decimal: properly calibrate scale for blocks with Inf values
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/805
2020-10-05 14:52:59 +03:00
Aliaksandr Valialkin
fd7dd5064a
lib/storage: code cleanup after 10f2eedee0
...
Remove the code that uses metricIDs caches for the current and the previous hour during metricIDs search,
since this code became unused after implementing per-day inverted index almost a year ago.
While at it, fix a bug, which could prevent from finding time series with names containing dots (aka Graphite-like names
such as `foo.bar.baz`).
2020-10-01 19:12:04 +03:00
Aliaksandr Valialkin
3ad7566a87
lib/storage: imrpove cache effectiveness for time series ids matching the given filters
...
Previously the maximum cache lifetime has been limited by 10 seconds. Now it is extended up to a day.
This should reduce CPU usage in the following cases:
* when querying recently added data with small churn rate for time series
* when querying historical data
2020-10-01 14:39:46 +03:00
Aliaksandr Valialkin
7c2e4e267a
lib/storage: allow set values higher than 1 for vm_merge_need_free_disk_space
if there are multiple partitions with deferred merges due to disk space shortage
2020-09-29 22:53:34 +03:00
Aliaksandr Valialkin
097a4c10dd
app/vmstorage: add metrics for determining whether background merges need additional disk space to complete
...
These metrics are:
* vm_small_merge_need_free_disk_space
* vm_big_merge_need_free_disk_space
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686
2020-09-29 21:47:47 +03:00
Aliaksandr Valialkin
5dca7bbe85
app/vmagent/remotewrite: do not show -remoteWrite.url
in logs if -remoteWrite.showURL
isn't set
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/773
2020-09-29 19:49:19 +03:00
Aliaksandr Valialkin
5e998e597a
lib/cgroup: do not adjust the number of detected CPU cores via /sys/devices/system/cpu/online
...
The adjustement increases the resulting GOMAXPROC by 1, which looks confusing to users
as outlined at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685#issuecomment-698595309
2020-09-29 13:55:53 +03:00
Aliaksandr Valialkin
338a53ccf9
lib/storage: fix tests for 32-bit arches such as GOARCH=386 and GOARCH=arm
2020-09-29 13:10:37 +03:00
Aliaksandr Valialkin
ef416c72c2
lib/storage: fix 32-bit builds for GOARH=386 or GOARCH=arm
2020-09-29 12:42:25 +03:00
Aliaksandr Valialkin
0b0259c42c
lib/protoparser/prometheus: sort rows before comparing them in TestParseStream, since the order for callback calls is non-deterministic
2020-09-29 12:29:50 +03:00
Aliaksandr Valialkin
8f25206d8c
lib/protoparser/prometheus: fix TestParseStream after 124f78857b
2020-09-29 12:12:33 +03:00
Aliaksandr Valialkin
81cdf2fa14
lib/{fs,filestream}: small consistency-related updates after cc90a548b1
2020-09-29 00:43:20 +03:00
Nikolay Khramchikhin
658a05ef0f
added openbsd implementations ( #790 )
...
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/785
removed fadvise for openbsd, added freespace implemenation for openbsd
2020-09-29 00:43:19 +03:00
Aliaksandr Valialkin
1481d6d8ff
lib/protoparser: avoid copying of buffer read from the network to unmarshal buffer
2020-09-28 17:19:04 +03:00
Aliaksandr Valialkin
8df33bd5c1
app/{vminsert,vmagent}: improve data ingestion speed over a single connection
...
Process data obtianed from a single connection on all the available CPU cores.
2020-09-28 04:14:51 +03:00
Aliaksandr Valialkin
7072db75cb
lib/protoparser: use 64KB read buffer instead of default 4KB buffer provided by net/http.Server
...
This should reduce syscall overhead when reading big amounts of data
2020-09-28 02:07:19 +03:00
Aliaksandr Valialkin
6d8c23fdbd
app/{vminsert,vmselect}: skip accountID and projectID when marshaling/unmarshaling MetricName in /api/v1/export/native and /api/v1/import/native
...
This is needed in order to be able to migrate native data from/to single-node VictoriaMetrics
2020-09-28 00:58:58 +03:00
Aliaksandr Valialkin
aadbd014ff
all: add native format for data export and import
...
The data can be exported via [/api/v1/export/native](https://victoriametrics.github.io/#how-to-export-data-in-native-format ) handler
and imported via [/api/v1/import/native](https://victoriametrics.github.io/#how-to-import-data-in-native-format ) handler.
2020-09-27 17:36:38 +03:00
Aliaksandr Valialkin
00ec2b7189
lib/protoparser: use all the available CPU cores for processing ingested data from a single /api/v1/import stream
...
Previously a single data ingestion stream to /api/v1/import could load only a single CPU core.
2020-09-26 04:22:06 +03:00
Aliaksandr Valialkin
533bf76a12
lib/storage: correctly use maxBlockSize in various checks
...
Previously `maxBlockSize` has been multiplied by 8 in certain checks. This is unnecessary.
2020-09-24 18:13:15 +03:00
Aliaksandr Valialkin
543f3aea97
all: consistently use "%w" formatting in fmt.Errorf for wrapped errors
2020-09-23 22:48:21 +03:00
Aliaksandr Valialkin
90d2549428
lib/persistentqueue: protect from multiple concurrent opening for the same persistent queue
2020-09-23 02:17:53 +03:00
Aliaksandr Valialkin
89d652b583
lib/cgroup: attempt to obtain available CPU cores via /sys/devices/system/cpu/online
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685#issuecomment-674423728
2020-09-22 23:27:26 +03:00
Aliaksandr Valialkin
31e341371b
lib/storage: code prettifying after be5e1222f3
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/781
2020-09-22 00:42:20 +03:00
faceair
ad41e39350
add filter to getMetricIDs ( #783 )
...
* add getMetricIDs filter
* check nil filter before use
2020-09-22 00:42:19 +03:00
Aliaksandr Valialkin
d32c3f747c
lib/logger: add -loggerDisableTimestamps
command-line flag for disabling timestamps in logs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/778
2020-09-21 19:25:50 +03:00
Aliaksandr Valialkin
f961838290
lib/promscrape/discovery/ec2: code prettifying after 312fead9a2
2020-09-21 18:44:04 +03:00
Nikolay Khramchikhin
0069353d5e
Add improvements to ec2_sd_discovery ( #775 )
...
* Add improvements to ec2 discovery
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/771
role_arn support with aws sts
instance iam_role support
refreshing temporary tokens
* Apply suggestions from code review
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
* changed implementation, removed tests, clean up code
* moved endpoint builder into getEC2APIResponse
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2020-09-21 16:05:01 +03:00
Aliaksandr Valialkin
a9321f6a60
lib/storage: reduce CPU load for idle VictoriaMetrics by reducing the frequency for the need for background merges
2020-09-21 15:51:26 +03:00
Aliaksandr Valialkin
604e8f6114
lib/decimal: optimize maxUpExponent() by eliminating division from hot path
2020-09-19 13:49:43 +03:00
Aliaksandr Valialkin
df547bf345
lib/persistentqueue: sync data to file inside filestream.Writer.MustFlush
2020-09-19 12:51:46 +03:00
Aliaksandr Valialkin
778ea183ca
lib/decimal: properly store Inf values
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/752
2020-09-18 19:08:53 +03:00
Aliaksandr Valialkin
adb4d3b75c
lib/persistentqueue: flush data to disk every second
...
Previously small amounts of data may be left unflushed for extended periods of time if vmagent collects small amounts of data.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/687
2020-09-18 13:06:03 +03:00
Aliaksandr Valialkin
8fd791d399
lib/promscrape: avoid copying response body when scraping targets.
...
This should reduce memory usage when scraping targets with millions of metrics.
2020-09-18 13:06:00 +03:00
Aliaksandr Valialkin
d96858b921
lib/storage: add /internal/force_merge
handler for running forced compactions on historical per-month partitions
...
This may be useful for freeing up storage space after time series deletion.
See https://victoriametrics.github.io/#force-merge for more details.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/686
2020-09-17 12:20:56 +03:00
Aliaksandr Valialkin
3abbb38254
lib/{mergeset,storage}: compare errors with errors.Is()
2020-09-17 03:03:10 +03:00
Aliaksandr Valialkin
ddb3519e17
lib/{mergeset,storage}: code prettifying
2020-09-17 02:06:37 +03:00
Aliaksandr Valialkin
bf826dd828
lib/storage: removed duplicate checks for empty parts during merge - another check is in the beginning of mergeParts functions
2020-09-17 01:49:08 +03:00
Aliaksandr Valialkin
406f4fe445
app/vmagent: substitute -remoteWrite.url
with secret-url
value in logs, since it may contain sensitive info such as passwords or auth tokens
...
Pass `-remoteWrite.showURL` command-line flag in order to see real `-remoteWrite.url` values in logs and at `/metrics` page.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/773
2020-09-16 22:36:18 +03:00
Aliaksandr Valialkin
9705ac5d7a
lib/persistentqueue: code simplification after d455764a6f
2020-09-16 21:14:01 +03:00
Aliaksandr Valialkin
eee6f1e56d
lib/persistentqueue: make the persistent queue more durable against unclean shutdown (kill -9, OOM, hard reset)
...
The strategy is:
- Periodical flushing of inmemory blocks to files, so they aren't lost on unclean shutdown.
- Periodical syncing of metadata for persisted queues, so the metadata remains in sync with the persisted data.
- Automatic adjusting of too big chunk size when opening the queue. The chunk size may be bigger than the writer offset after unclean shutdown.
- Skipping of broken chunk file if it cannot be read.
- Fsyncing finalized chunk files.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/687
2020-09-16 18:13:24 +03:00
Aliaksandr Valialkin
6ce52e3702
lib/protoparser/vmimport: add more testcases for invalid timestamps and values
...
Updates https://github.com/VictoriaMetrics/vmctl/issues/25
2020-09-16 02:21:53 +03:00
Aliaksandr Valialkin
cd87ca303f
lib/protoparser: report more errors for incorrect timestamps and/or values
...
Previously certain errors in timestamps and/or values could be silently skipped,
which could lead to samples with zero values stored in the database.
Updates https://github.com/VictoriaMetrics/vmctl/issues/25
2020-09-16 02:16:15 +03:00
Aliaksandr Valialkin
5c4e111b43
lib/protoparser/graphite: return error when value or timestamp cannot be properly parsed
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/99
2020-09-16 02:16:13 +03:00
Aliaksandr Valialkin
0c1c1b79ba
lib/promscrape: add a link to troubleshooting docs to error message when duplicate scrape target with identical labels is skipped
2020-09-15 14:16:20 +03:00
Aliaksandr Valialkin
ca08161b54
lib/promscrape: typo fix
2020-09-12 00:14:15 +03:00
Aliaksandr Valialkin
e53235ac5c
lib/promscrape: do not reset the remaining rows when pushing a part of data to remote storage during big scrapes
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/753
Thanks to @PerGon and @clmssz for help with debugging.
2020-09-11 23:38:17 +03:00
Aliaksandr Valialkin
114cf24b43
lib/promscrape/discovery/dns: add __meta_dns_srv_record_target
and __meta_dns_srv_record_port
labels
...
This syncs dns service discovery with Prometheus 2.21 - see https://github.com/prometheus/prometheus/releases
and https://github.com/prometheus/prometheus/pull/7678 .
2020-09-11 21:35:39 +03:00
Aliaksandr Valialkin
5cf5a0e8c4
lib/protoparser/common: do not read request body when parsing timestamp
query arg
...
This was preventing from reading data via /api/v1/prometheus/import .
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/750
2020-09-11 14:45:21 +03:00
Aliaksandr Valialkin
81c05f669b
lib/storage: do not store inf values, since they may lead to significant precision loss for previously stored values
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/752
2020-09-11 14:45:20 +03:00
Aliaksandr Valialkin
8bd5aa3516
lib/protoparser: accept timestamp in milliseconds instead of seconds at /api/v1/import/prometheus
...
This improves consistency with timestamps in Prometheus text exposition format
2020-09-11 14:05:24 +03:00
Aliaksandr Valialkin
58d3b82ae5
app/{vminsert,vmagent}: allow passing timestamp via timestamp
query arg when ingesting data to /api/v1/import/prometheus
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/750
2020-09-11 13:28:31 +03:00
Nikolay Khramchikhin
af994562c8
Added endpointslices discovery to k8s api ( #760 )
...
This is similar to https://github.com/prometheus/prometheus/pull/6838 , which will be added in Prometheus v2.21.
See https://github.com/prometheus/prometheus/releases/tag/v2.21.0-rc.1
* Added endpointslices discovery to k8s api
Started from 1.17 k8s version endpointslices is beta,
it allows to query k8s api for endpoints more efficient.
It presents at scrape_config.yaml as separate role for kubernetes_sd_config.
kubernetes_sd_config:
- role: endpointslices
* fixed typos, changed EndpointConditions signature - with values instead of pointers
2020-09-11 12:24:50 +03:00
Aliaksandr Valialkin
f307e6f432
app/vmselect: initial implementation of Graphite Metrics API
...
See https://graphite-api.readthedocs.io/en/latest/api.html#the-metrics-api
2020-09-11 00:30:20 +03:00
Aliaksandr Valialkin
f5cb213ef9
lib/storage: reuse timestamp blocks for adjancent metric blocks with identical timestamps
...
This should reduce disk space usage when scraping targets containing metrics with identical names
such as `node_cpu_seconds_total`, histograms, quantiles, etc.
Expose `vm_timestamps_blocks_merged_total` and `vm_timestamps_bytes_saved_total` metrics for monitoring
the effectiveness of timestamp blocks merging.
2020-09-09 23:59:21 +03:00
Aliaksandr Valialkin
e5c8377212
lib/httpserver: add a jitter to connection timeouts in order to protect from Thundering herd problem
2020-09-08 19:57:20 +03:00
Nikolay Khramchikhin
fb356c434b
Changed s3 configProfile flag default, ( #749 )
...
aws sdk has complicated logic for chosing profile name and we shouldn't set
it to `default` value. It leads to bugs and improper configuration.
Set it to empty value by default is safe. It will be automatically set to `default` by sdk.
2020-09-07 21:55:45 +03:00
Aliaksandr Valialkin
d2f6f96e4a
lib/memory: fall back to reading hierarchical memory limit in cgroups when the default limit isn't set
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/699
2020-09-04 00:04:57 +03:00
Aliaksandr Valialkin
281d715060
lib/httpserver: add -http.connTimeout
command-line flag for limiting the lifetime for incoming http connections
...
This can be useful for balancing incoming connections among multiple services.
2020-09-03 22:23:55 +03:00
Aliaksandr Valialkin
4fa97430d7
app/{vminsert,vmagent}: allow adding extra labels when importing data via Prometheus, CSV and JSON line formats
...
Extra labels may be added to the imported data by passing `extra_label=name=value` query args.
Multiple query args may be passed in order to add multiple extra labels.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/719
2020-09-02 19:47:02 +03:00
Aliaksandr Valialkin
95ce89e7d7
lib/promscrape: use the number of parsed rows as a basis for writeRequestCtxPool leveling
...
The previous basis on `cap(sw.labels)` doesn't work anymore after 7785869ccc
,
because `sw.labels` may be reset multiple times when processing big number of rows.
2020-09-02 18:46:55 +03:00
Aliaksandr Valialkin
bc1ca4b20b
lib/httpserver: add -http.idleConnTimeout
command-line flag for tuning the timeout for incoming idle http connections
2020-09-01 15:33:31 +03:00
Aliaksandr Valialkin
a01c56104a
lib/promscrape: fix applying sample_limit when scraping targets with big number of metrics
...
This has been broken at 7785869ccc
2020-09-01 11:09:25 +03:00
Aliaksandr Valialkin
deff8d419a
lib/promscrape: reduce memory usage when scraping targets with millions of metrics
...
This should help when scraping /federate endpoints from Prometheus instances,
which scrape millions of metrics. See https://prometheus.io/docs/prometheus/latest/federation/
2020-09-01 10:55:24 +03:00
Aliaksandr Valialkin
5f2277624a
lib/{promscrape,leveledbytebufferpool}: rename getPoolIdAndCapacity to getPoolIDAndCapacity in order to make golint happy
2020-08-28 09:49:22 +03:00
Aliaksandr Valialkin
45e770ed20
lib/cgroup: limit the maximum GOMAXPROCS value to the number of available CPU cores
...
There is no sense in setting GOMAXPROCS to value higher than the number of available CPU cores.
2020-08-28 09:49:22 +03:00
Roman Khavronenko
34ef10fbcc
lib/flagutil: avoid int overflow for arch 386 ( #710 )
...
Arch 386 is a 32-bit architecture and interprets int type for numbers as an explicit int32,
whereas on most modern CPUs int is implicitly an int64. This makes tests to fail with
`int overflow` error.
2020-08-28 09:46:35 +03:00
Aliaksandr Valialkin
9a77ae9d1c
lib/promscrape: reduce memory usage when scraping targets with big number of metrics alongside targets with small number of labels
...
Previously targets with big number of metrics and/or labels could generated too big buffers,
which then could be re-used when scraping targets with small number of metrics.
This resulted in memory waste.
Now big buffers are used only for targets with big number of metrics / labels,
while small buffers are used for targets with small number of metrics / labels.
2020-08-16 22:30:34 +03:00
Aliaksandr Valialkin
3ea6444219
lib/leveledbytebufferpool: allocate byte buffers with capacity rounded to the upper boundary for the given bucket
...
This should reduce the number of resizings for the returned byte buffers.
2020-08-16 22:13:38 +03:00
Roman Khavronenko
4b89da9463
lib/decimal: rename significant decimal digits
to significant figures
( #698 )
...
The previous notion was inconsistent with what `decimal.Round` does.
According to [wiki](https://en.wikipedia.org/wiki/Significant_figures ) rounding
applied to all significant figures, not just decimal ones.
2020-08-16 17:22:40 +03:00
Aliaksandr Valialkin
6aab2f4989
all: allow using KB
, MB
, GB
, KiB
, MiB
and GiB
suffixes in command-line flag values related to byte sizes or byte rates
2020-08-16 17:08:28 +03:00
Aliaksandr Valialkin
1b5467f7fd
lib/memory: improve log message about the memory allowed to use by VictoriaMetrics
2020-08-16 17:08:27 +03:00
Aliaksandr Valialkin
d9f7ea1c6e
lib/protoparser: removed unnecessary call to SetReadDeadline when reading a stream of data
...
The OS should return any buffered data in the stream without the need to set the read timeout.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/696
2020-08-15 15:38:35 +03:00
Aliaksandr Valialkin
edf3ee2a9b
lib: dump compressed block contents on error during decompression
...
This should improve detecting root cause for https://github.com/facebook/zstd/issues/2222
2020-08-15 14:51:14 +03:00
Aliaksandr Valialkin
6e863376f7
lib/leveledbytebufferpool: pre-allocate byte slice with the given capacity if the pool is empty
...
This should reduce memory allocations and copying when the byte slice is growing.
2020-08-15 01:41:59 +03:00
Aliaksandr Valialkin
3efa4e4e1c
lib/protoparser: move common code for detecting timeouts to ReadLinesBlockExt
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/696
2020-08-14 20:39:51 +03:00
Aliaksandr Valialkin
c6b0547847
lib/protoparser: prevent from busy loop on repeated timeout errors when reading streams of ingested data
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/696
2020-08-14 20:13:37 +03:00
Aliaksandr Valialkin
9d79a3a99d
lib/memory: add -memory.allowedBytes
command-line flag for setting absolute memory limit for VictoriaMetrics caches
2020-08-14 19:19:10 +03:00
Aliaksandr Valialkin
b996280c65
app/{vminsert,vmagent}: improve documentation for -influxListenAddr
command-line flag
2020-08-14 18:03:08 +03:00
Aliaksandr Valialkin
c82a485cf6
lib/protoparser/prometheus: typo fix in error message
2020-08-14 11:04:15 +03:00
Aliaksandr Valialkin
9e67343756
lib/promscrape: use a hint on body length instead of body capacity
...
This should reduce memory usage for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/689
2020-08-14 01:17:46 +03:00
Aliaksandr Valialkin
b4119bb51e
lib/promscrape: reduce memory usage when scraping big number of targets
...
Thanks to @dxtrzhang for the original idea at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/688
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/689
2020-08-14 01:05:04 +03:00
Aliaksandr Valialkin
1724cc241e
lib/promscrape: properly retry requests on the server closed connection before returning the first response byte
error during service discover API calls and target scrapes
2020-08-13 22:32:29 +03:00
Aliaksandr Valialkin
60c7397be5
all: support %{ENV_VAR}
placeholders in yaml configs in all the vm* components
...
Such placeholders are substituted by the corresponding environment variable values.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/583
2020-08-13 17:17:06 +03:00
Aliaksandr Valialkin
6721e47ae9
app: respect CPU limits set via cgroups
...
Update GOMAXPROCS to limits set via cgroups. This should reduce CPU trashing and reduce memory usage
for cases when VictoriaMetrics components run in containers with CPU limits.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685
2020-08-11 23:01:03 +03:00
Aliaksandr Valialkin
59d95961b8
lib/protoparser: clarify that the string passed to Unmarshal()
function must remain available when the parsed rows are in use
2020-08-11 17:05:21 +03:00
Aliaksandr Valialkin
890cfe5b61
lib/protoparser/influx: accept precision=us
and precision=µ
according to https://docs.influxdata.com/influxdb/v1.8/tools/api/#write-http-endpoint
2020-08-10 20:23:20 +03:00
Aliaksandr Valialkin
e3439a6cd0
lib/promscrape: optimize per-metric hash calculations
...
This increases vmagent performance by up to 10% when scraping big number of metrics
2020-08-10 19:47:50 +03:00
Aliaksandr Valialkin
4ce1368e4b
lib/storage: mention time range used in the query that led to error message
...
This should improve detecting slow queries with too big time ranges
2020-08-10 13:46:29 +03:00
Aliaksandr Valialkin
f92255e803
lib/storage: mention tag filters used in the query that led to error message
...
This should improve detecting invalid or heavy queries that lead to errors.
2020-08-10 13:36:54 +03:00
Aliaksandr Valialkin
b3d4ff7ee2
app/vmstorage: improve error logging when the request times out
2020-08-10 13:17:24 +03:00
Aliaksandr Valialkin
e3999ac010
lib/promscrape: show real timestamp and real duration for the scape on /targets
page
...
Previously the scrape duration may be negative when calculated scrape timestamp drifts away from the real scrape timestamp
2020-08-10 12:40:49 +03:00
Aliaksandr Valialkin
c09c881264
lib/promscrape: make errcheck happy
2020-08-09 13:17:30 +03:00
Aliaksandr Valialkin
2dfb42a8b4
lib/promscrape: export scrape_samples_added
per-target metric like Prometheus does
...
This metric may be useful for detecting targets with high churn rate for the exported metrics.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/683
2020-08-09 12:45:30 +03:00
Aliaksandr Valialkin
fd9f1463df
lib/fs: use WARN instead of ERROR log level for the message when NFS diretory removal temporarily fails
...
this is expected condition, so it is better to use WARN log level for it
2020-08-09 12:07:35 +03:00
Aliaksandr Valialkin
d4be3efc60
lib/promscrape: add a test for scrape config for blackbox exporter
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/684
2020-08-09 12:03:51 +03:00
Aliaksandr Valialkin
67cacb22ac
lib/httpserver: add -tls
, -tlsCertFile
and -tlsKeyFile
command-line flags in every vm binary
...
This makes such binaries compatible with binaries from `master` branch (aka single-node version)
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/677
2020-08-07 10:57:32 +03:00
Aliaksandr Valialkin
307281e922
lib/storage: slow down concurrent searches when the number of concurrent inserts reaches the limit
...
This should improve data ingestion performance when heavy searches are executed
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-08-07 08:49:13 +03:00
Aliaksandr Valialkin
dd1d59f57a
lib/storage: properly check timeouts and pace limits
...
Previously they were checked on every iteration for small number of iterations
2020-08-07 08:40:56 +03:00
Aliaksandr Valialkin
a2039b3bbc
app/vmselect: return the upper bound on the number of found time series from storage.Search.Init
...
This is used by a single-node version in order to reduce memory allocations during search.
See bc8381613d
for details.
2020-08-06 19:20:31 +03:00
Aliaksandr Valialkin
b690eeff53
lib/storage: reduce the frequency (and overhead) for timeout and pace limiter checks by 4x
2020-08-06 18:45:47 +03:00
Aliaksandr Valialkin
6c0a92a1ee
lib/pacelimiter: increase scalability for multi-CPU system
2020-08-06 18:33:07 +03:00
Aliaksandr Valialkin
13f8644f8e
lib/storage: optimize prefetching metric names for the given metricIDs
2020-08-06 16:52:58 +03:00
Aliaksandr Valialkin
f789e0fa44
lib/fs: export vm_nfs_pending_dirs_to_remove
metric for monitoring the number of pending directories that couldn't be removed due to NFS lock
2020-08-06 15:31:50 +03:00
Aliaksandr Valialkin
a3e91c593b
lib/storage: limit the number of concurrent calls to storage.searchTSIDs to GOMAXPROCS*2
...
This should limit the maximum memory usage and reduce CPU trashing on vmstorage
when multiple heavy queries are executed.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-08-05 18:27:21 +03:00
Aliaksandr Valialkin
76064ba9e7
Perform conversion from string to []byte according to rule #6 at https://golang.org/pkg/unsafe/#Pointer
2020-08-05 11:55:12 +03:00
Aliaksandr Valialkin
8cc2e01386
lib/backup: allow using ~/.aws/config
without region
...
Use us-west-2 for determining bucket region.
2020-08-04 13:08:05 +03:00
Aliaksandr Valialkin
a2aa3a60eb
app/vmselect: show X-Forwarded-For
contents on /api/v1/status/active_queries
page
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/659
2020-07-31 20:01:09 +03:00
Aliaksandr Valialkin
3149af624d
lib/storage: reduce the maximum number of concurrent merge workers to GOMAXPROCS/2
...
Previously the limit has been raised to GOMAXPROCS, but it has been appeared that this
increases query latencies since more CPUs are busy with merges.
While at it, substitute `*MergeConcurrencyLimitCh` channels with simple integer limits.
2020-07-31 17:53:13 +03:00
Aliaksandr Valialkin
29bbab0ec9
lib/storage: remove prioritizing of merging small parts over merging big parts, since it doesn't work as expected
...
The prioritizing could lead to big merge starvation, which could end up in too big number of parts that must be merged into big parts.
Multiple big merges may be initiated after the migration from v1.39.0 or v1.39.1. It is OK - these merges should be finished soon,
which should return CPU and disk IO usage to normal levels.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/618
2020-07-30 20:02:22 +03:00
Aliaksandr Valialkin
96039dcb40
lib/storage: properly update vm_slow_row_inserts_total
metric when importing multiple data points per time series at once
...
Previously the `vm_slow_row_inserts_total` metric may be incremented multiple times for different data points per a single time series,
while only a single increment is needed when inserting the first data point for this time series.
2020-07-30 16:17:19 +03:00
Aliaksandr Valialkin
1e067401ba
lib/httpserver: emit X-Forwarded-For additionally to remoteAddr in error logs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/659
2020-07-29 13:12:35 +03:00
Sasasu
96bc476e53
lib/storage: metaindexRow use memroy more efficiently ( #655 )
...
due to memory align the metaindexRow structure use 64-byte pre object.
this commit changes the order of field, make metaindexRow use 56-byte pre
object.
Signed-off-by: Sasasu <su@sasasu.me>
2020-07-27 23:23:25 +03:00
Aliaksandr Valialkin
f26ef58137
lib/protoparser/prometheus: add a test for cassandra-exporter
...
Thanks to Seva
2020-07-27 18:37:46 +03:00
Aliaksandr Valialkin
94cc677b0c
lib/storage: slightly reduce code difference between single-node and cluster versions
2020-07-24 01:18:05 +03:00
Aliaksandr Valialkin
fb3d1380ac
lib/storage: respect -search.maxQueryDuration
when searching for time series in inverted index
...
Previously the time spent on inverted index search could exceed the configured `-search.maxQueryDuration`.
This commit stops searching in inverted index on query timeout.
2020-07-23 21:22:05 +03:00
Aliaksandr Valialkin
dbf3038637
lib/storage: add more fine-grained pace limiting for search
2020-07-23 19:21:49 +03:00
Aliaksandr Valialkin
b8303afcd8
lib/storage: improve prioritizing of data ingestion over querying
...
Prioritize also small merges over big merges.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/648
2020-07-23 01:40:38 +03:00
Aliaksandr Valialkin
7d0743422b
lib/storage: properly calculate global metrics in UpdateStats()
2020-07-23 00:35:31 +03:00
Aliaksandr Valialkin
6afdcf8a20
lib/mergeset: properly calculate global metrics in UpdateStats()
...
Previously these metrics could be calculated multiple times for multiple mergeset.Table instances.
2020-07-23 00:35:29 +03:00
Aliaksandr Valialkin
23fa44e56e
lib/storage: reorder mergeBlockStreams() args in order to make them more consistent
2020-07-22 21:58:25 +03:00
Aliaksandr Valialkin
754eac676d
lib/storage: prevent possible race condition when all the goroutines exit Storage.AddRows, before goroutines other goroutines are blocked on searchTSIDsCond inside Storage.searchTSIDs
...
This condition may occur after the following sequence of events:
1) A goroutine enters the loop body when len(addRowsConcurrencyCh) == cap(addRowsConcurrencyCh) inside Storage.searchTSIDs.
2) All the goroutines return from Storage.AddRows.
3) The goroutine from step 1 blocks on searchTSIDsCond.Wait() inside the loop body.
The goroutine remains blocked until the next call to Storage.AddRows, which calls searchTSIDsCond.Signal().
This may take indefinite time.
2020-07-22 21:52:42 +03:00
Aliaksandr Valialkin
a3f48e395e
app/vmagent: add -remoteWrite.decimalPlaces
command-line flag, which may be used for reducing disk space usage on the remote storage
2020-07-21 21:55:42 +03:00
Aliaksandr Valialkin
67be79a0bc
lib/uint64set: optimize adding items to the set via Set.AddMulti
2020-07-21 20:57:05 +03:00
Aliaksandr Valialkin
31ef39e8da
lib/httpserver: log remote address in error message from httpserver.Errorf
...
This should improve detection of the root cause of errors.
Thanks to Anant for the idea.
2020-07-20 14:06:29 +03:00
Aliaksandr Valialkin
be0ab4fbfe
lib/storage: reset MetricName->TSID
cache after marking metricIDs as deleted
...
This is a follow-up commit after 12b16077c4
,
which didn't reset the `tsidCache` in all the required places.
This could result in indefinite errors like:
missing metricName by metricID ...; this could be the case after unclean shutdown; deleting the metricID, so it could be re-created next time
Fix this by resetting the cache inside deleteMetricIDs function.
2020-07-14 14:05:19 +03:00
Aliaksandr Valialkin
a4c96d9e6d
lib/protoparser: properly update vm_protoparser_rows_read_total{type="promscrape"}
metric
2020-07-14 12:15:56 +03:00
Seva Poliakov
a5e713b6e0
add vm_protoparser_rows_read_total metrics to promscrape ( #624 )
...
* add vm_protoparser_rows_read_total metrics to promscrape
move vm_protoparser_rows_read_total for promscrape to better place
move vm_protoparser_rows_read_total for promscrape to better place
* remove possibility of infinity loop at prometheus parser
2020-07-14 12:02:25 +03:00
Roman Khavronenko
207e93b50d
lib/flagutil: specify additional description for all Array type flags ( #620 )
...
Array type flag is now defined as `value` type in flag description when printed.
This change adds additional description to every Array type flag so it would be
clear what exact type is used:
```
-remoteWrite.urlRelabelConfig array
Optional path to relabel config for the corresponding -remoteWrite.url
Supports array of values separated by comma or specified via multiple flags.
```
2020-07-13 22:00:03 +03:00
Roman Khavronenko
605711bde5
lib/persistentqueue: add vm_persistentqueue_bytes_pending
metric ( #619 )
...
Metric `vm_persistentqueue_bytes_pending` is a gauge that shows current amount
of bytes in persistentqueue flushed on disk as a difference between write and read
offsets. This metric is very similar to `vmagent_remotewrite_pending_data_bytes`
except of accounting for bytes in-memory.
2020-07-13 21:54:54 +03:00
Roman Khavronenko
a02097e657
Extend metric vm_promscrape_targets
with status
label ( #615 )
...
The change to `vm_promscrape_targets` metric suppose to improve observability
for `vmagent` so it will be possible to track how many targets are up or down
for every specific scrape group:
```
vm_promscrape_targets{type="static_configs", status="down"} 1
vm_promscrape_targets{type="static_configs", status="up"} 2
```
2020-07-13 21:54:53 +03:00
Aliaksandr Valialkin
6373d377ef
app/{vminsert,vmagent}: add ability to import data in Prometheus exposition format via /api/v1/import/prometheus
2020-07-10 12:13:28 +03:00
Aliaksandr Valialkin
2012e294d1
properly calculate readCalls
2020-07-10 12:01:05 +03:00
Aliaksandr Valialkin
87f8c728bf
lib/promscrape: send Accept
header similar to Prometheus when scraping targets
...
This should fix scraping Spring Boot servers, which return incorrect response
unless `Accept: text/plain` request header is set.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/608
2020-07-08 19:50:06 +03:00
Aliaksandr Valialkin
7335743d57
lib/storage: limit the maximum concurrency for data ingestion to GOMAXPROCS
...
Previously the concurrency has been limited to GOMAXPROCS*2. This had little sense,
since every call to Storage.AddRows is bound to CPU, so the maximum ingestion bandwidth
is achieved when the number of concurrent calls to Storage.AddRows is limited to the number of CPUs,
i.e. to GOMAXPROCS.
2020-07-08 17:34:27 +03:00
Roman Khavronenko
929ad74de6
lib/protoparser: fix metric name of unmarshal errors in promremotewrite ( #607 )
...
The change fixes the typo in metric name `vm_protoparser_unmarshal_errors` to
respect the naming standard.
2020-07-08 14:19:27 +03:00
Aliaksandr Valialkin
e401b8d527
lib/protoparser/graphite: go fmt
2020-07-08 14:13:06 +03:00
Aliaksandr Valialkin
50ecf09042
lib/protoparser/graphite: add more tests after eb45185eef
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/610
2020-07-08 14:13:03 +03:00
Seva Poliakov
1ae0334e17
Fix graphite minus one timestamp ( #609 )
...
* fix graphite -1 timestamp
* format the graphite fix -1 timestamp
2020-07-08 14:13:01 +03:00
Aliaksandr Valialkin
fad008df7e
lib/storage: clarify out of retention period
error message by mentioning -retentionPeriod
command-line flag
2020-07-08 13:54:13 +03:00
Aliaksandr Valialkin
fe58462bef
lib/storage: reset MetricName->TSID cache after deleting time series
...
This should prevent from adding new data points to deleted time series
without the need to check for the deleted time series.
This improves ingestion performance a bit when the `deleted time series ids` aka `dmis` set
contains big number of time series.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/596
Based on the idea from @n4mine at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/604
2020-07-06 22:01:24 +03:00
Aliaksandr Valialkin
77bb0e6595
lib/fs: clarify description for -fs.disableMmap
command-line flag
2020-07-06 14:28:57 +03:00
Aliaksandr Valialkin
0bff96fe4b
lib/storage: prioritize data ingestion over heavy queries
...
Heavy queries could result in the lack of CPU resources for processing the current data ingestion stream.
Prevent this by delaying queries' execution until free resources are available for data ingestion.
Expose `vm_search_delays_total` metric, which may be used in for alerting when there is no enough CPU resources
for data ingestion and/or for executing heavy queries.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/291
2020-07-05 19:44:04 +03:00
Aliaksandr Valialkin
6f1d926698
lib/promscrape: use HostClient.DoDeadline instead of HostClient.Do in order to guarantee strict deadline across multiple scrape attempts
2020-07-03 21:33:48 +03:00
Aliaksandr Valialkin
ee03b4ccbd
lib/promscrape: prevent from too big deadline misses on scrape retries
...
The maximum deadline miss duration is reduced to 2x scrape_interval in the worst case.
By default it is limited to scrape_interval configured for the given scrape target.
2020-07-03 20:42:09 +03:00
Aliaksandr Valialkin
dfa83a4a35
lib/promscrape: check for nil error before checking for the returned status code when scraping targets
2020-07-03 18:37:25 +03:00
Aliaksandr Valialkin
8bb3622e9d
app/vminsert: prevent from adding and/or selecting labels with empty values
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600
2020-07-02 23:17:12 +03:00
Aliaksandr Valialkin
6ebac3ab63
app/vminsert: add ability to apply relabeling to all the incoming metrics if -relabelConfig
command-line arg points to a file with a list of relabel_config
entries
...
See https://victoriametrics.github.io/#relabeling
2020-07-02 20:36:33 +03:00
Aliaksandr Valialkin
2361ad8ab4
lib/promscrape: add ability to set disable_compression
and disable_keepalive
options in scrape_config
section of the config passed to -promscrape.config
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-07-02 14:19:34 +03:00
Aliaksandr Valialkin
0f754bea49
lib/promscrape: add -promscrape.disableKeepAlive
command-line flag for disabling http keep-alive connections when scraping targets
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-07-01 02:20:46 +03:00
Aliaksandr Valialkin
4cb3e7595c
app/vmstorage: add -denyQueriesOutsideRetention
command-line flag for denying queries outside the configured retention
2020-07-01 00:58:42 +03:00
Aliaksandr Valialkin
81e3d4305f
lib/httpserver: add Unwrap method to ErrorWithStatusCode, so As
and Is
functions in standard errors
package may properly unwrap the error inside ErrorWithStatusCode
2020-07-01 00:53:49 +03:00
Aliaksandr Valialkin
fe77d661b3
all: use errors.As
instead of type assertion for detecting net.Error
2020-07-01 00:16:13 +03:00
Aliaksandr Valialkin
0c4e8aeb2b
all: use errors.As
for inspecting errors that implement httpserver.ErrorWithStatusCode
2020-07-01 00:03:11 +03:00
Aliaksandr Valialkin
d962568e93
all: use %w instead of %s for wrapping errors in fmt.Errorf
...
This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode .
See https://blog.golang.org/go1.13-errors for details.
2020-06-30 23:33:46 +03:00
Aliaksandr Valialkin
5a43842bd3
lib/promscrape: add missing label sorting for autogenerated metrics
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/592
2020-06-29 22:39:40 +03:00
Ween
b42cf33c4d
Fix Auto metrics relabeled errors ( #593 )
...
* Fix Auto metrics relabeled errors
* Finalize auto-genenated Labels
* Fix Test Errors
Co-authored-by: xinyulong <xinyulong@kuaishou.com>
2020-06-29 22:39:39 +03:00
Aliaksandr Valialkin
aad38c8283
lib/promrelabel: properly apply ^
and $
anchors to regex
value in Prometheus relabeling rules
2020-06-25 17:19:02 +03:00
Aliaksandr Valialkin
fd7a3d880e
lib/fs: go fmt
2020-06-23 23:03:08 +03:00
Aliaksandr Valialkin
08edb90814
lib/fs: fall back to cgo copy for copying the last 4KB of mmaped data
...
This probably should fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/581
2020-06-23 22:55:56 +03:00
Aliaksandr Valialkin
3a444bb7bb
lib/promrelabel: add support for keep_if_equal
and drop_if_equal
actions to relabel configs
...
These actions may be useful for filtering out unneeded targets and/or metrics if they contain equal label values.
For example, the following rule would leave the target only if __meta_kubernetes_annotation_prometheus_io_port
equals __meta_kubernetes_pod_container_port_number:
- action: keep_if_equal
source_labels: [__meta_kubernetes_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number]
2020-06-23 17:29:19 +03:00
Aliaksandr Valialkin
de7e585ac8
lib/promscrape: preserve the previously discovered targets on discovery errors per each job_name
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/582
2020-06-23 15:42:46 +03:00
Aliaksandr Valialkin
521c657f8d
lib/fs: an attempt to fix SIGBUS error by rounding mmap`ed region to multiple of 4KB pages
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/581
2020-06-23 13:40:20 +03:00
Aliaksandr Valialkin
5fb60dd647
lib/logger: add -loggerErrorsPerSecondLimit
for limiting the rate of ERROR messages
2020-06-23 12:42:59 +03:00
Aliaksandr Valialkin
a80e852aab
lib/promscrape: retry performing the request to the server for up to 3 times before giving up when it closes keep-alive connections
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/580
2020-06-23 12:34:12 +03:00
Aliaksandr Valialkin
70bf8218bb
app/vmselect/promql: properly override label values from group_left
and group_right
lists like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/577
2020-06-21 16:32:27 +03:00
Aliaksandr Valialkin
50aa34bcbe
lib/promscrape/discovery/consul: reduce load on Consul when discovering big number of targets by using background caching
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-06-20 18:20:07 +03:00
Aliaksandr Valialkin
62e1908986
lib/promscrape: reduce default value for -promscrape.discovery.concurrency
from 500 to 100
...
This should reduce load on Kubernetes API server and Consul when big number of targets are discovered
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/574
2020-06-20 17:53:48 +03:00
Aliaksandr Valialkin
1f2826bae2
lib/promscrape/discovery/ec2: expose __meta_ec2_ami
like the next Prometheus release will do
...
See b5d61fb66c
for details
2020-06-20 17:45:30 +03:00
Tristan Su
c254b683fd
lib/storage: set big/small merge concurrency ( #568 )
...
fixed #567
Co-authored-by: Tristan Su <suqing.sq@alibaba-inc.com>
2020-06-19 02:21:55 +03:00
Aliaksandr Valialkin
2e5b6220a4
lib/promrelabel: allows regex capture groups in target_label like Prometheus does
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/569
2020-06-19 02:20:58 +03:00
Aliaksandr Valialkin
4f673a5201
app/vminsert: export metrics for determining ingested rows with dropped or truncated labels
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/565
2020-06-19 01:12:44 +03:00
Aliaksandr Valialkin
5f3a895c23
lib/storage: add key!=".+"
filter additionally to negative filter matching empty value such as key!~"|foo"
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/546
2020-06-18 20:05:45 +03:00
Aliaksandr Valialkin
c40f29f783
lib/storage: properly match {tag!="|foo"}
filters
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/546
2020-06-10 19:34:37 +03:00
Aliaksandr Valialkin
9f55dea162
lib/httpserver: do not flush and do not close gzip writer if response compression is disabled
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/535
2020-06-05 21:37:46 +03:00
Aliaksandr Valialkin
cf91a94daf
lib/backup: properly create missing parent directories in fs.CreateFile
2020-06-05 19:28:25 +03:00
Aliaksandr Valialkin
ba1f764b29
lib/fs: optimize queries that read recent samples for big number of time series
...
Use standard copy() func instead of mmap-aware copy func for reading recently touched mmap-ed data.
This improves read performance by up to 4x.
2020-06-05 19:10:22 +03:00
Aliaksandr Valialkin
2358d9e41d
lib/fs: add a benchmark for ReaderAt.MustReadAt
2020-06-05 19:10:21 +03:00
Aliaksandr Valialkin
8b0d9df51d
lib/backup/fsremote: create all the parent directories before creating file in CreateFile
2020-06-05 10:25:28 +03:00
Aliaksandr Valialkin
3d0a0b3785
lib/fs: optimize MustGetFreeSpace performance by caching the results for up to 2 seconds
2020-06-04 13:14:04 +03:00
Vyacheslav Mitrofanov
89a922fb19
allow to use values lower than 10 with the flag -memory.allowedPercent ( #531 )
...
Co-authored-by: Vyacheslav Mitrofanov <vmitrofanov@mfms.ru>
2020-06-03 23:40:13 +03:00
Aliaksandr Valialkin
304f9499cf
lib/bytesutil: prevent from garbage collecting s before returning from ToUnsafeBytes
2020-06-03 00:23:27 +03:00
Aliaksandr Valialkin
eca1afdc20
lib/storage: fix Graphite wildcard matching, which has been broken in v1.36.0
2020-05-28 11:58:47 +03:00
Aliaksandr Valialkin
b0131c79b6
lib/storage: improve search speed for time series matching Graphite whildcards such as foo.*.bar.baz
...
Add index for reverse Graphite-like metric names with dots. Use this index during search for filters
like `__name__=~"foo\\.[^.]*\\.bar\\.baz"` which end with non-empty suffix with dots, i.e. `.bar.baz` in this case.
This change may "hide" historical time series during queries. The workaround is to add `[.]*` to the end of regexp label filter,
i.e. "foo\\.[^.]*\\.bar\\.baz" should be substituted with "foo\\.[^.]*\\.bar\\.baz[.]*".
2020-05-27 21:48:08 +03:00
Aliaksandr Valialkin
301838e7b1
lib/httpserver: properly set status code for empty response
2020-05-24 23:55:55 +03:00
Aliaksandr Valialkin
64bec11c91
lib/httpserver: fix compression for static files
2020-05-24 22:16:51 +03:00
Aliaksandr Valialkin
b747362936
lib/promscrape: mention about -promscrape.maxScrapeSize in the error message when target returns too big response
2020-05-24 14:41:24 +03:00
Aliaksandr Valialkin
be7253c084
lib/httpserver: do not recompress already compressed response
...
This shoud help with vmauth issue - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/514
2020-05-22 16:45:20 +03:00
Aliaksandr Valialkin
b59e089ac7
app/vmagent: add -dryRun
option for checking all the configs mentioned in command-line flags without running vmagent
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/362
2020-05-21 15:23:18 +03:00
Aliaksandr Valialkin
482bae8466
lib/promscrape: add -promscrape.config.dryRun
flag for checking -promscrape.config
for errors or unsupported options
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/508
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/362
2020-05-21 14:54:32 +03:00
Aliaksandr Valialkin
73ec5cf460
lib/promscrape: add -promscrape.discovery.concurrency
and -promscrape.discovery.concurrentWaitTime
flags for tuning the number of concurrent requests to autodiscovery API servers at Consul or Kubernetes
2020-05-19 17:35:59 +03:00
Aliaksandr Valialkin
2a8f1e6931
lib/storage: do not increment vm_slow_metric_name_loads_total
counter for metric_ids which shouldnt be prefetched, since this may mislead users
2020-05-16 10:23:39 +03:00
Aliaksandr Valialkin
dc16cdd1ca
lib/persistentqueue: a follow-up for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/484
2020-05-16 09:32:30 +03:00
肖贝贝
c154a92d29
fix: fix vmagent multi queue may become one because sync bug ( #484 )
...
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-05-16 09:32:29 +03:00
Aliaksandr Valialkin
0f3d46810b
lib/backup: remove misleading -dst
mention in error message
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/482
2020-05-15 17:13:27 +03:00
Aliaksandr Valialkin
e72518e8c6
lib/backup: donload only the remaining parts for partially downloaded files after vmrestore
restart
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/487
2020-05-15 17:03:25 +03:00
Aliaksandr Valialkin
1e5c1d7eaa
app/vmstorage: add vm_slow_metric_name_loads_total
metric, which could be used as an indicator when more RAM is needed for improving query performance
2020-05-15 14:12:24 +03:00
Aliaksandr Valialkin
d6b9a49481
app/vmstorage: add vm_slow_row_inserts_total
and vm_slow_per_day_index_inserts_total
metrics for determining whether VictoriaMetrics required more RAM for the current number of active time series
2020-05-15 13:46:57 +03:00
Aliaksandr Valialkin
a72f18e821
lib/{storage,mergeset}: further tuning of compression levels depending on block size
...
This should improve performance for querying newly added data, since it can be unpacked faster.
2020-05-15 13:12:28 +03:00
Aliaksandr Valialkin
2cf2e9955b
lib/storage: wait for all the goroutines to finish in TestSearch in order to prevent racy behavior on test finish
2020-05-15 12:12:20 +03:00
Aliaksandr Valialkin
67e331ac62
lib/storage: optimize ingestion pefrormance for new time series
2020-05-15 12:12:19 +03:00
Aliaksandr Valialkin
6838fa876c
lib/mergeset: tune compression levels in order to improve ingestion performance a bit
2020-05-15 12:12:15 +03:00
Aliaksandr Valialkin
1b5d272e07
lib/storage: reduce indentation in Storage.add
2020-05-14 23:23:56 +03:00
Aliaksandr Valialkin
71d29a8fa1
lib/storage: return the first error instead of the last error, since the first error usually points to the root cause
2020-05-14 23:18:59 +03:00
Aliaksandr Valialkin
3845420a8f
lib: extract common code for returning fast unix timestamp into lib/fasttime
2020-05-14 23:06:50 +03:00
Aliaksandr Valialkin
7e831741f9
lib/{storage,mergeset}: return dst on error from unmarshalBlockHeaders, so it could be reused
2020-05-14 15:32:23 +03:00
Aliaksandr Valialkin
2f42b85e0e
lib/storage: document that getnerateUniqueMetricID should return dense ids
2020-05-14 14:08:59 +03:00
Aliaksandr Valialkin
f442d81648
lib/{storage,mergeset}: cleanup: remove unused partSearch.indexBlockReuse
2020-05-14 14:03:15 +03:00
Aliaksandr Valialkin
8bb44a5d09
lib/storage: optimize label matching for regexp ending with literal suffix
...
For example, `{label=~"foo.*bar.+baz"}` contains literal suffix `baz`,
so it should work faster now.
2020-05-13 11:39:05 +03:00
Aliaksandr Valialkin
3b0f66a227
app/vmagent: fix a bug with improper relabeling when multiple -remoteWrite.urlRelableConfig
args are set
...
This bug could result in incorrect relabeling and metrics' drop.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/467
2020-05-12 22:03:45 +03:00
Aliaksandr Valialkin
c9ab6dc532
lib/fs: do not use mmap for 32-bit arches by default, since they cannot map files bigger than 4GB in RAM
2020-05-12 20:21:39 +03:00
Aliaksandr Valialkin
d54a93fc81
app/vmagent: fix scraping mTLS targets, which has been broken in v1.35.1
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470
2020-05-12 17:23:43 +03:00
Aliaksandr Valialkin
405cf44aed
app/vmagent,lib/promscrape: do not set HostClient.DialDualStack, since it isnt used if HostClient.Dial is set
2020-05-12 15:24:53 +03:00
Aliaksandr Valialkin
bd5f4e0344
lib/storage: properly initialize part struct before trying to close it on error
...
This should prevent from nil pointer dereference bug at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/468 .
2020-05-12 14:54:16 +03:00
Aliaksandr Valialkin
f7753b1469
lib/storage: gradually pre-populate per-day inverted index for the next day
...
This should prevent from CPU usage spikes at 00:00 UTC every day when
inverted index for new day must be quickly created for all the active time series.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/430
2020-05-12 12:13:32 +03:00
Aliaksandr Valialkin
8c77cb436a
lib/storage: typo fixes in error messages: or -> of
2020-05-12 12:12:33 +03:00
Aliaksandr Valialkin
bbf06a4248
lib/storage: speed up matching for common regexps in label filters
...
The following regexps have been optimized:
* 'foo.+bar'
* 'foo.+bar.+baz'
This should improve performance for matching Graphite-like metrics.
2020-05-11 22:49:01 +03:00
Aliaksandr Valialkin
37254a139a
lib/storage: add a benchmark for Graphite-like regexps for metric names
2020-05-11 22:49:00 +03:00
Aliaksandr Valialkin
2f28e945b8
lib/httpserver: add -http.shutdownDelay
flag for a grace period before http server shutdown
...
The http server returns 503 non-OK error at `/health` page during grace period,
so load balancers in front of the http server could re-route incoming requests
to other servers.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/463
2020-05-07 15:25:51 +03:00
Aliaksandr Valialkin
3052b479b7
lib/httpserver: reduce typical duration for http server graceful shutdown
...
Previously the duration for graceful shutdown for http server could take more than a minute
because of imporperly set timeouts in setNetworkTimeout.
Now typical duration for graceful shutdown should be reduced to less than 5 seconds.
2020-05-07 14:16:38 +03:00
Aliaksandr Valialkin
c43a265716
lib/flagutil: make errcheck happy by explicitly ignoring Array.Set result in tests
2020-05-06 22:37:28 +03:00
Aliaksandr Valialkin
15e3682b40
lib/flagutil: properly parse quoted flag values for flagutil.Array
2020-05-06 22:28:15 +03:00
Aliaksandr Valialkin
20538a2a5d
app/vmagent: allow setting independent auth configs per each configured -remoteWrite.url
2020-05-06 16:52:32 +03:00
Aliaksandr Valialkin
9f39e618ed
lib/promscrape/discovery/gce: discover per-zone instances for gce_sd_config
in parallel. This should reduce discovery latency
2020-05-06 15:00:23 +03:00
Aliaksandr Valialkin
8ab5e47b5c
lib/promscrape: add Prometheus-compatible DNS-based service discovery aka dns_sd_configs
2020-05-06 00:02:41 +03:00
Aliaksandr Valialkin
42d563934b
lib/promscrape: properly connect to TCP6 addresses if -enableTCP6
is set
2020-05-06 00:02:40 +03:00
Aliaksandr Valialkin
1c8e97c8a0
lib/procutil: add NewSighupChan function, which returns a channel, which is triggered on every SIGHUP
2020-05-05 10:56:15 +03:00
Aliaksandr Valialkin
054457d1f4
lib/promscrape: allow explicitly setting empty token via token: ""
in consul_sd_config
2020-05-05 07:49:54 +03:00
Aliaksandr Valialkin
89aa6dbf56
lib/promscrape: add Prometheus-compatible service discovery for Consul aka consul_sd_configs
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/330
2020-05-04 20:53:06 +03:00
Aliaksandr Valialkin
28e0e8fd88
lib/promauth: properly set up client certificate in tls.Config
...
Previously the client certificate has been mistakenly set up as a server certificate
2020-05-04 20:53:04 +03:00
Aliaksandr Valialkin
ed91fe1d9b
lib/promscrape: move common code for discovery api config map handling into discoveryutils
2020-05-04 20:52:58 +03:00
Aliaksandr Valialkin
c50fd219dc
lib/promscrape/discovery/kubernetes/: unify apiConfig creation
2020-05-04 20:52:53 +03:00
Aliaksandr Valialkin
a5880f17af
lib/promscrape: remove debug line left after the commit e4aac6ea40
2020-05-03 17:16:19 +03:00
Aliaksandr Valialkin
1f0e8fdc0d
lib/promscrape: fix tests after the commit 658a8742ac
...
The original commit copies `__address__` label to `instance` label when generating per-target labels as Prometheus does.
See https://www.robustperception.io/life-of-a-label for details.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/453
2020-05-03 16:59:29 +03:00
DexterZhang
317688f144
fix(vmagent): different behavior as how prometheus deal with labels. [Issue#453] ( #454 )
2020-05-03 16:59:28 +03:00
Aliaksandr Valialkin
ab1e6a76bb
lib/promscrape: make consistent scrape time offsets across reloads for the same ScrapeURL and Labels
...
This should make consistent intervals between data points for scrape targets across reloads.
Previously these intervals were random.
2020-05-03 14:31:22 +03:00
Aliaksandr Valialkin
f25416984b
lib/promscrape: fix TestGetFileSDScrapeWorkSuccess after 3b234d82e5
2020-05-03 14:31:20 +03:00
Aliaksandr Valialkin
f422203e10
lib/promscrape: reload only modified scrapers on config changes
...
This should improve scrape stability when big number of targets are scraped and these targets are frequently changed.
Thanks to @xbsura for the idea and initial implementation attempts at the following pull requests:
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/449
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/458
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/459
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/460
2020-05-03 12:47:16 +03:00
Aliaksandr Valialkin
bbaca16ce8
lib/httpserver: rename http.externalURL
to http.pathPrefix
and improve help message for this flag
...
The `http.externalURL` flag name was slightly misleading, so it has been renamed to `http.pathPrefix`.
2020-05-02 13:12:24 +03:00
DexterZhang
a0589f2ca5
feat(httpserver): add http.externalUrl
config to http server, it adds prefix to http path automatically ( #452 )
2020-05-02 13:12:23 +03:00
Aliaksandr Valialkin
b21b73115a
app/vminsert: add /-/reload
handler in the same way as for vmagent
2020-04-30 02:18:08 +03:00
Aliaksandr Valialkin
a970705d8e
lib/procutil: prevent from app termination on SIGHUP signal, since this signal is frequently used for config reload
2020-04-30 02:18:06 +03:00
Aliaksandr Valialkin
d99f48aa48
lib/httpserver: mention that -http.maxGracefulShutdownDuration
command-line flag value can be increased on shutdown timeout
2020-04-30 01:37:02 +03:00
Aliaksandr Valialkin
de5f923476
lib/promscrape: set 30 seconds timeout for discovery api requests
...
Previously such requests could hang for long time. This could make debugging harder.
2020-04-29 17:29:03 +03:00
Aliaksandr Valialkin
b6d88bac04
vendor: use github.com/VictoriaMetrics/fasthttp instead of github.com/fasthttp/fasthttp
...
The upstream fasthttp may contain issues like 996610f021
,
plus a code that isn't used by VictoriaMetrics. So let's use a private copy under our control instead.
2020-04-29 16:43:09 +03:00
Aliaksandr Valialkin
9ed4951ec8
lib/metricsql: move it to a separate repository - github.com/VictoriaMetrics/metrics
2020-04-28 15:30:06 +03:00
Aliaksandr Valialkin
d78ed50edd
lib/storage: recover when metricID->metricName entry is missing in the inverted index after unclean shutdown
...
Newly added index entries can be missing after unclean shutdown, since they didn't flush to persistent storage yet.
Log about this and delete the corresponding metricID, so it could be re-created next time.
2020-04-28 12:01:32 +03:00
Aliaksandr Valialkin
53740d0026
lib/promscrape: handle connection reset when targets responds with http redirect
2020-04-28 02:14:32 +03:00
肖贝贝
3e6f29f462
fix: vmagent not follow 301/302 redirect bug ( #445 )
...
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-04-28 02:14:31 +03:00
Aliaksandr Valialkin
424068f804
lib/promscrape: handle connection reset when targets responds with http redirect
2020-04-28 02:14:26 +03:00
肖贝贝
7d045bf2ca
fix: vmagent not follow 301/302 redirect bug ( #445 )
...
Co-authored-by: xiaobeibei <xiaobeibei@bigo.sg>
2020-04-28 02:14:25 +03:00
Aliaksandr Valialkin
2aecf7c37c
lib/{encoding,decimal}: typo fixes in tests: epxecting->expecting
2020-04-28 00:02:19 +03:00
Aliaksandr Valialkin
806dc73d8a
lib/encoding: reduce possibility of failure in TestMarshalInt64ArraySize
2020-04-28 00:02:18 +03:00
Aliaksandr Valialkin
a603a15757
lib/promscrape/discovery/gce: make golangci-lint happy
2020-04-27 19:29:42 +03:00
Aliaksandr Valialkin
86a1d9cb0c
lib/promscrape: add initial support for Prometheus-compatible service discovery for Amazon EC2 aka ec2_sd_configs
2020-04-27 19:29:22 +03:00
Aliaksandr Valialkin
1acb6eb25a
lib/promscrape/discovery/gce: properly set filter
query arg in api url
2020-04-27 16:01:53 +03:00
Aliaksandr Valialkin
0daa37fa02
lib/promscrape/discovery/gce: allow empty project and zone for gce_sd_config
2020-04-27 11:45:45 +03:00
Aliaksandr Valialkin
989d84cf3f
app/{vminsert,vmstorage}: wait for ack
from vmstorage
after each packet sent to it from vminsert
...
This should protect from possible data loss when `vmstorage` is stopped while the packet is sent from `vminsert`.
This commit switches to new protocol between vminsert and vmstorage, which is incompatible
with the previous protocol. So it is required that both vminsert and vmstorage nodes are updated.
2020-04-27 09:53:26 +03:00
Aliaksandr Valialkin
e933cbac16
lib/storage: postpone reading data from blocks during search
...
This eliminates the need for storing block data into temporary files on a single-node VictoriaMetrics
during heavy queries, which touch big number of time series over long time ranges.
This improves single-node VM performance on heavy queries by up to 2x.
2020-04-27 08:44:01 +03:00
Aliaksandr Valialkin
31861c5b8e
lib/promscrape/discovery/gce: allow empty zone
arg in gce_sd_config
- in this case zones for the given project are automatically discovered
2020-04-26 14:37:38 +03:00
Aliaksandr Valialkin
b16e19c053
lib/storage/dedup.go: go fmt
2020-04-26 14:37:36 +03:00
Aliaksandr Valialkin
a0000c3a6e
lib/storage: improve deduplication algorithm
...
Now it leaves only the first data point on each `-dedup.minScrapeInterval` interval.
Previously it may leave two data points on the interval. This could lead to unexpected results
for `histogram_quantile(phi, sum(rate(buckets)) by (le))` query.
2020-04-26 13:10:18 +03:00
Aliaksandr Valialkin
13b4069c59
lib/storage: postpone label filters matching too many time series instead of giving up with error
...
This should reduce the frequency of the following errors:
cannot find tag filter matching less than N time series; either increase -search.maxUniqueTimeseries or use more specific tag filters
more than N time series found on the time range [...]; either increase -search.maxUniqueTimeseries or shrink the time range
2020-04-24 21:18:52 +03:00
Aliaksandr Valialkin
7c74efd640
lib/promscrape/discovery/gce: make golint happy by ignoring resp.Body.Close() result
2020-04-24 18:13:26 +03:00
Aliaksandr Valialkin
069690e3bd
lib/promscrape: initial implementation for gce_sd_configs
aga Prometheus-compatible service discovery for Google Compute Engine
2020-04-24 17:53:43 +03:00
Aliaksandr Valialkin
de991551f5
lib/promscrape: query /api/v1/namespaces/*
for the configured namespaces in kubernetes_sd_config
...
This should fix authroization issues described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/432
2020-04-24 14:42:02 +03:00
Aliaksandr Valialkin
387a21c96d
lib/promscrape: add -promscrape.configCheckInterval
command-line flag for automating config checking
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/431
2020-04-23 23:41:26 +03:00