Commit graph

1915 commits

Author SHA1 Message Date
Roman Khavronenko
10601bc652
vmalert: update -rule flag description to enforce quotes using (#709)
Description for `-rule` flag uses as example specific chars like asterisks
which could be interpreted wrong by different shells. To avoid this, description
now contains quoted flag values.

See also #708
2020-08-20 22:36:38 +01:00
Roman Khavronenko
f2c004d1ae
lib/flagutil: avoid int overflow for arch 386 (#710)
Arch 386 is a 32-bit architecture and interprets int type for numbers as an explicit int32,
whereas on most modern CPUs int is implicitly an int64. This makes tests to fail with
`int overflow` error.
2020-08-20 22:27:37 +01:00
Aliaksandr Valialkin
efc730863b lib/promscrape: reduce memory usage when scraping targets with big number of metrics alongside targets with small number of labels
Previously targets with big number of metrics and/or labels could generated too big buffers,
which then could be re-used when scraping targets with small number of metrics.
This resulted in memory waste.

Now big buffers are used only for targets with big number of metrics / labels,
while small buffers are used for targets with small number of metrics / labels.
2020-08-16 22:29:51 +03:00
Aliaksandr Valialkin
d6967319b6 lib/leveledbytebufferpool: allocate byte buffers with capacity rounded to the upper boundary for the given bucket
This should reduce the number of resizings for the returned byte buffers.
2020-08-16 22:13:30 +03:00
Roman Khavronenko
f5f59896ec
lib/decimal: rename significant decimal digits to significant figures (#698)
The previous notion was inconsistent with what `decimal.Round` does.
According to [wiki](https://en.wikipedia.org/wiki/Significant_figures) rounding
applied to all significant figures, not just decimal ones.
2020-08-16 17:21:35 +03:00
Aliaksandr Valialkin
147c35ebd4 all: allow using KB, MB, GB, KiB, MiB and GiB suffixes in command-line flag values related to byte sizes or byte rates 2020-08-16 17:05:52 +03:00
Aliaksandr Valialkin
7c0d6a8b88 lib/memory: improve log message about the memory allowed to use by VictoriaMetrics 2020-08-16 16:04:11 +03:00
Aliaksandr Valialkin
ed00eb3f33 lib/protoparser: removed unnecessary call to SetReadDeadline when reading a stream of data
The OS should return any buffered data in the stream without the need to set the read timeout.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/696
2020-08-15 15:38:08 +03:00
Aliaksandr Valialkin
7615a3ab8d vendor: upgrade github.com/valyala/gozstd from v1.7.1 to v1.8.3 2020-08-15 15:11:56 +03:00
Aliaksandr Valialkin
7be9bedaf9 vendor: downgrade github.com/valyala/gozstd from v1.8.1 to v1.7.1 until https://github.com/facebook/zstd/issues/2222 is fixed 2020-08-15 14:46:32 +03:00
Aliaksandr Valialkin
00b1659dde lib: dump compressed block contents on error during decompression
This should improve detecting root cause for https://github.com/facebook/zstd/issues/2222
2020-08-15 14:44:33 +03:00
Aliaksandr Valialkin
528e25bdde vendor: update github.com/valyala/gozstd from v1.7.0 to v1.8.1 2020-08-15 13:46:43 +03:00
Aliaksandr Valialkin
b3849a90fd lib/leveledbytebufferpool: pre-allocate byte slice with the given capacity if the pool is empty
This should reduce memory allocations and copying when the byte slice is growing.
2020-08-15 01:40:54 +03:00
Aliaksandr Valialkin
7d89fafe1a app/vmselect/promql: allow passing multiple args to aggregate functions such as avg(q1, q2, q3) 2020-08-15 01:15:09 +03:00
Aliaksandr Valialkin
cd96248480 docs/vmagent.md: mention that gaps in remote storage may appear if vmagent cannot keep up with data ingestion 2020-08-14 20:47:57 +03:00
Aliaksandr Valialkin
7554be172d lib/protoparser: move common code for detecting timeouts to ReadLinesBlockExt
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/696
2020-08-14 20:40:15 +03:00
Aliaksandr Valialkin
4beab7ad39 lib/protoparser: prevent from busy loop on repeated timeout errors when reading streams of ingested data
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/696
2020-08-14 20:14:11 +03:00
Aliaksandr Valialkin
41d23f84ed docs/Cluster-VictoriaMetrics.md: sync with upstream 2020-08-14 19:15:29 +03:00
Aliaksandr Valialkin
184670fb9b docs: update docs 2020-08-14 19:13:42 +03:00
Aliaksandr Valialkin
52791fd1c0 lib/memory: add -memory.allowedBytes command-line flag for setting absolute memory limit for VictoriaMetrics caches 2020-08-14 19:13:38 +03:00
Aliaksandr Valialkin
576da0fe46 app/{vminsert,vmagent}: improve documentation for -influxListenAddr command-line flag 2020-08-14 18:04:44 +03:00
Aliaksandr Valialkin
215967437d lib/protoparser/prometheus: typo fix in error message 2020-08-14 11:04:23 +03:00
Aliaksandr Valialkin
d1ad3adcbe vendor: make vendor-update 2020-08-14 02:29:02 +03:00
Aliaksandr Valialkin
42960feff4 vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.4 to v1.0.5 2020-08-14 02:19:36 +03:00
Aliaksandr Valialkin
07246bc31c vendor: update github.com/klauspost/compress from v1.10.10 to v1.10.11 2020-08-14 02:17:07 +03:00
Aliaksandr Valialkin
e646674b23 lib/promscrape: use a hint on body length instead of body capacity
This should reduce memory usage for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/689
2020-08-14 01:17:52 +03:00
Aliaksandr Valialkin
4628deecd1 lib/promscrape: reduce memory usage when scraping big number of targets
Thanks to @dxtrzhang for the original idea at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/688

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/689
2020-08-14 01:04:53 +03:00
Aliaksandr Valialkin
eead3ee8ec lib/promscrape: properly retry requests on the server closed connection before returning the first response byte error during service discover API calls and target scrapes 2020-08-13 22:31:52 +03:00
Aliaksandr Valialkin
c402265e88 all: support %{ENV_VAR} placeholders in yaml configs in all the vm* components
Such placeholders are substituted by the corresponding environment variable values.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/583
2020-08-13 17:15:25 +03:00
Aliaksandr Valialkin
ff495a74f6 deployment/docker: update Go builder from Go1.14.7 to Go1.15.0 2020-08-13 15:53:32 +03:00
Aliaksandr Valialkin
45962fb8c2 docs/Cluster-VictoriaMetrics.md: mention about Kubernetes operator 2020-08-12 21:15:34 +03:00
Aliaksandr Valialkin
fd6c690276 docs/Single-server-VictoriaMetrics.md: mention helm charts, k8s operator and vmctl tool in Integrations chapter 2020-08-12 21:12:23 +03:00
Aliaksandr Valialkin
e730788477 docs/Articles.md: added https://medium.com/@romanhavronenko/victoriametrics-how-to-migrate-data-from-prometheus-d44a6728f043 2020-08-12 21:03:05 +03:00
Aliaksandr Valialkin
ef7e2af8f5 app: respect CPU limits set via cgroups
Update GOMAXPROCS to limits set via cgroups. This should reduce CPU trashing and reduce memory usage
for cases when VictoriaMetrics components run in containers with CPU limits.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685
2020-08-11 22:59:19 +03:00
Aliaksandr Valialkin
15aa6142ef lib/protoparser: clarify that the string passed to Unmarshal() function must remain available when the parsed rows are in use 2020-08-11 17:04:39 +03:00
Aliaksandr Valialkin
5492edcc6c docs/Single-server-VictoriaMetrics.md: mention that it is safe to skip multiple versions during the upgrade 2020-08-11 14:21:37 +03:00
Aliaksandr Valialkin
e969ef2639 app/vmselect: reduce memory usage when exporting time series with big number of samples via /api/v1/export if max_rows_per_line is set to non-zero value
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685
2020-08-10 20:57:36 +03:00
Aliaksandr Valialkin
c098988a18 lib/protoparser/influx: accept precision=us and precision=µ according to https://docs.influxdata.com/influxdb/v1.8/tools/api/#write-http-endpoint 2020-08-10 20:23:26 +03:00
Aliaksandr Valialkin
1bdfa29ef7 lib/promscrape: optimize per-metric hash calculations
This increases vmagent performance by up to 10% when scraping big number of metrics
2020-08-10 19:49:03 +03:00
Aliaksandr Valialkin
8adba82c02 app/vmselect/netstorage: vary batch size for data unpacking depending on the available CPU cores
This should reduce contention on the channel with unpack work for systems with high number of CPU cores
2020-08-10 15:16:42 +03:00
Aliaksandr Valialkin
8d9eb5f808 lib/storage: mention time range used in the query that led to error message
This should improve detecting slow queries with too big time ranges
2020-08-10 13:46:36 +03:00
Aliaksandr Valialkin
582c74cd93 lib/storage: mention tag filters used in the query that led to error message
This should improve detecting invalid or heavy queries that lead to errors.
2020-08-10 13:36:49 +03:00
Aliaksandr Valialkin
f3d33e23c9 app/vmstorage: improve error logging when the request times out 2020-08-10 13:23:26 +03:00
Aliaksandr Valialkin
455bf50a91 lib/promscrape: show real timestamp and real duration for the scape on /targets page
Previously the scrape duration may be negative when calculated scrape timestamp drifts away from the real scrape timestamp
2020-08-10 12:40:25 +03:00
Aliaksandr Valialkin
2791008e19 vendor: make vendor-update 2020-08-09 15:13:55 +03:00
Aliaksandr Valialkin
a499de45cc lib/promscrape: make errcheck happy 2020-08-09 13:17:18 +03:00
Aliaksandr Valialkin
23c9e6b727 lib/promscrape: export scrape_samples_added per-target metric like Prometheus does
This metric may be useful for detecting targets with high churn rate for the exported metrics.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/683
2020-08-09 12:45:39 +03:00
Aliaksandr Valialkin
9d32fb1d9e lib/fs: use WARN instead of ERROR log level for the message when NFS diretory removal temporarily fails
this is expected condition, so it is better to use WARN log level for it
2020-08-09 12:07:30 +03:00
Aliaksandr Valialkin
d4b6d22987 lib/promscrape: add a test for scrape config for blackbox exporter
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/684
2020-08-09 12:02:48 +03:00
Roman Khavronenko
0be5b09fb4
app/vmalert: extend metrics set exported by vmalert #573 (#654)
* app/vmalert: extend metrics set exported by `vmalert` #573

New metrics were added to improve observability:
+ vmalert_alerts_pending{alertname, group} - number of pending alerts per group
per alert;
+ vmalert_alerts_acitve{alertname, group} - number of active alerts per group
per alert;
+ vmalert_alerts_error{alertname, group} - is 1 if alertname ended up with error
during prev execution, is 0 if no errors happened;
+ vmalert_recording_rules_error{recording, group} - is 1 if recording rule
 ended up with error during prev execution, is 0 if no errors happened;
* vmalert_iteration_total{group, file} - now contains group and file name labels.
This should improve control over specific groups;
* vmalert_iteration_duration_seconds{group, file} - now contains group and file name labels. This should improve control over specific groups;

Some collisions for alerts and recording rules are possible, because neither
group name nor alert/recording rule name are unique for compatibility reasons.

Commit contains list of TODOs for Unregistering metrics since groups and rules
are ephemeral and could be removed without application restart. In order to
unlock Unregistering feature corresponding PR was filed - https://github.com/VictoriaMetrics/metrics/pull/13

* app/vmalert: extend metrics set exported by `vmalert` #573

The changes are following:
* add an ID label to rules metrics, since `name` collisions within one group is
a common case - see the k8s example alerts;
* supports metrics unregistering on rule updates. Consider the case when one rule
was added or removed from the group, or the whole group was added or removed.

The change depends on https://github.com/VictoriaMetrics/metrics/pull/16
where race condition for Unregister method was fixed.
2020-08-09 09:41:29 +03:00