github-mirrors/VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2024-11-21 14:44:00 +00:00

Author	SHA1	Message	Date
Aliaksandr Valialkin	21f049e211	lib/streamaggr: follow-up for `9c3d44c8c9` - Consistently enumerate stream aggregation outputs in alphabetical order across the source code and docs. This should simplify future maintenance of the corresponding code and docs. - Fix the link to `rate_sum()` at `see also` section of `rate_avg()` docs. - Make more clear the docs for `rate_sum()` and `rate_avg()` outputs. - Encapsulate output metric suffix inside rateAggrState. This eliminates possible bugs related to incorrect suffix passing to newRateAggrState(). - Rename rateAggrState.total field to less misleading rateAggrState.increase name, since it calculates counter increase in the current aggregation window. - Set rateLastValueState.prevTimestamp on the first sample in time series instead of the second sample. This makes more clear the code logic. - Move the code for removing outdated entries at rateAggrState into removeOldEntries() function. This make the code logic inside rateAggrState.flushState() more clear. - Do not write output sample with zero value if there are no input series, which could be used for calculating the rate, e.g. if only a single sample is registered for every input series. - Do not take into account input series with a single registered sample when calculating rate_avg(), since this leads to incorrect results. - Move {rate,total}AggrState.flushState() function to the end of rate.go and total.go files, so they look more similar. This shuld simplify future mantenance. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6243	2024-07-15 08:44:48 +02:00
Hui Wang	f3cbd62823	vmagent: fix `vm_streamaggr_flushed_samples_total` counter (#6604 ) We use `vm_streamaggr_flushed_samples_total` to show the number of produced samples by aggregation rule, previously it was overcounted, and doesn't account for `output_relabel_configs`. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6462 --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `2eb1bc4f81`)	2024-07-12 14:19:17 +02:00
Arkadii Yakovets	8645b2cc8e	docs: add spellcheck command (#6562 ) ### Describe Your Changes Implement spellcheck command: - add cspell configuration files - dockerize spellchecking process - add Makefile targets This PR adds a standalone `make spellcheck` target to check `docs/.md` files for spelling errors. The target process is dockerized to be run in a separate npm environment. Some `docs/` typo fixes also included. ### Checklist The following checks are mandatory*: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Arkadii Yakovets <ark@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `fabf0b928e`)	2024-07-11 12:40:24 +02:00
omahs	efc6b00b2c	docs: fix typos (#6600 ) ### Describe Your Changes docs: fix typos ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). (cherry picked from commit `8786a08d27`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-07-09 10:52:47 +02:00
Aliaksandr Valialkin	172ae1adf7	Revert `c6c5a5a186` and `b2765c45d0` Reason for revert: There are many statsd servers exist: - https://github.com/statsd/statsd - classical statsd server - https://docs.datadoghq.com/developers/dogstatsd/ - statsd server from DataDog built into DatDog Agent ( https://docs.datadoghq.com/agent/ ) - https://github.com/avito-tech/bioyino - high-performance statsd server - https://github.com/atlassian/gostatsd - statsd server in Go - https://github.com/prometheus/statsd_exporter - statsd server, which exposes the aggregated data as Prometheus metrics These servers can be used for efficient aggregating of statsd data and sending it to VictoriaMetrics according to https://docs.victoriametrics.com/#how-to-send-data-from-graphite-compatible-agents-such-as-statsd ( the https://github.com/prometheus/statsd_exporter can be scraped as usual Prometheus target according to https://docs.victoriametrics.com/#how-to-scrape-prometheus-exporters-such-as-node-exporter ). Adding support for statsd data ingestion protocol into VictoriaMetrics makes sense only if it provides significant advantages over the existing statsd servers, while has no significant drawbacks comparing to existing statsd servers. The main advantage of statsd server built into VictoriaMetrics and vmagent - getting rid of additional statsd server. The main drawback is non-trivial and inconvenient streaming aggregation configs, which must be used for the ingested statsd metrics ( see https://docs.victoriametrics.com/stream-aggregation/ ). These configs are incompatible with the configs for standalone statsd servers. So you need to manually translate configs of the used statsd server to stream aggregation configs when migrating from standalone statsd server to statsd server built into VictoriaMetrics (or vmagent). Another important drawback is that it is very easy to shoot yourself in the foot when using built-in statsd server with the -statsd.disableAggregationEnforcement command-line flag or with improperly configured streaming aggregation. In this case the ingested statsd metrics will be stored to VictoriaMetrics as is without any aggregation. This may result in high CPU usage during data ingestion, high disk space usage for storing all the unaggregated statsd metrics and high CPU usage during querying, since all the unaggregated metrics must be read, unpacked and processed during querying. P.S. Built-in statsd server can be added to VictoriaMetrics and vmagent after figuring out more ergonomic specialized configuration for aggregating of statsd metrics. The main requirements for this configuration: - easy to write, read and update (ideally it should work out of the box for most cases without additional configuration) - hard to misconfigure (e.g. hard to shoot yourself in the foot) It would be great if this configuration will be compatible with the configuration of the most widely used statsd server. In the mean time it is recommended continue using external statsd server. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6265 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5053 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5052 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/206 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4600	2024-07-03 23:57:49 +02:00
Aliaksandr Valialkin	f8779d1ed2	lib/streamaggr: follow-up for the commit `c0e4ccb7b5` - Clarify docs for `Ignore aggregation intervals on start` feature. - Make more clear the code dealing with ignoreFirstIntervals at aggregator.runFlusher() functions. It is better from readability and maintainability PoV using distinct a.flush() calls for distinct cases instead of merging them into a single a.flush() call. - Take into account the first incomplete interval when tracking the number of skipped aggregation intervals, since this behaviour is easier to understand by the end users. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6137	2024-07-02 21:34:48 +02:00
hagen1778	69625aa8a1	docs: mark `without` optional in stream aggr docs Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `f8eea0f2c9`)	2024-07-01 14:56:46 +02:00
Arkadii Yakovets	a6655322b1	docs: fix docs/ and README.md spelling errors (#6362 ) Fixes `docs/` and `README.md` typos and errors. Signed-off-by: Arkadii Yakovets <ark@victoriametrics.com> (cherry picked from commit `c740a8042e`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-06-03 11:53:33 +02:00
yumeiyin	95b8cf76f8	chore: remove redundant words (#6348 ) (cherry picked from commit `9289c7512d`)	2024-05-29 14:37:04 +02:00
Andrii Chubatiuk	fe332c3419	app/vmagent: add global aggregator (#6268 ) Add global stream aggregation for VMAgent https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5467 (cherry picked from commit `f153f54d11`)	2024-05-17 14:01:31 +02:00
Andrii Chubatiuk	d9cddf1ad8	lib/streamaggr: added rate and rate_avg output (#6243 ) Added `rate` and `rate_avg` output Resource usage is the same as for increase output, tested on a benchmark --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `9c3d44c8c9`)	2024-05-13 16:49:39 +02:00
Oleg	76af930e4a	Statsd protocol compatibility (#5053 ) In this PR I added compatibility with [statsd protocol](https://github.com/b/statsd_spec) with tags to be able to send metrics directly from statsd clients to vmagent or directly to VM. For example its compatible with [statsd-instrument](https://github.com/Shopify/statsd-instrument) and [dogstatsd-ruby](https://github.com/DataDog/dogstatsd-ruby) gems Related issues: #5052, #206, #4600 (cherry picked from commit `c6c5a5a186`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-05-10 14:27:31 +02:00
Hui Wang	abd29c15ab	docs: update vmalert and vmagent docs (#6207 ) * restore and actualize doc section explaining duplicated labels error * rm misleading comment about post-aggregation in stream aggregation (cherry picked from commit `e3c226cf92`)	2024-04-30 10:30:19 +02:00
hagen1778	342290275e	app/streamaggr: follow-up after `c0e4ccb7b5` * rm vmagent mentions from vminsert flags * improve documentation wording, add links to related sections * mention `ignore_first_intervals` in the stream aggr options * update flags description * add basic test for config parsing validation Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `bae3874e6a`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-04-22 14:39:23 +02:00
Andrii Chubatiuk	131367fb59	lib/streamaggr: add option to ignore first N aggregation intervals (#6137 ) Stream aggregation may yield inaccurate results if it processes incomplete data. This issue can arise when data is sourced from clients that maintain a queue of unsent data, such as Prometheus or vmagent. If the queue isn't fully cleared within the aggregation interval, only a portion of the time series may be included in that period, leading to distorted calculations. To mitigate this we add an option to ignore first N aggregation intervals. It is expected, that client queues will be cleared during the time while aggregation ignores first N intervals and all subsequent aggregations will be correct. (cherry picked from commit `c0e4ccb7b5`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-04-22 14:34:36 +02:00
Aliaksandr Valialkin	b933001d2d	all: replace old https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html url with the new one - https://docs.victoriametrics.com/single-server-victoriametrics/	2024-04-18 03:11:49 +02:00
Aliaksandr Valialkin	baf5c8d6d0	all: replace old https://docs.victoriametrics.com/keyConcepts.html url with the new one - https://docs.victoriametrics.com/keyconcepts/	2024-04-18 02:34:09 +02:00
Aliaksandr Valialkin	728bcf0585	all: replace old https://docs.victoriametrics.com/stream-aggregation.html url with the new one - https://docs.victoriametrics.com/stream-aggregation/	2024-04-18 02:20:00 +02:00
Aliaksandr Valialkin	64938732e3	all: replace old https://docs.victoriametrics.com/MetricsQL.html url with the new one - https://docs.victoriametrics.com/metricsql/	2024-04-18 02:15:33 +02:00
Aliaksandr Valialkin	a99005eff6	all: replace old https://docs.victoriametrics.com/vmalert.html url with the new one - https://docs.victoriametrics.com/vmalert/	2024-04-18 01:44:54 +02:00
Aliaksandr Valialkin	0211a04a52	all: replace the outdated url https://docs.victoriametrics.com/vmagent.html with the new one - https://docs.victoriametrics.com/vmagent/	2024-04-18 01:32:57 +02:00
Aliaksandr Valialkin	cd222d6502	lib/streamaggr: ignore out of order samples for `last` output This is a follow-up for `6a465f6e29` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5931	2024-03-18 01:03:58 +02:00
Aliaksandr Valialkin	111d0aa2bf	app/{vmagent,vminsert}: add an ability to ignore input samples outside the current aggregation interval for stream aggregation See https://docs.victoriametrics.com/stream-aggregation.html#ignoring-old-samples	2024-03-17 23:30:46 +02:00
Aliaksandr Valialkin	1f753a049a	lib/streamaggr: follow-up for `15e33d56f1` - Properly set pushSample.timestamp when flushing de-duplicated samples to stream aggregation This is needed for https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5931 - Re-classify this change as feature instead of bugfix at docs/CHANGELOG.md - Verify de-duplication logic for samples with different timestamps Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5643 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5939	2024-03-17 23:23:57 +02:00
hagen1778	41a1efbea8	docs: follow-up `15e33d56f1` Update documentation according to changes in deduplication logic. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-03-17 23:07:51 +02:00
Aliaksandr Valialkin	abf347ec0f	docs: replace `speed up` with more clear `accelerate` wording	2024-03-12 03:03:15 +02:00
Aliaksandr Valialkin	27b9e8ed3e	app/{vmagent,vminsert}: add `-streamAggr.dropInputSamples` command-line flag for dropping the specified labels from input samples before deduplication and streaming aggregation	2024-03-05 02:27:27 +02:00
Aliaksandr Valialkin	c38c45d71f	app/{vminsert,vmagent}: allow using -streamAggr.dedupInterval without -streamAggr.config This allows performing online de-duplication of incoming samples	2024-03-05 00:47:23 +02:00
Aliaksandr Valialkin	48a425898a	lib/streamaggr: enable time alignment for aggregate flushed to multiples of interval For example, if `interval: 1m`, then data flush occurs at the end of every minute, while `interval: 1h` leads to data flush at the end of every hour. Add `no_align_flush_to_interval` option, which can be used for disabling the alignment.	2024-03-04 06:23:35 +02:00
Aliaksandr Valialkin	3ba9b2225e	docs/stream-aggregation.md: add `troubleshooting` section with solutions for common problems in streaming aggregation	2024-03-04 03:04:59 +02:00
Aliaksandr Valialkin	d0e6541f35	docs/stream-aggregation.md: typo fixes	2024-03-02 04:35:37 +02:00
Aliaksandr Valialkin	b912a45220	docs/stream-aggregation.md: remove superflouous output_relabel_configs from the config example for histogram aggregation	2024-03-02 03:36:08 +02:00
Aliaksandr Valialkin	0d5d46f9db	lib/streamaggr: huge pile of changes - Reduce memory usage by up to 5x when de-duplicating samples across big number of time series. - Reduce memory usage by up to 5x when aggregating across big number of output time series. - Add lib/promutils.LabelsCompressor, which is going to be used by other VictoriaMetrics components for reducing memory usage for marshaled []prompbmarshal.Label. - Add `dedup_interval` option at aggregation config, which allows setting individual deduplication intervals per each aggregation. - Add `keep_metric_names` option at aggregation config, which allows keeping the original metric names in the output samples. - Add `unique_samples` output, which counts the number of unique sample values. - Add `increase_prometheus` and `total_prometheus` outputs, which ignore the first sample per each newly encountered time series. - Use 64-bit hashes instead of marshaled labels as map keys when calculating `count_series` output. This makes obsolete https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5579 - Expose various metrics, which may help debugging stream aggregation: - vm_streamaggr_dedup_state_size_bytes - the size of data structures responsible for deduplication - vm_streamaggr_dedup_state_items_count - the number of items in the deduplication data structures - vm_streamaggr_labels_compressor_size_bytes - the size of labels compressor data structures - vm_streamaggr_labels_compressor_items_count - the number of entries in the labels compressor - vm_streamaggr_flush_duration_seconds - a histogram, which shows the duration of stream aggregation flushes - vm_streamaggr_dedup_flush_duration_seconds - a histogram, which shows the duration of deduplication flushes - vm_streamaggr_flush_timeouts_total - counter for timed out stream aggregation flushes, which took longer than the configured interval - vm_streamaggr_dedup_flush_timeouts_total - counter for timed out deduplication flushes, which took longer than the configured dedup_interval - Actualize docs/stream-aggregation.md The memory usage reduction increases CPU usage during stream aggregation by up to 30%. This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5850 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5898	2024-03-02 03:15:43 +02:00
hagen1778	216f268c1a	docs: follow-up after `491287ed15` * port un-synced changed from docs/readme to readme * consistently use `sh` instead of `console` highlight, as it looks like a more appropriate syntax highlight * consistently use `sh` instead of `bash`, as it is shorter * consistently use `yaml` instead of `yml` See syntax codes here https://gohugo.io/content-management/syntax-highlighting/ Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-29 17:06:26 +01:00
Roman Khavronenko	9e9f170fe7	lib/streamaggr: skip unfinished aggregation state on shutdown by default (#5689 ) Sending unfinished aggregate states tend to produce unexpected anomalies with lower values than expected. The old behavior can be restored by specifying `flush_on_shutdown: true` setting in streaming aggregation config Signed-off-by: hagen1778 <roman@victoriametrics.com>	2024-01-26 22:45:45 +01:00
Aliaksandr Valialkin	e6e5b97e1e	lib/streamaggr: expand `%{ENV}` placeholders in stream aggregation configs	2024-01-24 12:31:42 +02:00
Aliaksandr Valialkin	db6dadf1f7	docs: convert png images to webp in all the docs except of docs/operator/* This reduces the size of docs/* folder from 33MB to 18MB Images inside docs/operator/* must be converted at the https://github.com/VictoriaMetrics/operator/tree/master/docs and then the updated images must be automatically propagated to the docs/operator/* This is a follow-up for `d3f919df3e` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5206	2023-11-22 19:29:47 +02:00
Aliaksandr Valialkin	de3d5943eb	docs/stream-aggregation.md: clarify that stream aggregation is applied after all the configured relabeling This is a follow-up after `68d2cb203d`	2023-11-15 15:54:57 +01:00
hagen1778	14df5af660	docs/stream-aggr: specify the relabeling order during aggregation Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `68d2cb203d`)	2023-11-15 14:29:29 +01:00
Artem Navoiev	b72dc10bb3	docs: fix formatting in stream aggregation more Signed-off-by: Artem Navoiev <tenmozes@gmail.com>	2023-11-13 09:20:23 +01:00
Artem Navoiev	55df212a76	docs: fix formatting in stream aggregation Signed-off-by: Artem Navoiev <tenmozes@gmail.com>	2023-11-13 09:19:49 +01:00
Haleygo	130e0ea5f0	vmalert-tool: implement unittest (#4789 ) 1. split package rule under /app/vmalert, expose needed objects 2. add vmalert-tool with unittest subcmd https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2945	2023-10-16 14:12:06 +02:00
hagen1778	740725b7a3	docs: mention that `quantiles` can't be used in sharded mode https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4942 Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `40c94b26dd`)	2023-09-07 10:59:19 +02:00
Aliaksandr Valialkin	baea9da66b	docs/stream-aggregation.md: use `5m` instead of `300` in the example query for rate() calculation from "increase" results This makes the query easier to read and understand Follow-up for `0df506de54`	2023-08-28 09:37:34 +02:00
Aliaksandr Valialkin	18f8e90bd8	docs/stream-aggregation.md: typo fix after `54f522ac25`	2023-08-24 22:04:24 +02:00
hagen1778	92f158a2f5	docs: mention `increase` as alternative to `rate` Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `0df506de54`)	2023-08-23 13:29:32 +02:00
Aliaksandr Valialkin	07a3030856	docs/stream-aggregation.md: clarify the usage of `-remoteWrite.label` after the fix at `a27c2f3773` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247	2023-08-17 15:19:04 +02:00
Alexander Marshalov	58cf862b05	fixed applying `remoteWrite.label` for pushed metrics (#4247 ) (#4824 ) vmagent: properly add extra labels before sending data to remote storage labels from `remoteWrite.label` are now added to sent metrics just before they are pushed to `remoteWrite.url` after all relabelings, including stream aggregation relabelings (#4247) https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247 Signed-off-by: Alexander Marshalov <_@marshalov.org> Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> (cherry picked from commit `a27c2f3773`)	2023-08-15 13:48:19 +02:00
zhaojinxin409	aca4f69023	Update stream-aggregation.md for `speed` misspelling (#4752 )	2023-08-11 03:16:30 -07:00
Aliaksandr Valialkin	1b7d97787a	docs: use `1.` instead of `N.` in numbered bullets, so they are automatically adjusted by Github Markdown engine See https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#lists	2023-07-26 14:40:06 -07:00

1 2

68 commits