Commit graph

8 commits

Author SHA1 Message Date
Aliaksandr Valialkin
bc1f92d7f5
app/vmagent/remotewrite: follow-up for 87fd400dfc
- Drop samples and return true from remotewrite.TryPush() at fast path when all the remote storage
  systems are configured with the disabled on-disk queue, every in-memory queue is full
  and -remoteWrite.dropSamplesOnOverload is set to true. This case is quite common,
  so it should be optimized. Previously additional CPU time was spent on per-remoteWriteCtx
  relabeling and other processing in this case.

- Properly count the number of dropped samples inside remoteWriteCtx.pushInternalTrackDropped().
  Previously dropped samples were counted only if -remoteWrite.dropSamplesOnOverload flag is set.
  In reality, the samples are dropped when they couldn't be sent to the queue because in-memory queue is full
  and on-disk queue is disabled.
  The remoteWriteCtx.pushInternalTrackDropped() function is called by streaming aggregation for pushing
  the aggregated data to the remote storage. Streaming aggregation cannot wait until the remote storage
  processes pending data, so it drops aggregated samples in this case.

- Clarify the description for -remoteWrite.disableOnDiskQueue command-line flag at -help output,
  so it is clear that this flag can be set individually per each -remoteWrite.url.

- Make the -remoteWrite.dropSamplesOnOverload flag global. If some of the remote storage systems
  are configured with the disabled on-disk queue, then there is no sense in keeping samples
  on some of these systems, while dropping samples on the remaining systems, since this
  will result in global stall on the remote storage system with the disabled on-disk queue
  and with the -remoteWrite.dropSamplesOnOverload=false flag. vmagent will always return false
  from remotewrite.TryPush() in this case. This will result in infinite duplicate samples
  written to the remaining remote storage systems. That's why the -remoteWrite.dropSamplesOnOverload
  is forcibly set to true if more than one -remoteWrite.disableOnDiskQueue flag is set.
  This allows proceeding with newly scraped / pushed samples by sending them to the remaining
  remote storage systems, while dropping them on overloaded systems with the -remoteWrite.disableOnDiskQueue flag set.

- Verify that the remoteWriteCtx.TryPush() returns true in the TestRemoteWriteContext_TryPush_ImmutableTimeseries test.

- Mention in vmagent docs that the -remoteWrite.disableOnDiskQueue command-line flag can be set individually per each -remoteWrite.url.
  See https://docs.victoriametrics.com/vmagent/#disabling-on-disk-persistence

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6248
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6065
2024-07-13 02:30:10 +02:00
Aliaksandr Valialkin
a5d60ad78e
app/vmagent/remotewrite,lib/streamaggr: re-use common code in tests after 879771808b
- Export streamaggr.LoadFromData() function, so it could be used in tests outside the lib/streamaggr package.
  This allows removing a hack with creation of temporary files at TestRemoteWriteContext_TryPush_ImmutableTimeseries.

- Move common code for mustParsePromMetrics() function into lib/prompbmarshal package,
  so it could be used in tests for building []prompbmarshal.TimeSeries from string.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6205
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6206
2024-07-03 15:22:51 +02:00
Andrii Chubatiuk
937ae2ca90
lib/streamaggr: added stale samples metric, added metrics labels (#6462)
### Describe Your Changes

- added stale metrics counters for input and output samples
- added labels for aggregator metrics =>
`name="{rwctx}:{aggrId}:{aggrSuffix}"`
   - rwctx - global or number starting from 1
   - aggrid - aggregator id starting from 1
   - aggrSuffix - <interval>_(by|without)_label1_label2_labeln
   e.g: `name="global:1:1m_without_instance_pod"`

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 861852f262)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-07-01 15:01:49 +02:00
Roman Khavronenko
0bed453737
Feature allow configuring disableOnDiskQueue and dropSamplesOnOverload per url (#6248)
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html):
allow configuring `-remoteWrite.disableOnDiskQueue` and
`-remoteWrite.dropSamplesOnOverload` cmd-line flags per each
`-remoteWrite.url`. See this [pull
request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6065).
Thanks to @rbizos for implementaion!
* FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add
labels `path` and `url` to metrics
`vmagent_remotewrite_push_failures_total` and
`vmagent_remotewrite_samples_dropped_total`. Now number of failed pushes
and dropped samples can be tracked per `-remoteWrite.url`.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Raphael Bizos <r.bizos@criteo.com>
(cherry picked from commit 87fd400dfc)
2024-05-10 14:32:23 +02:00
Andrii Chubatiuk
e26b55db1e
app/vmagent/remotewrite: do not cleanup timeseries which are used in multiple remote write contexts (#6206)
When at least one remote write has deduplication configured it cleans up
timeseries while they can be in use by another remote write without
deduplication

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6205
---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 879771808b)
2024-05-06 12:10:45 +02:00
Aliaksandr Valialkin
5b29be1f4d
app/vmagent/remotewrite: add support for replication additionally to sharding when both -remoteWrite.shardByURL and -remoteWrite.shardByURLReplicas=RF command-line flags are set
This allows setting up data replication among failure domains if the replication factor is smaller than the number of failure domains.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6054

See https://docs.victoriametrics.com/vmagent/#sharding-among-remote-storages
2024-04-19 11:37:04 +02:00
Aliaksandr Valialkin
3f79e54a51
app/vmagent/remotewrite: follow-up for 166b97b8d0 and b6bd9a97a3
- Make the configuration more clear by accepting the list of ignored labels during sharding
  via a dedicated command-line flag - -remoteWrite.shardByURL.ignoreLabels.
  This prevents from overloading the meaning of -remoteWrite.shardByURL.labels command-line flag.

- Removed superfluous memory allocation per each processed sample if sharding by remote storage is enabled.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5938
2024-04-03 00:50:48 +03:00
hagen1778
36c1ca9949
app/vmagent: follow-up 166b97b8d0
* add tests for sharding function
* update flags description
* add changelog note

166b97b8d0
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit b6bd9a97a3)
2024-03-29 14:10:17 +01:00