lib/protoparser/promremotewrite: fall back to Snappy decoding if zstd decoding fails

This case is possible after the following steps:

1. vmagent tries to perform handshake with the -remoteWrite.url in order to determine whether
   the remote storage supports zstd-compressed data.
2. The remote storage is unavailable during the handshake. In this case vmagent falls back to Snappy compression
   for the data sent to the remote storage.
3. vmagent compresses the collected data into blocks with Snappy and puts these blocks to persistent queue on disk.
4. The remote storage becomes available.
5. vmagent restarts, performs the handshake with the remote storage and detects that it supports zstd-compressed data.
6. vmagent starts sending Snappy-compressed data from persistent queue to the remote storage,
   while falsely advertizing it sends zstd-compressed data.
7. The remote storage receives Snappy-compressed data and fails unpacking it with zstd.

The solution is to just fall back to Snappy decompression if zstd decompression fails.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5301
This commit is contained in:
Aliaksandr Valialkin 2023-11-13 21:19:06 +01:00
parent 356deada8c
commit 12cd32fd75
No known key found for this signature in database
GPG key ID: A72BEC6CD3D0DED1
2 changed files with 9 additions and 1 deletions

View file

@ -105,6 +105,7 @@ The sandbox cluster installation is running under the constant load generated by
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly drop samples if `-streamAggr.dropInput` command-line flag is set and `-remoteWrite.streamAggr.config` contains an empty file. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5207).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): do not print redundant error logs when failed to scrape consul or nomad target. See [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5239).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): generate proper link to the main page and to `favicon.ico` at http pages served by `vmagent` such as `/targets` or `/service-discovery` when `vmagent` sits behind an http proxy with custom http path prefixes. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5306).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): properly decode Snappy-encoded data blocks received via [VictoriaMetrics remote_write protocol](https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol). See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5301).
* BUGFIX: [vmstorage](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html): prevent deleted series to be searchable via `/api/v1/series` API if they were re-ingested with staleness markers. This situation could happen if user deletes the series from the target and from VM, and then vmagent sends stale markers for absent series. Thanks to @ilyatrefilov for the [issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5069) and [pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5174).
* BUGFIX: [vmstorage](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html): log warning about switching to ReadOnly mode only on state change. Before, vmstorage would log this warning every 1s. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5159) for details.
* BUGFIX: [vmauth](https://docs.victoriametrics.com/vmauth.html): show browser authorization window for unauthorized requests to unsupported paths if the `unauthorized_user` section is specified. This allows properly authorizing the user. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5236) for details.

View file

@ -42,7 +42,14 @@ func Parse(r io.Reader, isVMRemoteWrite bool, callback func(tss []prompb.TimeSer
if isVMRemoteWrite {
bb.B, err = zstd.Decompress(bb.B[:0], ctx.reqBuf.B)
if err != nil {
return fmt.Errorf("cannot decompress zstd-encoded request with length %d: %w", len(ctx.reqBuf.B), err)
// Fall back to Snappy decompression, since vmagent may send snappy-encoded messages
// with 'Content-Encoding: zstd' header if they were put into persistent queue before vmagent restart.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5301
zstdErr := err
bb.B, err = snappy.Decode(bb.B[:cap(bb.B)], ctx.reqBuf.B)
if err != nil {
return fmt.Errorf("cannot decompress zstd-encoded request with length %d: %w", len(ctx.reqBuf.B), zstdErr)
}
}
} else {
bb.B, err = snappy.Decode(bb.B[:cap(bb.B)], ctx.reqBuf.B)