fix: vmselect multi-level setup panic (#3738)

* app/vmselect/netstorage: fix panic for multi-level cluster setup when `replicationFactor` was set and request contained `trace` parameter (#3734)

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vmselect/netstorage: use correct context for retry

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
This commit is contained in:
Zakhar Bessarab 2023-02-01 20:56:36 +04:00 committed by Aliaksandr Valialkin
parent 840f4e3383
commit 1a8f6d98c7
No known key found for this signature in database
GPG key ID: A72BEC6CD3D0DED1
2 changed files with 5 additions and 4 deletions

View file

@ -1650,7 +1650,7 @@ func (sn *storageNode) processSearchQuery(qt *querytracer.Tracer, requestData []
func (sn *storageNode) execOnConnWithPossibleRetry(qt *querytracer.Tracer, funcName string, f func(bc *handshake.BufferedConn) error, deadline searchutils.Deadline) error {
qtChild := qt.NewChild("rpc call %s()", funcName)
err := sn.execOnConn(qtChild, funcName, f, deadline)
qtChild.Done()
defer qtChild.Done()
if err == nil {
return nil
}
@ -1661,9 +1661,9 @@ func (sn *storageNode) execOnConnWithPossibleRetry(qt *querytracer.Tracer, funcN
return err
}
// Repeat the query in the hope the error was temporary.
qtChild = qt.NewChild("retry rpc call %s() after error", funcName)
err = sn.execOnConn(qtChild, funcName, f, deadline)
qtChild.Done()
qtRetry := qtChild.NewChild("retry rpc call %s() after error", funcName)
err = sn.execOnConn(qtRetry, funcName, f, deadline)
qtRetry.Done()
return err
}

View file

@ -17,6 +17,7 @@ The following tip changes can be tested by building VictoriaMetrics components f
* BUGFIX: fix a bug, which could prevent background merges for the previous partitions until restart if the storage didn't have enough disk space for final deduplication and down-sampling.
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): update API version for [ec2_sd_configs](https://docs.victoriametrics.com/sd_configs.html#ec2_sd_configs) to fix [the issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3700) with missing `__meta_ec2_availability_zone_id` attribute.
* BUGFIX: [VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html): fix panic on top-level vmselect nodes of [multi-level setup](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multi-level-cluster-setup) when the `-replicationFactor` flag is set and request contains `trace` query parameter. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3734).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): [dockerswarm_sd_configs](https://docs.victoriametrics.com/sd_configs.html#dockerswarm_sd_configs): apply `filters` only to objects of the specified `role`. Previously filters were applied to all the objects, which could cause errors when different types of objects were used with filters that were not compatible with them. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3579).
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): suppress all the scrape errors when `-promscrape.suppressScrapeErrors` is enabled. Previously some scrape errors were logged even if `-promscrape.suppressScrapeErrors` flag was set.
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent.html): consistently put the scrape url with scrape target labels to all error logs for failed scrapes. Previously some failed scrapes were logged without this information.