app/vmagent: automatically detect whether the remote storage supports VictoriaMetrics remote write protocol

Substitute -remoteWrite.useVMProto with -remoteWrite.forcePromProto command-line flag,
which can be used for forcing Prometheus remote write protocol in cases when the remote storage
supports VictoriaMetrics remote write protocol.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3847
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1225
This commit is contained in:
Aliaksandr Valialkin 2023-02-23 17:36:52 -08:00
parent e688121de8
commit c080443fef
No known key found for this signature in database
GPG key ID: A72BEC6CD3D0DED1
7 changed files with 109 additions and 61 deletions

View file

@ -181,30 +181,25 @@ There is also support for multitenant writes. See [these docs](#multitenancy).
## VictoriaMetrics remote write protocol
By default `vmagent` uses Prometheus remote_write protocol for sending the data to the configured `-remoteWrite.url`.
This allows sending data to [any Prometheus-compatible remote storage](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage).
`vmagent` supports sending data to the configured `-remoteWrite.url` either via Prometheus remote write protocol
or via VictoriaMetrics remote write protocol.
The Prometheus remote_write protocol may require big amounts of network bandwidth under high load.
This may result in high network egress costs when the configured remote storage is located in remote datacenter or availability zone.
This also may result in the increased disk IO at `vmagent` when it writes to disk the pending data, which must be sent to remote storage.
In this case the `vmagent` can be instructed to use VictoriaMetrics remote write protocol.
This allows reducing egress network bandwidth costs while reducing disk read/write IO at `vmagent` side under high load.
The `-remoteWrite.useVMProto=true` command-line flag instructs `vmagent` to send the data to the corresponding `-remoteWrite.url`
via VictoriaMetrics remote write protocol.
VictoriaMetrics remote write protocol provides the following benefits comparing to Prometheus remote write protocol:
While all the [recently released](https://docs.victoriametrics.com/CHANGELOG.html) VictoriaMetrics components support
the VictoriaMetrics remote write protocol, third-party systems and old versions of VictoriaMetrics components may miss the support of this protocol.
- Reduced network bandwidth usage by 2x-5x. This allows saving network bandwidth usage costs when `vmagent` and
the configured remote storage systems are located in different datacenters, availability zones or regions.
The `-remoteWrite.useVMProto` command-line flag can be set independently per each configured `-remoteWrite.url`.
For example, the following command instructs `vmagent` to send the data to `https://victoriametrics/api/v1/write` via VictoriaMetrics remote write protocol,
while sending the data to `https://prom-compatible-storage/write` via Prometheus remote write protocol:
- Reduced disk read/write IO and disk space usage at `vmagent` when the remote storage is temporarily unavailable.
In this case `vmagent` buffers the incoming data to disk using the VictoriaMetrics remote write format.
This reduces disk read/write IO and disk space usage by 2x-5x comparing to Prometheus remote write format.
```
./vmagent -remoteWrite.url=https://victoriametrics/api/v1/write \
-remoteWrite.useVMProto=true \
-remoteWrite.url=https://prom-compatible-storage/write \
-remoteWrite.useVMProto=false
```
`vmagent` automatically uses VictoriaMetrics remote write protocol when it sends data to VictoriaMetrics components such as other `vmagent` instances,
[single-node VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html)
or `vminsert` at [cluster version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html).
`vmagent` automatically switches to Prometheus remote write protocol when it sends data to old versions of VictoriaMetrics components
or to other Prometheus-compatible remote storage systems. It is possible to force switch to Prometheus remote write protocol
by specifying `-remoteWrite.forcePromProto` command-line flag for the corresponding `-remoteWrite.url`.
## Multitenancy
@ -1453,6 +1448,9 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.flushInterval duration
Interval for flushing the data to remote storage. This option takes effect only when less than 10K data points per second are pushed to -remoteWrite.url (default 1s)
-remoteWrite.forcePromProto array
Whether to force Prometheus remote write protocol for sending data to the corresponding -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.headers array
Optional HTTP headers to send with each request to the corresponding -remoteWrite.url. For example, -remoteWrite.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -remoteWrite.url. Multiple headers must be delimited by '^^': -remoteWrite.headers='header1:value1^^header2:value2'
Supports an array of values separated by comma or specified via multiple flags.
@ -1538,14 +1536,11 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-remoteWrite.tmpDataPath string
Path to directory where temporary data for remote write component is stored. See also -remoteWrite.maxDiskUsagePerURL (default "vmagent-remotewrite-data")
-remoteWrite.url array
Remote storage URL to write data to. It must support Prometheus remote_write protocol. Example url: http://<victoriametrics-host>:8428/api/v1/write . It is recommended setting -remoteWrite.useVMProto command-line option when VictoriaMetrics is used as a remote storage in order to save network bandwidth. See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol . Pass multiple -remoteWrite.url options in order to replicate the collected data to multiple remote storage systems. See also -remoteWrite.multitenantURL
Remote storage URL to write data to. It must support either VictoriaMetrics remote write protocol or Prometheus remote_write protocol. Example url: http://<victoriametrics-host>:8428/api/v1/write . Pass multiple -remoteWrite.url options in order to replicate the collected data to multiple remote storage systems. See also -remoteWrite.multitenantURL
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.urlRelabelConfig array
Optional path to relabel configs for the corresponding -remoteWrite.url. See also -remoteWrite.relabelConfig. The path can point either to local file or to http url. See https://docs.victoriametrics.com/vmagent.html#relabeling
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.useVMProto array
Whether to use VictoriaMetrics protocol for sending the data to the given -remoteWrite.url in order to reduce network bandwidth usage and disk read/write IO under high load. See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol
Supports array of values separated by comma or specified via multiple flags.
-sortLabels
Whether to sort labels for incoming samples before writing them to all the configured remote storage systems. This may be needed for reducing memory usage at remote storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}Enabled sorting for labels can slow down ingestion performance a bit
-tls

View file

@ -253,6 +253,9 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
switch path {
case "/prometheus/api/v1/write", "/api/v1/write":
if common.HandleVMProtoServerHandshake(w, r) {
return true
}
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(nil, r); err != nil {
prometheusWriteErrors.Inc()

View file

@ -123,6 +123,10 @@ func newHTTPClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persiste
}
tr.Proxy = http.ProxyURL(pu)
}
hc := &http.Client{
Transport: tr,
Timeout: sendTimeout.GetOptionalArgOrDefault(argIdx, time.Minute),
}
c := &client{
sanitizedURL: sanitizedURL,
remoteWriteURL: remoteWriteURL,
@ -130,11 +134,8 @@ func newHTTPClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persiste
authCfg: authCfg,
awsCfg: awsCfg,
fq: fq,
hc: &http.Client{
Transport: tr,
Timeout: sendTimeout.GetOptionalArgOrDefault(argIdx, time.Minute),
},
stopCh: make(chan struct{}),
hc: hc,
stopCh: make(chan struct{}),
}
c.sendBlock = c.sendBlockHTTP
return c

View file

@ -21,6 +21,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/streamaggr"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
@ -28,17 +29,14 @@ import (
)
var (
remoteWriteURLs = flagutil.NewArrayString("remoteWrite.url", "Remote storage URL to write data to. It must support Prometheus remote_write protocol. "+
"Example url: http://<victoriametrics-host>:8428/api/v1/write . "+
"It is recommended setting -remoteWrite.useVMProto command-line option when VictoriaMetrics is used as a remote storage in order to save network bandwidth. "+
"See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol . "+
remoteWriteURLs = flagutil.NewArrayString("remoteWrite.url", "Remote storage URL to write data to. It must support either VictoriaMetrics remote write protocol "+
"or Prometheus remote_write protocol. Example url: http://<victoriametrics-host>:8428/api/v1/write . "+
"Pass multiple -remoteWrite.url options in order to replicate the collected data to multiple remote storage systems. See also -remoteWrite.multitenantURL")
remoteWriteMultitenantURLs = flagutil.NewArrayString("remoteWrite.multitenantURL", "Base path for multitenant remote storage URL to write data to. "+
"See https://docs.victoriametrics.com/vmagent.html#multitenancy for details. Example url: http://<vminsert>:8480 . "+
"Pass multiple -remoteWrite.multitenantURL flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.url")
useVMProto = flagutil.NewArrayBool("remoteWrite.useVMProto", "Whether to use VictoriaMetrics protocol for sending the data to the given -remoteWrite.url "+
"in order to reduce network bandwidth usage and disk read/write IO under high load. "+
"See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol")
forcePromProto = flagutil.NewArrayBool("remoteWrite.forcePromProto", "Whether to force Prometheus remote write protocol for sending data "+
"to the corresponding -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored. "+
"See also -remoteWrite.maxDiskUsagePerURL")
queues = flag.Int("remoteWrite.queues", cgroup.AvailableCPUs()*2, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+
@ -480,7 +478,18 @@ func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxI
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_inmemory_blocks{path=%q, url=%q}`, queuePath, sanitizedURL), func() float64 {
return float64(fq.GetInmemoryQueueLen())
})
isVMRemoteWrite := useVMProto.GetOptionalArg(argIdx)
// Auto-detect whether the remote storage supports VictoriaMetrics remote write protocol.
isVMRemoteWrite := false
usePromProto := forcePromProto.GetOptionalArg(argIdx)
if !usePromProto {
isVMRemoteWrite = common.HandleVMProtoClientHandshake(remoteWriteURL)
if !isVMRemoteWrite {
logger.Infof("the remote storage at %q doesn't support VictoriaMetrics remote write protocol. Switching to Prometheus remote write protocol. "+
"See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol", sanitizedURL)
}
}
var c *client
switch remoteWriteURL.Scheme {
case "http", "https":

View file

@ -157,6 +157,9 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
switch path {
case "/prometheus/api/v1/write", "/api/v1/write":
if common.HandleVMProtoServerHandshake(w, r) {
return true
}
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(r); err != nil {
prometheusWriteErrors.Inc()

View file

@ -185,30 +185,25 @@ There is also support for multitenant writes. See [these docs](#multitenancy).
## VictoriaMetrics remote write protocol
By default `vmagent` uses Prometheus remote_write protocol for sending the data to the configured `-remoteWrite.url`.
This allows sending data to [any Prometheus-compatible remote storage](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage).
`vmagent` supports sending data to the configured `-remoteWrite.url` either via Prometheus remote write protocol
or via VictoriaMetrics remote write protocol.
The Prometheus remote_write protocol may require big amounts of network bandwidth under high load.
This may result in high network egress costs when the configured remote storage is located in remote datacenter or availability zone.
This also may result in the increased disk IO at `vmagent` when it writes to disk the pending data, which must be sent to remote storage.
In this case the `vmagent` can be instructed to use VictoriaMetrics remote write protocol.
This allows reducing egress network bandwidth costs while reducing disk read/write IO at `vmagent` side under high load.
The `-remoteWrite.useVMProto=true` command-line flag instructs `vmagent` to send the data to the corresponding `-remoteWrite.url`
via VictoriaMetrics remote write protocol.
VictoriaMetrics remote write protocol provides the following benefits comparing to Prometheus remote write protocol:
While all the [recently released](https://docs.victoriametrics.com/CHANGELOG.html) VictoriaMetrics components support
the VictoriaMetrics remote write protocol, third-party systems and old versions of VictoriaMetrics components may miss the support of this protocol.
- Reduced network bandwidth usage by 2x-5x. This allows saving network bandwidth usage costs when `vmagent` and
the configured remote storage systems are located in different datacenters, availability zones or regions.
The `-remoteWrite.useVMProto` command-line flag can be set independently per each configured `-remoteWrite.url`.
For example, the following command instructs `vmagent` to send the data to `https://victoriametrics/api/v1/write` via VictoriaMetrics remote write protocol,
while sending the data to `https://prom-compatible-storage/write` via Prometheus remote write protocol:
- Reduced disk read/write IO and disk space usage at `vmagent` when the remote storage is temporarily unavailable.
In this case `vmagent` buffers the incoming data to disk using the VictoriaMetrics remote write format.
This reduces disk read/write IO and disk space usage by 2x-5x comparing to Prometheus remote write format.
```
./vmagent -remoteWrite.url=https://victoriametrics/api/v1/write \
-remoteWrite.useVMProto=true \
-remoteWrite.url=https://prom-compatible-storage/write \
-remoteWrite.useVMProto=false
```
`vmagent` automatically uses VictoriaMetrics remote write protocol when it sends data to VictoriaMetrics components such as other `vmagent` instances,
[single-node VictoriaMetrics](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html)
or `vminsert` at [cluster version](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html).
`vmagent` automatically switches to Prometheus remote write protocol when it sends data to old versions of VictoriaMetrics components
or to other Prometheus-compatible remote storage systems. It is possible to force switch to Prometheus remote write protocol
by specifying `-remoteWrite.forcePromProto` command-line flag for the corresponding `-remoteWrite.url`.
## Multitenancy
@ -1457,6 +1452,9 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.flushInterval duration
Interval for flushing the data to remote storage. This option takes effect only when less than 10K data points per second are pushed to -remoteWrite.url (default 1s)
-remoteWrite.forcePromProto array
Whether to force Prometheus remote write protocol for sending data to the corresponding -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol
Supports array of values separated by comma or specified via multiple flags.
-remoteWrite.headers array
Optional HTTP headers to send with each request to the corresponding -remoteWrite.url. For example, -remoteWrite.headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding -remoteWrite.url. Multiple headers must be delimited by '^^': -remoteWrite.headers='header1:value1^^header2:value2'
Supports an array of values separated by comma or specified via multiple flags.
@ -1542,14 +1540,11 @@ See the docs at https://docs.victoriametrics.com/vmagent.html .
-remoteWrite.tmpDataPath string
Path to directory where temporary data for remote write component is stored. See also -remoteWrite.maxDiskUsagePerURL (default "vmagent-remotewrite-data")
-remoteWrite.url array
Remote storage URL to write data to. It must support Prometheus remote_write protocol. Example url: http://<victoriametrics-host>:8428/api/v1/write . It is recommended setting -remoteWrite.useVMProto command-line option when VictoriaMetrics is used as a remote storage in order to save network bandwidth. See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol . Pass multiple -remoteWrite.url options in order to replicate the collected data to multiple remote storage systems. See also -remoteWrite.multitenantURL
Remote storage URL to write data to. It must support either VictoriaMetrics remote write protocol or Prometheus remote_write protocol. Example url: http://<victoriametrics-host>:8428/api/v1/write . Pass multiple -remoteWrite.url options in order to replicate the collected data to multiple remote storage systems. See also -remoteWrite.multitenantURL
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.urlRelabelConfig array
Optional path to relabel configs for the corresponding -remoteWrite.url. See also -remoteWrite.relabelConfig. The path can point either to local file or to http url. See https://docs.victoriametrics.com/vmagent.html#relabeling
Supports an array of values separated by comma or specified via multiple flags.
-remoteWrite.useVMProto array
Whether to use VictoriaMetrics protocol for sending the data to the given -remoteWrite.url in order to reduce network bandwidth usage and disk read/write IO under high load. See https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol
Supports array of values separated by comma or specified via multiple flags.
-sortLabels
Whether to sort labels for incoming samples before writing them to all the configured remote storage systems. This may be needed for reducing memory usage at remote storage when the order of labels in incoming samples is random. For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}Enabled sorting for labels can slow down ingestion performance a bit
-tls

View file

@ -0,0 +1,42 @@
package common
import (
"io"
"net/http"
"net/url"
"strconv"
)
func HandleVMProtoClientHandshake(remoteWriteURL *url.URL) bool {
u := *remoteWriteURL
q := u.Query()
q.Set("get_vm_proto_version", "1")
u.RawQuery = q.Encode()
resp, err := http.Get(u.String())
if err != nil {
return false
}
data, err := io.ReadAll(resp.Body)
_ = resp.Body.Close()
if err != nil {
return false
}
if resp.StatusCode != http.StatusOK {
return false
}
version, err := strconv.Atoi(string(data))
if err != nil {
return false
}
return version >= 1
}
// HandleVMProtoServerHandshake returns true if r contains handshake request for determining the supported protocol version.
func HandleVMProtoServerHandshake(w http.ResponseWriter, r *http.Request) bool {
q := r.URL.Query()
if q.Get("get_vm_proto_version") != "" {
io.WriteString(w, "1")
return true
}
return false
}