* vmalert: deprecate alert's status link
Deprecate alert's status link `/api/v1/<groupID>/<alertID>/status` in favour of
`api/v1/alerts?group_id=<group_id>&alert_id=<alert_id>"`.
The change was needed for simplifying logic in vmselect for proxying vmalert's requests.
The old alert's status link will be still supported for a few versions but will be removed in the future.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2825
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* vmalert: fix review comments
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Previously the time series could be put into dateMetricIDCache without
registering in the per-day inverted index if GetOrCreateTSIDByName
finds TSID entry in the global index. This could lead to missing
series in query results.
The issue has been introduced in the commit 55e7afae3a,
which has been included in VictoriaMetrics v1.78.0
* docs: warn about potential issue with read queries for 1.78.0
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* docs: warn about potential issue with read queries for 1.78.0
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Previously SearchMetricNames was returning unmarshaled metric names.
This wasn't great for vmstorage, which should spend additional CPU time
for marshaling the metric names before sending them to vmselect.
While at it, remove possible duplicate metric names, which could occur when
multiple samples for new time series are ingested via concurrent requests.
Also sort the metric names before returning them to the client.
This simplifies debugging of the returned metric names across repeated requests to /api/v1/series
Previously the cache could store 10K unique regexps. When every regexp is huge (e.g. hundreds of kilobytes),
then the total cache size could grow to multiples of gigabytes. Now the cache size is limited by the total length
of all cached regexps. So huge regexps won't result in high memory usage for the cache.
If the previous dial attempt was unsuccessful, then all the new dial attempts are skipped
until the background goroutine determines that the given address can be successfully dialed.
This reduces query latency when some of vmstorage nodes are unavailable and dialing them is slow.
This should help with https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
This commit is based on ideas from the https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2756
The main differences are:
- The check for healthy/unhealthy storage nodes is moved one level lower from app/vmselect/netstorage to lib/netutil.ConnPool.
This makes possible re-using this feature everywhere lib/netutil.ConnPool is used.
- The check doesn't take into account handshake errors for already established connections.
Handshake errors usually mean improperly configured VictoriaMetrics cluster, so they shouldn't be ignored.
* vmselect: limit `end` param max value by 2d in future
The change is applied only to service handlers like `/labels` or `/series`
and limits the `end` param by max value <= now() + 2 days. The same limit
is applied for the ingested data, so no reason to allow to request data
in future far than that.
The change is also needed for corner cases like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2669
where too high `end` value triggers inefficient global index search.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* docs/CHANGELOG.md: document the bugfix
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>