The new rule for vmalert supposed to detect groups that miss their
evaulations due to slow queries.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9866974a53)
Using `min_over_time` should reduce the amount of false positives when
component is running in near-the-threshold state. Now it should trigger
only if all collected samples were above the threshold on 10m interval.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 003ef3a518)
* lib/promscrape: add metric `vm_promscrape_scrapes_skipped_total`
add metric `vm_promscrape_scrapes_skipped_total`to show whether vmagent skips the scrapes.
This could happen if vmagent is overloaded or target is responding too slow for configured `scrape_interval`.
The follow-up commit should add a corresponding alerting rule and panel to vmagent dashboard.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* deployment/docker: add `TooManyScrapeSkips` alerting rule for vmagent
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* dashboards: add panels `Scrape duration 0.99 quantile` and `Skipped scrapes` to vmagent dashboard
Signed-off-by: hagen1778 <roman@victoriametrics.com>
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* docker-compose: add vmauth to cluster env
vmauth acts as a balancer and used as an example of how to interconnect
VM components via vmauth.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
* docker-compose: add vmauth to cluster env
vmauth acts as a balancer and used as an example of how to interconnect
VM components via vmauth.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
---------
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
* deployment/docker: add VictoriaLogs configuration
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
* deployment/docker/victorialogs: remove outdated comment
It was added in order to indicate that it is required to build VictoriaLogs manually before starting it at the time there was no public release available.
Currently, there is a public tag and it is not required to build it from sources.
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
* deployment/docker/victorialogs/fluentbit: include log path in stream configuration
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
* deployment/docker: add reference to monitoring setup for VictoriaLogs
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
* deployment/docker: disable provenance in buildx
it must fix an issue with multi-platform manifest generation
at buildx >= 0.10 backward compatibility was broken and generated image cannot be used with docker systems that doesn't support oci.
disabling attestat temporary fixes it.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4907https://docs.docker.com/build/attestations/slsa-provenance/
* Update docs/CHANGELOG.md
---------
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
The `ConcurrentFlushesHitTheLimit` could be related to components like
vminsert, vmstorage, vm-single-node and vmagent. Moving this alert
to the `health` section of alerts will be benefitial for all components
and will remove the duplicates from single/cluster alerts.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
- Parse protobuf if Content-Type isn't set to `application/json` - this behavior is documented at https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki
- Properly handle gzip'ped JSON requests. The `gzip` header must be read from `Content-Encoding` instead of `Content-Type` header
- Properly flush all the parsed logs with the explicit call to vlstorage.MustAddRows() at the end of query handler
- Check JSON field types more strictly.
- Allow parsing Loki timestamp as floating-point number. Such a timestamp can be generated by some clients,
which store timestamps in float64 instead of int64.
- Optimize parsing of Loki labels in Prometheus text exposition format.
- Simplify tests.
- Remove lib/slicesutil, since there are no more users for it.
- Update docs with missing info and fix various typos. For example, it should be enough to have `instance` and `job` labels
as stream fields in most Loki setups.
- Allow empty of missing timestamps in the ingested logs.
The current timestamp at VictoriaLogs side is then used for the ingested logs.
This simplifies debugging and testing of the provided HTTP-based data ingestion APIs.
The remaining MAJOR issue, which needs to be addressed: victoria-logs binary size increased from 13MB to 22MB
after adding support for Loki data ingestion protocol at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4482 .
This is because of shitty protobuf dependencies. They must be replaced with another protobuf implementation
similar to the one used at lib/prompb or lib/prompbmarshal .