Commit graph

140 commits

Author SHA1 Message Date
Aliaksandr Valialkin
7bfb5efaef
deployment/docker: upgrade Go builder for production builds from v1.17.7 to v1.18.0
See https://tip.golang.org/doc/go1.18
2022-03-16 14:07:43 +02:00
Denys Holius
e93e168bdc
Added missed runbook for udpating k8s VM Cluster in DO (#2219)
* added missed runbook for udpating k8s VM Cluster in DO

* Update deployment/marketplace/digitialocean/one-click-droplet/RELEASE_GUIDE.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-18 16:29:04 +02:00
Roman Khavronenko
7cd371f08f
alerts: lower the threshold for TooHighSlowInsertsRate (#2210)
Lowering threshold from 50% to 5% will be more sufficient
for discovering un-healthy system state. It also goes in
sync with alert definition in cluster branch.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-02-18 13:42:24 +02:00
Roman Khavronenko
e29b2b8444
Monitoring single (#2190)
* dashboards: plot cpu limits for vmagent, vmalert and vm-single dashboards

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* alerts: add `TooHighCPUUsage` alert for all VM components

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards: bump components version requirements

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-02-15 11:54:28 +02:00
Nikolay
75e84144c7
adds release build for macos darwin amd64 and arm64 (#2185)
* adds release build for macos darwin amd64 and arm64

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1896
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1851

* wip

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-14 17:28:56 +02:00
Aliaksandr Valialkin
93c2db5546
deployment/docker/docker-compose.yml: update Grafana from v8.3.4 to v8.3.5
See https://grafana.com/blog/2022/02/08/grafana-7.5.15-and-8.3.5-released-with-moderate-severity-security-fixes
2022-02-14 13:22:25 +02:00
Aliaksandr Valialkin
e08b74fcd6
deployment/docker: update Go builder from v1.17.6 to v1.17.7
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.7+label%3ACherryPickApproved
2022-02-12 01:13:05 +02:00
Nikolay
a8acad7453
adds CGO build for arm64 (#2102)
* adds CGO build for arm64
it must improve performance for arm64 based deployments of vmstorage and
vmsingle for 15-20%

it depends on gozstd package update for correct musl gozstd vendoring

* typo fixes

* docs/CHANGELOG.md: document the change

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-02-08 16:25:59 +02:00
Aliaksandr Valialkin
bce7d7ac60
deployment/docker: update Grafana from v8.3.2 to v8.3.4 2022-01-18 22:42:15 +02:00
Aliaksandr Valialkin
e47385d34a
deployment/docker: update Go builder from v1.17.5 to v1.17.6
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.6+label%3ACherryPickApproved
2022-01-07 13:34:32 +02:00
Denis Golius
dd1b789c15 removed not needed directory 2022-01-06 12:17:53 +03:00
Denys Holius
d44cc14c6b
added packer build for DigitalOcean Droplets (#1917)
* added packer build for DigitalOcean Droplets

* fixed typo

* added packer RELEASE_GUIDE.md, Makefile

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* added corrections amd improvements

* added packer link & templating for sed version

* fixed typo

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2021-12-21 12:09:14 +02:00
Roman Khavronenko
bc79bdf68a
Dashboards vmagent updates (#1973)
* dashboards/vmagent: shuffle panels for better visibility

More important error/dropped panels were moved higher on the main row.
Network usage panel moved to Resource usage row.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmagent: add Troubleshooting row to show top 5 instances/jobs by churn rate

New panels are supposed to show top 5 jobs or targets which generate the most
of the churn rate. They were placed into a new row "Troubleshooting".

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmagent: add panels for showing persistent queue saturation

New panels were added to Torubleshooting row to show the persistent queue
saturation. The corresponding alerts were added and linked to these
panels as well.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmagent: add alert "RejectedRemoteWriteDataBlocksAreDropped"

New alert suppose to send a notification when vmagent starts to drop
data blocks rejected by configured remote write destiantion.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-20 12:16:53 +02:00
Aliaksandr Valialkin
496b6e4d3d
deployment/docker/docker-compose.yml: update Grafana version from 8.2.2 to 8.3.2
See https://grafana.com/blog/2021/12/10/grafana-8.3.2-and-7.5.12-released-with-moderate-severity-security-fix/
2021-12-14 15:09:49 +02:00
Aliaksandr Valialkin
a8ad870bd0
deployment/docker: update Go builder from v1.17.4 to v1.17.5
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.5+label%3ACherryPickApproved
2021-12-12 18:17:49 +02:00
Aliaksandr Valialkin
a2e0275f14
deployment/docker: update Go builder from v1.17.3 to v1.17.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.4+label%3ACherryPickApproved
2021-12-09 18:51:58 +02:00
Thomas Danielsson
77e19b3f87
Fix vmsingle dashboard link (#1894) 2021-12-02 14:43:30 +02:00
Denis Golius
37faf1f426 Bumped Alpine linux version to 3.15.0 2021-11-28 20:53:48 +02:00
Aliaksandr Valialkin
129b0d2b22
deployment/docker: allow using / chars in ROOT_IMAGE when running make package-*
This fixes the following command:

ROOT_IMAGE=gcr.io/distroless/static make package-victoria-metrics
2021-11-14 13:55:05 +02:00
Denys Holius
49ee952e9a
Bumped Alpine linux version to the latest (#1811)
See this https://alpinelinux.org/posts/Alpine-3.14.3-released.html
2021-11-14 12:59:27 +03:00
Aliaksandr Valialkin
3db1f2d550
deployment/dm: update Go builder from Go1.17.2 to Go1.17.3
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.3+label%3ACherryPickApproved
2021-11-05 11:51:38 +02:00
Aliaksandr Valialkin
b76db7c772
deployment/docker: update Grafana from v8.2.0 to v8.2.2 2021-10-22 19:33:22 +03:00
Aliaksandr Valialkin
ec40affb59
deployment/docker/alerts.yml: formatting fixes after 865a60f13e 2021-10-19 08:53:03 +03:00
Yurii Kravets
865a60f13e
Update alerts.yml
Added Series Limit day\hour alerts
2021-10-18 18:14:49 +03:00
Aliaksandr Valialkin
83a2a9f2f7
deployment/docker/docker-compose.yml: upgrade Grafana from v8.1.2 to v8.2.0 2021-10-08 20:37:40 +03:00
Aliaksandr Valialkin
00fe5230e9
deployment/docker: update Go builder version from Go1.17.1 to Go1.17.2
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.2+label%3ACherryPickApproved
2021-10-08 17:42:57 +03:00
Nikolay
3f1e6da1d7
moves prod images build into alpine container with musl (#1640)
adds gcc and musl-dev to builder container
2021-09-24 00:14:11 +03:00
Aliaksandr Valialkin
2394b5018b deployment/docker: update Go builder from v1.17.0 to v1.17.1
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.1+label%3ACherryPickApproved
2021-09-12 15:23:53 +03:00
Denys Holius
abba6e8370
Bump alpine linux to latest (#1607) 2021-09-09 16:29:15 +03:00
Roman Khavronenko
0f4bcc00b2
Single dashboards upd (#1593)
* dasbhoard: replace `null` datasources

null datasource value may confuse Grafana and make it drop panel query in some
versions.

* docker: bump grafana image version

* dashboards: add URL variable selector to vmagent dashboard

* dashboards: add new panel `Remote write connection saturation` to vmagent dashboard

* alerts: add new alert for `Remote write connection saturation` panel of vmagent dashboard

* dashboards: add "Logging rate" panel to vmagent dashboard
2021-09-01 11:46:22 +03:00
Roman Khavronenko
2ed2878a57 docs: fix the link for cluster docker compose 2021-09-01 09:21:45 +03:00
Roman Khavronenko
0d6735106b docs: update docker env description 2021-09-01 09:18:56 +03:00
Roman Khavronenko
eff940aa76
Vmalert metrics update (#1580)
* vmalert: remove `vmalert_execution_duration_seconds` metric

The summary for `vmalert_execution_duration_seconds` metric gives no additional
value comparing to `vmalert_iteration_duration_seconds` metric.

* vmalert: update config reload success metric properly

Previously, if there was unsuccessfull attempt to reload config and then
rollback to previous version - the metric remained set to 0.

* vmalert: add Grafana dashboard to overview application metrics

* docker: include vmalert target into list for scraping

* vmalert: extend notifier metrics with addr label

The change adds an `addr` label to metrics for alerts_sent and alerts_send_errors
to identify which exact address is having issues.
The according change was made to vmalert dashboard.

* vmalert: update documentation and docker environment for vmalert's dashboard

Mention Grafana's dashboard in vmalert's README in a new section #Monitoring.

Update docker-compose env to automatically add vmalert's dashboard.
Update docker-compose README with additional info about services.
2021-08-31 12:28:02 +03:00
Aliaksandr Valialkin
69c291353b deployment/docker: update Go builder from Go1.16.0 to Go1.17.0
This improves data ingestion and query performance by up to 5% according to benchmarks.

See https://go.dev/blog/go1.17
2021-08-21 22:20:49 +03:00
Aliaksandr Valialkin
06bf21c21b deployment/docker: upgrade Alpine base docker image from v3.14.0 to v3.14.1
See https://www.alpinelinux.org/posts/Alpine-3.14.1-released.html

This fixes https://vuldb.com/?source_cve.180051
See also https://vuldb.com/?id.180051 and https://snyk.io/vuln/SNYK-ALPINE314-APKTOOLS-1533752
2021-08-18 11:04:11 +03:00
Aliaksandr Valialkin
5716af4636 deployment/dm: update Go builder from Go1.16.6 to Go1.16.7
See https://github.com/golang/go/issues?q=milestone%3AGo1.16.7+label%3ACherryPickApproved
2021-08-06 12:12:03 +03:00
Roman Khavronenko
408ba43092
Alerts single update (#1510)
* alerts: move `ProcessNearFDLimits` to `vm-health` group since it is relevant for all services

* alerts: add new `TooHighMemoryUsage` alerting rule
2021-08-02 15:51:24 +03:00
Aliaksandr Valialkin
244d0fe5d7 deployment/docker: update Go builder from v1.16.5 to v1.16.6
Ths Go release has the following bugfixes: https://github.com/golang/go/issues?q=milestone%3AGo1.16.6+label%3ACherryPickApproved
2021-07-13 14:25:41 +03:00
Aliaksandr Valialkin
8c764e88f0 app/vmui: move source code from https://github.com/VictoriaMetrics/vmui to app/vmui
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1413
2021-07-09 17:15:23 +03:00
Aliaksandr Valialkin
c5f0b454f0 app/vmselect: follow-up after aa11ef6d3b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1413
2021-07-07 17:43:35 +03:00
tony
e9e35a7d6a add vmui for vmselect component (#1431)
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-07-07 17:33:02 +03:00
Roman Khavronenko
2f54559c89
alerts: sync alert expression for DiskRunsOutOfSpaceIn3Days with dashboard (#1436) 2021-07-07 10:31:09 +03:00
Aliaksandr Valialkin
00bbe1608b deployment/docker: upgrade alpine image from v3.13.5 to v3.14.0 2021-07-01 10:56:56 +03:00
Roman Khavronenko
5e9f3777bf
alerts: add new alert LabelsLimitExceededOnIngestion (#1359) 2021-06-09 12:15:36 +03:00
Aliaksandr Valialkin
28c44ef065 deployment/docker/docker-compose.yml: update Grafana from v7.5.2 to v8.0.0
See https://github.com/grafana/grafana/releases/tag/v8.0.0
2021-06-09 02:25:24 +03:00
k1rk
668165f53d
rename serviceHealth group name to vm-health (#1360)
this causes conflicts in `victoria-metrics-k8s-stack` chart =)
2021-06-08 23:34:38 +03:00
Aliaksandr Valialkin
8a7e6ad5cc deployment/docker: update Go builder from v1.16.4 to v1.16.5
See the fixed isses at https://github.com/golang/go/issues?q=milestone%3AGo1.16.5+label%3ACherryPickApproved
2021-06-08 15:45:44 +03:00
Aliaksandr Valialkin
f8d50e9641 deployment/dm: update Go builder from v1.16.3 to v1.16.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.16.4+label%3ACherryPickApproved for details
2021-05-08 20:04:05 +03:00
Aliaksandr Valialkin
0969b446b3 deployment/docker: update base docker image from alpine:3.13.2 to alpine:3.13.5 2021-05-01 10:50:09 +03:00
Roman Khavronenko
162681e60d
add new alerts (#1195)
* alerts: backport `DiskRunsOutOfSpace` alert and some other tweaks from cluster branch

* alerts: add `ServiceDown` alert to detect "dead" services
2021-04-08 18:24:25 +03:00