Aliaksandr Valialkin
6721e47ae9
app: respect CPU limits set via cgroups
...
Update GOMAXPROCS to limits set via cgroups. This should reduce CPU trashing and reduce memory usage
for cases when VictoriaMetrics components run in containers with CPU limits.
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/685
2020-08-11 23:01:03 +03:00
Aliaksandr Valialkin
94471a1273
app: remove duplicate *-pure makefile rules
2020-07-31 20:01:30 +03:00
Aliaksandr Valialkin
106e302d7a
all: add mssing APP_NAME to vm*-GOARCH builds
2020-07-31 13:45:32 +03:00
Aliaksandr Valialkin
f6d4275087
app/{vmagent,vminsert}: properly preserve db
tag from query string passed to Influx line protocol query
...
Previously `db` tag from the query string wasn't added to metrics after encountering `db` tag in the Influx line
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/653
2020-07-28 21:25:49 +03:00
Aliaksandr Valialkin
0750d2cec1
app/vminsert: export vm_relabel_metrics_dropped_total
metric that shows the number of metrics dropped due to relabeling
2020-07-23 14:58:02 +03:00
Aliaksandr Valialkin
49a0011837
app/vminsert: do not call ApplyRelabeling function if relabeling is disabled
...
This should reduce CPU usage a bit when `-relabelConfig` isn't set
2020-07-23 13:35:36 +03:00
Aliaksandr Valialkin
c91ccce50c
app/vminsert: fix relabeling for metrics ingested via Influx line protocol
...
Previously the enabled relabeling with `-relabelConfig` command-line flag could result in missing labels
if a single Influx line protocol message contains multiple field values.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/638
2020-07-23 13:25:37 +03:00
Aliaksandr Valialkin
31ef39e8da
lib/httpserver: log remote address in error message from httpserver.Errorf
...
This should improve detection of the root cause of errors.
Thanks to Anant for the idea.
2020-07-20 14:06:29 +03:00
Aliaksandr Valialkin
f9b38f7f2d
app/vminsert/influx: properly handle the case when certain labels with empty values are removed by ApplyRelabeling() call
...
Previously this could lead to `out of range` panic
2020-07-17 00:05:24 +03:00
Aliaksandr Valialkin
86044f6561
app/{vminsert,vmagent}: add -influxSkipMeasurement
command-line flag for using field name as metric name
...
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/626
2020-07-14 14:18:40 +03:00
Aliaksandr Valialkin
6373d377ef
app/{vminsert,vmagent}: add ability to import data in Prometheus exposition format via /api/v1/import/prometheus
2020-07-10 12:13:28 +03:00
Aliaksandr Valialkin
8bb3622e9d
app/vminsert: prevent from adding and/or selecting labels with empty values
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/600
2020-07-02 23:17:12 +03:00
Aliaksandr Valialkin
6ebac3ab63
app/vminsert: add ability to apply relabeling to all the incoming metrics if -relabelConfig
command-line arg points to a file with a list of relabel_config
entries
...
See https://victoriametrics.github.io/#relabeling
2020-07-02 20:36:33 +03:00
Aliaksandr Valialkin
d962568e93
all: use %w instead of %s for wrapping errors in fmt.Errorf
...
This will simplify examining the returned errors such as httpserver.ErrorWithStatusCode .
See https://blog.golang.org/go1.13-errors for details.
2020-06-30 23:33:46 +03:00
Aliaksandr Valialkin
a586b8b6d4
app/vminsert/netstorage: do not re-route every time series to more than two vmstorage nodes when certain vmstorage nodes are temporarily slower than the rest of them
...
Previously vminsert may spread data for a single time series across all the available vmstorage nodes
when vmstorage nodes couldn't handle the given ingestion rate. This could lead to increased usage
of CPU and memory on every vmstorage node, since every vmstorage node had to register all the time
series seen in the cluster. Now a time series may spread to maximum two vmstorage nodes under heavy load.
Every time series is routed to a single vmstorage node under normal load.
2020-06-25 16:42:37 +03:00
Aliaksandr Valialkin
2fc2679a3f
app/vminsert/netstorage: remove possible race condition when broken connection may be recovered before acquiring storageNode.bcLock
2020-06-20 16:38:08 +03:00
Aliaksandr Valialkin
4400700832
app/vminsert: properly replicate data for the last RF-1
storage nodes for -replicationFactor=RF
...
Previously the data for the last `RF-1` storage noes has been incorrectly replicated to the first storage node.
2020-06-19 12:40:22 +03:00
Aliaksandr Valialkin
4f673a5201
app/vminsert: export metrics for determining ingested rows with dropped or truncated labels
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/565
2020-06-19 01:12:44 +03:00
Aliaksandr Valialkin
85c1ccb8b8
app/vminsert/netstorage: add missing return
in storageNode.checkHealth on connection failure
2020-06-18 20:51:51 +03:00
Aliaksandr Valialkin
464682f380
app/vminsert/netstorage: periodically check for each -storageNode
health, so it could be marked as healthy when it is ready to accept data
...
This fixes uneven data routing in cluster version when `-replicationFactor` is set to 1 (default value),
i.e. when the replication is disabled.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/546
2020-06-18 20:42:43 +03:00
Aliaksandr Valialkin
43b14b9569
app/vminsert/netstorage: free up unused memory in buffer after memory usage spikes
2020-06-01 14:33:35 +03:00
Aliaksandr Valialkin
f41a01332a
app/vminsert/netstorage: evenly distribute rerouted rows among all the availalbe storage nodes
...
Previously such rows were distributed to the original storage node or to the next storage node.
This may result to uneven load among the remaining storage nodes.
2020-05-30 13:51:09 +03:00
Aliaksandr Valialkin
02b2064d8e
app/vminsert/netstorage: do not increment vm_rpc_rows_lost_total when all the vmstorage nodes are unavailable, since vminsert retries sending the data instead of dropping it
2020-05-28 22:36:56 +03:00
Aliaksandr Valialkin
7a61357b5d
app/vminsert/netstorage: make sure that the the data is always replicated among -replicationFactor vmstorage nodes
...
Previously vminsert could write multiple copies of the data to a single vmstorage node when the ingestion rate
exceeds the maximum throughput for connections to vmstorage nodes.
2020-05-28 19:59:07 +03:00
Aliaksandr Valialkin
77e5165e7b
app/vminsert: add -replicationFactor
command-line flag for enabling data replication among available -storageNode instances
2020-05-27 17:29:44 +03:00
Aliaksandr Valialkin
b4e3bffe4b
app/vminsert/netstorage: emit warnings instead of errors when re-routing data to healthy storage nodes
2020-05-27 16:31:41 +03:00
Aliaksandr Valialkin
75f2f3b09d
app/vminsert/netstorage: improve ingestion performance when a single vmstorage node is slower than other vmstorage nodes
...
Previously the ingestion performance has been limited by the slowest vmstorage node.
Now vminsert should re-route data from the slowest vmstorage node to the remaining nodes.
2020-05-27 15:08:22 +03:00
Aliaksandr Valialkin
9844845d79
app/vminsert: tune the maximum summary buffer size for pending data to 1/4 of available RAM, since 1/2 of RAM is too big considering GOGC overhead
2020-05-25 02:00:37 +03:00
Aliaksandr Valialkin
4a82631e44
app/vminsert: limit the summary buffer sizes for all the storage nodes to a half of the allowed memory
2020-05-25 01:39:33 +03:00
Aliaksandr Valialkin
4bd3d4b148
app/vminsert/netstorage: do not return error from storageNode.flushBufLocked when the buffer has been successfully re-routed to healthy nodes
...
This should reduce the number of false errors in the log and the number of falsely lost rows
2020-05-22 18:29:43 +03:00
Aliaksandr Valialkin
6edc33d9bb
app/vminsert/netstorage: capture the first error instead of the last error when sending data to vmstorage
...
The first error has more chances to point to the real root cause of the issue.
2020-05-22 17:49:33 +03:00
Aliaksandr Valialkin
2784015a4d
all: print --help
output to stdout instead of stderr
...
This is easier to grep and pipe
2020-05-16 12:03:06 +03:00
Aliaksandr Valialkin
4e237b4670
app/vminsert/influx: support passing AccountID and ProjectID via plain TCP and UDP
...
Now `vminsert` accepts AccountID and ProjectID via `VictoriaMetrics_AccountID` and `VictoriaMetrics_ProjectID` tags
when reading Influx line protocol data via plain TCP or UDP (i.e. when `-influxListenAddr` is set).
2020-05-12 13:13:04 +03:00
Aliaksandr Valialkin
716bbe79d4
app/vminsert/netstorage: increase timeout for waiting for ack
message after sending big data block to vmstorage
2020-04-28 11:19:46 +03:00
Aliaksandr Valialkin
989d84cf3f
app/{vminsert,vmstorage}: wait for ack
from vmstorage
after each packet sent to it from vminsert
...
This should protect from possible data loss when `vmstorage` is stopped while the packet is sent from `vminsert`.
This commit switches to new protocol between vminsert and vmstorage, which is incompatible
with the previous protocol. So it is required that both vminsert and vmstorage nodes are updated.
2020-04-27 09:53:26 +03:00
Aliaksandr Valialkin
a53e332a93
app/vmstorage: add missing shutdown for http server on graceful shutdown
...
This could result in the following panic during graceful shutdown when `/metrics` page is requested:
http: panic serving 10.101.66.5:57366: runtime error: invalid memory address or nil pointer dereference
goroutine 2050 [running]:
net/http.(*conn).serve.func1(0xc00ef22000)
net/http/server.go:1772 +0x139
panic(0xa0fc00, 0xe91d80)
runtime/panic.go:973 +0x3e3
github.com/VictoriaMetrics/VictoriaMetrics/lib/workingsetcache.(*Cache).UpdateStats(0x0, 0xc0000516c8)
github.com/VictoriaMetrics/VictoriaMetrics/lib/workingsetcache/cache.go:224 +0x37
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*indexDB).UpdateMetrics(0xc00b931d00, 0xc02c41acf8)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/index_db.go:258 +0x9f
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage.(*Storage).UpdateMetrics(0xc0000bc7e0, 0xc02c41ac00)
github.com/VictoriaMetrics/VictoriaMetrics/lib/storage/storage.go:413 +0x4c5
main.registerStorageMetrics.func1(0x0)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/main.go:186 +0xd9
main.registerStorageMetrics.func3(0xc00008c380)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/main.go:196 +0x26
main.registerStorageMetrics.func7(0xc)
github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage/main.go:211 +0x26
github.com/VictoriaMetrics/metrics.(*Gauge).marshalTo(0xc000010148, 0xaa407d, 0x20, 0xb50d60, 0xc005319890)
github.com/VictoriaMetrics/metrics@v1.11.2/gauge.go:38 +0x3f
github.com/VictoriaMetrics/metrics.(*Set).WritePrometheus(0xc000084300, 0x7fd56809c940, 0xc005319860)
github.com/VictoriaMetrics/metrics@v1.11.2/set.go:51 +0x1e1
github.com/VictoriaMetrics/metrics.WritePrometheus(0x7fd56809c940, 0xc005319860, 0xa16f01)
github.com/VictoriaMetrics/metrics@v1.11.2/metrics.go:42 +0x41
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver.writePrometheusMetrics(0x7fd56809c940, 0xc005319860)
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver/metrics.go:16 +0x44
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver.handlerWrapper(0xb5a120, 0xc005319860, 0xc005018f00, 0xc00002cc90)
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver/httpserver.go:154 +0x58d
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver.gzipHandler.func1(0xb5a120, 0xc005319860, 0xc005018f00)
github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver/httpserver.go:119 +0x8e
net/http.HandlerFunc.ServeHTTP(0xc00002d110, 0xb5a660, 0xc0044141c0, 0xc005018f00)
net/http/server.go:2012 +0x44
net/http.serverHandler.ServeHTTP(0xc004414000, 0xb5a660, 0xc0044141c0, 0xc005018f00)
net/http/server.go:2807 +0xa3
net/http.(*conn).serve(0xc00ef22000, 0xb5bf60, 0xc010532080)
net/http/server.go:1895 +0x86c
created by net/http.(*Server).Serve
net/http/server.go:2933 +0x35c
2020-04-02 21:09:55 +03:00
Dmitry Naumov
b84071fc25
Rootless docker images by default ( #358 )
...
* Rootless docker images by default
* Migrate to rootless base image
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2020-03-27 21:18:32 +02:00
Aliaksandr Valialkin
2f0a36044c
app/{vmagent,vminsert}: add support for importing csv data via /api/v1/import/csv
2020-03-10 21:17:40 +02:00
Aliaksandr Valialkin
1286cead75
app/vminsert: properly initialize InsertCtx
...
This should prevent from panic described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/339
2020-02-26 21:21:02 +02:00
Aliaksandr Valialkin
2471340e0d
app/vmagent: add ability to accept Influx line protocol data via TCP and UDP
...
Just set `-influxListenAddr` command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/333
2020-02-25 19:18:01 +02:00
Aliaksandr Valialkin
7ee7614e90
app/vmagent: initial implementation for vmagent
2020-02-23 17:31:54 +02:00
Aliaksandr Valialkin
347aaba79d
lib/{storage,mergeset}: use time.Ticker instead of time.Timer where appropriate
...
It has been appeared that time.Timer was used in places where time.Ticker must be used instead.
This could result in blocked goroutines as in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/316 .
2020-02-13 13:21:48 +02:00
Aliaksandr Valialkin
1010a57882
all: allow setting flags via environment vars
...
Now flags can be set via environment vars with the same names as flags.
Command-line flags override flags set via env vars.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/311
2020-02-10 13:31:21 +02:00
Aliaksandr Valialkin
7cde594696
all: do not clash flag description with back-quoted flag types
...
See https://golang.org/pkg/flag/#PrintDefaults for more details.
2020-02-04 15:56:01 +02:00
Aliaksandr Valialkin
a9c1d5b351
app/vminsert: moved -maxInsertRequestSize
command-line flag out of lib/prompb
in order to prevent its inclusion in vmselect
and vmstorage
apps
2020-01-28 22:53:50 +02:00
Aliaksandr Valialkin
4d70a81e18
app/vminsert: do not drop pending rows if all the vmstorage backends are unavailable
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/294
2020-01-24 22:10:10 +02:00
Aliaksandr Valialkin
0cda6afa8e
app/vminsert: move ingestion protocol parsers to lib/protoparser, so they could be re-used in the upcoming vmagent
2020-01-24 16:55:18 +02:00
Aliaksandr Valialkin
ea53a21b02
all: consistently log durations in seconds with millisecond precision
...
This should improve logs readability
2020-01-22 18:35:24 +02:00
Aliaksandr Valialkin
ecddba30fe
app/vminsert/netstorage: increase timeout for pushing data from vminsert to vmstorage by 3x
...
Our clients report that the previous timeout could lead to frequent errors when
vmstorage starts background merge for big parts on slow HDD.
2020-01-21 18:21:49 +02:00
Aliaksandr Valialkin
cbd0452317
app/vminsert: increase default value for -insert.maxQueueDuration
from 30s to 60s
...
This should help catching up with high ingestion rate after VictoriaMetrics restart.
2020-01-18 14:39:30 +02:00