Aliaksandr Valialkin
65576ebb5b
docs/CaseStudies.md: add Sensedia case study
2021-06-24 14:35:47 +03:00
Aliaksandr Valialkin
1b6850bab3
docs/CHANGELOG.md: document the bugfix in increase_pure() function from the commit fb4f758715
2021-06-24 12:06:15 +03:00
Aliaksandr Valialkin
b84aea1e6e
lib/protoparser/clusternative: do not pool unmarshalWork structs, since they can occupy big amounts of memory (more than 100MB per each struct)
...
This should reduce memory usage for vmstorage under high ingestion rate when the vmstorage runs on a system with big number of CPU cores
2021-06-23 15:45:08 +03:00
Aliaksandr Valialkin
856aecae05
app/vmselect/promql: return the last timestamp for the max / min value from tmax_over_time()
and tmin_over_time()
function as most users expect
2021-06-23 14:18:37 +03:00
Aliaksandr Valialkin
dffa2afefe
docs/CHANGELOG.md: document the bugfix for incorrect stats collection for concurrently executed tag filter
...
Follow up for c22114c6f0
2021-06-23 14:06:33 +03:00
Aliaksandr Valialkin
c18017a9c3
app/vminsert/netstorage: sort the -storageNode
list passed to vminsert
nodes
...
This should reduce resource usage (CPU, RAM, disk IO) at vmstorage nodes
if the addresses of vmstorage nodes are passed in random order to vminsert nodes.
2021-06-23 14:00:08 +03:00
Aliaksandr Valialkin
a22f37599b
lib/storage: tune tag filters search logic
...
Tune the logic according to the logs provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-864293624
The previous logic had a race when multiple concurrent queries execute the same tag filter without prior stats.
This could result in incorrectly stored stats for such tag filter, which then could result in non-optimal sorting of tag filters
for further queries.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-23 13:30:36 +03:00
Aliaksandr Valialkin
f10fa0d1d7
lib/promscrape/discovery/consul: properly pass namespace to Consul watcher
...
Follow-up for 58a2989fe7
2021-06-22 17:43:20 +03:00
Aliaksandr Valialkin
4adf6c9766
lib/promscrape/discovery/http: follow up after e307bbb29a
2021-06-22 13:42:10 +03:00
Nikolay
e03a3d3a36
adds http_sd ( #1399 )
...
* adds http_sd
* adds X-Prometheus-Refresh-Interval-Seconds header
* Update lib/promscrape/discovery/http/api.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 13:42:09 +03:00
Aliaksandr Valialkin
3ab3902f17
lib/promscrape/discovery: support generic auth configs in Consul service discovery in the same way as Prometheus 2.28 does
2021-06-22 13:18:51 +03:00
Aliaksandr Valialkin
61fc9b98e5
docs/CHANGELOG.md: document the support for Consul namepsace
...
See 58a2989fe7
2021-06-22 12:56:12 +03:00
Nikolay
827a2396d2
adds consul enterprise namespace support ( #1400 )
...
* adds consul enterprise namespace support
* Update lib/promscrape/discovery/consul/consul.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 12:56:11 +03:00
Aliaksandr Valialkin
a575882ca2
docs/PerTenantStatistic.md: document that the per-tenant statistic is a part of cluster version of VictoriaMetrics
2021-06-22 12:43:30 +03:00
Roman Khavronenko
79474baf99
vmctl: add more context to flags description in vm-native mode ( #1395 )
2021-06-18 19:20:52 +03:00
Aliaksandr Valialkin
23fcafd437
docs/CHANGELOG.md: typo fixes
2021-06-18 19:15:29 +03:00
Aliaksandr Valialkin
b92d110cad
app/vmselect: log slow requests to all the /api/v1/*
handlers if their execution time exceeds -search.logSlowQueryDuration
2021-06-18 19:07:03 +03:00
Aliaksandr Valialkin
d29e130181
docs/Single-server-VictoriaMetrics.md: mention that it is recommended to use a single scrape_interval across all the scrape targets
2021-06-18 15:41:35 +03:00
Aliaksandr Valialkin
4acc4602b3
app/vmctl: limit JSON line size by 10K samples ( #1394 )
...
This should reduce the maximum memory usage at VictoriaMetrics when importing time series with big number of samples.
2021-06-18 15:41:34 +03:00
Aliaksandr Valialkin
3ec3705943
docs/Cluster-VictoriaMetrics.md: clarify docs about VictoriaMetrics cluster architecture
2021-06-18 14:35:55 +03:00
Aliaksandr Valialkin
cd697b88c5
docs/CHANGELOG.md: document the reduced disk write IO usage
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-18 14:03:42 +03:00
Aliaksandr Valialkin
f9069ba32a
lib/promscrape: show jobs with empty scrape targets on /targets page
2021-06-18 10:54:12 +03:00
Roman Khavronenko
5cb378f5b5
dasbhoard: display tweaks ( #1387 )
...
* rm cumulative visualisation for panel `Disk space used`.
It uses % threshold and cumulative display breaks it.
* remove area filling for resource usage row;
* add job name for panels in resource usage row.
2021-06-18 10:48:40 +03:00
Nikolay
9ea1dca3dd
fixes DO service discovery labels ( #1389 )
...
adds test for digitalocean sd
2021-06-17 17:21:10 +03:00
Aliaksandr Valialkin
60bc35f550
docs/{vmgateway,vmbackupmanager}: explicitly mention that these components are a part of an enterprise package
2021-06-17 17:19:13 +03:00
Aliaksandr Valialkin
a207be3ffb
lib/storage: fix infinite loop introduced in aa9b56a046
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 14:27:30 +03:00
Aliaksandr Valialkin
0efd37cec1
lib/{mergeset,storage}: reduce the number of fsync calls on data ingestion path on systems with many cpu cores
...
VictoriaMetrics maintains a buffer per CPU core for the ingested data. These buffers are flushed to disk every second.
These buffers are flushed to disk in parallel starting from the commit 56b6b893ce
.
This resulted in increased write disk IO usage on systems with many cpu cores
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-863046999 .
This commit merges the per-CPU buffers into bigger in-memory buffers before flushing them to disk.
This should reduce the rate of fsync syscalls and, consequently, the write disk IO on systems with many CPU cores.
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 13:51:42 +03:00
Aliaksandr Valialkin
51fc469642
app/vmagent/remotewrite: go fmt
after 0a796f7c3a
2021-06-17 13:51:40 +03:00
Aliaksandr Valialkin
90c3606269
docs/vmagent.md: sync with app/vmagent/README.md via make docs-sync
2021-06-16 12:37:55 +03:00
Aliaksandr Valialkin
644102b03b
docs/CHANGELOG.md: document the changed -remoteWrite.queues
value
...
This is a follow-up for 0a796f7c3a
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1385
2021-06-16 12:37:55 +03:00
Zongyang
cf506e300d
Change default value of '-remoteWrite.queues' to cgroup.AvailableCPUs * 2 ( #1385 )
...
* Change default value of '-remoteWrite.queues' to cgroup.AvailableCPUS() * 2 to reduce scrape interval
Default value of vmagent option '-remotewrite.queues' is 4 and default
size of vmagent ScheudleUnmarshalWorkers is number of CPUs, when available
CPUs is much greater than 4, e.g 32, worker are competing push queues
which will increase scrape interval and may cause scrape timeout.
* Update README and flag description
Co-authored-by: xiaozy <xiaozy01@fenbi.com>
2021-06-16 12:37:55 +03:00
Roman Khavronenko
a15c947045
promql: fix increase_pure
calculation for cases with stale series ( #1381 )
...
Due to staleness handling, increase_pure were using incorrect previous value
during calculation in cases where series disappears for period longer
than staleness period and then returns back. The fix suppose to account
for a real datapoint value before staleness takes place. The fix should
remove unexpected spikes while using `increase_pure` for staled series.
2021-06-15 17:37:51 +03:00
Aliaksandr Valialkin
234152d66e
docs/Single-server-VictoriaMetrics.md: mention that VictoriaMetrics works great with APM workloads (aka Application Performance Monitoring)
2021-06-15 17:33:41 +03:00
Aliaksandr Valialkin
b133de1e37
lib/storage: move deletedMetricIDs set from indexDB to Storage
...
This makes consitent the list of deleted metricIDs when it is used from both the current indexDB and the previous indexDB (aka extDB).
This should fix the issue, which could lead to storing new samples under deleted metricIDs after indexDB rotation.
See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136 .
Thanks to @tangqipengleoo for the initial analysis and the pull request - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .
This commit resolves the issue in more generic way compared to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .
The downside of the commit is the deletedMetricIDs set isn't cleaned from the metricIDs outside the retention. It needs app restart.
This should be OK in most cases.
2021-06-15 15:07:54 +03:00
Aliaksandr Valialkin
ebaf68bcb0
lib/protoparser: stop reading the input stream as soon as the callback provided by the caller returns error
...
This is a follow-up for af90c3c43b
2021-06-14 15:20:38 +03:00
faceair
2ea187e801
lib/protoparser: stop read when callback error ( #1380 )
2021-06-14 15:20:37 +03:00
Aliaksandr Valialkin
5f91a701fa
lib/promscrape: show the number of samples collected during the last scrape at /targets and /api/v1/targets pages
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1377
2021-06-14 14:04:35 +03:00
Aliaksandr Valialkin
fb8114ad9c
vendor: update github.com/klauspost/compress from v1.13.0 to v1.13.1
2021-06-14 13:42:05 +03:00
Roman Khavronenko
db39c4a7d1
dashboard: bump version requirements ( #1379 )
2021-06-14 13:32:32 +03:00
Aliaksandr Valialkin
5cd50d840f
docs/CHANGELOG.md: document the addition of DigitalOcean service discovery
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1367
2021-06-14 13:19:31 +03:00
Nikolay
e42da47608
adds digital ocean sd ( #1376 )
...
* adds digital ocean sd config
* adds digital ocean sd
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1367
* typo fix
2021-06-14 13:19:29 +03:00
Roman Khavronenko
1053d3e5a9
Dashboard cluster ( #1375 )
...
* dashboard: update vmagent dash
The update contains the following changes:
* display anonymous memory usage metric. This metric suppose to reflect
memory usage of the process which can't be freed by OS;
* add legends to all panels. This is important for cases when users share
the screenshots;
* modify panels for Grafana v8.0.0
* dashboard: update cluster dash
The update contains the following changes:
* move stats panels to Configuration row, so it can be collapsed;
* display anonymous memory usage metric. This metric suppose to reflect
memory usage of the process which can't be freed by OS;
* add legends to all panels. This is important for cases when users share
the screenshots;
* modify panels for Grafana v8.0.0
2021-06-14 13:03:54 +03:00
Aliaksandr Valialkin
df057177a0
lib/promscrape: increase the duration for reading the full response in stream parsing mode
...
Increase the duration from 10x to 30x of the configured `scrape_interval'.
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:29:46 +03:00
Aliaksandr Valialkin
074b11fa69
lib/protoparser: measure the duration for reading the whole block of data instead of a single read operation
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:29:45 +03:00
Aliaksandr Valialkin
357dbe092c
docs/Cluster-VictoriaMetrics.md: add lists for command-line flags for cluster components
2021-06-14 12:21:22 +03:00
Aliaksandr Valialkin
87d221f78a
lib/protoparser/common: log the duration for reading a block of data in ReadLinesBlockExt on error
...
This may help debugging issues like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:21:21 +03:00
Aliaksandr Valialkin
52efd5a05c
docs/vmalert.md: follow-up after 6d5a8c28cd
2021-06-14 11:43:02 +03:00
Roman Khavronenko
c5f493db8e
Vmalert docs ( #1372 )
...
* vmalert: mention what happens if `for` is set to 0 or omitted
* vmalert: add more context to docs
2021-06-14 11:43:01 +03:00
Aliaksandr Valialkin
541429a9af
docs/CHANGELOG.md: cut v1.61.1
2021-06-11 13:02:04 +03:00
Aliaksandr Valialkin
0672cfffa2
app/vmauth: properly handle http.ErrAbortHandler panic
...
This panic can be raised by the reverseProxy on aborted request to the backend.
So handle it (e.g. suppress) at reverseProxy.ServeHTTP call.
Do not suppress the panic at lib/httpserver generic HTTP handler,
since it may result in an inconsistent state left after the panicking handler.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1353
2021-06-11 12:54:37 +03:00