Roman Khavronenko
c6cf821600
dashboard: several minor fixes ( #1418 )
...
* move panel `Disk writes/reads` to `Resource usage` row
* rename row `storage` to `vmstorage`
* remove cumulative display for `Storage ETA` panel
2021-07-01 05:45:35 +03:00
Aliaksandr Valialkin
bced9ee666
lib/{mergeset,storage}: reduce the maximum lifetime for cached indexdb and data blocks from 2 minutes to a minute
...
This should reduce memory usage on a system with high number of active time series and a high churn rate.
One minute is enough for caching the blocks needed for repeated queries (e.g. alerting rules, recording rules and dashboard refreshes).
2021-06-29 19:57:53 +03:00
Aliaksandr Valialkin
b7c0b3dde3
lib/mergeset: switch from sync.Pool to a channel for a pool for inmemoryBlock structs
...
This should reduce memory usage for the pool on systems with big number of CPU cores.
The sync.Pool maintains per-CPU pools, so the total number of objects in the pool
is proportional to the number of available CPU cores. The channel limits the number
of pooled objects by its own capacity. This means smaller number of pooled objects on average.
2021-06-29 19:57:52 +03:00
Aliaksandr Valialkin
2edfea8c36
lib/promscrape/discovery/docker: fix golint warning: struct field Id should be ID
2021-06-29 13:11:33 +03:00
Aliaksandr Valialkin
609ad6d9bf
lib/storage: put indexDBName into the key for dateTagFilter cache and for uselessTagFilters cache
...
This should prevent from stats overwriting when the previous indexdb is queried.
2021-06-29 13:11:32 +03:00
Aliaksandr Valialkin
0c4c630839
lib/promscrape: typo fix in /targets
output
...
The typo has been introduced in fb72a2133f
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1408
2021-06-28 21:27:22 +03:00
Aliaksandr Valialkin
4d8ab5d9fa
docs/vmagent.md: mention about docker_sd_config support
2021-06-25 20:53:09 +03:00
Aliaksandr Valialkin
db66ea9a5f
docs/CHANGELOG.md: cut v1.62.0
2021-06-25 13:29:48 +03:00
Aliaksandr Valialkin
97d1ccfc8e
lib/promscrape: split docker and dockerswarm service discovery code bases, since they have very little in common
...
This is a follow up after c85a5b7fcb
2021-06-25 13:22:16 +03:00
Aliaksandr Valialkin
4461e20e7d
lib/promscrape: consistently sort service discovery routines
...
This should simplify further maintenance of the code
2021-06-25 13:22:16 +03:00
Lu Jiajing
12b4cbb68f
Support Docker ServiceDiscovery ( #1402 )
...
* add docker discovery
* add test
* add labels test and add scrape work
* remove TODO
* refactor to merge apiConfig and sdConfig
* apply suggestion
2021-06-25 13:22:16 +03:00
Nikolay
501429c3ff
adds missing MustStop call to do and http sd ( #1404 )
2021-06-25 11:43:32 +03:00
Aliaksandr Valialkin
f1a2aa0992
vendor: make vendor-update
2021-06-24 17:28:37 +03:00
Aliaksandr Valialkin
01af676436
docs: consistently put the link to articles and slides about VictoriaMetrics after the links to case studies
2021-06-24 15:38:49 +03:00
Aliaksandr Valialkin
634f2128d8
docs/CaseStudies.md: add a case study for DFKI
2021-06-24 15:25:19 +03:00
Aliaksandr Valialkin
b054563703
Add case study for Groove X
2021-06-24 15:06:19 +03:00
Aliaksandr Valialkin
65576ebb5b
docs/CaseStudies.md: add Sensedia case study
2021-06-24 14:35:47 +03:00
Aliaksandr Valialkin
1b6850bab3
docs/CHANGELOG.md: document the bugfix in increase_pure() function from the commit fb4f758715
2021-06-24 12:06:15 +03:00
Aliaksandr Valialkin
b84aea1e6e
lib/protoparser/clusternative: do not pool unmarshalWork structs, since they can occupy big amounts of memory (more than 100MB per each struct)
...
This should reduce memory usage for vmstorage under high ingestion rate when the vmstorage runs on a system with big number of CPU cores
2021-06-23 15:45:08 +03:00
Aliaksandr Valialkin
856aecae05
app/vmselect/promql: return the last timestamp for the max / min value from tmax_over_time()
and tmin_over_time()
function as most users expect
2021-06-23 14:18:37 +03:00
Aliaksandr Valialkin
dffa2afefe
docs/CHANGELOG.md: document the bugfix for incorrect stats collection for concurrently executed tag filter
...
Follow up for c22114c6f0
2021-06-23 14:06:33 +03:00
Aliaksandr Valialkin
c18017a9c3
app/vminsert/netstorage: sort the -storageNode
list passed to vminsert
nodes
...
This should reduce resource usage (CPU, RAM, disk IO) at vmstorage nodes
if the addresses of vmstorage nodes are passed in random order to vminsert nodes.
2021-06-23 14:00:08 +03:00
Aliaksandr Valialkin
a22f37599b
lib/storage: tune tag filters search logic
...
Tune the logic according to the logs provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-864293624
The previous logic had a race when multiple concurrent queries execute the same tag filter without prior stats.
This could result in incorrectly stored stats for such tag filter, which then could result in non-optimal sorting of tag filters
for further queries.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-23 13:30:36 +03:00
Aliaksandr Valialkin
f10fa0d1d7
lib/promscrape/discovery/consul: properly pass namespace to Consul watcher
...
Follow-up for 58a2989fe7
2021-06-22 17:43:20 +03:00
Aliaksandr Valialkin
4adf6c9766
lib/promscrape/discovery/http: follow up after e307bbb29a
2021-06-22 13:42:10 +03:00
Nikolay
e03a3d3a36
adds http_sd ( #1399 )
...
* adds http_sd
* adds X-Prometheus-Refresh-Interval-Seconds header
* Update lib/promscrape/discovery/http/api.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 13:42:09 +03:00
Aliaksandr Valialkin
3ab3902f17
lib/promscrape/discovery: support generic auth configs in Consul service discovery in the same way as Prometheus 2.28 does
2021-06-22 13:18:51 +03:00
Aliaksandr Valialkin
61fc9b98e5
docs/CHANGELOG.md: document the support for Consul namepsace
...
See 58a2989fe7
2021-06-22 12:56:12 +03:00
Nikolay
827a2396d2
adds consul enterprise namespace support ( #1400 )
...
* adds consul enterprise namespace support
* Update lib/promscrape/discovery/consul/consul.go
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 12:56:11 +03:00
Aliaksandr Valialkin
a575882ca2
docs/PerTenantStatistic.md: document that the per-tenant statistic is a part of cluster version of VictoriaMetrics
2021-06-22 12:43:30 +03:00
Roman Khavronenko
79474baf99
vmctl: add more context to flags description in vm-native mode ( #1395 )
2021-06-18 19:20:52 +03:00
Aliaksandr Valialkin
23fcafd437
docs/CHANGELOG.md: typo fixes
2021-06-18 19:15:29 +03:00
Aliaksandr Valialkin
b92d110cad
app/vmselect: log slow requests to all the /api/v1/*
handlers if their execution time exceeds -search.logSlowQueryDuration
2021-06-18 19:07:03 +03:00
Aliaksandr Valialkin
d29e130181
docs/Single-server-VictoriaMetrics.md: mention that it is recommended to use a single scrape_interval across all the scrape targets
2021-06-18 15:41:35 +03:00
Aliaksandr Valialkin
4acc4602b3
app/vmctl: limit JSON line size by 10K samples ( #1394 )
...
This should reduce the maximum memory usage at VictoriaMetrics when importing time series with big number of samples.
2021-06-18 15:41:34 +03:00
Aliaksandr Valialkin
3ec3705943
docs/Cluster-VictoriaMetrics.md: clarify docs about VictoriaMetrics cluster architecture
2021-06-18 14:35:55 +03:00
Aliaksandr Valialkin
cd697b88c5
docs/CHANGELOG.md: document the reduced disk write IO usage
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-18 14:03:42 +03:00
Aliaksandr Valialkin
f9069ba32a
lib/promscrape: show jobs with empty scrape targets on /targets page
2021-06-18 10:54:12 +03:00
Roman Khavronenko
5cb378f5b5
dasbhoard: display tweaks ( #1387 )
...
* rm cumulative visualisation for panel `Disk space used`.
It uses % threshold and cumulative display breaks it.
* remove area filling for resource usage row;
* add job name for panels in resource usage row.
2021-06-18 10:48:40 +03:00
Nikolay
9ea1dca3dd
fixes DO service discovery labels ( #1389 )
...
adds test for digitalocean sd
2021-06-17 17:21:10 +03:00
Aliaksandr Valialkin
60bc35f550
docs/{vmgateway,vmbackupmanager}: explicitly mention that these components are a part of an enterprise package
2021-06-17 17:19:13 +03:00
Aliaksandr Valialkin
a207be3ffb
lib/storage: fix infinite loop introduced in aa9b56a046
...
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 14:27:30 +03:00
Aliaksandr Valialkin
0efd37cec1
lib/{mergeset,storage}: reduce the number of fsync calls on data ingestion path on systems with many cpu cores
...
VictoriaMetrics maintains a buffer per CPU core for the ingested data. These buffers are flushed to disk every second.
These buffers are flushed to disk in parallel starting from the commit 56b6b893ce
.
This resulted in increased write disk IO usage on systems with many cpu cores
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-863046999 .
This commit merges the per-CPU buffers into bigger in-memory buffers before flushing them to disk.
This should reduce the rate of fsync syscalls and, consequently, the write disk IO on systems with many CPU cores.
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 13:51:42 +03:00
Aliaksandr Valialkin
51fc469642
app/vmagent/remotewrite: go fmt
after 0a796f7c3a
2021-06-17 13:51:40 +03:00
Aliaksandr Valialkin
90c3606269
docs/vmagent.md: sync with app/vmagent/README.md via make docs-sync
2021-06-16 12:37:55 +03:00
Aliaksandr Valialkin
644102b03b
docs/CHANGELOG.md: document the changed -remoteWrite.queues
value
...
This is a follow-up for 0a796f7c3a
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1385
2021-06-16 12:37:55 +03:00
Zongyang
cf506e300d
Change default value of '-remoteWrite.queues' to cgroup.AvailableCPUs * 2 ( #1385 )
...
* Change default value of '-remoteWrite.queues' to cgroup.AvailableCPUS() * 2 to reduce scrape interval
Default value of vmagent option '-remotewrite.queues' is 4 and default
size of vmagent ScheudleUnmarshalWorkers is number of CPUs, when available
CPUs is much greater than 4, e.g 32, worker are competing push queues
which will increase scrape interval and may cause scrape timeout.
* Update README and flag description
Co-authored-by: xiaozy <xiaozy01@fenbi.com>
2021-06-16 12:37:55 +03:00
Roman Khavronenko
a15c947045
promql: fix increase_pure
calculation for cases with stale series ( #1381 )
...
Due to staleness handling, increase_pure were using incorrect previous value
during calculation in cases where series disappears for period longer
than staleness period and then returns back. The fix suppose to account
for a real datapoint value before staleness takes place. The fix should
remove unexpected spikes while using `increase_pure` for staled series.
2021-06-15 17:37:51 +03:00
Aliaksandr Valialkin
234152d66e
docs/Single-server-VictoriaMetrics.md: mention that VictoriaMetrics works great with APM workloads (aka Application Performance Monitoring)
2021-06-15 17:33:41 +03:00
Aliaksandr Valialkin
b133de1e37
lib/storage: move deletedMetricIDs set from indexDB to Storage
...
This makes consitent the list of deleted metricIDs when it is used from both the current indexDB and the previous indexDB (aka extDB).
This should fix the issue, which could lead to storing new samples under deleted metricIDs after indexDB rotation.
See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136 .
Thanks to @tangqipengleoo for the initial analysis and the pull request - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .
This commit resolves the issue in more generic way compared to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .
The downside of the commit is the deletedMetricIDs set isn't cleaned from the metricIDs outside the retention. It needs app restart.
This should be OK in most cases.
2021-06-15 15:07:54 +03:00