VictoriaMetrics/lib/storage/index_db_timing_test.go

348 lines
9.6 KiB
Go
Raw Permalink Normal View History

2019-05-22 21:16:55 +00:00
package storage
import (
"fmt"
lib/storage: restore ability to put empty metric ID list into tagFiltersToMetricIDsCache (#7064) ### Describe Your Changes Currently it the metricID list is empty it won't be mashalled and as the result won't be put into the tagFiltersToMetricIDsCache which causes the cache misses for the corresponding tagFilters. In some setups this causes severe search speed detradation (see #7009). The empty metric IDs was covered before but then was accidentally removed in 6c21439. This PR restores the coverage of this case. A new unit test can be used as a proof that empty metricID lists are not added to the cache (just remove the fix in index_db.go and run the test to see the result) Also a benchmark has been added to see the implications of the compression. ``` user@laptop:~/p/github.com/rtm0/VictoriaMetrics/01/src$ go test ./lib/storage/ -run=NONE -bench BenchmarkMarshalUnmarshalMetricIDs --loggerLevel=ERROR goos: linux goarch: amd64 pkg: github.com/VictoriaMetrics/VictoriaMetrics/lib/storage cpu: 13th Gen Intel(R) Core(TM) i7-1355U BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-0-12 3237240 363.5 ns/op 0 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-1-12 2831049 451.8 ns/op 0.4706 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-10-12 1152764 1009 ns/op 1.667 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-100-12 297055 3998 ns/op 5.755 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-1000-12 31172 34566 ns/op 8.484 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-10000-12 4900 289659 ns/op 9.416 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-100000-12 447 2341173 ns/op 9.456 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-1000000-12 42 24926928 ns/op 9.468 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-10000000-12 5 204098872 ns/op 9.467 compression-rate PASS ok github.com/VictoriaMetrics/VictoriaMetrics/lib/storage 15.018s ``` ### Checklist The following checks are **mandatory**: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-09-20 15:21:53 +00:00
"math/rand"
"regexp"
2019-05-22 21:16:55 +00:00
"strconv"
"testing"
"time"
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
2019-05-22 21:16:55 +00:00
)
func BenchmarkRegexpFilterMatch(b *testing.B) {
b.ReportAllocs()
b.RunParallel(func(pb *testing.PB) {
re := regexp.MustCompile(`.*foo-bar-baz.*`)
b := []byte("fdsffd foo-bar-baz assd fdsfad dasf dsa")
for pb.Next() {
if !re.Match(b) {
panic("BUG: regexp must match!")
}
b[0]++
}
})
}
func BenchmarkRegexpFilterMismatch(b *testing.B) {
b.ReportAllocs()
b.RunParallel(func(pb *testing.PB) {
re := regexp.MustCompile(`.*foo-bar-baz.*`)
b := []byte("fdsffd foo-bar sfddsf assd nmn,mfdsdsakj")
for pb.Next() {
if re.Match(b) {
panic("BUG: regexp mustn't match!")
}
b[0]++
}
})
}
2019-05-22 21:16:55 +00:00
func BenchmarkIndexDBAddTSIDs(b *testing.B) {
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
const path = "BenchmarkIndexDBAddTSIDs"
s := MustOpenStorage(path, retentionMax, 0, 0)
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
db := s.idb()
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
const recordsPerLoop = 1e3
2019-05-22 21:16:55 +00:00
b.ReportAllocs()
b.SetBytes(recordsPerLoop)
b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
var mn MetricName
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
var genTSID generationTSID
2019-05-22 21:16:55 +00:00
// The most common tags.
mn.Tags = []Tag{
{
Key: []byte("job"),
},
{
Key: []byte("instance"),
},
}
startOffset := 0
for pb.Next() {
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
benchmarkIndexDBAddTSIDs(db, &genTSID, &mn, startOffset, recordsPerLoop)
2019-05-22 21:16:55 +00:00
startOffset += recordsPerLoop
}
})
b.StopTimer()
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
s.MustClose()
fs.MustRemoveAll(path)
2019-05-22 21:16:55 +00:00
}
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
func benchmarkIndexDBAddTSIDs(db *indexDB, genTSID *generationTSID, mn *MetricName, startOffset, recordsPerLoop int) {
date := uint64(0)
is := db.getIndexSearch(noDeadline)
2019-05-22 21:16:55 +00:00
defer db.putIndexSearch(is)
for i := 0; i < recordsPerLoop; i++ {
mn.MetricGroup = strconv.AppendUint(mn.MetricGroup[:0], uint64(i+startOffset), 10)
for j := range mn.Tags {
mn.Tags[j].Value = strconv.AppendUint(mn.Tags[j].Value[:0], uint64(i*j), 16)
}
mn.sortTags()
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
generateTSID(&genTSID.TSID, mn)
createAllIndexesForMetricName(is, mn, &genTSID.TSID, date)
2019-05-22 21:16:55 +00:00
}
}
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
func BenchmarkHeadPostingForMatchers(b *testing.B) {
// This benchmark is equivalent to https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52
// See https://www.robustperception.io/evaluating-performance-and-correctness for more details.
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
const path = "BenchmarkHeadPostingForMatchers"
s := MustOpenStorage(path, retentionMax, 0, 0)
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
db := s.idb()
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
// Fill the db with data as in https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L66
is := db.getIndexSearch(noDeadline)
defer db.putIndexSearch(is)
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
var mn MetricName
var genTSID generationTSID
date := uint64(0)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
addSeries := func(kvs ...string) {
mn.Reset()
for i := 0; i < len(kvs); i += 2 {
mn.AddTag(kvs[i], kvs[i+1])
}
mn.sortTags()
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
generateTSID(&genTSID.TSID, &mn)
createAllIndexesForMetricName(is, &mn, &genTSID.TSID, date)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
}
for n := 0; n < 10; n++ {
ns := strconv.Itoa(n)
lib/storage: increase the number of created time series in BenchmarkHeadPostingForMatchers in order to be on par with Promethues The previous commit was accidentally creating 10x smaller number of time series than Prometheus and this led to invalid benchmark results. The updated benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 6194893 -97.73% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 10781372 -92.19% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 10632834 -92.11% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 10679975 -94.55% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 100118510 -98.74% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 154955671 -97.96% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 258003769 -77.42% BenchmarkHeadPostingForMatchers/i!="" 9964150263 159783895 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 10937895 -94.96% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 10990027 -94.57% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 87004349 -82.11% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 53342793 -84.79% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 54256156 -85.76% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 21823279 -75.62% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 46671359 -87.70% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 53915842 -87.30% VictoriaMetrics uses 1GB of RAM during the benchmark (vs 3.5GB of RAM for Prometheus)
2019-11-18 17:48:24 +00:00
for i := 0; i < 100000; i++ {
ix := strconv.Itoa(i)
addSeries("i", ix, "n", ns, "j", "foo")
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
// Have some series that won't be matched, to properly test inverted matches.
addSeries("i", ix, "n", ns, "j", "bar")
addSeries("i", ix, "n", "0_"+ns, "j", "bar")
addSeries("i", ix, "n", "1_"+ns, "j", "bar")
addSeries("i", ix, "n", "2_"+ns, "j", "foo")
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
}
}
// Make sure all the items can be searched.
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
db.s.DebugFlush()
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
b.ResetTimer()
benchSearch := func(b *testing.B, tfs *TagFilters, expectedMetricIDs int) {
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
tfss := []*TagFilters{tfs}
tr := TimeRange{
MinTimestamp: 0,
MaxTimestamp: timestampFromTime(time.Now()),
}
for i := 0; i < b.N; i++ {
is := db.getIndexSearch(noDeadline)
metricIDs, err := is.searchMetricIDs(nil, tfss, tr, 2e9)
db.putIndexSearch(is)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
if err != nil {
b.Fatalf("unexpected error in searchMetricIDs: %s", err)
}
if len(metricIDs) != expectedMetricIDs {
b.Fatalf("unexpected metricIDs found; got %d; want %d", len(metricIDs), expectedMetricIDs)
}
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
}
}
addTagFilter := func(tfs *TagFilters, key, value string, isNegative, isRegexp bool) {
if err := tfs.Add([]byte(key), []byte(value), isNegative, isRegexp); err != nil {
b.Fatalf("cannot add tag filter %q=%q, isNegative=%v, isRegexp=%v", key, value, isNegative, isRegexp)
}
}
b.Run(`n="1"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "n", "1", false, false)
benchSearch(b, tfs, 2e5)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`n="1",j="foo"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "n", "1", false, false)
addTagFilter(tfs, "j", "foo", false, false)
benchSearch(b, tfs, 1e5)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`j="foo",n="1"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "j", "foo", false, false)
addTagFilter(tfs, "n", "1", false, false)
benchSearch(b, tfs, 1e5)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`n="1",j!="foo"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "n", "1", false, false)
addTagFilter(tfs, "j", "foo", true, false)
benchSearch(b, tfs, 1e5)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`i=~".*"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "i", ".*", false, true)
benchSearch(b, tfs, 0)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`i=~".+"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "i", ".+", false, true)
benchSearch(b, tfs, 5e6)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`i=~""`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "i", "", false, true)
benchSearch(b, tfs, 0)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`i!=""`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "i", "", true, false)
benchSearch(b, tfs, 5e6)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`n="1",i=~".*",j="foo"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "n", "1", false, false)
addTagFilter(tfs, "i", ".*", false, true)
addTagFilter(tfs, "j", "foo", false, false)
benchSearch(b, tfs, 1e5)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`n="1",i=~".*",i!="2",j="foo"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "n", "1", false, false)
addTagFilter(tfs, "i", ".*", false, true)
addTagFilter(tfs, "i", "2", true, false)
addTagFilter(tfs, "j", "foo", false, false)
benchSearch(b, tfs, 1e5-1)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`n="1",i!=""`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "n", "1", false, false)
addTagFilter(tfs, "i", "", true, false)
benchSearch(b, tfs, 2e5)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`n="1",i!="",j="foo"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "n", "1", false, false)
addTagFilter(tfs, "i", "", true, false)
addTagFilter(tfs, "j", "foo", false, false)
benchSearch(b, tfs, 1e5)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`n="1",i=~".+",j="foo"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "n", "1", false, false)
addTagFilter(tfs, "i", ".+", false, true)
addTagFilter(tfs, "j", "foo", false, false)
benchSearch(b, tfs, 1e5)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`n="1",i=~"1.+",j="foo"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "n", "1", false, false)
addTagFilter(tfs, "i", "1.+", false, true)
addTagFilter(tfs, "j", "foo", false, false)
benchSearch(b, tfs, 11110)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`n="1",i=~".+",i!="2",j="foo"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "n", "1", false, false)
addTagFilter(tfs, "i", ".+", false, true)
addTagFilter(tfs, "i", "2", true, false)
addTagFilter(tfs, "j", "foo", false, false)
benchSearch(b, tfs, 1e5-1)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
b.Run(`n="1",i=~".+",i!~"2.*",j="foo"`, func(b *testing.B) {
tfs := NewTagFilters()
addTagFilter(tfs, "n", "1", false, false)
addTagFilter(tfs, "i", ".+", false, true)
addTagFilter(tfs, "i", "2.*", true, true)
addTagFilter(tfs, "j", "foo", false, false)
benchSearch(b, tfs, 88889)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
})
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
s.MustClose()
fs.MustRemoveAll(path)
lib/storage: add BenchmarkHeadPostingForMatchers similar to the benchmark from Prometheus See the corresponding benchmark in Prometheus - https://github.com/prometheus/prometheus/blob/23c0299d85bfeb5d9b59e994861553a25ca578e5/tsdb/head_bench_test.go#L52 The benchmark allows performing apples-to-apples comparison of time series search in Prometheus and VictoriaMetrics. The following article - https://www.robustperception.io/evaluating-performance-and-correctness - contains incorrect numbers for VictoriaMetrics, since there wasn't this benchmark yet. Fix this. Benchmarks can be repeated with the following commands from Prometheus and VictoriaMetrics source code roots: - Prometheus: GOMAXPROCS=1 go test ./tsdb/ -run=111 -bench=BenchmarkHeadPostingForMatchers - VictoriaMetrics: GOMAXPROCS=1 go test ./lib/storage/ -run=111 -bench=BenchmarkHeadPostingForMatchers Benchmark results: benchmark old ns/op new ns/op delta BenchmarkHeadPostingForMatchers/n="1" 272756688 364977 -99.87% BenchmarkHeadPostingForMatchers/n="1",j="foo" 138132923 1181636 -99.14% BenchmarkHeadPostingForMatchers/j="foo",n="1" 134723762 1141578 -99.15% BenchmarkHeadPostingForMatchers/n="1",j!="foo" 195823953 1148056 -99.41% BenchmarkHeadPostingForMatchers/i=~".*" 7962582919 8716755 -99.89% BenchmarkHeadPostingForMatchers/i=~".+" 7589543864 12096587 -99.84% BenchmarkHeadPostingForMatchers/i=~"" 1142371741 16164560 -98.59% BenchmarkHeadPostingForMatchers/i!="" 9964150263 12230021 -99.88% BenchmarkHeadPostingForMatchers/n="1",i=~".*",j="foo" 216995884 1173476 -99.46% BenchmarkHeadPostingForMatchers/n="1",i=~".*",i!="2",j="foo" 202541348 1299743 -99.36% BenchmarkHeadPostingForMatchers/n="1",i!="" 486285711 11555193 -97.62% BenchmarkHeadPostingForMatchers/n="1",i!="",j="foo" 350776931 5607506 -98.40% BenchmarkHeadPostingForMatchers/n="1",i=~".+",j="foo" 380888565 6380335 -98.32% BenchmarkHeadPostingForMatchers/n="1",i=~"1.+",j="foo" 89500296 2078970 -97.68% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!="2",j="foo" 379529654 6561368 -98.27% BenchmarkHeadPostingForMatchers/n="1",i=~".+",i!~"2.*",j="foo" 424563825 6757132 -98.41% The first column (old) is for Prometheus, the second column (new) is for VictoriaMetrics. As you can see, VictoriaMetrics outperforms Prometheus by more than 100x in almost all the test cases of this benchmark. Prometheus was using 3.5GB of RAM during the benchmark, while VictoriaMetrics was using 400MB of RAM.
2019-11-18 16:21:27 +00:00
}
2019-05-22 21:16:55 +00:00
func BenchmarkIndexDBGetTSIDs(b *testing.B) {
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
const path = "BenchmarkIndexDBGetTSIDs"
s := MustOpenStorage(path, retentionMax, 0, 0)
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
db := s.idb()
2019-05-22 21:16:55 +00:00
const recordsPerLoop = 1000
const recordsCount = 1e5
// Fill the db with recordsCount records.
var mn MetricName
mn.MetricGroup = []byte("rps")
for i := 0; i < 2; i++ {
key := fmt.Sprintf("key_%d", i)
value := fmt.Sprintf("value_%d", i)
mn.AddTag(key, value)
}
mn.sortTags()
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
var genTSID generationTSID
date := uint64(12345)
2019-05-22 21:16:55 +00:00
is := db.getIndexSearch(noDeadline)
2019-05-22 21:16:55 +00:00
defer db.putIndexSearch(is)
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
2019-05-22 21:16:55 +00:00
for i := 0; i < recordsCount; i++ {
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
generateTSID(&genTSID.TSID, &mn)
createAllIndexesForMetricName(is, &mn, &genTSID.TSID, date)
2019-05-22 21:16:55 +00:00
}
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
db.s.DebugFlush()
2019-05-22 21:16:55 +00:00
b.SetBytes(recordsPerLoop)
b.ReportAllocs()
b.ResetTimer()
b.RunParallel(func(pb *testing.PB) {
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
var genTSIDLocal generationTSID
2019-05-22 21:16:55 +00:00
var metricNameLocal []byte
var mnLocal MetricName
mnLocal.CopyFrom(&mn)
mnLocal.sortTags()
2019-05-22 21:16:55 +00:00
for pb.Next() {
is := db.getIndexSearch(noDeadline)
2019-05-22 21:16:55 +00:00
for i := 0; i < recordsPerLoop; i++ {
metricNameLocal = mnLocal.Marshal(metricNameLocal[:0])
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
if !is.getTSIDByMetricName(&genTSIDLocal, metricNameLocal, date) {
panic(fmt.Errorf("cannot obtain tsid for row %d", i))
2019-05-22 21:16:55 +00:00
}
}
db.putIndexSearch(is)
2019-05-22 21:16:55 +00:00
}
})
b.StopTimer()
lib/storage: switch from global to per-day index for `MetricName -> TSID` mapping Previously all the newly ingested time series were registered in global `MetricName -> TSID` index. This index was used during data ingestion for locating the TSID (internal series id) for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names). The `MetricName -> TSID` index is stored on disk in order to make sure that the data isn't lost on VictoriaMetrics restart or unclean shutdown. The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache, and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics uses in-memory cache for speeding up the lookup for active time series. This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk. VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases: - If `storage/tsid` cache capacity isn't enough for active time series. Then just increase available memory for VictoriaMetrics or reduce the number of active time series ingested into VictoriaMetrics. - If new time series is ingested into VictoriaMetrics. In this case it cannot find the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index, since it doesn't know that the index has no the corresponding entry too. This is a typical event under high churn rate, when old time series are constantly substituted with new time series. Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index, are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics. Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName` for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod. This index can become very large under high churn rate and long retention. VictoriaMetrics caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups. The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series. This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics consults only the index for the current day when new time series is ingested into it. The downside of this change is increased indexdb size on disk for workloads without high churn rate, e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store identical `MetricName -> TSID` entries for static time series for every day. This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation, since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 . At the same time the change fixes the issue, which could result in lost access to time series, which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685 This is a follow-up for 1f28b46ae9350795af41cbfc3ca0e8a5af084fce
2023-07-13 22:33:41 +00:00
s.MustClose()
fs.MustRemoveAll(path)
2019-05-22 21:16:55 +00:00
}
lib/storage: restore ability to put empty metric ID list into tagFiltersToMetricIDsCache (#7064) ### Describe Your Changes Currently it the metricID list is empty it won't be mashalled and as the result won't be put into the tagFiltersToMetricIDsCache which causes the cache misses for the corresponding tagFilters. In some setups this causes severe search speed detradation (see #7009). The empty metric IDs was covered before but then was accidentally removed in 6c21439. This PR restores the coverage of this case. A new unit test can be used as a proof that empty metricID lists are not added to the cache (just remove the fix in index_db.go and run the test to see the result) Also a benchmark has been added to see the implications of the compression. ``` user@laptop:~/p/github.com/rtm0/VictoriaMetrics/01/src$ go test ./lib/storage/ -run=NONE -bench BenchmarkMarshalUnmarshalMetricIDs --loggerLevel=ERROR goos: linux goarch: amd64 pkg: github.com/VictoriaMetrics/VictoriaMetrics/lib/storage cpu: 13th Gen Intel(R) Core(TM) i7-1355U BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-0-12 3237240 363.5 ns/op 0 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-1-12 2831049 451.8 ns/op 0.4706 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-10-12 1152764 1009 ns/op 1.667 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-100-12 297055 3998 ns/op 5.755 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-1000-12 31172 34566 ns/op 8.484 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-10000-12 4900 289659 ns/op 9.416 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-100000-12 447 2341173 ns/op 9.456 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-1000000-12 42 24926928 ns/op 9.468 compression-rate BenchmarkMarshalUnmarshalMetricIDs/numMetricIDs-10000000-12 5 204098872 ns/op 9.467 compression-rate PASS ok github.com/VictoriaMetrics/VictoriaMetrics/lib/storage 15.018s ``` ### Checklist The following checks are **mandatory**: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Artem Fetishev <wwctrsrx@gmail.com> Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-09-20 15:21:53 +00:00
func BenchmarkMarshalUnmarshalMetricIDs(b *testing.B) {
rng := rand.New(rand.NewSource(1))
f := func(b *testing.B, numMetricIDs int) {
metricIDs := make([]uint64, numMetricIDs)
// metric IDs need to be sorted.
ts := uint64(time.Now().UnixNano())
for i := range numMetricIDs {
metricIDs[i] = ts + uint64(rng.Intn(100))
}
var marshalledLen int
b.ResetTimer()
for range b.N {
marshalled := marshalMetricIDs(nil, metricIDs)
marshalledLen = len(marshalled)
_ = mustUnmarshalMetricIDs(nil, marshalled)
}
b.StopTimer()
compressionRate := float64(numMetricIDs*8) / float64(marshalledLen)
b.ReportMetric(compressionRate, "compression-rate")
}
for _, n := range []int{0, 1, 10, 100, 1e3, 1e4, 1e5, 1e6, 1e7} {
b.Run(fmt.Sprintf("numMetricIDs-%d", n), func(b *testing.B) {
f(b, n)
})
}
}