Commit graph

2156 commits

Author SHA1 Message Date
Nikolay
fac272bc10
lib/vmselectapi: do not send empty label names for labelNames request (#4936)
* lib/vmselectapi: do not send empty label names for labelNames request
it breaks cluster communication, since vmselect incorrectly reads request buffer, leaving unread data on it
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4932

* typo fix

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-09-01 23:24:51 +02:00
Aliaksandr Valialkin
d8afd7fe98
Makefile: update golangci-lint from v1.51.2 to v1.54.2
See https://github.com/golangci/golangci-lint/releases/tag/v1.54.2
2023-09-01 10:25:49 +02:00
Dima Lazerka
1d60c236b1
Add flagutil.Duration to avoid conversion bugs (#4835)
* Introduce flagutil.Duration

To avoid conversion bugs

* Fix tests

* Comment why not .Seconds()
2023-09-01 09:30:30 +02:00
Nikolay
ae85b20c5b
lib/promscrape/k8s_sd: set resourceVersion to 0 by default for watch … (#4901)
* lib/promscrape/k8s_sd: set resourceVersion to 0 by default for watch requests
it must reduce load for kubernetes ETCD servers. Since requests without resourceVersion performs force cache sync at kubernetes API server with ETCD
more info at https://kubernetes.io/docs/reference/using-api/api-concepts/\#semantics-for-watch
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4855

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-08-30 16:04:14 +02:00
Aliaksandr Valialkin
3a2d035283
lib/auth: add NewTokenPossibleMultitenant() for parsing auth token, which can be multitenant
Disallow parsing multitenant token at auth.NewToken().

Use auth.NewTokenPossibleMultitenant() at vminsert only. All the other callers should call auth.NewToken(),
since they do not support multitenant token.

This is a follow-up for f0c06b428e

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4910
2023-08-30 14:13:51 +02:00
Aliaksandr Valialkin
b56294c174
lib/proxy: consistently use gopkg.in/yaml.v2 across all the code 2023-08-29 13:16:20 +02:00
Aliaksandr Valialkin
339879edd0
lib/netutil/tcpdialer.go: reduce the code difference with enterprise branch 2023-08-29 12:45:57 +02:00
Aliaksandr Valialkin
19d61737c1
app/{vminsert,vmselect}: follow-up after 2b7b3293c1
- Document the change at docs/CHANGELOG.md
- Set the default value for -vmstorageUserTimeout to 3 seconds. This is much better
  than the 0 value, which means that TCP connection to unreachable vmstorage could block
  for up to 16 minutes.
- Document -vmstorageUserTimeout at docs/Cluster-VictoriaMetrics.md
2023-08-29 12:17:39 +02:00
Will Jordan
2b7b3293c1
Add vmstorageUserTimeout flags to configure TCP user timeout (Linux) (#4423)
`TCP_USER_TIMEOUT` (since Linux 2.6.37) specifies the maximum amount of
time that transmitted data may remain unacknowledged before TCP will
forcibly close the connection and return `ETIMEDOUT` to the application.

Setting a low TCP user timeout allows RPC connections quickly reroute
around unavailable storage nodes during network interruptions.
2023-08-29 11:46:39 +02:00
crossoverJie
64b27c9217
lib/logstorage: Set ptwHot to nil when the partition pointed by ptwHot is dropped (#4902) 2023-08-29 11:22:53 +02:00
hagen1778
d70b346623
lib/promscrape: follow-up after eabcfc9bcd
`-promscrape.cluster.membersCount` by default should be `1`, like every
single vmagent is a cluster of one member on its own.
The change additionally validates that user can't set `-promscrape.cluster.membersCount`
to value lower than `1`.

eabcfc9bcd
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-29 11:18:12 +02:00
Haleygo
685c3d95e7
fix clusterMembersCount check (#4900) 2023-08-29 11:15:50 +02:00
Aliaksandr Valialkin
1e1aa94ffb
lib/logstorage: eliminate data race when clearing s.ptwHot after deleting the corresponding partition
The previous code could result in the following data race:
1. The s.ptwHot partition is marked to be deleted
2. ptw.decRef() is called on it
3. ptw.pt is set to nil
4. s.ptwHot.pt is accessed from concurrent goroutine, which leads to panic.

The change clears s.ptwHot under s.partitionsLock in order to prevent from the data race.

This is a follow-up for 8d50032dd6

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4895
2023-08-29 11:12:07 +02:00
crossoverJie
db0ae3fffb
lib/logstorage: add nil check for ptwHot.pt (#4896)
(cherry picked from commit cde5029bce)
2023-08-27 09:14:28 +02:00
Zakhar Bessarab
46e86add2f
lib/promscrape/client: sync timeout for HostClient and http.Client (#4889)
Initially, stream parse mode was reading data from response and parsing it on flight. This was causing longer delay to read the whole response and required increasing timeout value to allow data processing while reading. So that 908e35affd increased timeout value to fix this.

But after 74c00a8762 response in stream parse mode is saved into memory and then parsed eliminating necessity of having timeout value higher that for usual scrape.

Updates: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4847
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
(cherry picked from commit 6e8611f301)
2023-08-27 09:06:00 +02:00
hagen1778
b18e9b5bb0
app/vmagent: follow-up after 6788704152
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4884
Signed-off-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 4ebe8bb1d5)
2023-08-27 09:05:22 +02:00
Zakhar Bessarab
1242460fa6
lib/promscrape/client: make User-Agent consistent between fasthttp and native client (#4886)
User agent was not set for native client which resulted in using one provided by Golang.

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4884

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
(cherry picked from commit 6788704152)
2023-08-27 09:05:08 +02:00
Aliaksandr Valialkin
c813b5e4b1
lib/promscrape: add -promscrape.cluster.memberLabel command-line flag
This flag allows specifying an additional label to add to all the scraped metrics.
The flag must contain label name to add. The label value will be equal to -promscrape.cluster.memberNum.

This functionality can help when there is a need to differentiate metrics scraped
by distinct vmagent instances in the cluster according to https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4247#issuecomment-1692279393
2023-08-24 22:04:34 +02:00
Nikolay
1a943bb16a
lib/storage: properly caclucate nextRotationTimestamp (#4874)
cause of typo unix millis was used instead of unix for current timestamp
calculation
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4873

(cherry picked from commit c5aac34b68)
2023-08-23 13:29:32 +02:00
Dmytro Kozlov
1929d3bca9
lib/protoparser: handle unexpected EOF error when parsing lines in prometheus exposition format (#4851)
Previously only io.EOF was handled, and io.ErrUnexpectedEOF was ignored, but it may happen if the client interrupts the connection.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4817
(cherry picked from commit b7d07e5acf)
2023-08-18 08:56:27 +02:00
Aliaksandr Valialkin
67dd975be5
lib/envflag: do not allow unsupported form for boolean command-line flags in the form -boolFlag value
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4845
2023-08-17 13:37:05 +02:00
Aliaksandr Valialkin
4b1f01e45d
lib/promrelabel: properly replace : char with _ in metric names when -usePromCompatibleNaming command-line flag is set
This addresses https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3113#issuecomment-1275077071 comment from @johnseekins
2023-08-14 16:18:17 +02:00
Aliaksandr Valialkin
e8fe00d39e
lib/promrelabel: stop emitting DEBUG log lines when parsing if expressions
These lines were accidentally left in the commit 62651570bb

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4635
2023-08-14 16:18:17 +02:00
Aliaksandr Valialkin
bde876f7c9
all: refer to https://docs.victoriametrics.com/#resource-usage-limits in the error message about -search.max* limit
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4827
2023-08-14 02:02:12 -07:00
Aliaksandr Valialkin
1361239393
app/vmbackup: add ability to make server-side copying of existing backups 2023-08-13 17:26:26 -07:00
Nikolay
bb2885d57d
lib/protoparser/openetelemetry: fixes panic (#4821)
Opentelemetry format allows histograms with non-counter buckets. In this case it makes no sense to add buckets into database
and save only counter with _count suffix.
It could be used as gauge.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4814

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-08-12 05:10:05 -07:00
Nikolay
89fcb7baf0
lib/promscrape: adds validation for proxy_url scheme (#4823)
* lib/promscrape: adds validation for proxy_url scheme
adds tests
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4811

* Update lib/proxy/proxy.go

* Update lib/proxy/proxy.go

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-08-12 05:05:56 -07:00
Aliaksandr Valialkin
0ee8a9120a
lib/flagutil: add defaultValue arg to NewArray{Int,Bytes,Duration} functions
The defaultValue is printed in the flag description when passing -help to the app.

This is a follow-up for aef31f201a

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4776
2023-08-12 04:19:34 -07:00
Zakhar Bessarab
15b1810dc8
lib/promrelabel: fix relabeling if clause (#4816)
* lib/promrelabel: fix relabeling if clause being applied to labels outside of current context

Relabeling is applied to each metric row separately, but in order to lower amount of memory allocations it is reusing labels.

Functions which are working on current metric row labels are supposed to use only current metric labels by using provided offset, but if clause matcher was using the whole labels set instead of local metrics.

This leaded to invalid relabeling results such as one described here: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4806

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* docs/CHANGELOG.md: document the bugfix

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1998
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4806

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-08-11 06:44:46 -07:00
Aliaksandr Valialkin
fa400f83b6
lib/httpserver: properly quote the returned address from GetQuotedRemoteAddr() for requests with X-Forwarded-For header
Make sure that the quoted address can be used as JSON string.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4676#issuecomment-1663203424

This is a follow up for 252643d100 and ac0b7e0421

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4676
2023-08-11 05:47:28 -07:00
Zakhar Bessarab
9efc077214
vmbackupmanager: fixes for windows compatibility (#641)
* app/vmbackupmanager/storage: fix path join for windows

See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4704

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/backup: fixes for windows support

- close dir before running os.RemoveAll. Windows FS does not allow to delete directory before all handles will be closed.

- add path "normalization" for local FS to use the same format of paths for both *unix and Windows

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4704

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-08-11 05:43:28 -07:00
Aliaksandr Valialkin
8b4bf5d269
app/vlstorage: expose vl_data_size_bytes metric at /metrics page for tracking the on-disk data size (both indexdb and the data itself) 2023-07-31 07:56:16 -07:00
Aliaksandr Valialkin
3e62c71e8c
lib/promscrape: add a comment why honor_timestamps is set to false by default
This should prevent from returning it back to true in the future

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4697
2023-07-28 21:36:55 -07:00
Aliaksandr Valialkin
ee98f9ae66
lib/promscrape: use local scrape timestamp for scraped metrics unless honor_timestamps: true is set explicitly
This fixes the case with gaps for metrics collected from cadvisor,
which exports invalid timestamps, which break staleness detection at VictoriaMetrics side.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4697 ,
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4697#issuecomment-1654614799
and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4697#issuecomment-1656540535

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1773
2023-07-28 21:11:46 -07:00
Aliaksandr Valialkin
89ccf19b70
lib/storage: update nextRotationTimestamp relative to the timestamp of the indexdb rotation
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563
2023-07-28 19:48:42 -07:00
Roman Khavronenko
303d3616ec
vmalert: revert unittest feature (#4734)
* Revert "vmalert: unittest support stale datapoint (#4696)"

This reverts commit 0b44df7ec8.

* Revert "docs: specify min version and limitations for vmalert's unit tests"

This reverts commit a24541bd

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Revert "vmalert: init unit test (#4596)"

This reverts commit da60a68d

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* docs: mention unittest revert in changelog

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 9f1b9b86cc)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-28 11:09:52 +02:00
Aliaksandr Valialkin
1f30f53df2
lib/promscrape/discovery: close unused HTTP connections to service discovery servers
This should prevent from connection leaks

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4724
2023-07-27 14:47:55 -07:00
Nikolay
85de94e85c
lib/protoparser: adds opentelemetry parser (#2570)
* lib/protoparser: adds opentelemetry parser
app/{vmagent,vminsert}: adds opentelemetry ingestion path

Adds ability to ingest data with opentelemetry protocol
protobuf and json encoding is supported
data converted into prometheus protobuf timeseries
each data type has own converter and it may produce multiple timeseries
from single datapoint (for summary and histogram).
only cumulative aggregationFamily is supported for sum(prometheus
counter) and histogram.

Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

updates deps

fixes tests

wip

wip

wip

wip

lib/protoparser/opentelemetry: moves to vtprotobuf generator

go mod vendor

lib/protoparse/opentelemetry: reduce memory allocations

* wip

- Remove support for JSON parsing, since it is too fragile and is rarely used in practice.
  The most clients send OpenTelemetry metrics in protobuf.
  The JSON parser can be added in the future if needed.
- Remove unused code from lib/protoparser/opentelemetry/pb and lib/protoparser/opentelemetry/proto
- Do not re-use protobuf message between ParseStream() calls, since there is high chance
  of high fragmentation of the re-used message because of too complex nested structure of the message.

* wip

* wip

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-07-27 13:37:15 -07:00
Alexander Marshalov
74237ce5c0
fixed label values decoding for pushgateway compatibility (#4727)
Fixed decoding of label values with slash for pushgateway and prometheus golang client compatibility + added some tests. (#4962)
2023-07-27 13:03:48 -07:00
Aliaksandr Valialkin
3179d0528b
lib/promrelabel: return correct string representation for IfExpression containing a single selector
This is a follow-up for 62651570bb

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4635
2023-07-24 19:33:04 -07:00
Aliaksandr Valialkin
6e43664e24
lib/promrelabel: add support for a list of series selectors at IfExpression
This makes possible specifying a list of series selectors at the following places:

- Inside `if` option at relabeling rules
- Inside `match` option at stream aggregation rules

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4635
2023-07-24 17:09:59 -07:00
Aliaksandr Valialkin
c049778ad1
lib/streamaggr: follow-up for 736197179e
- Use a byte slice instead of a map for tracking indexes for matching series.
  This improves performance, since access by slice index is faster than access by map key.
- Re-use the byte slice for tracking indexes for matching series.
  This removes unnecessary memory allocations and improves stream aggregation performance a bit.
- Add an ability to return to the previous behvaiour by specifying -remoteWrite.streamAggr.dropInput command-line flag.
  In this case all the input samples are dropped when stream aggregation is enabled.
- Backport the new stream aggregation behaviour from vmagent to single-node VictoriaMetrics when -streamAggr.config
  option is set.
- Improve docs regarding this change at docs/CHANGELOG.md
- Document the new behavior at docs/stream-aggregation.md

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4243
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4575
2023-07-24 17:06:09 -07:00
Zakhar Bessarab
470afac5ff
{lib/streamaggr,vmagent/remotewrite}: breaking change for keepInput flag (#4575)
* {lib/streamaggr,vmagent/remotewrite}: breaking change for keepInput flag

Changes default behaviour of keepInput flag to write series which did not match any aggregators to the remote write.
See: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4243

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* Update app/vmagent/remotewrite/remotewrite.go

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-07-24 16:34:38 -07:00
Aliaksandr Valialkin
9d14c29667
lib/streamaggr: skip de-duplication for series, which do not match the configured aggregation rules
Previously all the incoming samples were de-duplicated, even if their series doesn't
match aggregation rule filters. This could result in increased CPU usage.

Now the de-duplication isn't applied to samples for series, which do not match
aggregation rule filters. Such samples are just ignored.
2023-07-22 16:46:17 -07:00
Nikolay
30b32583f4
lib/storage: pre-create timeseries before indexDB rotation (#4652)
* lib/storage: pre-create timeseries before indexDB rotation
during an hour before indexDB rotation start creating records at the next indexDB
it must improve performance during switch for the next indexDB and remove ingestion issues.
Since there is no need for creation new index records for timeseries already ingested into current indexDB
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563

* lib/storage: further work on indexdb rotation optimization

- Document the change at docs/CHAGNELOG.md
- Move back various caches from indexDB to Storage. This makes the change less intrusive.
  The dateMetricIDCache now takes into account indexDB generation, so it stores (date, metricID)
  entries for both the current and the next indexDB.
- Consolidate the code responsible for idbNext pre-filling into prefillNextIndexDB() function.
  This improves code readability and maintainability a bit.
- Rewrite and simplify the code responsible for calculating the next retention timestamp.
  Add various tests for corner cases of this code.
- Remove indexdb pre-filling from RegisterMetricNames() function, since this function is rarely called.
  It is OK to add indexdb entries on demand in this function. This simplifies the code.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401

* docs/CHANGELOG.md: refer to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4563

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-07-22 15:23:14 -07:00
Aliaksandr Valialkin
1ce82f874c
lib/streamaggr: follow up for 70773f53d7
- Round staleness_interval durations to the upper number of seconds.
  This should prevent from under-calculations for fractional staleness intervals.
- Rename stalenessInterval field at *AggrState structs into stalenessSecs, since it holds seconds.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4667
2023-07-20 21:56:36 -07:00
Aliaksandr Valialkin
dcf5b42670
lib/encoding/zstd: switch back from atomic.Pointer to atomic.Value for map[...]...
The map[...]... is already a pointer type, so atomic.Pointer[map[...]...] results in double pointer.

This is a follow-up for 140e7b6b74
2023-07-20 21:54:51 -07:00
Aliaksandr Valialkin
324a3c5288
lib/promscrape: follow-up after 6aa50ca954
- Improve docs
- Hide `debug relabeling` column when -promscrape.dropOriginalLabels command-line flag is set
- Inline the code from the added template functions, since the code is harder to follow
  with the template functions, especially when these functions have misleading names.
  Also, these functions are used only in one place, e.g. they do not reduce the amounts of code.
- Hide `click to show original labels` title at `labels` column when original labels aren't available.
- Show the reason on whey original labels aren't available at /service-discovery page.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4597
2023-07-20 21:54:09 -07:00
Aliaksandr Valialkin
30098ac8bd
app/vlinsert/loki: follow-up after 09df5b66fd
- Parse protobuf if Content-Type isn't set to `application/json` - this behavior is documented at https://grafana.com/docs/loki/latest/api/#push-log-entries-to-loki

- Properly handle gzip'ped JSON requests. The `gzip` header must be read from `Content-Encoding` instead of `Content-Type` header

- Properly flush all the parsed logs with the explicit call to vlstorage.MustAddRows() at the end of query handler

- Check JSON field types more strictly.

- Allow parsing Loki timestamp as floating-point number. Such a timestamp can be generated by some clients,
  which store timestamps in float64 instead of int64.

- Optimize parsing of Loki labels in Prometheus text exposition format.

- Simplify tests.

- Remove lib/slicesutil, since there are no more users for it.

- Update docs with missing info and fix various typos. For example, it should be enough to have `instance` and `job` labels
  as stream fields in most Loki setups.

- Allow empty of missing timestamps in the ingested logs.
  The current timestamp at VictoriaLogs side is then used for the ingested logs.
  This simplifies debugging and testing of the provided HTTP-based data ingestion APIs.

The remaining MAJOR issue, which needs to be addressed: victoria-logs binary size increased from 13MB to 22MB
after adding support for Loki data ingestion protocol at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4482 .
This is because of shitty protobuf dependencies. They must be replaced with another protobuf implementation
similar to the one used at lib/prompb or lib/prompbmarshal .
2023-07-20 21:52:11 -07:00
Alexander Marshalov
9ba03b4838
allow configuring staleness interval in stream aggregation (#4667) (#4670)
---------

Signed-off-by: Alexander Marshalov <_@marshalov.org>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-07-20 21:47:29 -07:00
Haleygo
939c8b8372
vmalert: init unit test (#4596)
vmalert: support unit tests

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2945
---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-07-20 21:19:45 -07:00
Dmytro Kozlov
f0d8f77e6d
app/vmagent: fix creating target id if --promscrape.dropOriginalLabels flag was used (#4616)
* app/vmagent: fix creating target id if `--promscrape.dropOriginalLabels` flag was used

* app/vmagent: hide links if OriginalLabels was dropped

* app/vmagent: update CHANGELOG.md and added information to the docs

* app/vmagent: fix comments
2023-07-20 19:21:41 -07:00
Zakhar Bessarab
5b3cbd4db1
app/vlinsert: add support of loki push protocol (#4482)
* app/vlinsert: add support of loki push protocol

- implemented loki push protocol for both Protobuf and JSON formats
- added examples in documentation
- added example docker-compose

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vlinsert: move protobuf metric into its own file

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* deployment/docker/victorialogs/promtail: update reference to docker image

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* deployment/docker/victorialogs/promtail: make volume name unique

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vlinsert/loki: add license reference

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* deployment/docker/victorialogs/promtail: fix volume name

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* docs/VictoriaLogs/data-ingestion: add stream fields for loki JSON ingestion example

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vlinsert/loki: move entities to places where those are used

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vlinsert/loki: refactor to use common components

- use CommonParameters from insertutils
- stop ingestion after first error similar to elasticsearch and jsonline

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vlinsert/loki: address review feedback

- add missing logstorage.PutLogRows calls
- refactor tenant ID parsing to use common function
- reduce number of allocations for parsing by reusing  logfields slices
- add tests and benchmarks for requests processing funcs

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-07-20 16:49:43 -07:00
Aliaksandr Valialkin
992c300ce9
all: replace atomic.Value with atomic.Pointer[T]
This eliminates the need in .(*T) casting for results obtained from Load()

Leave atomic.Value for map, since atomic.Pointer[map[...]...] makes double pointer to map,
because map is already a pointer type.
2023-07-19 17:48:26 -07:00
Yury Molodov
3ad80e281f
vmui: add Active Queries page (#4653)
* feat: add page to display a list of active queries (#4598)

* app/vmagent: code formatting

* fix: remove console

---------

Co-authored-by: dmitryk-dk <kozlovdmitriyy@gmail.com>
2023-07-19 16:02:58 -07:00
Roman Khavronenko
80768d53dd
docs: follow-up after aec4b5db81 (#4638)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-19 14:48:17 -07:00
Aliaksandr Valialkin
5819d4e6f7
lib/logstorage: properly encode "offset" search word just after _time filter 2023-07-18 16:03:57 -07:00
Aliaksandr Valialkin
da2ef397fa
lib/logstorage: add abilty to speficy offset for the selected _time filter
The following syntax is supported: _time:filter offset off
For example:

- _time:5m offset 1h - 5-minute duration one hour before the current time
- _time:2023 offset 2w - 2023 year with the 2 weeks offset in the past
2023-07-17 19:07:14 -07:00
Aliaksandr Valialkin
e1f7e0b455
lib/logstorage: log the -retentionPeriod and -futureRetention values when the ingested log entry has timestamp outside the configured retention
This should simplify debugging
2023-07-17 18:23:45 -07:00
Aliaksandr Valialkin
6751a08071
lib/logstorage: support for short form of _time:(now-duration, now] filter: _time:duration 2023-07-17 18:23:43 -07:00
Aliaksandr Valialkin
8fdfd13a29
lib/logstorage: LogsQL: replace exact_prefix("...") with exact("..."*)
This makes LogsQL queries more consistent with i("...") and i("..."*) syntax
2023-07-17 17:19:45 -07:00
Aliaksandr Valialkin
5ace0701d3
app/vmselect/promql: add the ability to copy all the labels from one side of group_left()/group_right() operation
This is performed by specifying `*` inside group_left()/group_right().
Also allow specifying prefix for the copied labels via `group_left(...) prefix "..."` and `group_right(...) prefix "..."` syntax.
For example, the following query adds all the namespace-related labels to pod info, and prefixes all the copied label names with "ns_" prefix:

  kube_pod_info * on(namespace) group_left(*) prefix "ns_" kube_namespace_labels

This resolves the following StackOverflow questions:

- https://stackoverflow.com/questions/76661818/how-to-add-namespace-labels-to-pod-labels-in-prometheus
- https://stackoverflow.com/questions/76653997/how-can-i-make-a-new-copy-of-kube-namespace-labels-metric-with-a-different-name
2023-07-17 16:58:30 -07:00
Aliaksandr Valialkin
a7fdc3fcc7
all: add support for or filters in series selectors
This commit adds ability to select series matching distinct filters via a single series selector.
For example, the following selector selects series with either {env="prod",job="a"}
or {env="dev",job="b"} labels:

  {env="prod",job="a" or env="dev",job="b"}

The `or` filter is supported in all the VictoriaMetrics tools now.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3997
Uses https://github.com/VictoriaMetrics/metricsql/pull/14
2023-07-15 23:56:18 -07:00
Aliaksandr Valialkin
3d23fd9853
lib/storage: move series registration in caches from createAllIndexesForMetricName into a separate function - putSeriesToCache
This makes the code more clear and easier to read

This is a follow-up for 7094fa38bc
2023-07-13 23:17:14 -07:00
Aliaksandr Valialkin
4b86522f4c
lib/mergeset: skip common prefix in binarySearchKey() function
This should improve performance a bit when the search if performed among items with long common prefix
2023-07-13 22:05:14 -07:00
Aliaksandr Valialkin
203a436066
lib/storage: optimize BenchmarkIndexDBGetTSIDs()
- Sort MetricName tags only once before the benchmark loop.
- Obtain indexSearch per each benchmark loop in order to give a chance for background merge
  for the recently created parts
2023-07-13 21:49:54 -07:00
Aliaksandr Valialkin
fbddb4ad32
lib/storage: typo fix after e1cf962bad
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401
2023-07-13 21:29:02 -07:00
Aliaksandr Valialkin
7d359d17d1
lib/storage: properly free up resources from newTestStorage() by calling stopTestStorage() 2023-07-13 17:13:34 -07:00
Aliaksandr Valialkin
e1cf962bad
lib/storage: switch from global to per-day index for MetricName -> TSID mapping
Previously all the newly ingested time series were registered in global `MetricName -> TSID` index.
This index was used during data ingestion for locating the TSID (internal series id)
for the given canonical metric name (the canonical metric name consists of metric name plus all its labels sorted by label names).

The `MetricName -> TSID` index is stored on disk in order to make sure that the data
isn't lost on VictoriaMetrics restart or unclean shutdown.

The lookup in this index is relatively slow, since VictoriaMetrics needs to read the corresponding
data block from disk, unpack it, put the unpacked block into `indexdb/dataBlocks` cache,
and then search for the given `MetricName -> TSID` entry there. So VictoriaMetrics
uses in-memory cache for speeding up the lookup for active time series.
This cache is named `storage/tsid`. If this cache capacity is enough for all the currently ingested
active time series, then VictoriaMetrics works fast, since it doesn't need to read the data from disk.

VictoriaMetrics starts reading data from `MetricName -> TSID` on-disk index in the following cases:

- If `storage/tsid` cache capacity isn't enough for active time series.
  Then just increase available memory for VictoriaMetrics or reduce the number of active time series
  ingested into VictoriaMetrics.

- If new time series is ingested into VictoriaMetrics. In this case it cannot find
  the needed entry in the `storage/tsid` cache, so it needs to consult on-disk `MetricName -> TSID` index,
  since it doesn't know that the index has no the corresponding entry too.
  This is a typical event under high churn rate, when old time series are constantly substituted
  with new time series.

Reading the data from `MetricName -> TSID` index is slow, so inserts, which lead to reading this index,
are counted as slow inserts, and they can be monitored via `vm_slow_row_inserts_total` metric exposed by VictoriaMetrics.

Prior to this commit the `MetricName -> TSID` index was global, e.g. it contained entries sorted by `MetricName`
for all the time series ever ingested into VictoriaMetrics during the configured -retentionPeriod.
This index can become very large under high churn rate and long retention. VictoriaMetrics
caches data from this index in `indexdb/dataBlocks` in-memory cache for speeding up index lookups.
The `indexdb/dataBlocks` cache may occupy significant share of available memory for storing
recently accessed blocks at `MetricName -> TSID` index when searching for newly ingested time series.

This commit switches from global `MetricName -> TSID` index to per-day index. This allows significantly
reducing the amounts of data, which needs to be cached in `indexdb/dataBlocks`, since now VictoriaMetrics
consults only the index for the current day when new time series is ingested into it.

The downside of this change is increased indexdb size on disk for workloads without high churn rate,
e.g. with static time series, which do no change over time, since now VictoriaMetrics needs to store
identical `MetricName -> TSID` entries for static time series for every day.

This change removes an optimization for reducing CPU and disk IO spikes at indexdb rotation,
since it didn't work correctly - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 .

At the same time the change fixes the issue, which could result in lost access to time series,
which stop receving new samples during the first hour after indexdb rotation - see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698

The issue with the increased CPU and disk IO usage during indexdb rotation will be addressed
in a separate commit according to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401#issuecomment-1553488685

This is a follow-up for 1f28b46ae9
2023-07-13 17:03:50 -07:00
Aliaksandr Valialkin
1bce67df06
lib/storage: fix possible test failure in TestStorageAddRowsConcurrent
The number of parts in the snapshot partition may be zero if concurrent goroutine just
started creating new partition, but didn't put data into it yet when the current
goroutine made a snapshot.
2023-07-13 15:03:51 -07:00
Aliaksandr Valialkin
733032e514
lib/mergeset: simplify fulsuhInmemoryParts() a bit 2023-07-13 12:33:43 -07:00
Dmytro Kozlov
3d0f846a79
lib/logstorage: fix panic (#4620) 2023-07-13 12:04:59 -07:00
Aliaksandr Valialkin
d8b8fc0343
lib/logstorage: fix TestValuesEncoder() on 32-bit architectures 2023-07-13 11:28:04 -07:00
Zakhar Bessarab
ddd918b93c
docs: make httpAuth.* flags description less ambiguous (#4588)
* docs: make `httpAuth.*` flags description less ambiguous

Currently, it may confuse users whether `httpAuth.*` flags are used by HTTP client or server configuration(see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4586 for example).

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* docs: fix a typo

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-07-09 12:36:14 -07:00
Aliaksandr Valialkin
eea088d87f
docs/CHANGELOG.md: clarify description for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336 bugfix
This is a follow-up for 5eb5df96e2
2023-07-06 22:42:02 -07:00
Alexander Marshalov
eb611c3dc3
fix removing storage data dir before restoring from backup (#598)
* fix removing storage data dir before restoring from backup

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* fix review comment

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* fix review comment

Signed-off-by: Alexander Marshalov <_@marshalov.org>

* fixes after merge with `enterprise-single-node` branch

Signed-off-by: Alexander Marshalov <_@marshalov.org>

---------

Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-07-06 22:32:12 -07:00
Aliaksandr Valialkin
eda26a8352
lib/backup/actions: remove misleading comment about the default value for Concurrency field 2023-07-06 22:31:40 -07:00
Aliaksandr Valialkin
ebd08cd822
lib/logstorage: go fmt 2023-07-06 22:24:18 -07:00
Aliaksandr Valialkin
5a12a518a3
lib/logstorage: fix make test-pure tests 2023-07-06 22:22:08 -07:00
Aliaksandr Valialkin
f2f9532fa5
lib/httputils: fix test after b49d04b3dc
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4459
2023-07-06 22:21:43 -07:00
Haleygo
b029286298
fix parse for invalid partial RFC3339 format (#4539)
The validation was needed for covering corner cases when storage is tested with data from 1970.
This resulted into unexpected search results, as year was parsed incorrectly from the given timestamp.


Co-authored-by: hagen1778 <roman@victoriametrics.com>
2023-07-06 22:09:35 -07:00
Alexander Marshalov
677c8a5465
show backup progress percentage in vmbackup log during backup uploading and restoring progress percentage in vmrestore log during backup downloading (#4460) (#4530)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-07-06 21:56:54 -07:00
Aliaksandr Valialkin
a9eb2409ea
app/vlstorage: export vl_active_merges and vl_merges_total metrics 2023-07-06 21:38:09 -07:00
Aliaksandr Valialkin
08634ae612
app/vlinsert/jsonline: code prettifying 2023-07-06 21:35:55 -07:00
Aliaksandr Valialkin
efee71986f
app/vlselect/logsql: sort query results by _time if their summary size doesnt exceed -select.maxSortBufferSize 2023-07-06 21:25:00 -07:00
Aliaksandr Valialkin
1c39af56ab
app/victoria-logs: add ability to debug data ingestion by passing debug query arg to data ingestion API 2023-07-06 21:19:58 -07:00
Aliaksandr Valialkin
374890294e
app/victoria-logs: initial code release 2023-07-06 17:30:05 -07:00
Aliaksandr Valialkin
de574e7128
lib/storage: do not create flock.lock files at partition directories, since it is created at the Storage level 2023-07-06 17:26:37 -07:00
Aliaksandr Valialkin
833a0e25a7
lib/netutil: ignore arificial timeout generated by net/http.Server
This prevents from the inflated vm_tcplistener_read_timeouts_total counter
2023-07-06 17:26:15 -07:00
Aliaksandr Valialkin
115667df82
lib/mergeset: do not create flock.lock file at mergeset table, since it is created at the lib/storage.Storage level 2023-07-06 17:25:45 -07:00
Aliaksandr Valialkin
ed5f4a0c5a
lib/fs: add ReaderAt.Path() function
This function is going to be used in VictoriaLogs
2023-07-06 17:25:19 -07:00
Aliaksandr Valialkin
4c80193a86
lib/encoding: add MarshalBool/UnmarshalBool and GetUint32s/PutUint32s functions
These functions are going to be used by VictoriaLogs
2023-07-06 17:24:52 -07:00
Aliaksandr Valialkin
d01f0a89db
lib/cgroup: add SetGOGC() function
This function is going to be used by VictoriaLogs
2023-07-06 17:24:31 -07:00
Aliaksandr Valialkin
af6c14d5e7
lib/bytesutil: substitute parentheses with slashes in ByteBuffer.Path() output, so it can be passed to path manipulating functions
This is needed for the upcoming VictoriaLogs
2023-07-06 17:23:52 -07:00
Aliaksandr Valialkin
427ce69426
app/vmselect: move common http functionality from app/vmselect/searchutils to lib/httputils
While at it, move app/vmselect/bufferedwriter to lib/bufferedwriter, since it is going to be used in VictoriaLogs
2023-07-06 17:22:23 -07:00
Aliaksandr Valialkin
46210c4d5e
lib/promutils.ParseTime(): add support for timestamps in milliseconds
See https://stackoverflow.com/questions/76437098/how-to-handle-time-unit-and-step-while-ingesting-or-querying-in-victoriametrics/76438405

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4459
2023-07-06 17:11:54 -07:00
Nikolay
dd7ebd6779
lib/storage: creates parts.json on start-up if it not exists. (#4450)
* lib/storage: creates parts.json on start-up if it not exists.
It fixes migrations from versions below v1.90.0.
Previously parts.json was created only after successful merge.
But if merge was interruped for some reason (OOM or shutdown), parts.json wasn't created and partitions left after interruped merge weren't properly deleted.
Since VM cannot check if it must be removed or not.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* Update lib/storage/partition.go

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-07-06 17:10:26 -07:00
Roman Khavronenko
09c05608f2
lib/storage: add comment for how mustBeDeleted field should be used (#4454)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-06 17:02:44 -07:00
Roman Khavronenko
897d17a5b3
lib/mergeset: add comment for how mustBeDeleted field should be used (#4449)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-06 17:00:55 -07:00
Alexander Marshalov
4084dba9e4
fixed service name detection for consulagent service discovery in case of a difference in service name and service id (#4390) (#4439)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-07-06 16:53:29 -07:00