Commit graph

2758 commits

Author SHA1 Message Date
Aliaksandr Valialkin
bf243df9ce
lib/logstorage: make sure that the number of output (bloom, values) shards is bigger than zero.
If the number of output (bloom, values) shards is zero, then this may lead to panic
as shown at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7391 .

This panic may happen when parts with only constant fields with distinct values are merged into
output part with non-constant fields, which should be written to (bloom, values) shards.

(cherry picked from commit 102e9d4f4e)
2024-10-30 15:19:51 +01:00
cangqiaoyuzhuo
07cf3189f8
chore: fix function name (#7381)
### Describe Your Changes

 fix function name

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

(cherry picked from commit 45896fb477)
2024-10-30 13:13:05 +01:00
Aliaksandr Valialkin
1dd01b8a8f
lib/logstorage: follow-up for af831a6c906158f371f1b6810706fa0a54b78386
Sync the code between top and sort pipes regarding the code related to rank.

(cherry picked from commit 7a623c225f)
2024-10-30 09:52:52 +01:00
Aliaksandr Valialkin
329d9a46ee
lib/logstorage: add an ability to return rank from top pipe results
(cherry picked from commit 3c06d083ea)
2024-10-30 09:52:51 +01:00
Aliaksandr Valialkin
fe5f16b662
lib/logstorage: dynamically adjust the number of (bloom, values) shards in a part depending on the number of non-const columns
This allows reducing the amounts of data, which must be read during queries over logs with big number of fields (aka "wide events").
This, in turn, improves query performance when the data, which needs to be scanned during the query, doesn't fit OS page cache.

(cherry picked from commit 7a62eefa34)
2024-10-30 09:52:51 +01:00
Aliaksandr Valialkin
76b21c8560
lib/logstorage: avoid reading columnsHeader data when field_values pipe is applied directly to log filters
This improves performance of `field_values` pipe when it is applied to large number of data blocks.
This also improves performance of /select/logsql/field_values HTTP API.

(cherry picked from commit 8d968acd0a)
2024-10-30 09:52:50 +01:00
Hui Wang
9616814728
vmalert: integrate with victorialogs (#7255)
address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6706.
See
https://github.com/VictoriaMetrics/VictoriaMetrics/blob/vmalert-support-vlog-ds/docs/VictoriaLogs/vmalert.md.

Related fix
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7254.

Note: in this pull request, vmalert doesn't support
[backfilling](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/vmalert-support-vlog-ds/docs/VictoriaLogs/vmalert.md#rules-backfilling)
for rules with a customized time filter. It might be added in the
future, see [this
issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7289)
for details.

Feature can be tested with image
`victoriametrics/vmalert:heads-vmalert-support-vlog-ds-0-g420629c-scratch`.

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 68bad22fd2)
2024-10-29 16:32:00 +01:00
Zakhar Bessarab
517bd9392c
lib/storage/partition: prevent panic in case resulting in-memory part is empty after merge (#7329)
It is possible for in-memory part to be empty if ingested samples are
removed by retention filters. In this case, data will not be discarded
due to retention before creating in memory part. After in-memory parts
merge samples will be removed resulting in creating completely empty
part at destination.

 This commit checks for resulting part and skips it, if it's empty.

---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2024-10-27 20:42:42 +01:00
Zhu Jiekun
3d55605ae5
lib/promscrape: adds support for PuppetDB service discovery
This commit adds support for
[PuppetDB](https://www.puppet.com/docs/puppetdb/8/overview.html) service
discovery to the `vmagent` and `victoria-metrics-single` components.

Related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5744
2024-10-27 20:42:42 +01:00
Andrii Chubatiuk
f6f4884ba6
lib/promscrape/discovery/kubernetes: support kubernetes native sidecars (#7324)
This commit adds Kubernetes Native Sidecar support. 

It's the special type of init containers, that have restartPolicy == "Always" and continue to run after container initialization. 


related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7287
2024-10-27 20:25:15 +01:00
Zakhar Bessarab
8198e7241d
lib/mergeset: add sparse indexdb cache (#7269)
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7182

- add a separate index cache for searches which might read through large
amounts of random entries. Primary use-case for this is retention and
downsampling filters, when applying filters background merge needs to
fetch large amount of random entries which pollutes an index cache.
Using different caches allows to reduce effect on memory usage and cache
efficiency of the main cache while still having high cache hit rate. A
separate cache size is 5% of allowed memory.

- reduce size of indexdb/dataBlocks cache in order to free memory for
new sparse cache. Reduced size by 5% and moved this to a separate cache.

- add a separate metricName search which does not cache metric names -
this is needed in order to allow disabling metric name caching when
applying downsampling/retention filters. Applying filters during
background merge accesses random entries, this fills up cache and does
not provide an actual improvement due to random access nature.

Merge performance and memory usage stats before and after the change:

- before

![image](https://github.com/user-attachments/assets/485fffbb-c225-47ae-b5c5-bc8a7c57b36e)

- after

![image](https://github.com/user-attachments/assets/f4ba3440-7c1c-4ec1-bc54-4d2ab431eef5)

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
(cherry picked from commit 837d0d136d)
2024-10-24 12:43:06 -03:00
Artem Fetishev
a722ddbdd9
lib/storage: Fix flaky test: TestStorageRotateIndexDB (#7267)
This commit fixes the TestStorageRotateIndexDB flaky test reported at:
#6977. Sample test failure: https://pastebin.com/bTSs8HP1

The test fails because one goroutine adds items to the indexDB table
while another goroutine is closing that table. This may happen if
indexDB rotation happens twice during one Storage.add() operation:
-  Storage.add() takes the current indexDB and adds index recods to it
- First index db rotation makes the current index DB a previous one
(still ok at this point)
- Second index db rotation removes the indexDB that was current two
rotations earlier. It does this by setting the mustDrop flag to true and
decrementing the ref counter. The ref counter reaches zero which cases
the underlying indexdb table to release its resources gracefully.
Graceful release assumes that the table is not written anymore. But
Storage.add() still adds items to it.

The solution is to increment the indexDB ref counters while it is used
inside add().
The unit test has been changed a little so that the test fails reliably.
The idea is to make add() function invocation to last much longer,
therefore the test inserts not just one record at a time but thouthands
of them.

To see the test fail, just replace the idbsLocked() func with:

```go
unc (s *Storage) idbsLocked2() (*indexDB, *indexDB, func()) {
       return s.idbCurr.Load(), s.idbNext.Load(), func() {}
}
```

---------

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>

(cherry picked from commit 6b9f57e5f7)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-10-24 15:45:54 +02:00
Zhu Jiekun
85f60237e2
vmstorage: auto calculate maxUniqueTimeseries based on resources (#6961)
### Describe Your Changes

Add support for
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6930

Calculate `-search.maxUniqueTimeseries` by
`-search.maxConcurrentRequests` and remaining memory if it's **not set**
or **less equal than 0**.

The remaining memory is affected by `-memory.allowedPercent`,
`-memory.allowedBytes` and cgroup memory limit.
### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2024-10-18 13:41:43 +02:00
Andrii Chubatiuk
1d352b92c7
lib/promscrape: fixed reload on max_scrape_size change (#7282)
### Describe Your Changes

fixed reload on max_scrape_size change
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7260

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 965a33c893)
2024-10-18 11:42:47 +02:00
Aliaksandr Valialkin
62e4baf556
lib/logstorage: use simpler in-memory cache instead of workingsetcache for caching recently ingested _stream values and recently queried set of streams
These caches aren't expected to grow big, so it is OK to use the most simplest cache based on sync.Map.
The benefit of this cache compared to workingsetcache is better scalability on systems with many CPU cores,
since it doesn't use mutexes at fast path.
An additional benefit is lower memory usage on average, since the size of in-memory cache equals
working set for the last 3 minutes.

The downside is that there is no upper bound for the cache size, so it may grow big during workload spikes.
But this is very unlikely for typical workloads.

(cherry picked from commit 0f24078146)
2024-10-18 11:42:16 +02:00
Aliaksandr Valialkin
f9d86a913c
lib/logstorage: do not persist streamIDCache, since it may go out of sync with partition directories, which can be changed manually between VictoriaLogs restarts
Partition directories can be manually deleted and copied from another sources such as backups or other VitoriaLogs instances.
In this case the persisted cache becomes out of sync with partitions. This can result in missing index entries
during data ingestion or in incorrect results during querying. So it is better to do not persist caches.
This shouldn't hurt VictoriaLogs performance just after the restart too much, since its caches usually contain
small amounts of data, which can be quickly re-populated from the persisted data.

(cherry picked from commit 8aa144fa74)
2024-10-18 11:42:16 +02:00
Aliaksandr Valialkin
b9fae4378a
lib/logstorage: consistently use "pHits := m[..]" pattern
Consistency improves maintainability of the code a bit.

(cherry picked from commit 1892e357c3)
2024-10-18 11:42:16 +02:00
Aliaksandr Valialkin
92b9b13df1
lib/logstorage: optimize performance for queries, which select all the log fields for logs containing hundreds of log fields (aka "wide events")
Unpack the full columnsHeader block instead of unpacking meta-information per each individual column
when the query, which selects all the columns, is executed. This improves performance when scanning
logs with big number of fields.

(cherry picked from commit 2023f017b1)
2024-10-18 11:42:15 +02:00
Aliaksandr Valialkin
5d541322c6
lib/logstorage: improve performance of top and field_values pipes on systems with many CPU cores
- Parallelize mering of per-CPU results.
- Parallelize writing the results to the next pipe.

(cherry picked from commit 78c6fb0883)
2024-10-18 11:42:15 +02:00
Aliaksandr Valialkin
cd7823a310
lib/logstorage: optimize 'stats by(...)' calculations for by(...) fields with millions of unique values on multi-CPU systems
- Parallelize merging of per-CPU `stats by(...)` result shards.
- Parallelize writing `stats by(...)` results to the next pipe.

(cherry picked from commit c4b2fdff70)
2024-10-18 11:42:15 +02:00
Aliaksandr Valialkin
1000ae437c
lib/logstorage: optimize performance for top pipe when it is applied to a field with millions of unique values
- Use parallel merge of per-CPU shard results. This improves merge performance on multi-CPU systems.
- Use topN heap sort of per-shard results. This improves performance when results contain millions of entries.

(cherry picked from commit 192c07f76a)
2024-10-18 11:42:15 +02:00
hagen1778
d14cf4b679
docs: follow-up after f0d1db81dc
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-10-17 11:19:03 -03:00
Roman Khavronenko
4114301955
lib/flagutil: rename Duration to RetentionDuration (#7284)
The purpose of this change is to reduce confusion between using
`flag.Duration` and `flagutils.Duration`. The reason is that
`flagutils.Duration` was mistakenly used for cases that required `m`
support. See
ab0d31a7b0

The change in name should clearly indicate the purpose of this data
type.

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-10-17 11:18:45 -03:00
Alexander Frolov
4ed01e3ab8
lib/flagutil: rm misleading minutes support from flagutil.Duration docs (#7066)
### Describe Your Changes

`flagutil.Duration` docs state that `m` suffix stands for `minute`, but
in fact this suffix is not supported due to ambiguity with `month`

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Signed-off-by: Alexander Frolov <winningpiece@gmail.com>
2024-10-17 11:11:46 -03:00
Zakhar Bessarab
efb7213f6b
lib/flagutil/dict: properly update default value in case there is no key value set (#7211)
### Describe Your Changes

If a dict flag has only one value without a prefix it is supposed to
replace default value.

Previously, when flag was set to `-flag=2` and the default value in
`NewDictInt` was set to 1 the resulting value for any `flag.Get()` call
would be 1 which is not expected.

This commit updates default value for the flag in case there is only one
entry for flag and the entry is a number without a key.

This affects cluster version and specifically `replicationFactor` flag
usage with vmstorage [node
groups](https://docs.victoriametrics.com/cluster-victoriametrics/#vmstorage-groups-at-vmselect).
Previously, the following configuration would effectively be ignored:
```
/path/to/vmselect \
 -replicationFactor=2 \
 -storageNode=g1/host1,g1/host2,g1/host3 \
 -storageNode=g2/host4,g2/host5,g2/host6 \
 -storageNode=g3/host7,g3/host8,g3/host9
```

Changes from this PR will force default value for `replicationFactor`
flag to be set to `2` which is expected as the result of this
configuration.


---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2024-10-17 11:11:45 -03:00
Aliaksandr Valialkin
54ccf09fdd
lib/logstorage: follow-up for 72941eac36
- Allow dropping metrics if the query result contains at least a single metric.
- Allow copying by(...) fields.
- Disallow overriding by(...) fields via `math` pipe.
- Allow using `format` pipe in stats query. This is useful for constructing some labels from the existing by(...) fields.
- Add more tests.
- Remove the check for time range in the query filter according to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7254/files#r1803405826

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7254
2024-10-17 11:09:16 -03:00
Hui Wang
21864de527
victorialogs: add more checks for stats query APIs (#7254)
1. Verify if field in [fields
pipe](https://docs.victoriametrics.com/victorialogs/logsql/#fields-pipe)
exists. If not, it generates a metric with illegal float value "" for
prometheus metrics protocol.
2. check if multiple time range filters produce conflicted query time
range, for instance:
```
query: _time: 5m | stats count(), 
start:2024-10-08T10:00:00.806Z, 
end: 2024-10-08T12:00:00.806Z, 
time: 2024-10-10T10:02:59.806Z
```
must give no result due to invalid final time range.

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2024-10-17 11:09:16 -03:00
Aliaksandr Valialkin
3346576a3a
lib/logstorage: refactor storage format to be more efficient for querying wide events
It has been appeared that VictoriaLogs is frequently used for collecting logs with tens of fields.
For example, standard Kuberntes setup on top of Filebeat generates more than 20 fields per each log.
Such logs are also known as "wide events".

The previous storage format was optimized for logs with a few fields. When at least a single field
was referenced in the query, then the all the meta-information about all the log fields was unpacked
and parsed per each scanned block during the query. This could require a lot of additional disk IO
and CPU time when logs contain many fields. Resolve this issue by providing an (field -> metainfo_offset)
index per each field in every data block. This index allows reading and extracting only the needed
metainfo for fields used in the query. This index is stored in columnsHeaderIndexFilename ( columns_header_index.bin ).
This allows increasing performance for queries over wide events by 10x and more.

Another issue was that the data for bloom filters and field values across all the log fields except of _msg
was intermixed in two files - fieldBloomFilename ( field_bloom.bin ) and fieldValuesFilename ( field_values.bin ).
This could result in huge disk read IO overhead when some small field was referred in the query,
since the Operating System usually reads more data than requested. It reads the data from disk
in at least 4KiB blocks (usually the block size is much bigger in the range 64KiB - 512KiB).
So, if 512-byte bloom filter or values' block is read from the file, then the Operating System
reads up to 512KiB of data from disk, which results in 1000x disk read IO overhead. This overhead isn't visible
for recently accessed data, since this data is usually stored in RAM (aka Operating System page cache),
but this overhead may become very annoying when performing the query over large volumes of data
which isn't present in OS page cache.

The solution for this issue is to split bloom filters and field values across multiple shards.
This reduces the worst-case disk read IO overhead by at least Nx where N is the number of shards,
while the disk read IO overhead is completely removed in best case when the number of columns doesn't exceed N.
Currently the number of shards is 8 - see bloomValuesShardsCount . This solution increases
performance for queries over large volumes of newly ingested data by up to 1000x.

The new storage format is versioned as v1, while the old storage format is version as v0.
It is stored in the partHeader.FormatVersion.

Parts with the old storage format are converted into parts with the new storage format during background merge.
It is possible to force merge by querying /internal/force_merge HTTP endpoint - see https://docs.victoriametrics.com/victorialogs/#forced-merge .
2024-10-17 11:09:16 -03:00
Nikolay
135e3ced8c
lib/storage: properly unmarshal SearchQuery (#7277)
After adding multitenant query feature at v1.104.0, searchQuery wasn't
properly unmarshalled at bottom vmselect in multi-level cluster setup.
It resulted into empty query responses.

This commit adds fallback to Unmarshal method of SearchQuery to fill
TenantTokens. It allows to properly execute search requests
at vmselect side.

Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7270

---------

Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2024-10-17 11:45:27 +02:00
Andrii Chubatiuk
019171fdfc
lib/protoparser/influx: enable batch processing by default (#7165)
### Describe Your Changes

Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7090

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit daa7183749)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-10-15 11:51:48 +02:00
Aliaksandr Valialkin
0881e5fd5c
app/vlselect: do not show empty fields in query results
Empty fields are treated as non-existing fields by VictoriaLogs data model.
So there is no sense in returning empty fields in query results, since they may mislead and confuse users.

(cherry picked from commit bac193e50b)
2024-10-15 11:49:32 +02:00
Aliaksandr Valialkin
f627d7f686
app/vlstorage: add support for forced merge via /internal/force_merge HTTP endpoint
(cherry picked from commit 3c73dbbacc)
2024-10-15 11:49:31 +02:00
Aliaksandr Valialkin
ac2b6e8704
lib/logstorage: make a copy of s.partitions slice when performing queries over the selected partitions
s.partitions can be changed when new partition is registered or when old partition is dropped.
This could lead to data races and panics when s.partitions slice is accessed by concurrently executed queries.

The fix is to make a copy of the selected partitions under s.partitionsLock before performing the query.

(cherry picked from commit b4b79a4961)
2024-10-15 11:49:31 +02:00
Aliaksandr Valialkin
b694ca4952
lib/logstorage: move getConstColumnValue() and getColumnHeader() methods from columnsHeader to blockSearch
This localizes blockSearch.getColumnsHeader() call at block_search.go .
This call is going to be optimized in the next commits in order to avoid
unmarshaling of header data for unneeded columns, which weren't requested
by getConstColumnValue() / getColumnHeader().

(cherry picked from commit 507b206a7d)
2024-10-15 11:49:30 +02:00
Aliaksandr Valialkin
beeb80e4f8
lib/logstorage: avoid redundant copying of column names and column values for dictionary-encoded columns during querying
Refer the original byte slice with the marshaled columnsHeader for columns names and dictionary-encoded column values.
This improves query performance a bit when big number of blocks with big number of columns are scanned during the query.

(cherry picked from commit 279e25e7c8)
2024-10-15 11:49:30 +02:00
Aliaksandr Valialkin
afe5158443
lib/logstorage: avoid calling columnsHeader.initFromBlockHeader() multiple times for the same blockSearch
This should improve performance when blockSearch.getColumnsHeader() is called multiple times
from different places of the code.

(cherry picked from commit 9e48074b59)
2024-10-15 11:49:30 +02:00
Aliaksandr Valialkin
e581338b84
lib/logstorage: make sure that bs.br is non-nil before checking br.bs.bsw.bh.rowsCount there
br.bs may be nil when br contains the block with additional filters applied during pipe calculations.
For example, `* | count() if (error) errors`.

(cherry picked from commit 867f671cc4)
2024-10-15 11:49:29 +02:00
Andrii Chubatiuk
68b6834542
lib/protoparser/opentelemetry: added exponential histograms support (#6354)
### Describe Your Changes

added opentelemetry exponential histograms support. Such histograms are automatically converted into
VictoriaMetrics histogram with `vmrange` buckets.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
(cherry picked from commit 9eb0c1fd86)
2024-10-11 14:28:19 +02:00
Aliaksandr Valialkin
b3bbf94310
lib/logstorage: disallow using pipe names as the first unquoted words in filter pipe
Improperly written pipes could be silently parsed as filter pipe.
For example, the following query:

   * | by (x)

was silently parsed to:

   * | filter "by" x

It is better to return error, so the user could identify and fix invalid pipe
instead of silently executing invalid query with `filter` pipe.

(cherry picked from commit 7b475ed95d)
2024-10-11 14:27:46 +02:00
Aliaksandr Valialkin
834e2ad855
lib/logstorage: disallow using by as the first word in log filters, since it frequently clashes with stats by(...) pipe where stats word is omitted
(cherry picked from commit 6acf543b90)
2024-10-11 14:27:46 +02:00
Zakhar Bessarab
06947c2685
vmagent: add support of HTTP2 client for Kubernetes SD (#7114)
### Describe Your Changes

Currently, vmagent always uses a separate `http.Client` for every group
watcher in Kubernetes SD. With a high number of group watchers this
leads to large amount of opened connections.

This PR adds 2 changes to address this:
- re-use of existing `http.Client` - in case `http.Client` is connecting
to the same API server and uses the same parameters it will be re-used
between group watchers
- HTTP2 support - this allows to reuse connections more efficiently due
to ability of using streaming via existing connections.

See this issue for the details and test results -
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5971

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
(cherry picked from commit eefae85450)
2024-10-08 10:37:48 +02:00
Aliaksandr Valialkin
0a61222627
lib/logstorage: quote logfmt strings only if they contain special chars, which could break logfmt parsing and/or reading
(cherry picked from commit 462b7cd597)
2024-10-07 14:47:22 +02:00
Artem Fetishev
ac128d268f
lib/promscrape: Fix TestClientProxyReadOk flaky test (#7173)
This PR fixes #7062

For hijacked connections, one has to read from the connection buffer,
but still write directly to the connection. Otherwise, when reading
directly from such connections, the first byte may be lost. This, in
turn corrupts the ClientHello TLS handshake message and when the backend
server receives it, it closes the connection and reports the following
error in the log:

```
http: TLS handshake error from 127.0.0.1:33150: tls: first record does not look
like a TLS handshake
```

The first byte may be lost because underlying HTTP request handler may
read it from the connection and put it into the buffer. As the result,
subsequent connection reads won't see that byte.

-   See: https://github.com/golang/go/issues/27408
-   The fix is taken from : https://github.com/k3s-io/k3s/pull/6216

### Checklist

The following checks are **mandatory**:

- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
(cherry picked from commit c1cd3e85a7)
2024-10-04 10:42:52 +02:00
Aliaksandr Valialkin
7a44614e0b
lib/logstorage: add len pipe for calculating byte length of log field values
(cherry picked from commit 364f084b43)
2024-10-04 10:42:51 +02:00
Roman Khavronenko
dfb2ad4ab4
(app|lib)/vmstorage: do not increment vm_rows_ignored_total on NaNs (#7166)
`vm_rows_ignored_total` metric is a metric for users to signalize about
ingestion issues, such as bad timestamp or parsing error.
In commit
a5424e95b3
this metric started to increment each time vmstorage gets NaN. But NaN
is a valid value for Prometheus data model and for Prometheus metrics
exposition format. Exporters from Prometheus ecosystem could expose NaNs
as values for metrics and these values will be delivered to vmstorage
and increment the metric.
Since there is nothing user can do with this, in opposite to parsing
errors or bad timestamps, there is not much sense in incrementing this
metric. So this commit rolls-back `reason="nan_value"` increments.

### Describe Your Changes

Please provide a brief description of the changes you made. Be as
specific as possible to help others understand the purpose and impact of
your modifications.

### Checklist

The following checks are **mandatory**:

- [ ] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).

Signed-off-by: hagen1778 <roman@victoriametrics.com>

(cherry picked from commit 0d4f4b8f7d)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2024-10-02 12:43:13 +02:00
Zakhar Bessarab
44b071296d
vmselect: add support of multi-tenant queries (#6346)
### Describe Your Changes

Added an ability to query data across multiple tenants. See:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1434

Currently, the following endpoints work with multi-tenancy:
- /prometheus/api/v1/query
- /prometheus/api/v1/query_range
- /prometheus/api/v1/series
- /prometheus/api/v1/labels
- /prometheus/api/v1/label/<label_name>/values
- /prometheus/api/v1/status/active_queries
- /prometheus/api/v1/status/top_queries
- /prometheus/api/v1/status/tsdb
- /prometheus/api/v1/export
- /prometheus/api/v1/export/csv
- /vmui


A note regarding VMUI: endpoints such as `active_queries` and
`top_queries` have been updated to indicate whether query was a
single-tenant or multi-tenant, but UI needs to be updated to display
this info.
cc: @Loori-R 

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Co-authored-by: f41gh7 <nik@victoriametrics.com>
2024-10-01 16:37:18 +02:00
Aliaksandr Valialkin
81f3e07e1e
lib/logstorage: do not count dictionary values which have no matching logs in count_uniq stats function
Create blockResultColumn.forEachDictValue* helper functions for visiting matching
dictionary values. These helper functions should prevent from counting dictionary values
without matching logs in the future.

This is a follow-up for 0c0f013a60
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7152
2024-10-01 13:36:27 +02:00
Aliaksandr Valialkin
8c55b699f4
app/vlogscli: add interactive command-line tool for querying VictoriaLogs 2024-10-01 12:24:53 +02:00
Zhu Jiekun
d1d59d6348
feature: [vmagent] Add service discovery support for OVH Cloud VPS and dedicated server (#6160)
### Describe Your Changes
related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6071

#### Added
- Added service discovery support for OVH Cloud:
    - VPS.
    - Dedicated server.

#### Docs
- `CHANGELOG.md`, `sd_configs.md`, `vmagent.md` are updated.

#### Note
- Useful links: 
    - OVH Cloud VPS API: https://eu.api.ovh.com/console/#/vps~GET
- OVH Cloud Dedicated server API:
https://eu.api.ovh.com/console/#/dedicated/server~GET
    - OVH Cloud SDK: https://github.com/ovh/go-ovh
- Prometheus SD:
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#ovhcloud_sd_config

Tested on OVH Cloud VPS and dedicated server.
<img width="1722" alt="image"
src="https://github.com/VictoriaMetrics/VictoriaMetrics/assets/30280396/d3f0adc8-b0ef-423e-9379-8a9b9b0792ee">

<img width="1724" alt="image"
src="https://github.com/VictoriaMetrics/VictoriaMetrics/assets/30280396/18b5b730-3512-4fc0-8b2c-f2450ac550fd">

---
Signed-off-by: Jiekun <jiekun@victoriametrics.com>
Co-authored-by: hagen1778 <roman@victoriametrics.com>
2024-09-30 15:06:14 +02:00
Hui Wang
bf3d9ba57b
stream aggregation: fix possible duplicated aggregation results (#7118)
When ingesting samples with the same labels(duplicated samples or
samples with the same labels after `by` or `without` options). They
could register different entries for the same labelset in
LabelsCompressor.
For example, both index 99 and 100 can be assigned to label `foo=1` in
two concurrent pushes. Then due to differing label indexes in encoded
keys, the samples will appear as distinct in aggrState, resulting in
duplicated results after decompressing the label indexes.

fbde238cdc/lib/streamaggr/streamaggr.go (L933)

In this pull request, since we need to store `idxToLabel` first to
ensure the idx can be searched after `lc.labelToIdxStore`,
the `lc.idxToLabel` still could contain a duplicated entries
[100]="foo=1". But given the low likelihood of this issue and the size
of idxToLabel, it should be fine.
2024-09-30 14:30:34 +02:00