Commit graph

2021 commits

Author SHA1 Message Date
Aliaksandr Valialkin
497ec3f3e6
lib/encoding: add MarshalBool/UnmarshalBool and GetUint32s/PutUint32s functions
These functions are going to be used by VictoriaLogs
2023-06-19 22:40:55 -07:00
Aliaksandr Valialkin
3409317a67
lib/cgroup: add SetGOGC() function
This function is going to be used by VictoriaLogs
2023-06-19 22:39:00 -07:00
Aliaksandr Valialkin
c1bed35b39
lib/bytesutil: substitute parentheses with slashes in ByteBuffer.Path() output, so it can be passed to path manipulating functions
This is needed for the upcoming VictoriaLogs
2023-06-19 22:37:26 -07:00
Aliaksandr Valialkin
78eaa056c0
app/vmselect: move common http functionality from app/vmselect/searchutils to lib/httputils
While at it, move app/vmselect/bufferedwriter to lib/bufferedwriter, since it is going to be used in VictoriaLogs
2023-06-19 22:34:20 -07:00
Aliaksandr Valialkin
b49d04b3dc
lib/promutils.ParseTime(): add support for timestamps in milliseconds
See https://stackoverflow.com/questions/76437098/how-to-handle-time-unit-and-step-while-ingesting-or-querying-in-victoriametrics/76438405

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4459
2023-06-19 22:25:04 -07:00
Nikolay
5eb5df96e2
lib/storage: creates parts.json on start-up if it not exists. (#4450)
* lib/storage: creates parts.json on start-up if it not exists.
It fixes migrations from versions below v1.90.0.
Previously parts.json was created only after successful merge.
But if merge was interruped for some reason (OOM or shutdown), parts.json wasn't created and partitions left after interruped merge weren't properly deleted.
Since VM cannot check if it must be removed or not.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4336

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

* Update lib/storage/partition.go

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-06-15 11:19:22 +02:00
Roman Khavronenko
f50f35a8e0
lib/storage: add comment for how mustBeDeleted field should be used (#4454)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-15 11:17:45 +02:00
Roman Khavronenko
f71cc99a8c
lib/mergeset: add comment for how mustBeDeleted field should be used (#4449)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-14 18:13:16 +02:00
Alexander Marshalov
40d12be607
fixed service name detection for consulagent service discovery in case of a difference in service name and service id (#4390) (#4439)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-06-12 16:16:43 +02:00
Roman Khavronenko
dfe53a36fc
lib/promscrape/discoveryutils: properly check for net.ErrClosed (#4426)
This error may be wrapped in another error, and should normally be tested using
`errors.Is(err, net.ErrClosed)`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-09 09:26:33 +02:00
Roman Khavronenko
3305a6901c
app/vmagent: mention enable_http2 in changelog (#4403)
Follow-up after
72c3cd47eb

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-06-05 16:31:58 +02:00
Haleygo
72c3cd47eb
vmagent:scrape config support enable_http2 (#4295)
app/vmagent: support `enable_http2` in scrape config 

This change adds HTTP2 support for scrape config
and improves compatibility with Prometheus config. 

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4283
2023-06-05 15:56:49 +02:00
Nikolay
f263031fe9
app/vmauth: properly handle LOCAL proxy protocol command (#4373)
app/vmauth: properly handle LOCAL proxy protocol command

It is required for handling health checks from load balancers

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3335
2023-05-31 15:37:59 +02:00
Haleygo
b3d0ff463a
vmagent:support follow_redirects on SD level (#4286)
* vmagent:support follow_redirects on SD level

* fix follow_redirects on sd level

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4282
2023-05-26 09:39:45 +02:00
Aliaksandr Valialkin
1f2f74e70e
lib/promrelabel: use monospace font at textarea for writing relabel configs on /metric-relabel-debug and /target-relabel-debug pages
This simplifies visual inspection of indentation in yaml configs
2023-05-18 20:48:41 -07:00
Aliaksandr Valialkin
1f28b46ae9
lib/storage: revert the migration from global to per-day index for (MetricName -> TSID)
This reverts the following commits:
- e0e16a2d36
- 2ce02a7fe6

The reason for revert: the updated logic breaks assumptions made
when fixing https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698 .
For example, if a time series stop receiving new samples during the first
day after the indexdb rotation, there are chances that the time series
won't be registered in the new indexdb. This is OK until the next indexdb
rotation, since the time series is registered in the previous indexdb,
so it can be found during queries. But the time series will become invisible
for search after the next indexdb rotation, while its data is still there.

There is also incompletely solved issue with the increased CPU and disk IO resource
usage just after the indexdb rotation. There was an attempt to fix it, but it didn't fix
it in full, while introducing the issue mentioned above. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401

TODO: to find out the solution, which simultaneously solves the following issues:
- increased memory usage for setups high churn rate and long retention (e.g. what the reverted commit does)
- increased CPU and disk IO usage during indexdb rotation ( https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1401 )
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2698

Possible solution - to create the new indexdb in one hour before the indexdb rotation
and to gradually pre-populate it with the needed index data during the last hour before indexdb rotation.
Then the new indexdb will contain all the needed data just after the rotation,
so it won't trigger increased CPU and disk IO.
2023-05-18 11:30:49 -07:00
Haleygo
1531d757ea fix lint check 2023-05-17 13:51:36 +02:00
Aliaksandr Valialkin
e0e16a2d36
lib/storage: follow-up after 2ce02a7fe6
- Document the change at docs/CHANGELOG.md
- Clarify comments for non-trivial code touched by the commit
- Improve the logic behind maybeCreateIndexes():
  - Correctly create per-day indexes if the indexdb rotation is performed during
    the first hour or the last hour of the day by UTC.
    Previously there was a possibility of missing index entries on that day.
  - Increase the duration for creating new indexes in the current indexdb for up to 22 hours
    after indexdb rotation. This should reduce the increased resource usage
    after indexdb rotation.
    It is safe to postpone index creation for the current day until the last hour
    of the current day after indexdb rotation by UTC, since the corresponding (date, ...)
    entries exist in the previous indexdb.
- Search for TSID by (date, MetricName) in both the current and the previous indexdb.
  Previously the search was performed only in the current indexdb. This could lead
  to excess creation of per-day indexes for the current day just after indexdb rotation.
- Search for (date, metricID) entries in both the current and the previous indexdb.
  Previously the search was performed only in the current indexdb. This could lead
  to excess creation of per-day indexes for the current day just after indexdb rotation.
2023-05-16 23:19:27 -07:00
Roman Khavronenko
2ce02a7fe6
lib/storage: introduce per-day MetricName=>TSID index (#4252)
The new index substitutes global MetricName=>TSID index
used for locating TSIDs on ingestion path.
For installations with high ingestion and churn rate, global
MetricName=>TSID index can grow enormously making
index lookups too expensive. This also results into bigger
than expected cache growth for indexdb blocks.

New per-day index supposed to be much smaller and more efficient.
This should improve ingestion speed and reliability during
re-routings in cluster.

The negative outcome could be occupied disk size, since
per-day index is more expensive comparing to global index.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-05-16 15:46:42 -07:00
Aliaksandr Valialkin
278278af95
lib/storage: reduce the unimportant logging during Storage start / stop
This should improve the visibility of potentially important logs
2023-05-16 15:14:21 -07:00
Aliaksandr Valialkin
d330c7e6fc
lib/mergeset: remove superflouos logging when opening and closing the Table
The logged messages had little useful info, while they were polluting log output during VictoriaMetrics start/stop
2023-05-16 15:01:25 -07:00
Aliaksandr Valialkin
3cbc0975f6
lib/mergeset: close and open the table before making snapshots at TestTableCreateSnapshotAt()
This gives guarantees that all the in-memory data is written to disk at the snapshot time.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4272
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4316
2023-05-16 14:55:11 -07:00
Aliaksandr Valialkin
09b403d38a
lib/{mergeset,storage}: make it clear that DebugFlush() doesn't store all the recently ingested data to disk
DebugFlush() makes sure that the recently ingested data becomes visible to search.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4272
2023-05-16 11:50:17 -07:00
Alexander Marshalov
3b2dc2b098
backup metadata are written in separate file (#560)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-05-16 11:24:54 -07:00
Zakhar Bessarab
242050ba94
lib/storage: follow-up after a50d63c376 (#4289)
* lib/storage: follow-up after a50d63c376

- ensure retentionMsecs is rounded to day
- remove localTimeOffset in test as localOffset is ignored when using `UnixMilli`

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/storage: restore retention timezone offset effect on retention deadline

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-05-16 17:14:08 +02:00
Aliaksandr Valialkin
1c47acda11
lib/promutils: add ParseTimeAt() function 2023-05-13 20:12:31 -07:00
Aliaksandr Valialkin
616175b1ce
lib/promutils: properly return error when incorrect Prometheus label names are passed to NewLabelsFromString()
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4284
See also https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4304
2023-05-12 16:52:29 -07:00
Aliaksandr Valialkin
318a87c36f
Revert "lib/promrelabel: show error message if labels not in prometheus exposition format (#4304)"
This reverts commit 193a9c3328.

Reason for revert: the commit doesn't fix the real issue with promutils.NewLabelsFromString()
function, which must return error when improperly formatted Prometheus metric with labels is passed to it.
See https://github.com/prometheus/docs/blob/main/content/docs/instrumenting/exposition_formats.md#text-format-example

E.g. the promutils.NewLabelsFromString() must return error when the following strings are passed to it:

- `{foo:"bar"}`, since `:` is disallowed in Prometheus text exposition format. The corect value is `{foo="bar"}`
- `{"foo":"bar"}`, since label name shouldn't be quoted. The correct value is `{foo="bar"}`.

The reverted commit introduces another set of bugs, which happily accept the following invalid input:

- `{foo=~"bar"}`
- `{foo!="bar"}`
- `{foo!~"bar"}`

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4284
See also https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4304
2023-05-12 16:07:37 -07:00
Aliaksandr Valialkin
160453b86c
lib/protoparser/csvimport: properly parse the last empty column in CSV line
Do not ignore the last empty column in CSV line.
While at it, properly parse CSV columns in single quotes, e.g. `'foo,bar',baz` is parsed as two columns - `foo,bar` and `baz`

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4048

See also https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4298
2023-05-12 15:51:41 -07:00
Aliaksandr Valialkin
b7fe7b801c
Revert "lib/protoparser: fix skip csv line when metric can be collect from the line (#4298)"
This reverts commit 410ae99c2e.

Reason for revert: the commit masks the real issue instead of fixing it.
The real issue is that the scanner.NextColumn() skips the last column if it is empty.

The commit also introduces two bugs:

- a panic if all the metric values in CSV line are empty
- silent import of CSV lines with too small number of columns

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4048
See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4298
2023-05-12 15:22:27 -07:00
Dmytro Kozlov
193a9c3328
lib/promrelabel: show error message if labels not in prometheus exposition format (#4304)
lib/promrelabel: show error message if labels not in prometheus exposition format

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4284
2023-05-12 10:42:56 +02:00
Dmytro Kozlov
410ae99c2e
lib/protoparser: fix skip csv line when metric can be collect from the line (#4298)
* lib/protoparser: fix skip csv line when metric can be collect from the line

* lib/protoparser: fix comment
2023-05-12 11:04:16 +03:00
Alexander Marshalov
9855b38da2
fixed error with double slash in vmbackupmanager (#557)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-05-11 13:38:07 -07:00
Aliaksandr Valialkin
73812c71a5
lib/promutils: properly parse time strings with timezones at ParseTime() 2023-05-11 13:24:00 -07:00
Aliaksandr Valialkin
da037cafc5
lib/bytesutil: go fmt after 2ec17bed2c 2023-05-10 20:29:03 -07:00
Aliaksandr Valialkin
2ec17bed2c
lib/bytesutil: add benchmarks for ToUnsafeString() and ToUnsafeBytes() 2023-05-10 12:59:26 -07:00
Alexander Marshalov
2e494e2375
fixed typos in documentation and commandline flags descriptions (#4275) 2023-05-10 09:50:41 +02:00
Aliaksandr Valialkin
b9bb64ce55
lib/promscrape/discovery/consulagent: substitute metaPrefix with the __meta_consulagent_ plaintext string
This simplifies future code navigation and search for the specific meta-label starting from __meta_consulagent_* prefix.
For example, `grep __meta_consulagent_namespace` finds the exact place where this label is defined.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3953
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4217
2023-05-08 23:40:13 -07:00
Aliaksandr Valialkin
7db647e924
lib/fs: move common code outside arch-specific implementations of mustRemoveDirAtomic()
This is a follow-up for 73b6c23271
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-05-08 23:10:20 -07:00
Aliaksandr Valialkin
887555669e
Revert "lib/streamaggr: discard samples with timestamps outside of aggregation interval (#4199)"
This reverts commit 9e99f2f5b3.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4068

Reason for revert: this breaks valid use cases:

- If timestamps aren't specified in the incoming samples on purpose. For example, if stream aggregation is used
  as StatsD replacement. StatsD protocol has no timestamp concept for incoming samples.
  See https://github.com/b/statsd_spec

- If all the samples must be aggregated, even if they contain stale timestamps.
  for example, if the stream aggregation produces some counter of some events,
  it may be better to count all the events even if they were delayed before
  being ingested into VictoriaMetrics.

Is is also unclear how to determine whether the sample becomes stale.
For example, if the aggregation interval equals to 1h, and the previous
aggregation cycle just finished 10 minutes ago, what to do with the newly
incoming sample with the timestamp 30 minutes older than the current time?
The answer highly depends on the context, so it is unsafe to uncoditionally
use a single logic for dropping the old samples here.
2023-05-08 16:52:27 -07:00
Aliaksandr Valialkin
74155afb71
docs: clarify docs after 5ee344824f
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4183
2023-05-08 16:11:44 -07:00
Aliaksandr Valialkin
ec3943d14a
app/vmselect: small cleanup after 4f3f9950d0
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3807
2023-05-08 14:57:11 -07:00
Aliaksandr Valialkin
80946f06c2
app/{vmselect,vmctl}: move ParseTime() to lib/promutils
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4091

This is a follow-up for e2053baf32
2023-05-08 14:17:57 -07:00
Alexander Marshalov
8225a48b56
fixed vm_promscrape_config_last_reload_successful metric value recovery after successful reloading with unchanged content (#4260) (#4268)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-05-08 13:32:51 +02:00
Nikolay
8f4de6fa47
lib/storage: properly update link for entry at dateMetricID cache (#4258)
previously during sync for mutable and immutable cache parts, link for hotEntry with current date may be not properly updated
it corrupts cache for backfilling metrics and increased cpu load
2023-05-05 21:45:47 -07:00
Zakhar Bessarab
4e71003620
lib/promscrape/discovery/kubernetes: follow-up for d5e94721db (#4255)
- add changelog reference to an author
- fix tests
- add metadata to match Prometheus behavior

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-05-05 14:41:17 +02:00
Vasilchenko Anton
22e65402af
Add endpoint labels for pod targets discovered form endpoint but has different ports (#4253)
Signed-off-by: Vasilchenko Anton <vasilchenko-as@yandex.ru>
2023-05-05 15:46:07 +04:00
Zakhar Bessarab
aca256735c
lib/storage: fix indexdb rotation infinite loop (#4249)
When using `retentionTimezoneOffset` and having local timezone being more than 4 hours different from UTC indexdb retention calculation could return negative value. This caused indexdb rotation to get in loop.
Fix calculation of offset to use `retentionTimezoneOffset` value properly and add test to cover all legit timezone configs.
See:
- https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4207
- https://github.com/VictoriaMetrics/VictoriaMetrics/pull/4206

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-05-04 17:16:48 +02:00
Alexander Marshalov
56b84140a9
added new consulagent service discovery (#3953) (#4217) 2023-05-04 11:36:21 +02:00
Alexander Marshalov
2eb27ddb22
max value for memory.allowedPercent changed from 200 to 100 (#4171) (#4251)
Signed-off-by: Alexander Marshalov <_@marshalov.org>
2023-05-04 11:34:57 +02:00
justcompile
49b77ec01a
squash commits (#4166) 2023-05-03 10:51:08 +02:00
Nikolay
4786f036de
lib/backup: fixes path generation for windows (#4133)
replaces custom fsync function with standard Fsync methods for files.
fixes pattern matching for parts and properly generate backup path for local fs.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-05-03 10:48:53 +02:00
Nikolay
73b6c23271
lib/fs: do not panic at windows at dir deletion (#4132)
Windows doesn't allow to remove dir with opened files. Usually it's a case for snapshots, hard cannot be removed if file is openned.
With this change, dir will be renamed and properly deleted at the next process start.
It's recommended to restart vmstorage/vmsingle for snapshots deletion completion periodically.
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/70
2023-05-03 10:47:02 +02:00
Zakhar Bessarab
bf3b6732bd
lib/promscrape/discovery/kubernetes: add common labels to all ports discovered from endpoints (#4235)
* lib/promscrape/discovery/kubernetes: add common labels to all ports discovered from endpoints

Sets
`__meta_kubernetes_endpoints_name` and `__meta_kubernetes_namespace` labels to all ports of pod.
Prometheus sets those labels to all ports in pod (0ab9553611/discovery/kubernetes/endpoints.go (L267C15-L269)) even if port is not matching any service.

See: #4154

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/promscrape/discovery/kubernetes: fix test for updated discovery logic

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-05-03 02:17:33 +02:00
Roman Khavronenko
eb746a4dab
Revert "http server: limit max concurrent requests (#4185)" (#4215)
This reverts commit 77f76371

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-04-27 13:02:47 +02:00
Zakhar Bessarab
9e99f2f5b3
lib/streamaggr: discard samples with timestamps outside of aggregation interval (#4199)
* lib/streamaggr: discard samples with timestamps not matching aggregation interval

Samples with timestamps lower than `now - aggregation_interval` are likely to be written via backfilling and should not be used for calculation of aggregation.
See #4068

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* lib/streamaggr: make log message more descriptive, fix imports

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-04-27 11:59:49 +02:00
Haleygo
03150c8973
lib/opentsdbhttp: fix a typo preventing from using writeconcurrencylimiter (#4208) 2023-04-27 09:22:42 +02:00
Nikolay
5ee344824f
lib/promscrape: adds filter for consul_sd_configs: (#4184)
* lib/promscrape: adds filter for consul_sd_configs:
it allows advanced filtering for consul service discovery requests
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4183

* typo fix

* removes deprecation mentions since it's not relevant

* Update docs/CHANGELOG.md

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

---------

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-04-26 19:16:27 +02:00
Dmytro Kozlov
bc17f4828c
app/vmagent,lib/persistentqueue: show warning message if --remoteWrite.maxDiskUsagePerURL flag lower than 500MB (#4196)
* app/vmagent,lib/persistentqueue: show warning message if `--remoteWrite.maxDiskUsagePerURL` flag lower than 500MB

* app/vmagent,lib/persistentqueue: linter fix

* app/vmagent,lib/persistentqueue: fix comment
2023-04-26 13:23:01 +03:00
Yury Molodov
4f3f9950d0
vmui: add metric relabel debug (#3889)
* feat: add metric relabel debug (#3807)

* fix: add link to relabeling cookbook

* lib/promrelabel: merge, fix conflicts

* lib/promrelabel: fix diff

* docs/vmui: add metric relabel playground

---------

Co-authored-by: dmitryk-dk <kozlovdmitriyy@gmail.com>
2023-04-26 11:53:29 +03:00
Roman Khavronenko
77f76371d0
http server: limit max concurrent requests (#4185)
* lib/httpserver: introduce `-http.maxConcurrentRequests` command-line flag

Introduce `-http.maxConcurrentRequests` command-line flag to protect
VM components from resource exhaustion during unexpected spikes of HTTP requests.
By default, the new flag's value is set to 0 which means no limits are applied.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/httpserver: mention http.maxConcurrentRequests in docs

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-04-24 14:52:06 +02:00
Zakhar Bessarab
472fe3fd03
lib/httpserver: add handler to serve /robots.txt and deny search indexing (#4143)
This handler will instruct search engines that indexing is not allowed for the content exposed to the internet. This should help to address issues like #4128 when instances are exposed to the internet without authentication.
2023-04-18 16:47:26 +04:00
Aliaksandr Valialkin
2a4c48c59d
lib/{mergeset,storage}: make mustReadPartNames() code more clear 2023-04-14 23:16:59 -07:00
Aliaksandr Valialkin
52006149b2
lib/storage: replace OpenStorage() with MustOpenStorage()
Callers of OpenStorage() log the returned error and exit.
The error logging and exit can be performed inside MustOpenStorage()
alongside with printing the stack trace for better debuggability.
This simplifies the code at caller side.
2023-04-14 23:02:40 -07:00
Aliaksandr Valialkin
2a2036160d
lib/storage: fix a bug, which prevents from reading pre-v1.90.0 parts
The bug has been introduced in c0b852d50d
2023-04-14 22:33:08 -07:00
Aliaksandr Valialkin
3727251910
lib/fs: add MustReadDir() function
Use fs.MustReadDir() instead of os.ReadDir() across the code in order to reduce the code verbosity.
The fs.MustReadDir() logs the error with the directory name and the call stack on error
before exit. This information should be enough for debugging the cause of the error.
2023-04-14 22:10:46 -07:00
Aliaksandr Valialkin
60d92894c5
lib/storage: validate rows in partition.AddRows() only during tests 2023-04-14 20:52:36 -07:00
Aliaksandr Valialkin
df619bdff0
all: consistently use fs.MustClose() for closing lock files 2023-04-14 20:14:21 -07:00
Aliaksandr Valialkin
2a3b19e1d2
lib/fs: convert CreateFlockFile to MustCreateFlockFile
Callers of CreateFlockFile log the returned err and exit.
It is better to log the error inside the MustCreateFlockFile together with the path
to the specified directory and the call stack. This simplifies
the code at the callers' side while leaving the debuggability at the same level.
2023-04-14 19:50:01 -07:00
Aliaksandr Valialkin
c0b852d50d
lib/{storage,mergeset}: convert InitFromFilePart to MustInitFromFilePart
Callers of InitFromFilePart log the error and exit.
It is better to log the error with the path to the part and the call stack
directly inside the MustInitFromFilePart() function.
This simplifies the code at callers' side while leaving the same level of debuggability.
2023-04-14 15:46:12 -07:00
Aliaksandr Valialkin
9183a439c7
lib/filestream: change Create() to MustCreate()
Callers of this function log the returned error and exit.
It is better logging the error together with the path to the filename
and call stack directly inside the function. This simplifies
the code at callers' side without reducing the level of debuggability
2023-04-14 15:12:48 -07:00
Aliaksandr Valialkin
5eb163a08a
lib/filestream: transform Open() -> MustOpen()
Callers of this function log the returned error and exit.
Let's log the error with the path to the filename and call stack
inside the function. This simplifies the code at callers' side
without reducing the level of debuggability.
2023-04-14 15:03:42 -07:00
Aliaksandr Valialkin
fda1a54343
lib/fs: improve error logging at ReaderAt.MustReadAt()
- Add 'BUG:' prefix to error messages related to programming errors aka bugs.
- Consistently log the path to the file in all the messages in order to improve debuggability.
2023-04-14 14:51:06 -07:00
Aliaksandr Valialkin
f341b7b3f8
lib/fs: substitute ReadFullData with MustReadData
Callers of ReadFullData() log the error and then exit.
So let's log the error with the path to the filename and the call stack
inside MustReadData(). This simplifies the code at callers' side,
while leaving the debuggability at the same level.
2023-04-14 14:39:29 -07:00
Aliaksandr Valialkin
bd6de6406a
lib/fs: improve error logging inside MustWriteData
Log the path to file on errors inside MustWriteData().
This improves debuggability of errors, which may occur inside MustWriteData().
2023-04-14 14:32:45 -07:00
Aliaksandr Valialkin
e0595af2bf
lib/{mergeset,storage}: remove isInMerge flag from parts only when they werent removed yet from the list of active parts
This prevents from possible panic during access to pw.p when it is set to nil at partWrapper.decRef() called inside swapSrcWithDstParts()
2023-04-14 00:08:11 -07:00
Aliaksandr Valialkin
9f8209d593
docs/CHANGELOG.md: run at least 4 background mergers on systems with less than 4 CPU cores
This reduces the probability of sudden spike in the number of small parts when all the background mergers
are busy with big merges.
2023-04-13 23:43:17 -07:00
Aliaksandr Valialkin
550d5c7ea4
lib/{mergeset,storage}: make sure that getFlushToDiskDeadline() takes into account only in-memory parts 2023-04-13 23:43:17 -07:00
Aliaksandr Valialkin
809fbaeaac
lib/fs: add Must prefix to CopyDirectory and CopyFile functions
Callers of these functions log the returned error and then exit.
Let's log the error with the call stack inside the function itself.
This simplifies the code at callers' side, while leaving the same
level of debuggability in case of errors.
2023-04-13 23:02:59 -07:00
Aliaksandr Valialkin
780abc3b3b
lib/fs: rename SymlinkRelative to MustSymlinkRelative
Callers of this function log the returned error and then exit.
Let's log the error with the call stack inside the function itself.
This simplifies the code at callers' side, while leaving the same
level of debuggability in case of errors.
2023-04-13 22:52:55 -07:00
Aliaksandr Valialkin
5f487ed996
lib/fs: rename HardLinkFiles to MustHardLinkFiles
Callers of this function log the returned error and then exit.
Let's log the error with the call stack inside the function itself.
This simplifies the code at callers' side, while leaving the same
level of debuggability in case of errors.
2023-04-13 22:48:07 -07:00
Aliaksandr Valialkin
30425ca81a
lib/fs: rename WriteFileAtomically to MustWriteAtomic
Callers of this function log the returned error and exit.
So let's just log the error with the given filepath and the call stack
inside the function itself and then exit. This simplifies the code
at callers' place while leaves the same level of debuggability in case of errors.
2023-04-13 22:41:15 -07:00
Aliaksandr Valialkin
036a7b7365
lib/fs: replace MkdirAllIfNotExist->MustMkdirIfNotExist and MkdirAllFailIfExist->MustMkdirFailIfExist
Callers of these functions log the returned error and then exit. The returned error already contains the path
to directory, which was failed to be created. So let's just log the error together with the call stack
inside these functions. This leaves the debuggability of the returned error at the same level
while allows simplifying the code at callers' side.

While at it, properly use MustMkdirFailIfExist instead of MustMkdirIfNotExist inside inmemoryPart.MustStoreToDisk().
It is expected that the inmemoryPart.MustStoreToDick() must fail if there is already a directory under the given path.
2023-04-13 22:11:59 -07:00
Aliaksandr Valialkin
344209e5e6
lib/fs: rename MustWriteFileAndSync to MustWriteSync in order to improve readability a bit
This is a follow-up for 2a8395be05
2023-04-13 21:43:32 -07:00
Aliaksandr Valialkin
b15c5961ab
lib/{mergeset,storage}: remove unused path field from blockStreamWriter
This is a follow-up after 42bba64aa7
2023-04-13 21:39:59 -07:00
Aliaksandr Valialkin
2a8395be05
lib/fs: replace WriteFileAndSync with MustWriteAndSync
When WriteFileAndSync fails, then the caller eventually logs the error message
and exits. The error message returned by WriteFileAndSync already contains the path
to the file, which couldn't be created. This information alongside the call stack
is enough for debugging the issue. So just use log.Panicf("FATAL: ...") inside MustWriteAndSync().
This simplifies error handling at caller side a bit.
2023-04-13 21:33:19 -07:00
Aliaksandr Valialkin
25f089de9d
lib/{mergeset,storage}: properly fsync part directory listing after writing in-memory part to disk
This is a follow-up after 42bba64aa7

Previously the part directory listing was fsync'ed implicitly inside partHeader.WriteMetadata()
by calling fs.WriteFileAtomically(). Now it must be fsync'ed explicitly.

There is no need in fsync'ing the parent directory, since it is fsync'ed by the caller
when updating parts.json file.
2023-04-13 21:19:04 -07:00
Aliaksandr Valialkin
42bba64aa7
lib/{mergeset,storage}: explicitly fsync the created part directory listing
Previously the created part directory listing was fsynced implicitly
when storing metadata.json file in it.

Also remove superflouous fsync for part directory listing,
which was called at blockStreamWriter.MustClose().
After that the metadata.json file is created, so an additional fsync
for the directory contents is needed.
2023-04-13 21:03:08 -07:00
Aliaksandr Valialkin
e1211a1187
app/vmstorage: deprecate -bigMergeConcurrency command-line flag
Improperly configured -bigMergeConcurrency command-line flag usually leads to uncontrolled
growth of unmerged parts, which, in turn, increases CPU usage and query durations.

So it is better deprecating this flag. In rare cases -smallMergeConcurrency command-line flag
can be used instead for controlling the concurrency of background merges.
2023-04-13 20:40:24 -07:00
Aliaksandr Valialkin
ca54e58c1f
lib/{fs,persistentqueue}: use filepath.Join() instead of concatenating path parts with /
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4014
2023-04-13 20:13:45 -07:00
Aliaksandr Valialkin
90b876cd1e
app/vmbackupmanager: sync with enterprise-single-node branch after 41a54c775891c87e3d5ed59ff0769c869dd2fe71 2023-04-13 19:29:06 -07:00
Zakhar Bessarab
81f28f0f1f
lib/backup/actions: store metadata(creation and completion time) in backup files (#4117)
This makes it easier to understand exact point in time which is included in this backup.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-04-12 18:51:27 +02:00
Haleygo
0ad6010c91
fix sort pendingDateMetricsIDs (#4102) 2023-04-10 10:23:12 -07:00
Dmytro Kozlov
244c18fa38
app/vmctl: add multiple filters defined in --vm-native-filter-match flag to discovered metric names (#4063)
* app/vmctl: add multiple filters defined in `--vm-native-filter-match` flag to discovered metric names

* app/vmctl: fix comments

* app/vmctl: move function buildMatchWithFilter to the correct place

* app/vmctl: update CHANGELOG.md

* app/vmctl: fix CI, remove error wrapping

* app/vmctl: fix CI, simplify `Set()`
2023-04-06 15:06:52 -07:00
Aliaksandr Valialkin
593c151831
lib/encoding: fix test after 4725549cb2 2023-04-05 21:38:37 -07:00
Aliaksandr Valialkin
19b189e9b7
lib/storage: use shorter code after 03bde173b7 2023-04-02 21:35:52 -07:00
faceair
38fc55976e
lib/storage: fix reuse pendingMetricRow (#4049) 2023-04-02 21:35:50 -07:00
faceair
f3af8331ec
lib/storage: remove unused code (#4050) 2023-04-02 21:24:42 -07:00
Aliaksandr Valialkin
f638496298
lib/promscrape: do not re-use previously loaded scrape targets on failed attempt to load updated scrape targets at file_sd_configs
The logic employed for re-using the previously loaded scrape target was broken initially.
The commit cc0427897c tried to fix it, but the new logic
became too complex and fragile. So it is better to just remove this logic,
since the targets from temporarily broken file should be eventually loaded on next
attempts every -promscrape.fileSDCheckInterval

This also allows removing fragile hacks around __vm_filepath label.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3989
2023-04-02 21:05:28 -07:00
Dmytro Kozlov
cc0427897c
lib/promscrape: fix the problem with scrape work duplicates when file_sd_config can't be read (#4027)
* lib/promscrape: fix the problem with scrape work duplicates when file_sd_config can't be read

* lib/promscrape: clarified comment

* lib/promscrape: made better approach to handle a problem with growing []*ScrapeWork on each error when loading config

* lib/promscrape: added CHANGELOG.md

* Update docs/CHANGELOG.md

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-04-02 20:26:13 -07:00