Commit graph

247 commits

Author SHA1 Message Date
hagen1778
1043fc1fd9
alerts: add docs section for the full list of alerting rules
The change also includes update of all references in other docs
to the alerting rules.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-08-03 10:46:25 +02:00
Jan Kielmann
5d73a07cc3 Fix spelling mistake in opentelemetry ingestion API path 2023-08-03 00:47:05 -07:00
Nikolay
46ecbbea26
lib/protoparser: adds opentelemetry parser (#2570)
* lib/protoparser: adds opentelemetry parser
app/{vmagent,vminsert}: adds opentelemetry ingestion path

Adds ability to ingest data with opentelemetry protocol
protobuf and json encoding is supported
data converted into prometheus protobuf timeseries
each data type has own converter and it may produce multiple timeseries
from single datapoint (for summary and histogram).
only cumulative aggregationFamily is supported for sum(prometheus
counter) and histogram.

Apply suggestions from code review

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>

updates deps

fixes tests

wip

wip

wip

wip

lib/protoparser/opentelemetry: moves to vtprotobuf generator

go mod vendor

lib/protoparse/opentelemetry: reduce memory allocations

* wip

- Remove support for JSON parsing, since it is too fragile and is rarely used in practice.
  The most clients send OpenTelemetry metrics in protobuf.
  The JSON parser can be added in the future if needed.
- Remove unused code from lib/protoparser/opentelemetry/pb and lib/protoparser/opentelemetry/proto
- Do not re-use protobuf message between ParseStream() calls, since there is high chance
  of high fragmentation of the re-used message because of too complex nested structure of the message.

* wip

* wip

* wip

---------

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-07-27 13:26:45 -07:00
Aliaksandr Valialkin
0c860c112c
docs/Cluster-VictoriaMetrics.md: use 1. instead of N. in numbered bullets, so they are automatically adjusted by Github Markdown engine
See https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax#lists
2023-07-26 14:11:30 -07:00
Roman Khavronenko
c32a01c52e
docs: follow-up after aec4b5db81 (#4638)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-19 10:10:51 +02:00
Aliaksandr Valialkin
7d45fc1f99
app/vmselect/netstorage: follow-up after 173ccf4333
- Clarify docs about -replicationFactor command-line flag at vmselect
- Clarify description for -replicationFactor and -search.skipSlowReplicas command-line flags
- Fix the logic for returning responses if -search.skipSlowReplicas command-line flag
  is enabled. The logic was broken in the 173ccf4333,
  so it could return responses only if some of vmstorage nodes return error,
  while it should return when query results are successfully collected from more than
  (len(storageNodes) - replicationFactor) vmstorage nodes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
2023-07-09 12:40:06 -07:00
Roman Khavronenko
fb03762d4d
vmselect: introduce search.skipSlowReplicas cmd-line flag (#4538)
* vmselect: introduce `search.skipSlowReplicas` cmd-line flag

vmselect has two logical conditions during request processing when
`-replicationFactor` cmd-line flag is set:
1. If at least `len(storageNodes) - replicationFactor` responded, it could skip
waiting for the rest of nodes to respond. This could lead to problems described
here https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207.
2. Mark response as partial if less than `len(storageNodes) - replicationFactor` responded
without an error.

The P1 showed itself error-prone and became the main reason why
`-replicationFactor` wasn't recommended to use at vmselect level.
However, this optimization could be still very useful in situations
when there are slow and fast replicas in cluster.

But P2 remains viable and important conditionless.
Hiding P1 behind the feature-flag `search.skipSlowReplicas`
should make `-replicationFactor` flag usable again. And let users
choose whether they want P1 to be respected.

Related issues
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* docs: update changelog

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-07-09 12:30:06 -07:00
Zakhar Bessarab
51a9cc9783
docs: make httpAuth.* flags description less ambiguous (#4588)
* docs: make `httpAuth.*` flags description less ambiguous

Currently, it may confuse users whether `httpAuth.*` flags are used by HTTP client or server configuration(see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4586 for example).

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* docs: fix a typo

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-07-07 13:50:13 +02:00
Artem Navoiev
8f1f80c4f8
update logo width in cluster doc to 300
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-07-06 23:16:01 -07:00
Aliaksandr Valialkin
5c36935b8b
docs/Cluster-VictoriaMetrics.md: update -help output 2023-07-06 23:13:07 -07:00
Zakhar Bessarab
5b7bfc41ad
docs: clarify downsampling periods requirements (#4542)
It is required for periods to be multiplies, but it was not stated clearly in documentation.

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
2023-06-30 10:37:22 +02:00
Dmytro Kozlov
24f34347f1
docs: clarify -retentionPeriod flag usage (#4417)
app/vmstorage: clarify the min value for `-retentionPeriod` flag

Co-authored-by: Roman Khavronenko <roman@victoriametrics.com>
2023-06-09 09:46:25 +02:00
Dmytro Kozlov
c7884f8686
app/{graphite,netstorage,prometheus}: fix graphite search tags api limits, remove redudant limit from SeriesHandler handler (#4352)
* app/{graphite,netstorage,prometheus}: fix graphite search tags api limits, remove unused limit from SeriesHandler handler,

* app/{graphite,netstorage,prometheus}: use search.maxTagValues for Graphite

* app/{graphite,netstorage,prometheus}: update CHANGELOG.md

* app/{graphite,netstorage,prometheus}: use own flags for Graphite API

* app/{graphite,netstorage,prometheus}: cleanup

* app/{graphite,netstorage,prometheus}: cleanup

* app/{graphite,netstorage,prometheus}: update docs

---------

Co-authored-by: Nikolay <nik@victoriametrics.com>
2023-06-02 14:34:04 +02:00
Artem Navoiev
39ba4fc1c4
update logo width in cluster doc to 300
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-05-16 15:33:15 -07:00
Alexander Marshalov
2e494e2375
fixed typos in documentation and commandline flags descriptions (#4275) 2023-05-10 09:50:41 +02:00
Artem Navoiev
ee1da35071 prepare static docs to migration
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-04-30 12:33:36 +02:00
Roman Khavronenko
eb746a4dab
Revert "http server: limit max concurrent requests (#4185)" (#4215)
This reverts commit 77f76371

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-04-27 13:02:47 +02:00
Roman Khavronenko
77f76371d0
http server: limit max concurrent requests (#4185)
* lib/httpserver: introduce `-http.maxConcurrentRequests` command-line flag

Introduce `-http.maxConcurrentRequests` command-line flag to protect
VM components from resource exhaustion during unexpected spikes of HTTP requests.
By default, the new flag's value is set to 0 which means no limits are applied.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/httpserver: mention http.maxConcurrentRequests in docs

Signed-off-by: hagen1778 <roman@victoriametrics.com>

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2023-04-24 14:52:06 +02:00
Aliaksandr Valialkin
e1211a1187
app/vmstorage: deprecate -bigMergeConcurrency command-line flag
Improperly configured -bigMergeConcurrency command-line flag usually leads to uncontrolled
growth of unmerged parts, which, in turn, increases CPU usage and query durations.

So it is better deprecating this flag. In rare cases -smallMergeConcurrency command-line flag
can be used instead for controlling the concurrency of background merges.
2023-04-13 20:40:24 -07:00
Aliaksandr Valialkin
ffdf430be0
app/vmselect/graphite: open source Graphite Render API 2023-03-31 23:25:04 -07:00
Aliaksandr Valialkin
46127b432d
lib/bytesutil: add -internStringDisableCache and -internStringCacheExpireDuration command-line flags
This commit is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/3872
2023-02-27 14:16:49 -08:00
Aliaksandr Valialkin
0d3f31f60e
lib/storage: follow-up for 39cdc546dd
- Use flag.Duration instead of flagutil.Duration for -snapshotCreateTimeout,
  since the flagutil.Duration is intended mostly for big durations, e.g. days, months and years,
  while the -snapshotCreateTimeout is usually smaller than one hour.
- Add links to https://docs.victoriametrics.com/#how-to-work-with-snapshots in docs/CHANGELOG.md,
  so readers could easily find the corresponding docs when reading the changelog.
- Properly remove all the created directories on unsuccessful attempt to create
  snapshot in Storage.CreateSnapshot().

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3551
2023-02-27 13:07:38 -08:00
Zakhar Bessarab
39cdc546dd
lib/storage: enhancements for snapshots process (#3873)
* lib/{fs,mergeset,storage}: skip `.must-remove.` dirs when creating snapshot (#3858)

* lib/{mergeset,storage}: add timeout configuration for snapshots creation, remove incomplete snapshots from storage

* docs: fix formatting

* app/vmstorage: add metrics to track status of snapshots

* app/vmstorage: use `vm_http_requests_total` metric for snapshot endpoints metrics, rename new flag to make name more clear

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vmstorage: update flag name in docs

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

* app/vmstorage: reflect new metrics names change in docs

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>

---------

Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-02-27 12:12:03 -08:00
Aliaksandr Valialkin
b4bf8dc2f0
docs: update -help output after e1c3267e34 2023-02-24 17:16:13 -08:00
Roman Khavronenko
e1c3267e34
vmselect/promql: check for deadline in count_values fn (#3806)
* vmselect/promql: check for deadline in `count_values` fn

`count_values` could be very slow during the data processing.
Checking for deadline between iterations supposed to reduce
probability of exceeding `search.maxQueryDuration`.

The change also adds a new trace record, which captures the time
spent in aggregation function. Before that, the trace for aggr funcs
could be confusing since it doesn't account for all the places where
time was spent.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* wip

---------

Signed-off-by: hagen1778 <roman@victoriametrics.com>
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2023-02-24 16:59:26 -08:00
Aliaksandr Valialkin
a02cf92fd1
docs: update --help descriptions after recent changes 2023-02-23 19:02:27 -08:00
Alexander Marshalov
b6845951a5
fixed typo in dns+srv documentation (#3861) 2023-02-23 02:40:00 +01:00
Artem Navoiev
6b4ccc17c2
docs: fix typo OpentTSDB -> OpenTSDB (#3854)
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2023-02-22 11:24:26 +01:00
Aliaksandr Valialkin
776391917f
app/vmauth: improve load balancing by sending incoming requests to backends with the lowest number of concurrent requests
While at it, stop sending requests to unavailable backend for 3 seconds
before the next attempt. This should reduce the amounts of useless work
and the number of useless network packets when the backend is temporarily unavailable.
2023-02-11 00:30:31 -08:00
earthgecko
3a51a3bc42
Clarifications between standalone/cluster ingestion endpoints (#3771)
docs: clarifications between standalone/cluster ingestion endpoints

This is an attempt to make it a bit clearer to the user that the cluster version ingestion URLs are different from the standalone ones.  I have also changed the order of the list items to make it a bit clearer and hopefully stop the user simply inferring that `/prometheus/api/v1` is only related to Prometheus data.
2023-02-07 15:17:12 +01:00
Aliaksandr Valialkin
51ad94677c
docs: update security chapters after bd716d1b0c 2023-01-27 15:45:35 -08:00
Denys Holius
bd716d1b0c
Improving docs by adding additional security sections (#3713)
* docs/Cluster-VictoriaMetrics.md: adds security section

* docs/Quick-Start.md: adds Security recommendation section
2023-01-27 15:22:21 -08:00
Aliaksandr Valialkin
119010f7f2
docs/Cluster-VictoriaMetrics.md: move Naive_cluster_scheme.png from the docs/assets/images/ folder into docs/ folder and add Cluster-VictoriaMetrics_ prefix to the image name
The docs/assets folder should be used only for assets specific to docs generation at https://docs.victoriametrics.com, e.g. css, js and images.

All the other assets related to specific docs should be placed in the same folder as the corresponding *.md file.
These assets should have the same name prefix as the corresponding doc file name. This simplifies tracking the lifetime of these assets.
For example, if the doc is removed, it is very easy to remove all assets associated with it with a simple `rm -rf docs/doc-name*` command.

This also simplifies generating correct urls for doc-specific assets from both https://docs.victoriametrics.com
and from https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/ - just refer to the asset name without any directory prefixes.
2023-01-27 10:54:27 -08:00
Aliaksandr Valialkin
558165521b
docs/Cluster-VictoriaMetrics.md: update command-line descriptions after ebebaecd94 2023-01-27 00:04:41 -08:00
Aliaksandr Valialkin
9fdd1a10c6
docs/Cluster-VictoriaMetrics.md: mention the -vmui.customDashboardsPath command-line flag at vmselect
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3322
2023-01-11 23:39:55 -08:00
Aliaksandr Valialkin
af263fe881
all: small improvements in error messages and command-line flag descriptions related to concurrency limiters 2023-01-07 00:11:44 -08:00
Aliaksandr Valialkin
c63755c316
lib/writeconcurrencylimiter: improve the logic behind -maxConcurrentInserts limit
Previously the -maxConcurrentInserts was limiting the number of established client connections,
which write data to VictoriaMetrics. Some of these connections could be idle.
Such connections do not consume big amounts of CPU and RAM, so there is a little sense in limiting
the number of such connections. So now the -maxConcurrentInserts command-line option
limits the number of concurrently executed insert requests, not including idle connections.

It is recommended removing -maxConcurrentInserts command-line option, since the default value
for this option should work good for most cases.
2023-01-06 22:20:19 -08:00
Aliaksandr Valialkin
f299d2ca1a
lib/vmselectapi: limit the number of concurrently executed requests
This should prevent from out of memory errors when big number of vmselect
nodes send many concurrent requests to vmstorage

The limit can be controlled at vmstorage via the following command-line flags:
- search.maxConcurrentRequests
- search.maxQueueDuration

See https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#resource-usage-limits
2023-01-06 22:11:34 -08:00
Aliaksandr Valialkin
fd175ad80b
docs: update -help outputs for vm* tools 2023-01-03 23:27:06 -08:00
Artem Navoiev
34c705988e fix broken links
Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-12-27 13:17:22 +02:00
Artem Navoiev
7d9c4bebc0
update links to grafana dashboards (#3534)
docs: update links to grafana dashboards

Signed-off-by: Artem Navoiev <tenmozes@gmail.com>
2022-12-25 17:36:20 +01:00
Roman Khavronenko
86dae56bd0
vmselect: support overriding of -search.latencyOffset (#3489)
support overriding of `-search.latencyOffset` value via
URL param `latency_offset`.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3481

Signed-off-by: hagen1778 <roman@victoriametrics.com>

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-16 16:54:57 -08:00
Aliaksandr Valialkin
5d30080555
lib/flagutil: support for TB and TiB suffixes for command-line flags, which accept byte sizes 2022-12-14 17:52:32 -08:00
Aliaksandr Valialkin
e0009ec466
docs/Cluster-VictoriaMetrics.md: mention the vm_storage_is_read_only metric, which can help debugging readonly mode at vmstorage 2022-12-14 09:31:28 -08:00
Aliaksandr Valialkin
3f4cb9a142
docs: sync with cluster branch after 97b41e727c 2022-12-10 02:32:24 -08:00
Aliaksandr Valialkin
a8b8e23d68
lib/promscrape: implement target-level and metric-level relabel debugging
Target-level debugging is performed by clicking the 'debug' link at the corresponding target
on either http://vmagent:8429/targets page or on http://vmagent:8428/service-discovery page.

Metric-level debugging is perfromed at http://vmagent:8429/metric-relabel-debug page.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3407

See https://docs.victoriametrics.com/vmagent.html#relabel-debug
2022-12-10 02:09:44 -08:00
Aliaksandr Valialkin
bce1c5d572
docs/Cluster-VictoriaMetrics.md: typo fix 2022-12-07 12:26:34 -08:00
Aliaksandr Valialkin
59430e4274
docs/Cluster-VictoriaMetrics.md: make docs-sync after 5de8330ce00adfc5ac794070d30a2617ddc14bf2 2022-12-07 12:02:38 -08:00
Aliaksandr Valialkin
35a3170d97
docs: document the addition of -storageNode.discoveryInterval command-line flag in VictoriaMetrics cluster enterprise
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3417
2022-12-06 19:57:06 -08:00
Roman Khavronenko
cd5c451ea3
docs: fix typo in cluster's README (#3430)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-12-01 16:24:16 +01:00