app/vlselect/logsql: add an ability to delay returning matching logs from live tailing via offset query arg

By default the delay equals to 1 second.

While at it, document refresh_interval query arg at /select/logsql/tail endpoint.

Thanks to @Fusl for the idea and the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7428
This commit is contained in:
Aliaksandr Valialkin 2024-11-08 16:17:41 +01:00
parent e5537bc64d
commit a44787372f
No known key found for this signature in database
GPG key ID: 52C003EE2BCDB9EB
3 changed files with 27 additions and 3 deletions

View file

@ -406,13 +406,20 @@ func ProcessLiveTailRequest(ctx context.Context, w http.ResponseWriter, r *http.
} }
startOffset := startOffsetMsecs * 1e6 startOffset := startOffsetMsecs * 1e6
offsetMsecs, err := httputils.GetDuration(r, "offset", 1000)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return
}
offset := offsetMsecs * 1e6
ctxWithCancel, cancel := context.WithCancel(ctx) ctxWithCancel, cancel := context.WithCancel(ctx)
tp := newTailProcessor(cancel) tp := newTailProcessor(cancel)
ticker := time.NewTicker(refreshInterval) ticker := time.NewTicker(refreshInterval)
defer ticker.Stop() defer ticker.Stop()
end := time.Now().UnixNano() end := time.Now().UnixNano() - offset
start := end - startOffset start := end - startOffset
doneCh := ctxWithCancel.Done() doneCh := ctxWithCancel.Done()
flusher, ok := w.(http.Flusher) flusher, ok := w.(http.Flusher)
@ -441,7 +448,7 @@ func ProcessLiveTailRequest(ctx context.Context, w http.ResponseWriter, r *http.
return return
case <-ticker.C: case <-ticker.C:
start = end - tailOffsetNsecs start = end - tailOffsetNsecs
end = time.Now().UnixNano() end = time.Now().UnixNano() - offset
} }
} }
} }

View file

@ -17,6 +17,7 @@ according to [these docs](https://docs.victoriametrics.com/victorialogs/quicksta
* FEATURE: [`join` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#join-pipe): add an ability to add prefix to all the log field names from the joined query, by using `| join by (<by_fields>) (<query>) prefix "some_prefix"` syntax. * FEATURE: [`join` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#join-pipe): add an ability to add prefix to all the log field names from the joined query, by using `| join by (<by_fields>) (<query>) prefix "some_prefix"` syntax.
* FEATURE: [`_time` filter](https://docs.victoriametrics.com/victorialogs/logsql/#time-filter): allow specifying offset without time range. For example, `_time:offset 1d` matches all the logs until `now-1d` in the [`_time` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field). This is useful when building graphs for time ranges with some offset in the past. * FEATURE: [`_time` filter](https://docs.victoriametrics.com/victorialogs/logsql/#time-filter): allow specifying offset without time range. For example, `_time:offset 1d` matches all the logs until `now-1d` in the [`_time` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field). This is useful when building graphs for time ranges with some offset in the past.
* FEATURE: [`/select/logsql/tail` HTTP endpoint](): support for `offset` query arg, which can be used for delayed emission of matching logs during live tailing. Thanks to @Fusl for the initial idea and implementation in [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7428).
* BUGFIX: [HTTP querying APIs](https://docs.victoriametrics.com/victorialogs/querying/#http-api): properly take into account the `end` query arg when calculating time range for [`_time:duration` filter](https://docs.victoriametrics.com/victorialogs/logsql/#time-filter). Previously the `_time:duration` filter was treated as `_time:[now-duration, now)`, while it should be treated as `_time:[end-duration, end)`. * BUGFIX: [HTTP querying APIs](https://docs.victoriametrics.com/victorialogs/querying/#http-api): properly take into account the `end` query arg when calculating time range for [`_time:duration` filter](https://docs.victoriametrics.com/victorialogs/logsql/#time-filter). Previously the `_time:duration` filter was treated as `_time:[now-duration, now)`, while it should be treated as `_time:[end-duration, end)`.

View file

@ -148,13 +148,29 @@ The `<query>` must conform the following rules:
across multiple streams. across multiple streams.
Live tailing supports returning historical logs, which were ingested into VictoriaLogs before the start of live tailing. Pass `start_offset=<d>` query Live tailing supports returning historical logs, which were ingested into VictoriaLogs before the start of live tailing. Pass `start_offset=<d>` query
arg to `/select/logsql/tail` where `<d>` is the duration for returning historical logs. For example, the following request returns historical logs arg to `/select/logsql/tail` where `<d>` is the duration for returning historical logs. For example, the following command returns historical logs
which were ingested into VictoriaLogs during the last hour, before starting live tailing: which were ingested into VictoriaLogs during the last hour, before starting live tailing:
```sh ```sh
curl -N http://localhost:9428/select/logsql/tail -d 'query=*' -d 'start_offset=1h' curl -N http://localhost:9428/select/logsql/tail -d 'query=*' -d 'start_offset=1h'
``` ```
Live tailing delays delivering new logs for one second, so they could be properly delivered from log collectors to VictoriaLogs.
This delay can be changed via `offset` query arg. For example, the following command delays delivering new logs for 30 seconds:
```sh
curl -N http://localhost:9428/select/logsql/tail -d 'query=*' -d 'offset=30s'
```
Live tailing checks for new logs every second. The frequency for the check can be changed via `refresh_interval` query arg.
For example, the following command instructs live tailing to check for new logs every 10 seconds:
```sh
curl -N http://localhost:9428/select/logsql/tail -d 'query=*' -d 'refresh_interval=10s'
```
It isn't recommended setting too low value for `refresh_interval` query arg, since this may increase load on VictoriaLogs without measurable benefits.
**Performance tip**: live tailing works the best if it matches newly ingested logs at relatively slow rate (e.g. up to 1K matching logs per second), **Performance tip**: live tailing works the best if it matches newly ingested logs at relatively slow rate (e.g. up to 1K matching logs per second),
e.g. it is optimized for the case when real humans inspect the output of live tailing in the real time. If live tailing returns logs at too high rate, e.g. it is optimized for the case when real humans inspect the output of live tailing in the real time. If live tailing returns logs at too high rate,
then it is recommended adding more specific [filters](https://docs.victoriametrics.com/victorialogs/logsql/#filters) to the `<query>`, so it matches less logs. then it is recommended adding more specific [filters](https://docs.victoriametrics.com/victorialogs/logsql/#filters) to the `<query>`, so it matches less logs.