app/vlselect/logsql: add an ability to delay returning matching logs from live tailing via offset query arg

By default the delay equals to 1 second.

While at it, document refresh_interval query arg at /select/logsql/tail endpoint.

Thanks to @Fusl for the idea and the initial implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7428

(cherry picked from commit a44787372f)
This commit is contained in:
Aliaksandr Valialkin 2024-11-08 16:17:41 +01:00 committed by hagen1778
parent a02d26e853
commit 2818ce29f8
No known key found for this signature in database
GPG key ID: E92986095E0DD614
3 changed files with 27 additions and 3 deletions

View file

@ -406,13 +406,20 @@ func ProcessLiveTailRequest(ctx context.Context, w http.ResponseWriter, r *http.
}
startOffset := startOffsetMsecs * 1e6
offsetMsecs, err := httputils.GetDuration(r, "offset", 1000)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return
}
offset := offsetMsecs * 1e6
ctxWithCancel, cancel := context.WithCancel(ctx)
tp := newTailProcessor(cancel)
ticker := time.NewTicker(refreshInterval)
defer ticker.Stop()
end := time.Now().UnixNano()
end := time.Now().UnixNano() - offset
start := end - startOffset
doneCh := ctxWithCancel.Done()
flusher, ok := w.(http.Flusher)
@ -441,7 +448,7 @@ func ProcessLiveTailRequest(ctx context.Context, w http.ResponseWriter, r *http.
return
case <-ticker.C:
start = end - tailOffsetNsecs
end = time.Now().UnixNano()
end = time.Now().UnixNano() - offset
}
}
}

View file

@ -17,6 +17,7 @@ according to [these docs](https://docs.victoriametrics.com/victorialogs/quicksta
* FEATURE: [`join` pipe](https://docs.victoriametrics.com/victorialogs/logsql/#join-pipe): add an ability to add prefix to all the log field names from the joined query, by using `| join by (<by_fields>) (<query>) prefix "some_prefix"` syntax.
* FEATURE: [`_time` filter](https://docs.victoriametrics.com/victorialogs/logsql/#time-filter): allow specifying offset without time range. For example, `_time:offset 1d` matches all the logs until `now-1d` in the [`_time` field](https://docs.victoriametrics.com/victorialogs/keyconcepts/#time-field). This is useful when building graphs for time ranges with some offset in the past.
* FEATURE: [`/select/logsql/tail` HTTP endpoint](): support for `offset` query arg, which can be used for delayed emission of matching logs during live tailing. Thanks to @Fusl for the initial idea and implementation in [this pull request](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7428).
* BUGFIX: [HTTP querying APIs](https://docs.victoriametrics.com/victorialogs/querying/#http-api): properly take into account the `end` query arg when calculating time range for [`_time:duration` filter](https://docs.victoriametrics.com/victorialogs/logsql/#time-filter). Previously the `_time:duration` filter was treated as `_time:[now-duration, now)`, while it should be treated as `_time:[end-duration, end)`.

View file

@ -148,13 +148,29 @@ The `<query>` must conform the following rules:
across multiple streams.
Live tailing supports returning historical logs, which were ingested into VictoriaLogs before the start of live tailing. Pass `start_offset=<d>` query
arg to `/select/logsql/tail` where `<d>` is the duration for returning historical logs. For example, the following request returns historical logs
arg to `/select/logsql/tail` where `<d>` is the duration for returning historical logs. For example, the following command returns historical logs
which were ingested into VictoriaLogs during the last hour, before starting live tailing:
```sh
curl -N http://localhost:9428/select/logsql/tail -d 'query=*' -d 'start_offset=1h'
```
Live tailing delays delivering new logs for one second, so they could be properly delivered from log collectors to VictoriaLogs.
This delay can be changed via `offset` query arg. For example, the following command delays delivering new logs for 30 seconds:
```sh
curl -N http://localhost:9428/select/logsql/tail -d 'query=*' -d 'offset=30s'
```
Live tailing checks for new logs every second. The frequency for the check can be changed via `refresh_interval` query arg.
For example, the following command instructs live tailing to check for new logs every 10 seconds:
```sh
curl -N http://localhost:9428/select/logsql/tail -d 'query=*' -d 'refresh_interval=10s'
```
It isn't recommended setting too low value for `refresh_interval` query arg, since this may increase load on VictoriaLogs without measurable benefits.
**Performance tip**: live tailing works the best if it matches newly ingested logs at relatively slow rate (e.g. up to 1K matching logs per second),
e.g. it is optimized for the case when real humans inspect the output of live tailing in the real time. If live tailing returns logs at too high rate,
then it is recommended adding more specific [filters](https://docs.victoriametrics.com/victorialogs/logsql/#filters) to the `<query>`, so it matches less logs.