VRL
Vector Remap Language
Vector のデータ構造化の変換ルール定義をする言語
良いところ
REPL機能があって、試行錯誤しやすいのがめちゃくちゃ便利!
エラーログが親切すぎて、問題を把握しやすいのが助かる
欲しいと思った機能はだいたい揃っているので、自分で頑張るより組み込み関数を探した方が良い
悪いところ
最初とっつきにくいので、文法のリファレンスをざっくり読んでおかないと意味が分からない
code:vrl
structured = parse_syslog!(.message)
. = merge(., structured)
お試し
https://hub.docker.com/r/timberio/vector を使うと楽
code:shell
docker run -it --rm timberio/vector:latest-alpine
REPL
vector vrl で起動できる
code:shell
$ docker run -it --rm timberio/vector:latest-alpine vrl
@shimizukawa ➜ /workspaces/bpredmine (master) $ docker run -it --rm timberio/vector:latest-alpine vrl
VVVVVVVV VVVVVVVVRRRRRRRRRRRRRRRRR LLLLLLLLLLL
V::::::V V::::::VR::::::::::::::::R L:::::::::L
V::::::V V::::::VR::::::RRRRRR:::::R L:::::::::L
V::::::V V::::::VRR:::::R R:::::RLL:::::::LL
V:::::V V:::::V R::::R R:::::R L:::::L
V:::::V V:::::V R::::R R:::::R L:::::L
V:::::V V:::::V R::::RRRRRR:::::R L:::::L
V:::::V V:::::V R:::::::::::::RR L:::::L
V:::::V V:::::V R::::RRRRRR:::::R L:::::L
V:::::V V:::::V R::::R R:::::R L:::::L
V:::::V:::::V R::::R R:::::R L:::::L
V:::::::::V R::::R R:::::R L:::::L LLLLLL
V:::::::V RR:::::R R:::::RLL:::::::LLLLLLLLL:::::L
V:::::V R::::::R R:::::RL::::::::::::::::::::::L
V:::V R::::::R R:::::RL::::::::::::::::::::::L
VVV RRRRRRRR RRRRRRRLLLLLLLLLLLLLLLLLLLLLLLL
VECTOR REMAP LANGUAGE
Welcome!
The CLI is running in REPL (Read-eval-print loop) mode.
To run the CLI in regular mode, add a program to your command.
VRL REPL commands:
help Learn more about VRL
next Load the next object or create a new one
prev Load the previous object
exit Terminate the program
Any other value is resolved to a VRL expression.
Try it out now by typing . and hitting enter to see the result.
$ log = s'192.168.102.197 - - 01/Nov/2022:10:10:25 +0900 "GET /redmine/projects/ HTTP/1.1" 502 182 "https://project.beproud.jp/redmine/projects/proj/issues" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:106.0) Gecko/20100101 Firefox/106.0"'
"192.168.102.197 - - 01/Nov/2022:10:10:25 +0900 \"GET /redmine/projects/ HTTP/1.1\" 502 182 \"https://project.beproud.jp/redmine/projects/proj/issues\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:106.0) Gecko/20100101 Firefox/106.0\""
$ parsed = parse_nginx_log(log, "combined")
errorE103: unhandled fallible assignment
┌─ :1:10
│
1 │ parsed = parse_nginx_log(log, "combined")
│ -------- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
│ │ │
│ │ this expression is fallible
│ │ update the expression to be infallible
│ or change this to an infallible assignment:
│ parsed, err = parse_nginx_log(log, "combined")
│
= see documentation about error handling at https://errors.vrl.dev/#handling
= learn more about error code 103 at https://errors.vrl.dev/103
= see language documentation at https://vrl.dev
= try your code in the VRL REPL, learn more at https://vrl.dev/examples
$ parsed = parse_nginx_log!(log, "combined")
{ "agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:106.0) Gecko/20100101 Firefox/106.0", "client": "192.168.102.197", "method": "GET", "path": "/redmine/projects/", "protocol": "HTTP/1.1", "referer": "https://project.beproud.jp/redmine/projects/proj/issues", "request": "GET /redmine/projects/ HTTP/1.1", "size": 182, "status": 502, "timestamp": t'2022-11-01T01:10:25Z' }
$ .http = parsed
{ "agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:106.0) Gecko/20100101 Firefox/106.0", "client": "192.168.102.197", "method": "GET", "path": "/redmine/projects/", "protocol": "HTTP/1.1", "referer": "https://project.beproud.jp/redmine/projects/proj/issues", "request": "GET /redmine/projects/ HTTP/1.1", "size": 182, "status": 502, "timestamp": t'2022-11-01T01:10:25Z' }
$ .
{ "http": { "agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:106.0) Gecko/20100101 Firefox/106.0", "client": "192.168.102.197", "method": "GET", "path": "/redmine/projects/", "protocol": "HTTP/1.1", "referer": "https://project.beproud.jp/redmine/projects/proj/issues", "request": "GET /redmine/projects/ HTTP/1.1", "size": 182, "status": 502, "timestamp": t'2022-11-01T01:10:25Z' } }
$ flatten(.)
{ "http.agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:106.0) Gecko/20100101 Firefox/106.0", "http.client": "192.168.102.197", "http.method": "GET", "http.path": "/redmine/projects/", "http.protocol": "HTTP/1.1", "http.referer": "https://project.beproud.jp/redmine/projects/proj/issues", "http.request": "GET /redmine/projects/ HTTP/1.1", "http.size": 182, "http.status": 502, "http.timestamp": t'2022-11-01T01:10:25Z' }
https://master--vector-project.netlify.app/docs/reference/vrl/examples/ に変換の事例が多数あるので見ながら練習できる
OpenTelemetry CollectorでNginxのログ構造化をしていたのをVector VRLで書き直したらめっちゃわかりやすくなった
code:otel-collector.yml
filelog/nginx/access:
# https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/filelogreceiver
include:
- /var/log/nginx/access.log
start_at: end
attributes:
service: nginx
env: prod
operators:
# https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/regex_parser.md
- type: regex_parser
regex: '^(?P<client_ip>.+) - (?P<user>.+) \(?P<time>.+)\ "(?P<method>^ +) (?P<path>^ +) (?P<version>^\"+)" (?P<status_code>\d{3}) (?P<read_bytes>\d+) "(?P<refer>^\"+)" "(?P<user_agent>^\"+)"$'
parse_to: attributes.http
timestamp:
# https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/types/timestamp.md
parse_from: attributes.http.time
layout_type: strptime
# https://github.com/observiq/ctimefmt/blob/3e07deba22cf7a753f197ef33892023052f26614/ctimefmt.go#L63
layout: "%d/%b/%Y:%H:%M:%S %z"
# https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/types/severity.md
severity:
parse_from: attributes.http.status_code
preset: none
mapping:
info:
- 2xx
- 3xx
warn: 4xx
error: 5xx
# https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/remove.md
- type: remove
field: attributes.http.time
# https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/pkg/stanza/docs/operators/uri_parser.md
- type: uri_parser
parse_from: attributes.http.path
code:vector.toml
sources.nginx
# https://vector.dev/docs/reference/configuration/sources/file/
type = "file"
include = "/var/log/nginx/access.log", "/var/log/nginx/error.log"
read_from = "end"
transforms.modify_nginx
type = "remap"
inputs = "nginx"
# https://vector.dev/docs/reference/configuration/transforms/remap/
source = '''
.http = parse_nginx_log(.message, "combined") ?? parse_nginx_log!(.message, "error")
path_parts = split(string(.http.path), "?", limit: 2)
.http.path = path_parts0
status = to_int!(.http.status)
.severity = if (status<400){"info"} else if (status<500){"warn"} else {"error"}
.timestamp = .http.timestamp || now()
. = flatten(.)
."http.query" = parse_query_string(to_string(path_parts1) || "")
."service" = "nginx"
'''
?? と ! の使い方が最初戸惑った
?? は coalesceで前の式が例外出した場合のリカバリを連鎖で書ける
! は function-call の例外抑止で例外を発生させない
httpのstatus_codeをseverityにマッピングする関数くらいあるんじゃないの?と思うけど、今のところ見つからない
parse_nginx_log() 関数にお任せでログの構造化が出来るの、便利すぎる
お気に入りは flatten() 関数
入力 {"http": {"status": 200, "path": "/foo/bar"}} を与えると
出力 {"http.status": 200, "http.path": "/foo/bar"} が得られる
Telemetryを受け取った側で、前者より後者の方が扱いやすいっぽい?
少なくとも、Uptraceは前者のように入れ子で構造化されていると、フィルタ条件を指定しづらい