场景描述
小王将日志以 JSON 格式采集到日志服务(Cloud Log Service,CLS),有以下两种情况:
1. JSON 是多层嵌套,小王想提取 user 和 App 字段,其中 user 是二级嵌套字段。
2. 小王的 JSON 日志是数组,需要数组中拆分出来多条日志。
原始日志
[{"content": {"App": "App-1","start_time": "2021-10-14T02:15:08.221","resonsebody": {"method": "GET","user": "Tom"},"response_code_details": "3000","bytes_sent": 69}},{"content": {"App": "App-2","start_time": "2222-10-14T02:15:08.221","resonsebody": {"method": "POST","user": "Jerry"},"response_code_details": "2222","bytes_sent": 1}}]
{"timestamp": 1732099684144000,"topic": "log-containers","records": [{"category": "kube-request","log": "{\\"requestID\\":\\"12345\\",\\"stage\\":\\"Complete\\"}"},{"category": "db-request","log": "{\\"requestID\\":\\"67890\\",\\"stage\\":\\"Response\\"}"}]}
加工结果
[{"App":"App-1","user":"Tom"},{"App":"App-2","user":"Jerry"}]
[{"category":"kube-request","requestID":"12345","stage":"Complete","timestamp":"1732099684144000","topic":"log-containers"},{"category":"db-request","requestID":"67890","stage":"Response","timestamp":"1732099684144000","topic":"log-containers"}]
DSL 加工函数
//使用 ext_json 函数从 JSON 数据中提取结构化数据,默认会平铺所有的字段ext_json("content")//丢弃 content 字段fields_drop("content")//丢弃不需要的字段 bytes_sent,method,response_code_details,start_timefields_drop("bytes_sent","method","response_code_details","start_time")
//从数组中拆分日志,拆出来2条日志log_split_jsonarray_jmes("records")//丢弃原始字段recordsfields_drop("records")//展开log的KV对ext_json("log")//丢弃原始字段logfields_drop("log")