我有一个大的(20 by ) csv文件流格式。
date,ip,dev_type,env,time,cpu_usage
2015-11-09,10.241.121.172,M2,production,11:01,8
2015-11-09,10.241.121.172,M2,production,11:02,9
2015-11-09,10.241.121.243,C1,preproduction,11:01,4
2015-11-09,10.241.121.243,C1,preproduction,11:02,8
2015-11-10,10.241.121.172,M2,production,11:01,3
2015-11-10,10.241.121.172,M2,production,11:02,9
2015-11-10,10.241.121.243,C1,preproduction,11:01,4
2015-11-10,10.241.121.243,C1,preproduction,11:02,8
并将其作为流动格式导入弹性
{
"_index": "cpuusage",
"_type": "logs",
"_id": "AVFOkMS7Q4jUWMFNfSrZ",
"_score": 1,
"_source": {
"date": "2015-11-10",
"ip": "10.241.121.172",
"dev_type": "M2",
"env": "production",
"time": "11:02",
"cpu_usage": "9"
},
"fields": {
"date": [
1447113600000
]
}
}
...
那么,当我在每天找出每个ip的最大cpu_usage值时,如何输出所有字段(日期、ip、cpu_usage、env、env)?
curl -XGET localhost:9200/cpuusage/_search?pretty -d '{
"size": 0,
"aggs": {
"by_date": {
"date_histogram": {
"field": "date",
"interval": "day"
},
"aggs" : {
"genders" : {
"terms" : {
"field" : "ip",
"size": 100000,
"order" : { "_count" : "asc" }
},
"aggs" : {
"cpu_usage" : { "max" : { "field" : "cpu_usage" } }
}
}
}
}
}
}'
-停
----output ----
"aggregations" : {
"events_by_date" : {
"buckets" : [ {
"key_as_string" : "2015-11-09T00:00:00.000Z",
"key" : 1447027200000,
"doc_count" : 4,
"genders" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [ {
"key" : "10.241.121.172",
"doc_count" : 2,
"cpu_usage" : {
"value" : 9.0
}
}, {
"key" : "10.241.121.243",
"doc_count" : 2,
"cpu_usage" : {
"value" : 8.0
}
} ]
}
},
发布于 2015-11-28 17:15:56
你可以用顶击聚合来做
尝尝这个
{
"size": 0,
"aggs": {
"by_date": {
"date_histogram": {
"field": "date",
"interval": "day"
},
"aggs": {
"genders": {
"terms": {
"field": "ip",
"size": 100000,
"order": {
"_count": "asc"
}
},
"aggs": {
"cpu_usage": {
"max": {
"field": "cpu_usage"
}
},
"include_source": {
"top_hits": {
"size": 1,
"_source": {
"include": [
"date", "ip", "dev_type", "env", "cpu_usage"
]
}
}
}
}
}
}
}
}
}
这个有用吗?
https://stackoverflow.com/questions/33973478
复制相似问题