有奖捉虫:办公协同&微信生态&物联网文档专题 HOT
本文主要介绍 CTSDB 中常用查询及关键字的使用,并给出简单 CURL 示例。所有示例使用的 metric 结构如下所示:
{
"ctsdb_test" : {
"tags" : {
"region" : "string"
},
"time" : {
"name" : "timestamp",
"format" : "epoch_millis"
},
"fields" : {
"cpuUsage" : "float",
"diskUsage" : "string",
"dcpuUsage" : "integer"
},
"options" : {
"expire_day" : 7,
"refresh_interval" : "10s",
"number_of_shards" : 5,
"indexed_fields" : "cpuUsage"
}
}
}

组合查询

这里的组合查询即可以用于单查询也可用于复合查询。查询体中的 query 关键字通过 Query DSL(Domain Specific Language)来定义查询条件。 下文会先介绍如何构造过滤条件,再介绍怎样组合过滤条件和如何处理返回结果集。

常用过滤条件

1. Range

Range 即区间查询,Range 查询支持的字段类型包括 string、long、integer、short、double、float、date。Range 查询可包含的参数如下表所示:
参数名称
描述
gte
大于或等于
gt
大于
lte
小于或等于
lt
小于
说明
当 Range 查询涉及时间类型时,可通过 format 参数来指定时间格式。具体时间格式请参考 新建 metric
时间范围查询的 CURL 示例:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"query": {
"range": {
"timestamp": {
"gte": "01/01/2018",
"lte": "03/01/2018",
"format": "MM/dd/yyyy",
"time_zone":"+08:00"
}
}
}
}'
数字范围查询 CURL 示例:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"query": {
"range": {
"cpuUsage": {
"gte": 1.0,
"lte": 10.0
}
}
}
}'

2. Terms

Terms 关键字用于查询时匹配特定字段,其值需要用中括号包裹。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"query": {
"terms": {
"region": ["sh", "bj"]
}
}
}'

过滤条件组合

复合查询常使用 Bool 关键字来组合多个查询条件,常见的用于 Bool 查询的组合关键字有 filter(类似于 AND)、must_not(类似于 NOT)、should(类似于 OR)。为提高查询性能,请务必加上 time 字段的 range 查询,且 time 字段,不论查询时以何种方式写入,返回值统一为 epoch_millis 格式。query-bool-filter 的组合形式可显著提升查询性能,请务必采用这种查询方式。

1. AND 条件 CURL 示例说明

curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"query": {
"bool": {
"filter": [
{
"range": {
"timestamp": {
"format": "yyyy-MM-dd HH:mm:ss",
"gte": "2017-11-06 23:00:00",
"lt": "2018-03-06 23:05:00",
"time_zone":"+08:00"
}

}
},
{
"terms": {
"region": ["sh"]
}
},
{
"terms": {
"cpuUsage": ["2.0"]
}
}
]
}
},
"docvalue_fields": [
"cpuUsage",
"timestamp"
]
}'
说明
此查询条件类似于 timestamp>='2017-11-06 23:00:00' AND timestamp<'2018-03-06 23:05:00' AND region=sh AND cpuUsage=2.0

2. OR 条件 CURL 示例说明

curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"query": {
"bool": {
"filter": [
{
"range": {
"timestamp": {
"format": "yyyy-MM-dd HH:mm:ss",
"gte": "2017-11-06 23:00:00",
"lt": "2018-11-06 23:05:00",
"time_zone":"+08:00"
}
}
},
{
"term": {
"region": "gz"
}
}
],
"should": [
{
"terms": {
"cpuUsage": ["2.0"]
}
},
{
"terms": {
"cpuUsage": ["2.5"]
}
}
],
"minimum_should_match": 1
}
},
"docvalue_fields": [
"cpuUsage",
"timestamp"
]
}'
说明
此查询条件类似于 timestamp>='2017-11-06 23:00:00' AND timestamp<'2018-11-06 23:05:00' AND region='gz' AND (cpuUsage=2.0 or cpuUsage=2.5)。minimum_should_match 参数的意义在于设置 cpuUsage=2.0 和 cpuUsage=2.5 至少匹配的个数,系统默认 minimum_should_match 为0。

3. NOT 条件 CURL 示例说明

curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"query": {
"bool": {
"filter": [
{
"range": {
"timestamp": {
"format": "yyyy-MM-dd HH:mm:ss",
"gte": "2017-11-06 23:00:00",
"lt": "2018-11-06 23:05:00",
"time_zone":"+08:00"
}

}
},
{
"terms": {
"region": ["gz"]
}
}
],
"must_not": [
{
"terms": {
"cpuUsage": ["2.0"]
}
}
]
}
},
"docvalue_fields": [
"cpuUsage",
"timestamp"
]
}'
说明
此查询条件类似于 timestamp>=2017-11-06 23:00:00 AND timestamp<2018-11-06 23:05:00 AND region='gz' AND cpuUsage !=2.0

返回结果集处理

1. From/Size

通过设置 From 和 Size 关键字可对查询结果进行分页。From 关键字定义了查询结果中第一条数据的偏移量,Size 关键字配置返回结果的最大条数。From 系统默认为 0,Size 系统默认为 10,From 与 Size 的总和系统默认不能超过 65536。若想要更大的返回结果,请参考本文的 scroll 关键字查询。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"from": 0,
"size": 5,
"query": {
"bool": {
"filter": {
"range": {
"timestamp": {
"gte": "01/01/2018",
"lte": "03/01/2018",
"format": "dd/MM/yyyy",
"time_zone":"+08:00"
}
}
},
"must_not": {
"terms": {
"region": ["bj"]
}
}
}
}
}'

2. Scroll

Scroll 可以理解为关系型数据库里的 cursor,因此 scroll 并不适合用来做实时搜索,而更适用于后台批处理任务。 可以把 scroll 分为初始化和遍历两个阶段,初始化时将所有符合搜索条件的搜索结果缓存起来,类似于快照,在遍历时,从这个快照里取数据。注意,在 Scroll 初始化后对 metric 的插入、删除、更新都不会影响遍历结果。
初始化阶段可通过 size 关键字指定返回结果集的大小,size 默认值为10,最大值为65536;初始化阶段可通过 query 关键字指定查询条件;初始化阶段可通过 docvalue_fields 关键字指定返回字段。初始化阶段和遍历阶段都返回_scroll_id 字段,后续遍历可以通过指定前一次遍历返回的 _scroll_id 来进行新的遍历,直到返回结果为空;初始化和遍历两个阶段都可以通过指定 scroll 参数来设定遍历的上下文环境保留时间,过期后 _scroll_id 将变得无效。时间格式如下表所示:
格式
描述
d
days
h
hours
m
minutes
s
seconds
ms
milliseconds
micros
microseconds
nanos
nanoseconds
CURL 示例说明: scroll 初始化:
curl -u root:le201909 -H 'Content-Type:application/json' -X POST 172.xx.xx.4:9201/ctsdb_test/_search?scroll=1m -d'
{
"size":5,
"query": {
"bool": {
"filter": [
{
"terms": {
"region": ["gz"]
}
}
]
}
},
"docvalue_fields": [
"cpuUsage",
"region",
"timestamp"
]
}'
scroll 初始化返回:
{
"_scroll_id": "DnF1ZXJ5VGhlbkZldGNoAwAAAAAADrOFFm5YSEhnMjdnUWNPcndHS1k5Wjc3bHcAAAAAAAz_1RZiRkZTcGp4dFRXR18xMGtzSmhEUFJRAAAAAAAP5vQWOXFOR29lc0hROHFWMmFGTkVmSkxmZw==",
"took": 10641,
"timed_out": false,
"_shards": {
"total": 3,
"successful": 3,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value":1592072666,
"relation":"eq"
},
"max_score": 0.65708643,
"hits": [
{
"_index": "ctsdb_test@0_-1",
"_type": "doc",
"_id": "oyylNU0U65cZjByyt7sW_JmPPgAACY4Bh0",
"_score": 0.65708643,
"_routing": "354d14eb",
"fields": {
"region": [
"gz"
],
"cpuUsage": [
"2.0"
],
"timestamp": [
1509909300000
]
}
},
{
"_index": "ctsdb_test@0_-1",
"_type": "doc",
"_id": "oyykFN0yd1d9NDPfzjRdrJ8whQAACYsBqc",
"_score": 0.65708643,
"_routing": "14dd3277",
"fields": {
"region": [
"gz"
],
"cpuUsage": [
"1.8"
],
"timestamp": [
1509908340000
]
}
},
{
"_index": "ctsdb_test@0_-1",
"_type": "doc",
"_id": "oyylHLp9jl_3sF4N2rnh67h4SgAABHIBso",
"_score": 0.65708643,
"_routing": "1cba7d8e",
"fields": {
"region": [
"gz"
],
"cpuUsage": [
"2.5"
],
"timestamp": [
1509909720000
]
}
},
{
"_index": "ctsdb_test@0_-1",
"_type": "doc",
"_id": "oyylH2JsOKnFHGUinQ7jM-ZwkgAAAvcBso",
"_score": 0.65708643,
"_routing": "1f626c38",
"fields": {
"region": [
"gz"
],
"cpuUsage": [
"2.1"
],
"timestamp": [
1509909720000
]
}
},
{
"_index": "ctsdb_test@0_-1",
"_type": "doc",
"_id": "oyylHLp9jl_3sF4N2rnh67h4SgAABGsBso",
"_score": 0.65708643,
"_routing": "1cba7d8e",
"fields": {
"region": [
"gz"
],
"cpuUsage": [
"2.0"
],
"timestamp": [
1509909720000
]
}
}
]
}
}
scroll 遍历:
curl -u root:le201909 -H 'Content-Type:application/json' -X POST 172.xx.xx.4:9201/_search/scroll -d'
{
"scroll" : "1m",
"scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAADrOFFm5YSEhnMjdnUWNPcndHS1k5Wjc3bHcAAAAAAAz_1RZiRkZTcGp4dFRXR18xMGtzSmhEUFJRAAAAAAAP5vQWOXFOR29lc0hROHFWMmFGTkVmSkxmZw=="
}'
说明
此请求中的 scroll_id 是 scroll 初始化返回的 _scroll_id 值。而下一次遍历则需要将 scroll_id 参数调整为上一次遍历返回的 _scroll_id 值,即每次请求中的 scroll_id 参数是上一次请求返回的 _scroll_id 值,直到返回结果为空,遍历结束。
两次遍历返回的 _scroll_id 值可能相同,无法利用 _scroll_id 进行指定页跳转。

3. Sort

Sort 关键字主要用于对查询结果进行排序。排序方式有 asc 和 desc 两种,CTSDB 对于用户自定义字段的默认排序方式为 asc。排序模式有 min、max、sum、avg、median。其中 sum、avg、median 只适用于类型为 array 并存储数字的字段。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"query": {
"bool": {
"must": {
"range": {
"timestamp": {
"gte": "01/01/2018",
"lte": "03/01/2018",
"format": "MM/dd/yyyy",
"time_zone":"+08:00"
}
}
}
}
},
"sort": [
{
"cpuUsage": {
"order": "asc",
"mode": "min"
}
},
{
"timestamp": {
"order": "asc"
}
},
"diskUsage"
]
}'

4. docvalue_fields

docvalue_fields 关键词指定需要返回的字段名称,需要以数组的形式指定。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"query": {
"terms": {
"region": ["sh", "bj"]
}
},
"docvalue_fields": ["timestamp", "cpuUsage"]
}'

聚合查询

agg 关键字主要用于构造聚合查询。用户可到返回的 aggregations 字段取聚合结果。聚合返回字段说明详见下表。如果只关注聚合结果,请在查询时设置 size 参数为0。
字段名称
描述
hits
返回匹配的查询结果,其中 total 字段表示参与聚合的数据条数。里面的 hits 字段为数组结构,若无指定,则包含所有查询结果的前十条。hits 数组里的每个结果中包含_index(查询涉及到的 子表),若查询中指定了 docvalue_fields,则会返回 fields字段具体指明每个字段的值。
took
整个查询耗费的毫秒数。
_shards
参与查询的分片数。其中 total 标识总分片数,successful 标识执行成功的数量,failed 标识执行失败的数量,skipped 标识跳过执行的分片数。
timed_out
查询是否超时,取值有 false 和 true。
aggregations
聚合返回结果。
下面列举几种常用的聚合方式。

普通聚合

普通聚合需要指定聚合名称、聚合方式(常用的聚合方式有 min、max、avg、value_count、sum 等)和聚合所作用的字段。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"size":0,
"query": {
"terms": {
"region": ["sh", "bj"]
}
},
"aggs": {
"myname": {
"max": {
"field": "cpuUsage"
}
}
}
}'
说明
上例中对字段 cpuUsage 做 max 聚合(用户也可指定 min、avg 等),返回结果取别名 myname(用户可任意指定该名称)。
返回结果:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 20,
"successful": 20,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value":7,
"relation":"eq"
},
"max_score": 0,
"hits": []
},
"aggregations": {
"myname": {
"value": 4
}
}
}

terms 聚合

terms 聚合主要用于查询某个字段的所有唯一值以及该值的个数,用户可指定唯一值返回时的排序规则,以及返回结果数,并且可对参与聚合的数据字段进行模糊或者精确匹配,详情请参考如下示例,使用 filter_path 参数可定制返回结果字段,使用方法请参考 批量查询数据
CURL 示例说明: 请求:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search?filter_path=aggregations -d'
{
"aggs": {
"myname": {
"terms": {"field":"region"}
}
}
}'
返回:
{
"aggregations": {
"myname": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "sh",
"doc_count": 10
},
{
"key": "Motor_sports",
"doc_count": 6
},
{
"key": "gz",
"doc_count": 3
},
{
"key": "bj",
"doc_count": 2
},
{
"key": "cd",
"doc_count": 2
},
{
"key": "Winter_sports",
"doc_count": 1
},
{
"key": "water_sports",
"doc_count": 1
}
]
}
}
}
说明
上例表示查看 ctsdb_test 的 region 字段中所有的唯一值及其出现的次数。分析返回字段 aggregations 中的 buckets 字段发现,region 字段共有7种类型的值,分别是 sh、Motor_sports、gz、bj、cd、Winter_sports、water_sports, 并且返回字段通过 doc_count 指明了每种值的个数。
用户可通过 size 字段指定返回的唯一值的个数,例如某字段名为 region,其唯一值有7个,可通过设置 size 字段为5指定只返回前5个,具体请参考示例。
请求:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search?filter_path=aggregations -d'
{
"aggs": {
"myname": {
"terms": {
"field":"region",
"size":5
}
}
}
}'
返回:
{
"aggregations": {
"myname": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 2,
"buckets": [
{
"key": "sh",
"doc_count": 10
},
{
"key": "Motor_sports",
"doc_count": 6
},
{
"key": "gz",
"doc_count": 3
},
{
"key": "bj",
"doc_count": 2
},
{
"key": "cd",
"doc_count": 2
}
]
}
}
}
用户可对返回结果进行排序,详情请参考如下示例。 1.请求(返回结果按照唯一值的个数降序排列):
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search?filter_path=aggregations -d'
{
"aggs": {
"myname": {
"terms": {
"field":"region",
"order":{"_count":"desc"}
}
}
}
}'
返回:
{
"aggregations": {
"myname": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "sh",
"doc_count": 10
},
{
"key": "Motor_sports",
"doc_count": 6
},
{
"key": "gz",
"doc_count": 3
},
{
"key": "bj",
"doc_count": 2
},
{
"key": "cd",
"doc_count": 2
},
{
"key": "Winter_sports",
"doc_count": 1
},
{
"key": "water_sports",
"doc_count": 1
}
]
}
}
}
2.请求(返回结果按照唯一值的字母升序进行排列):
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search?filter_path=aggregations -d'
{
"aggs": {
"myname": {
"terms": {
"field":"region",
"order":{"_term":"asc"}
}
}
}
}'
返回:
{
"aggregations": {
"myname": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Motor_sports",
"doc_count": 6
},
{
"key": "Winter_sports",
"doc_count": 1
},
{
"key": "bj",
"doc_count": 2
},
{
"key": "cd",
"doc_count": 2
},
{
"key": "gz",
"doc_count": 3
},
{
"key": "sh",
"doc_count": 10
},
{
"key": "water_sports",
"doc_count": 1
}
]
}
}
}
用户可设置正则表达式对参与 terms 聚合的数据字段进行模糊匹配或者指定字段进行精确匹配,详情请参考示例。
模糊匹配示例: 请求(只针对 region 字段中有 sport 并且不是以 water_ 开头的数据进行聚合并返回结果):
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search?filter_path=aggregations -d'
{
"aggs": {
"myname": {
"terms": {
"field":"region",
"include" : ".*sport.*",
"exclude" : "water_.*"
}
}
}
}'
返回:
{
"aggregations": {
"myname": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Motor_sports",
"doc_count": 6
},
{
"key": "Winter_sports",
"doc_count": 1
}
]
}
}
}
精确匹配示例 请求(region_zone 指定值为"sh","bj","cd","gz"的 region 字段进行聚合,而 region_sports 指定值不为"sh","bj","cd","gz"的 region 字段进行聚合)
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search?filter_path=aggregations -d'
{
"aggs" : {
"region_zone" : {
"terms" : {
"field" : "region",
"include" : ["sh", "bj","cd","gz"]
}
},
"region_sports" : {
"terms" : {
"field" : "region",
"exclude" : ["sh", "bj","cd","gz"]
}
}
}
}'
返回:
{
"aggregations": {
"region_sport": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Motor_sports",
"doc_count": 6
},
{
"key": "Winter_sports",
"doc_count": 1
},
{
"key": "water_sports",
"doc_count": 1
}
]
},
"region_zone": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "sh",
"doc_count": 10
},
{
"key": "gz",
"doc_count": 3
},
{
"key": "bj",
"doc_count": 2
},
{
"key": "cd",
"doc_count": 2
}
]
}
}
}

Date Histogram 聚合

Date Histogram 主要对日期做直方图聚合。 CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"query": {
"terms": {
"region": ["sh", "bj"]
}
},
"aggs": {
"time_1h_agg": {
"date_histogram": {
"field": "timestamp",
"interval": "1h"
},
"aggs": {
"avgCpuUsage": {
"avg": {
"field": "cpuUsage"
}
}
}
}
}
}'
返回:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 20,
"successful": 20,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value":6,
"relation":"eq"
},
"max_score": 0.074107975,
"hits": [
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2QtGR5xcjRaw2ETf-",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2QtGR5xcjRaw2ETf_",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2Q5Xr5xcjRaw2ETgA",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2Q5Xr5xcjRaw2ETgB",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2RGfF5xcjRaw2ETgC",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2RGfF5xcjRaw2ETgD",
"_score": 0.074107975,
"_routing": "sh"
}
]
},
"aggregations": {
"time_1h_agg": {
"buckets": [
{
"key_as_string": "1520222400",
"key": 1520222400000,
"doc_count": 1,
"avgCpuUsage": {
"value": 2.5
}
},
{
"key_as_string": "1520226000",
"key": 1520226000000,
"doc_count": 0,
"avgCpuUsage": {
"value": null
}
},
{
"key_as_string": "1520229600",
"key": 1520229600000,
"doc_count": 0,
"avgCpuUsage": {
"value": null
}
},
{
"key_as_string": "1520233200",
"key": 1520233200000,
"doc_count": 0,
"avgCpuUsage": {
"value": null
}
},
{
"key_as_string": "1520236800",
"key": 1520236800000,
"doc_count": 0,
"avgCpuUsage": {
"value": null
}
},
{
"key_as_string": "1520240400",
"key": 1520240400000,
"doc_count": 0,
"avgCpuUsage": {
"value": null
}
},
{
"key_as_string": "1520244000",
"key": 1520244000000,
"doc_count": 0,
"avgCpuUsage": {
"value": null
}
},
{
"key_as_string": "1520247600",
"key": 1520247600000,
"doc_count": 1,
"avgCpuUsage": {
"value": 2
}
},
{
"key_as_string": "1520251200",
"key": 1520251200000,
"doc_count": 4,
"avgCpuUsage": {
"value": 2.25
}
}
]
}
}
}
说明
上例是对字段 cpuUsage 进行粒度为1小时的 date_histogram 聚合。返回结果中总的聚合名称为 time_1h_agg(用户可任意指定该名称),每个时间分段内的聚合名称为 avgCpuUsage(用户可任意指定该名称)。interval 字段可选的时间粒度有 year、quarter、 month、week、day、hour、minute、second,用户也可将时间粒度用时间单位来表示,例如1y代表1year,1h代表1hour。系统不支持小数粒度的时间单位,例如1.5h您需要转换为90min。

Percentiles 聚合

Percentiles 聚合即百分位聚合。百分位可以任意指定,系统默认的百分位是 1、5、25、50、75、95、99,可自行选择其他数值。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"query": {
"terms": {
"region": ["sh", "bj"]
}
},
"aggs":
{
"myname":
{
"percentiles":
{
"field": "cpuUsage",
"percents": [1,25,50,70,99]
}
}
}
}'
返回:
{
"took": 18,
"timed_out": false,
"_shards": {
"total": 20,
"successful": 20,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value":6,
"relation":"eq"
},
"max_score": 0.074107975,
"hits": [
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2QtGR5xcjRaw2ETf-",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2QtGR5xcjRaw2ETf_",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2Q5Xr5xcjRaw2ETgA",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2Q5Xr5xcjRaw2ETgB",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2RGfF5xcjRaw2ETgC",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2RGfF5xcjRaw2ETgD",
"_score": 0.074107975,
"_routing": "sh"
}
]
},
"aggregations": {
"myname": {
"values": {
"1.0": 2,
"25.0": 2,
"50.0": 2.25,
"70.0": 2.5,
"99.0": 2.5
}
}
}
}
说明
上例中是对字段 cpuUsage 做 percentiles 聚合,选取的百分位为 1,25,50,70,99。聚合返回结果的别名是 myname(用户可任意指定该名称)。

Cardinality 聚合

Cardinality 聚合主要用于统计去重后的数量。默认情况下,当聚合结果小于等于3000时,Cardinality 返回的结果是精确值,大于3000时,为近似值。
CURL 示例说明:
curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.xx.xx.4:9201/ctsdb_test/_search -d'
{
"query": {
"terms": {
"region": ["sh", "bj"]
}
},
"aggs":
{
"myname":
{
"cardinality":
{
"field": "cpuUsage"
}
}
}
}'
返回:
{
"took": 15,
"timed_out": false,
"_shards": {
"total": 20,
"successful": 20,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value":6,
"relation":"eq"
},
"max_score": 0.074107975,
"hits": [
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2QtGR5xcjRaw2ETf-",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2QtGR5xcjRaw2ETf_",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2Q5Xr5xcjRaw2ETgA",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2Q5Xr5xcjRaw2ETgB",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2RGfF5xcjRaw2ETgC",
"_score": 0.074107975,
"_routing": "sh"
},
{
"_index": "ctsdb_test@1520092800000_3",
"_type": "doc",
"_id": "AWH2RGfF5xcjRaw2ETgD",
"_score": 0.074107975,
"_routing": "sh"
}
]
},
"aggregations": {
"myname": {
"value": 2
}
}
}
说明
上例的聚合是对字段 cpuUsage 做 cardinality 聚合,返回的聚合结果取别名 myname(用户可任意指定该名称)。