常用查询示例

最近更新时间:2019-10-18 16:32:51

此篇文章主要介绍 CTSDB 中常用查询及关键字的使用,并给出简单 CURL 示例。所有示例使用的 metric 结构如下所示:

    {
        "ctsdb_test" : {
          "tags" : {
            "region" : "string"
          },
          "time" : {
            "name" : "timestamp",
            "format" : "epoch_millis"
          },
          "fields" : {
            "cpuUsage" : "float",
            "diskUsage" : "string",
            "dcpuUsage" : "integer"
          },
          "options" : {
            "expire_day" : 7,
            "refresh_interval" : "10s",
            "number_of_shards" : 5
          }
        }
    }

组合查询

这里的组合查询即可以用于单查询也可用于复合查询。查询体中的 query 关键字通过 Query DSL(Domain Specific Language) 来定义查询条件。下文会先介绍如何构造过滤条件再介绍怎样组合过滤条件和如何处理返回结果集。

常用过滤条件

1. Range

Range 即区间查询,Range 查询支持的字段类型包括 string、long、integer、short、double、float、date。Range 查询可包含的参数如下表所示:

参数名称 描述
gte 大于或等于
gt 大于
lte 小于或等于
lt 小于
说明:

当 Range 查询涉及时间类型时,可通过 format 参数来指定时间格式。具体时间格式请参考 新建 metric

时间范围查询的 CURL 示例:

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    {
        "query": {
            "range": {
                "timestamp": {
                    "gte": "01/01/2018",
                    "lte": "03/01/2018",
                    "format": "MM/dd/yyyy",
                    "time_zone":"+08:00"
                  }
              }
          }
     }'

数字范围查询 CURL 示例:

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    {
        "query": {
            "range": {
                "cpuUsage": {
                    "gte": 1.0,
                    "lte": 10.0
                }
            }
        }
    }'  

2. Terms

Terms 关键字用于查询时匹配特定字段,其值需要用中括号包裹。

CURL 示例说明:

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    { 
        "query": {
            "terms": {
                "region": ["sh", "bj"]
            }
        }
    }'

过滤条件组合

复合查询常使用 Bool 关键字来组合多个查询条件,常见的用于 Bool 查询的组合关键字有 filter(类似于 AND)、must_not(类似于 NOT)、should(类似于 OR)。为提高查询性能,请务必加上 time 字段的 range 查询,且 time 字段,不论查询时以何种方式写入,返回值统一为 epoch_millis 格式。query-bool-filter 的组合形式可显著提升查询性能,请务必采用这种查询方式。

1. AND 条件 CURL 示例说明

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    {
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "timestamp": {
                  "format": "yyyy-MM-dd HH:mm:ss",
                  "gte": "2017-11-06 23:00:00",
                  "lt": "2018-03-06 23:05:00",
                  "time_zone":"+08:00"
                }

              }
            },
            {
              "terms": {
                "region": ["sh"]
              }
            },
            {
              "terms": {
                "cpuUsage": ["2.0"]
              }
            }
          ]
        }
      },
       "docvalue_fields": [
        "cpuUsage",
        "timestamp"
        ]
    }'
说明:

此查询条件类似于 timestamp>='2017-11-06 23:00:00' AND timestamp<'2018-03-06 23:05:00' AND region=sh AND cpuUsage=2.0。

2. OR 条件 CURL 示例说明

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    {
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "timestamp": {
                  "format": "yyyy-MM-dd HH:mm:ss",
                  "gte": "2017-11-06 23:00:00",
                  "lt": "2018-11-06 23:05:00",
                  "time_zone":"+08:00"
                }
              }
            },
            {
              "term": {
                "region": "gz"
              }
            }
          ],
          "should": [
            {
              "terms": {
                "cpuUsage": ["2.0"]
              }
            },
            {
              "terms": {
                "cpuUsage": ["2.5"]
              }
            }
          ],
          "minimum_should_match": 1
        }
      },
      "docvalue_fields": [
        "cpuUsage",
        "timestamp"
      ]
    }'
说明:

此查询条件类似于 timestamp>='2017-11-06 23:00:00' AND timestamp<'2018-11-06 23:05:00' AND region='gz' AND (cpuUsage=2.0 or cpuUsage=2.5)。minimum_should_match 参数的意义在于设置 cpuUsage=2.0 和 cpuUsage=2.5 至少匹配的个数,系统默认 minimum_should_match 为0。

3.NOT 条件 CURL 示例说明

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    {
      "query": {
        "bool": {
          "filter": [
                    {
              "range": {
                "timestamp": {
                  "format": "yyyy-MM-dd HH:mm:ss",
                  "gte": "2017-11-06 23:00:00",
                  "lt": "2018-11-06 23:05:00",
                  "time_zone":"+08:00"
                }

              }
            },
            {
              "terms": {
                "region": ["gz"]
              }
            }
          ],
          "must_not": [
           {
              "terms": {
                "cpuUsage": ["2.0"]
              }
            }
          ]
        }
      },
      "docvalue_fields": [
        "cpuUsage",
        "timestamp"
          ]
      }'
说明:

此查询条件类似于 timestamp>=2017-11-06 23:00:00 AND timestamp<2018-11-06 23:05:00 AND region='gz' AND cpuUsage !=2.0。

返回结果集处理

1. From/Size

通过设置 From 和 Size 关键字可对查询结果进行分页。From 关键字定义了查询结果中第一条数据的偏移量,Size 关键字配置返回结果的最大条数。From 系统默认为 0,Size 系统默认为 10,From 与 Size 的总和系统默认不能超过 65536。若想要更大的返回结果,请参考本文的 scroll 关键字查询。

CURL 示例说明:

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    {
        "from": 0,
        "size": 5,
        "query": {
            "bool": {
                "filter": {
                    "range": {
                        "timestamp": {
                            "gte": "01/01/2018",
                            "lte": "03/01/2018",
                            "format": "dd/MM/yyyy",
                            "time_zone":"+08:00"
                        }
                    }
                },
                "must_not": {
                    "terms": {
                        "region": ["bj"]
                    }
                }
            }
        }
    }'

2. Scroll

Scroll 可以理解为关系型数据库里的 cursor,因此 scroll 并不适合用来做实时搜索,而更适用于后台批处理任务。
可以把 scroll 分为初始化和遍历两个阶段,初始化时将所有符合搜索条件的搜索结果缓存起来,类似于快照,在遍历时,从这个快照里取数据。注意,在 Scroll 初始化后对 metric 的插入、删除、更新都不会影响遍历结果。
初始化阶段可通过 size 关键字指定返回结果集的大小,size 默认值为10,最大值为65536;初始化阶段可通过 query 关键字指定查询条件;初始化阶段可通过 docvalue_fields 关键字指定返回字段。初始化阶段和遍历阶段都返回_scroll_id 字段,后续遍历可以通过指定前一次遍历返回的 _scroll_id 来进行新的遍历,直到返回结果为空;初始化和遍历两个阶段都可以通过指定 scroll 参数来设定遍历的上下文环境保留时间,过期后 scroll_id 将变得无效。时间格式如下表所示:

格式 描述
d days
h hours
m minutes
s seconds
ms milliseconds
micros microseconds
nanos nanoseconds

CURL 示例说明:

scroll 初始化:

    curl -u root:le201909 -X POST 172.16.345.14:9201/ctsdb_test/_search?scroll=1m -d'
    {
       "size":5,
       "query": {
       "bool": {
         "filter": [
           {
             "terms": {
               "region": ["gz"]
             }
           }
         ]
       }
     },
    "docvalue_fields": [
          "cpuUsage",
        "region",
        "timestamp"
        ]
    }'

scroll 初始化返回:

    {
      "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoAwAAAAAADrOFFm5YSEhnMjdnUWNPcndHS1k5Wjc3bHcAAAAAAAz_1RZiRkZTcGp4dFRXR18xMGtzSmhEUFJRAAAAAAAP5vQWOXFOR29lc0hROHFWMmFGTkVmSkxmZw==",
      "took": 10641,
      "timed_out": false,
      "_shards": {
       "total": 3,
       "successful": 3,
       "skipped": 0,
       "failed": 0
      },
      "hits": {
        "total": 1592072666,
        "max_score": 0.65708643,
        "hits": [
          {
            "_index": "ctsdb_test@0_-1",
            "_type": "doc",
            "_id": "oyylNU0U65cZjByyt7sW_JmPPgAACY4Bh0",
            "_score": 0.65708643,
            "_routing": "354d14eb",
            "fields": {
              "region": [
                "gz"
              ],
              "cpuUsage": [
                "2.0"
              ],
              "timestamp": [
                1509909300000
              ]
            }
          },
          {
            "_index": "ctsdb_test@0_-1",
            "_type": "doc",
            "_id": "oyykFN0yd1d9NDPfzjRdrJ8whQAACYsBqc",
            "_score": 0.65708643,
            "_routing": "14dd3277",
            "fields": {
              "region": [
                "gz"
              ],
              "cpuUsage": [
                "1.8"
              ],
              "timestamp": [
                1509908340000
              ]
            }
          },
          {
            "_index": "ctsdb_test@0_-1",
            "_type": "doc",
            "_id": "oyylHLp9jl_3sF4N2rnh67h4SgAABHIBso",
            "_score": 0.65708643,
            "_routing": "1cba7d8e",
            "fields": {
              "region": [
                "gz"
              ],
              "cpuUsage": [
                "2.5"
              ],
              "timestamp": [
                1509909720000
              ]
            }
          },
          {
            "_index": "ctsdb_test@0_-1",
            "_type": "doc",
            "_id": "oyylH2JsOKnFHGUinQ7jM-ZwkgAAAvcBso",
            "_score": 0.65708643,
            "_routing": "1f626c38",
            "fields": {
              "region": [
                "gz"
              ],
              "cpuUsage": [
                "2.1"
              ],
              "timestamp": [
                1509909720000
              ]
            }
          },
          {
            "_index": "ctsdb_test@0_-1",
            "_type": "doc",
            "_id": "oyylHLp9jl_3sF4N2rnh67h4SgAABGsBso",
            "_score": 0.65708643,
            "_routing": "1cba7d8e",
            "fields": {
              "region": [
                "gz"
              ],
              "cpuUsage": [
                "2.0"
              ],
              "timestamp": [
                1509909720000
              ]
            }
          }
        ]
          }
    }

scroll 遍历:

    curl -u root:le201909 -X POST 172.16.345.14:9201/_search/scroll  -d'
    {
    "scroll" : "1m", 
    "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAADrOFFm5YSEhnMjdnUWNPcndHS1k5Wjc3bHcAAAAAAAz_1RZiRkZTcGp4dFRXR18xMGtzSmhEUFJRAAAAAAAP5vQWOXFOR29lc0hROHFWMmFGTkVmSkxmZw==" 
    }'
说明:

此请求中的 scroll_id 是 scroll 初始化返回的 scrollid 值。而下一次遍历则需要将 scroll_id 参数调整为上一次遍历返回的 scrollid 值,即每次请求中的 scroll_id 参数是上一次请求返回的 scrollid 值,直到返回结果为空,遍历结束。

3. Sort

Sort 关键字主要用于对查询结果进行排序。排序方式有 asc 和 desc 两种,CTSDB 对于用户自定义字段的默认排序方式为 asc。排序模式有 min、max、sum、avg、median。其中 sum、avg、median 只适用于类型为 array 并存储数字的字段。

CURL 示例说明:

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    {
        "query": {
            "bool": {
                "must": {
                    "range": {
                        "timestamp": {
                            "gte": "01/01/2018",
                            "lte": "03/01/2018",
                            "format": "MM/dd/yyyy",
                            "time_zone":"+08:00"
                        }
                    }
                }
            }
        },
        "sort": [
            {
                "cpuUsage": {
                    "order": "asc",
                    "mode": "min"
                }
            },
            {
                "timestamp": {
                    "order": "asc"
                }
            }, 
            "diskUsage"
        ]
    }'

4. docvalue_fields

docvalue_fields 关键词指定需要返回的字段名称,需要以数组的形式指定。

CURL 示例说明:

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    {
        "query": {
            "terms": {
                "region": ["sh", "bj"]
            }
        },
        "docvalue_fields": ["timestamp", "cpuUsage"]
    }'

聚合查询

agg 关键字主要用于构造聚合查询。用户可到返回的 aggregations 字段取聚合结果。聚合返回字段说明详见下表。如果只关注聚合结果,请在查询时设置 size 参数为0。

字段名称 描述
hits 返回匹配的查询结果,其中 total 字段表示参与聚合的数据条数。里面的 hits 字段为数组结构,若无指定,则包含所有查询结果的前十条。hits 数组里的每个结果中包含_index(查询涉及到的子表),若查询中指定了 docvalue_fields,则会返回 fields字段具体指明每个字段的值。
took 整个查询耗费的毫秒数。
_shards 参与查询的分片数。其中 total 标识总分片数,successful 标识执行成功的数量,failed 标识执行失败的数量,skipped标识跳过执行的分片数。
timed_out 查询是否超时,取值有false和true。
aggregations 聚合返回结果。

下面列举几种常用的聚合方式。

普通聚合

普通聚合需要指定聚合名称、聚合方式(常用的聚合方式有 min、max、avg、value_count、sum 等)和聚合所作用的字段。

CURL 示例说明:

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    {
        "size":0,
        "query": {
            "terms": {
                "region": ["sh", "bj"]
            }
        },
        "aggs": {
            "myname": {
                "max": {
                    "field": "cpuUsage"
                }
            }
        }
    }'
说明:

上例中对字段 cpuUsage 做 max 聚合(用户也可指定 min、avg 等),返回结果取别名 myname(用户可任意指定该名称)。

返回结果:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 20,
    "successful": 20,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 7,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "myname": {
      "value": 4
    }
  }
}

terms 聚合

terms 聚合主要用于查询某个字段的所有唯一值以及该值的个数,用户可指定唯一值返回时的排序规则,以及返回结果数,并且可对参与聚合的数据字段进行模糊或者精确匹配,详情请参考如下示例,使用 filter_path 参数可定制返回结果字段,使用方法请参考 批量查询数据

CURL 示例说明:

请求:

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search?filter_path=aggregations -d'
    {
        "aggs": {
            "myname": {
                "terms": {"field":"region"}
            }
        }
    }'

返回:

    {
      "aggregations": {
        "myname": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "sh",
              "doc_count": 10
            },
            {
              "key": "Motor_sports",
              "doc_count": 6
            },
            {
              "key": "gz",
              "doc_count": 3
            },
            {
              "key": "bj",
              "doc_count": 2
            },
            {
              "key": "cd",
              "doc_count": 2
            },
            {
              "key": "Winter_sports",
              "doc_count": 1
            },
            {
              "key": "water_sports",
              "doc_count": 1
            }
          ]
        }
      }
    }
说明:

上例表示查看 ctsdb_test 的 region 字段中所有的唯一值及其出现的次数。分析返回字段 aggregations 中的 buckets 字段发现,region 字段共有7种类型的值,分别是 sh、Motor_sports、gz、bj、cd、Winter_sports、water_sports, 并且返回字段通过 doc_count 指明了每种值的个数。

用户可通过 size 字段指定返回的唯一值的个数,例如某字段名为 region,其唯一值有7个,可通过设置 size 字段为5指定只返回前5个,具体请参考示例。

请求:

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search?filter_path=aggregations -d'
    {
        "aggs": {
            "myname": {
                "terms": {
                   "field":"region",
                    "size":5
                }
            }
        }
    }'

返回:

    {
      "aggregations": {
        "myname": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 2,
          "buckets": [
            {
              "key": "sh",
              "doc_count": 10
            },
            {
              "key": "Motor_sports",
              "doc_count": 6
            },
            {
              "key": "gz",
              "doc_count": 3
            },
            {
              "key": "bj",
              "doc_count": 2
            },
            {
              "key": "cd",
              "doc_count": 2
            }
          ]
        }
      }
    }

用户可对返回结果进行排序,详情请参考如下示例。
1.请求(返回结果按照唯一值的个数降序排列):

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search?filter_path=aggregations -d'
    {
      "aggs": {
          "myname": {
              "terms": {
                "field":"region",
                "order":{"_count":"desc"}
              }
          }
      }
    }'

返回:

   {
      "aggregations": {
        "myname": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "sh",
              "doc_count": 10
            },
            {
              "key": "Motor_sports",
              "doc_count": 6
            },
            {
              "key": "gz",
              "doc_count": 3
            },
            {
              "key": "bj",
              "doc_count": 2
            },
            {
              "key": "cd",
              "doc_count": 2
            },
            {
              "key": "Winter_sports",
              "doc_count": 1
            },
            {
              "key": "water_sports",
              "doc_count": 1
            }
          ]
        }
      }
    }

2.请求(返回结果按照唯一值的字母升序进行排列):

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search?filter_path=aggregations -d'
    {
      "aggs": {
          "myname": {
              "terms": {
                "field":"region",
                "order":{"_term":"asc"}
              }
          }
      }
    }'

返回:

    {
      "aggregations": {
        "myname": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "Motor_sports",
              "doc_count": 6
            },
            {
              "key": "Winter_sports",
              "doc_count": 1
            },
            {
              "key": "bj",
              "doc_count": 2
            },
            {
              "key": "cd",
              "doc_count": 2
            },
            {
              "key": "gz",
              "doc_count": 3
            },
            {
              "key": "sh",
              "doc_count": 10
            },
            {
              "key": "water_sports",
              "doc_count": 1
            }
          ]
        }
      }
    }

用户可设置正则表达式对参与 terms 聚合的数据字段进行模糊匹配或者指定字段进行精确匹配,详情请参考示例。

模糊匹配示例:
请求(只针对 region 字段中有 sport 并且不是以 water_ 开头的数据进行聚合并返回结果):

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search?filter_path=aggregations -d'
    {
      "aggs": {
          "myname": {
              "terms": {
                "field":"region",
                "include" : ".*sport.*",
                  "exclude" : "water_.*"
              }
          }
      }
    }'

返回:

    {
      "aggregations": {
        "myname": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "Motor_sports",
              "doc_count": 6
            },
            {
              "key": "Winter_sports",
              "doc_count": 1
            }
          ]
        }
      }
    }

精确匹配示例
请求(region_zone 指定值为"sh", "bj","cd","gz"的 region 字段进行聚合,而 region_sports 指定值不为"sh", "bj","cd","gz"的 region 字段进行聚合)

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search?filter_path=aggregations -d'
    {
    "aggs" : {
      "region_zone" : {
        "terms" : {
          "field" : "region",
          "include" : ["sh", "bj","cd","gz"]
          }
        },
        "region_sports" : {
         "terms" : {
             "field" : "region",
             "exclude" : ["sh", "bj","cd","gz"]
         }
        }
      }
    }'

返回:

    {
      "aggregations": {
        "region_sport": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "Motor_sports",
              "doc_count": 6
            },
            {
              "key": "Winter_sports",
              "doc_count": 1
            },
            {
              "key": "water_sports",
              "doc_count": 1
            }
          ]
        },
        "region_zone": {
          "doc_count_error_upper_bound": 0,
          "sum_other_doc_count": 0,
          "buckets": [
            {
              "key": "sh",
              "doc_count": 10
            },
            {
              "key": "gz",
              "doc_count": 3
            },
            {
              "key": "bj",
              "doc_count": 2
            },
            {
              "key": "cd",
              "doc_count": 2
            }
          ]
        }
      }
    }

Date Histogram 聚合

Date Histogram 主要对日期做直方图聚合。
CURL 示例说明:

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    {
        "query": {
            "terms": {
                "region": ["sh", "bj"]
            }
        },
        "aggs": {
            "time_1h_agg": {
                "date_histogram": {
                    "field": "timestamp",
                    "interval": "1h"
                },
                "aggs": {
                    "avgCpuUsage": {
                        "avg": {
                            "field": "cpuUsage"
                        }
                    }
                }
            }
        }
    }'

返回:

    {
      "took": 5,
      "timed_out": false,
      "_shards": {
        "total": 20,
        "successful": 20,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 6,
        "max_score": 0.074107975,
        "hits": [
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2QtGR5xcjRaw2ETf-",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2QtGR5xcjRaw2ETf_",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2Q5Xr5xcjRaw2ETgA",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2Q5Xr5xcjRaw2ETgB",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2RGfF5xcjRaw2ETgC",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2RGfF5xcjRaw2ETgD",
            "_score": 0.074107975,
            "_routing": "sh"
          }
        ]
      },
      "aggregations": {
        "time_1h_agg": {
          "buckets": [
            {
              "key_as_string": "1520222400",
              "key": 1520222400000,
              "doc_count": 1,
              "avgCpuUsage": {
                "value": 2.5
              }
            },
            {
              "key_as_string": "1520226000",
              "key": 1520226000000,
              "doc_count": 0,
              "avgCpuUsage": {
                "value": null
              }
            },
            {
              "key_as_string": "1520229600",
              "key": 1520229600000,
              "doc_count": 0,
              "avgCpuUsage": {
                "value": null
              }
            },
            {
              "key_as_string": "1520233200",
              "key": 1520233200000,
              "doc_count": 0,
              "avgCpuUsage": {
                "value": null
              }
            },
            {
              "key_as_string": "1520236800",
              "key": 1520236800000,
              "doc_count": 0,
              "avgCpuUsage": {
                "value": null
              }
            },
            {
              "key_as_string": "1520240400",
              "key": 1520240400000,
              "doc_count": 0,
              "avgCpuUsage": {
                "value": null
              }
            },
            {
              "key_as_string": "1520244000",
              "key": 1520244000000,
              "doc_count": 0,
              "avgCpuUsage": {
                "value": null
              }
            },
            {
              "key_as_string": "1520247600",
              "key": 1520247600000,
              "doc_count": 1,
              "avgCpuUsage": {
                "value": 2
              }
            },
            {
              "key_as_string": "1520251200",
              "key": 1520251200000,
              "doc_count": 4,
              "avgCpuUsage": {
                "value": 2.25
              }
            }
          ]
        }
      }
    }
说明:

上例是对字段 cpuUsage 进行粒度为1小时的 date_histogram 聚合。返回结果中总的聚合名称为 time_1h_agg(用户可任意指定该名称),每个时间分段内的聚合名称为 avgCpuUsage(用户可任意指定该名称)。interval 字段可选的时间粒度有 year、quarter、 month、week、day、hour、minute、second,用户也可将时间粒度用时间单位来表示,例如1y代表1year,1h代表1hour。系统不支持小数粒度的时间单位,例如1.5h您需要转换为90m。

Percentiles 聚合

Percentiles 聚合即百分位聚合。百分位可以任意指定,系统默认的百分位是 1、5、25、50、75、95、99。

CURL 示例说明:

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    {
        "query": {
            "terms": {
                "region": ["sh", "bj"]
            }
        },
        "aggs": 
        {
            "myname": 
            {
                "percentiles":
                {
                    "field": "cpuUsage",
                    "percents": [1,25,50,70,99]
                }
            }
        }
    }'

返回:

    {
      "took": 18,
      "timed_out": false,
      "_shards": {
        "total": 20,
        "successful": 20,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 6,
        "max_score": 0.074107975,
        "hits": [
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2QtGR5xcjRaw2ETf-",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2QtGR5xcjRaw2ETf_",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2Q5Xr5xcjRaw2ETgA",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2Q5Xr5xcjRaw2ETgB",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2RGfF5xcjRaw2ETgC",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2RGfF5xcjRaw2ETgD",
            "_score": 0.074107975,
            "_routing": "sh"
          }
        ]
      },
      "aggregations": {
        "myname": {
          "values": {
            "1.0": 2,
            "25.0": 2,
            "50.0": 2.25,
            "70.0": 2.5,
            "99.0": 2.5
          }
        }
      }
    }
说明:

上例中是对字段 cpuUsage 做 percentiles 聚合,选取的百分位为 1,25,50,70,99。聚合返回结果的别名是 myname(用户可任意指定该名称)。

Cardinality 聚合

Cardinality 聚合主要用于统计去重后的数量。默认情况下,当聚合结果小于等于3000时,Cardinality 返回的结果是精确值,大于3000时,为近似值。

CURL 示例说明:

    curl -u root:le201909 -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
    {
        "query": {
            "terms": {
                "region": ["sh", "bj"]
            }
        },
        "aggs": 
        {
            "myname": 
            {
                "cardinality":
                {
                    "field": "cpuUsage"
                }
            }
        }
    }'

返回:

    {
      "took": 15,
      "timed_out": false,
      "_shards": {
        "total": 20,
        "successful": 20,
        "skipped": 0,
        "failed": 0
      },
      "hits": {
        "total": 6,
        "max_score": 0.074107975,
        "hits": [
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2QtGR5xcjRaw2ETf-",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2QtGR5xcjRaw2ETf_",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2Q5Xr5xcjRaw2ETgA",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2Q5Xr5xcjRaw2ETgB",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2RGfF5xcjRaw2ETgC",
            "_score": 0.074107975,
            "_routing": "sh"
          },
          {
            "_index": "ctsdb_test@1520092800000_3",
            "_type": "doc",
            "_id": "AWH2RGfF5xcjRaw2ETgD",
            "_score": 0.074107975,
            "_routing": "sh"
          }
        ]
      },
      "aggregations": {
        "myname": {
          "value": 2
        }
      }
    }
说明:

上例的聚合是对字段 cpuUsage 做 cardinality 聚合,返回的聚合结果取别名 myname(用户可任意指定该名称)。