有奖调研

常用查询示例

最近更新时间:2021-07-06 19:06:29

本文主要介绍 CTSDB 中常用查询及关键字的使用,并给出简单 CURL 示例。所有示例使用的 metric 结构如下所示:

   {
       "ctsdb_test" : {
         "tags" : {
           "region" : "string"
         },
         "time" : {
           "name" : "timestamp",
           "format" : "epoch_millis"
         },
         "fields" : {
           "cpuUsage" : "float",
           "diskUsage" : "string",
           "dcpuUsage" : "integer"
         },
         "options" : {
           "expire_day" : 7,
           "refresh_interval" : "10s",
           "number_of_shards" : 5
         }
       }
   }

组合查询

这里的组合查询即可以用于单查询也可用于复合查询。查询体中的 query 关键字通过 Query DSL(Domain Specific Language)来定义查询条件。
下文会先介绍如何构造过滤条件,再介绍怎样组合过滤条件和如何处理返回结果集。

常用过滤条件

1. Range

Range 即区间查询,Range 查询支持的字段类型包括 string、long、integer、short、double、float、date。Range 查询可包含的参数如下表所示:

参数名称 描述
gte 大于或等于
gt 大于
lte 小于或等于
lt 小于
说明:

当 Range 查询涉及时间类型时,可通过 format 参数来指定时间格式。具体时间格式请参考 新建 metric

时间范围查询的 CURL 示例:

   curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
   {
       "query": {
           "range": {
               "timestamp": {
                   "gte": "01/01/2018",
                   "lte": "03/01/2018",
                   "format": "MM/dd/yyyy",
                   "time_zone":"+08:00"
                 }
             }
         }
    }'

数字范围查询 CURL 示例:

curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
{
    "query": {
        "range": {
            "cpuUsage": {
                "gte": 1.0,
                "lte": 10.0
            }
        }
    }
}'  

2. Terms

Terms 关键字用于查询时匹配特定字段,其值需要用中括号包裹。

CURL 示例说明:

   curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
   { 
       "query": {
           "terms": {
               "region": ["sh", "bj"]
           }
       }
   }'

过滤条件组合

复合查询常使用 Bool 关键字来组合多个查询条件,常见的用于 Bool 查询的组合关键字有 filter(类似于 AND)、must_not(类似于 NOT)、should(类似于 OR)。为提高查询性能,请务必加上 time 字段的 range 查询,且 time 字段,不论查询时以何种方式写入,返回值统一为 epoch_millis 格式。query-bool-filter 的组合形式可显著提升查询性能,请务必采用这种查询方式。

1. AND 条件 CURL 示例说明

   curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
   {
     "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "timestamp": {
              "format": "yyyy-MM-dd HH:mm:ss",
              "gte": "2017-11-06 23:00:00",
              "lt": "2018-03-06 23:05:00",
              "time_zone":"+08:00"
            }

          }
        },
        {
          "terms": {
            "region": ["sh"]
          }
        },
        {
          "terms": {
            "cpuUsage": ["2.0"]
          }
        }
      ]
    }
  },
      "docvalue_fields": [
    "cpuUsage",
    "timestamp"
       ]
}'
说明:

此查询条件类似于 timestamp>='2017-11-06 23:00:00' AND timestamp<'2018-03-06 23:05:00' AND region=sh AND cpuUsage=2.0。

2. OR 条件 CURL 示例说明

   curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
   {
     "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "timestamp": {
              "format": "yyyy-MM-dd HH:mm:ss",
              "gte": "2017-11-06 23:00:00",
              "lt": "2018-11-06 23:05:00",
              "time_zone":"+08:00"
            }
          }
        },
        {
          "term": {
            "region": "gz"
          }
        }
      ],
      "should": [
        {
          "terms": {
            "cpuUsage": ["2.0"]
          }
        },
        {
          "terms": {
            "cpuUsage": ["2.5"]
          }
        }
      ],
      "minimum_should_match": 1
    }
     },
     "docvalue_fields": [
    "cpuUsage",
    "timestamp"
     ]
}'
说明:

此查询条件类似于 timestamp>='2017-11-06 23:00:00' AND timestamp<'2018-11-06 23:05:00' AND region='gz' AND (cpuUsage=2.0 or cpuUsage=2.5)。minimum_should_match 参数的意义在于设置 cpuUsage=2.0 和 cpuUsage=2.5 至少匹配的个数,系统默认 minimum_should_match 为0。

3.NOT 条件 CURL 示例说明

curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
{
     "query": {
    "bool": {
      "filter": [
                {
          "range": {
            "timestamp": {
              "format": "yyyy-MM-dd HH:mm:ss",
              "gte": "2017-11-06 23:00:00",
              "lt": "2018-11-06 23:05:00",
              "time_zone":"+08:00"
            }

          }
        },
        {
          "terms": {
            "region": ["gz"]
          }
        }
      ],
      "must_not": [
       {
          "terms": {
            "cpuUsage": ["2.0"]
          }
        }
      ]
    }
     },
     "docvalue_fields": [
    "cpuUsage",
    "timestamp"
         ]
     }'
说明:

此查询条件类似于 timestamp>=2017-11-06 23:00:00 AND timestamp<2018-11-06 23:05:00 AND region='gz' AND cpuUsage !=2.0。

返回结果集处理

1. From/Size

通过设置 From 和 Size 关键字可对查询结果进行分页。From 关键字定义了查询结果中第一条数据的偏移量,Size 关键字配置返回结果的最大条数。From 系统默认为 0,Size 系统默认为 10,From 与 Size 的总和系统默认不能超过 65536。若想要更大的返回结果,请参考本文的 scroll 关键字查询。

CURL 示例说明:

curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
{
    "from": 0,
    "size": 5,
    "query": {
        "bool": {
            "filter": {
                "range": {
                    "timestamp": {
                        "gte": "01/01/2018",
                        "lte": "03/01/2018",
                        "format": "dd/MM/yyyy",
                        "time_zone":"+08:00"
                    }
                }
            },
            "must_not": {
                "terms": {
                    "region": ["bj"]
                }
            }
        }
    }
}'

2. Scroll

Scroll 可以理解为关系型数据库里的 cursor,因此 scroll 并不适合用来做实时搜索,而更适用于后台批处理任务。
可以把 scroll 分为初始化和遍历两个阶段,初始化时将所有符合搜索条件的搜索结果缓存起来,类似于快照,在遍历时,从这个快照里取数据。注意,在 Scroll 初始化后对 metric 的插入、删除、更新都不会影响遍历结果。
初始化阶段可通过 size 关键字指定返回结果集的大小,size 默认值为10,最大值为65536;初始化阶段可通过 query 关键字指定查询条件;初始化阶段可通过 docvalue_fields 关键字指定返回字段。初始化阶段和遍历阶段都返回_scroll_id 字段,后续遍历可以通过指定前一次遍历返回的 _scroll_id 来进行新的遍历,直到返回结果为空;初始化和遍历两个阶段都可以通过指定 scroll 参数来设定遍历的上下文环境保留时间,过期后 scroll_id 将变得无效。时间格式如下表所示:

格式 描述
d days
h hours
m minutes
s seconds
ms milliseconds
micros microseconds
nanos nanoseconds

CURL 示例说明:
scroll 初始化:

   curl -u root:le201909 -H 'Content-Type:application/json' -X POST 172.16.345.14:9201/ctsdb_test/_search?scroll=1m -d'
   {
      "size":5,
      "query": {
      "bool": {
        "filter": [
          {
            "terms": {
              "region": ["gz"]
            }
          }
        ]
      }
    },
   "docvalue_fields": [
         "cpuUsage",
       "region",
       "timestamp"
       ]
}'

scroll 初始化返回:

   {
     "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoAwAAAAAADrOFFm5YSEhnMjdnUWNPcndHS1k5Wjc3bHcAAAAAAAz_1RZiRkZTcGp4dFRXR18xMGtzSmhEUFJRAAAAAAAP5vQWOXFOR29lc0hROHFWMmFGTkVmSkxmZw==",
     "took": 10641,
     "timed_out": false,
     "_shards": {
      "total": 3,
      "successful": 3,
      "skipped": 0,
      "failed": 0
     },
     "hits": {
       "total": 1592072666,
       "max_score": 0.65708643,
       "hits": [
         {
           "_index": "ctsdb_test@0_-1",
           "_type": "doc",
           "_id": "oyylNU0U65cZjByyt7sW_JmPPgAACY4Bh0",
           "_score": 0.65708643,
           "_routing": "354d14eb",
           "fields": {
             "region": [
               "gz"
             ],
             "cpuUsage": [
               "2.0"
             ],
             "timestamp": [
               1509909300000
             ]
           }
         },
         {
           "_index": "ctsdb_test@0_-1",
           "_type": "doc",
           "_id": "oyykFN0yd1d9NDPfzjRdrJ8whQAACYsBqc",
           "_score": 0.65708643,
           "_routing": "14dd3277",
           "fields": {
             "region": [
               "gz"
             ],
             "cpuUsage": [
               "1.8"
             ],
             "timestamp": [
               1509908340000
             ]
           }
         },
         {
           "_index": "ctsdb_test@0_-1",
           "_type": "doc",
           "_id": "oyylHLp9jl_3sF4N2rnh67h4SgAABHIBso",
           "_score": 0.65708643,
           "_routing": "1cba7d8e",
           "fields": {
             "region": [
               "gz"
             ],
             "cpuUsage": [
               "2.5"
             ],
             "timestamp": [
               1509909720000
             ]
           }
         },
         {
           "_index": "ctsdb_test@0_-1",
           "_type": "doc",
           "_id": "oyylH2JsOKnFHGUinQ7jM-ZwkgAAAvcBso",
           "_score": 0.65708643,
           "_routing": "1f626c38",
           "fields": {
             "region": [
               "gz"
             ],
             "cpuUsage": [
               "2.1"
             ],
             "timestamp": [
               1509909720000
             ]
           }
         },
         {
           "_index": "ctsdb_test@0_-1",
           "_type": "doc",
           "_id": "oyylHLp9jl_3sF4N2rnh67h4SgAABGsBso",
           "_score": 0.65708643,
           "_routing": "1cba7d8e",
           "fields": {
             "region": [
               "gz"
             ],
             "cpuUsage": [
               "2.0"
             ],
             "timestamp": [
               1509909720000
             ]
           }
         }
       ]
         }
   }

scroll 遍历:

   curl -u root:le201909 -H 'Content-Type:application/json' -X POST 172.16.345.14:9201/_search/scroll  -d'
   {
   "scroll" : "1m", 
   "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoAwAAAAAADrOFFm5YSEhnMjdnUWNPcndHS1k5Wjc3bHcAAAAAAAz_1RZiRkZTcGp4dFRXR18xMGtzSmhEUFJRAAAAAAAP5vQWOXFOR29lc0hROHFWMmFGTkVmSkxmZw==" 
   }'
说明:

  • 此请求中的 scroll_id 是 scroll 初始化返回的 _scroll_id 值。而下一次遍历则需要将 scroll_id 参数调整为上一次遍历返回的 _scroll_id 值,即每次请求中的 scroll_id 参数是上一次请求返回的 _scroll_id 值,直到返回结果为空,遍历结束。
  • 两次遍历返回的 _scroll_id 值可能相同,无法利用 _scroll_id 进行指定页跳转。

3. Sort

Sort 关键字主要用于对查询结果进行排序。排序方式有 asc 和 desc 两种,CTSDB 对于用户自定义字段的默认排序方式为 asc。排序模式有 min、max、sum、avg、median。其中 sum、avg、median 只适用于类型为 array 并存储数字的字段。

CURL 示例说明:

   curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
   {
       "query": {
           "bool": {
               "must": {
                   "range": {
                       "timestamp": {
                           "gte": "01/01/2018",
                           "lte": "03/01/2018",
                           "format": "MM/dd/yyyy",
                           "time_zone":"+08:00"
                       }
                   }
               }
           }
       },
       "sort": [
           {
               "cpuUsage": {
                   "order": "asc",
                   "mode": "min"
               }
           },
           {
               "timestamp": {
                   "order": "asc"
               }
           }, 
           "diskUsage"
       ]
   }'

4. docvalue_fields

docvalue_fields 关键词指定需要返回的字段名称,需要以数组的形式指定。

CURL 示例说明:

   curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
   {
       "query": {
           "terms": {
               "region": ["sh", "bj"]
           }
       },
       "docvalue_fields": ["timestamp", "cpuUsage"]
   }'

聚合查询

agg 关键字主要用于构造聚合查询。用户可到返回的 aggregations 字段取聚合结果。聚合返回字段说明详见下表。如果只关注聚合结果,请在查询时设置 size 参数为0。

字段名称 描述
hits 返回匹配的查询结果,其中 total 字段表示参与聚合的数据条数。里面的 hits 字段为数组结构,若无指定,则包含所有查询结果的前十条。hits 数组里的每个结果中包含_index(查询涉及到的 子表),若查询中指定了 docvalue_fields,则会返回 fields字段具体指明每个字段的值。
took 整个查询耗费的毫秒数。
_shards 参与查询的分片数。其中 total 标识总分片数,successful 标识执行成功的数量,failed 标识执行失败的数量,skipped标识跳过执行的分片数。
timed_out 查询是否超时,取值有false和true。
aggregations 聚合返回结果。

下面列举几种常用的聚合方式。

普通聚合

普通聚合需要指定聚合名称、聚合方式(常用的聚合方式有 min、max、avg、value_count、sum 等)和聚合所作用的字段。

CURL 示例说明:

   curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
   {
       "size":0,
       "query": {
           "terms": {
               "region": ["sh", "bj"]
           }
       },
       "aggs": {
           "myname": {
               "max": {
                   "field": "cpuUsage"
               }
           }
       }
   }'
说明:

上例中对字段 cpuUsage 做 max 聚合(用户也可指定 min、avg 等),返回结果取别名 myname(用户可任意指定该名称)。

返回结果:

{
 "took": 1,
 "timed_out": false,
 "_shards": {
   "total": 20,
   "successful": 20,
   "skipped": 0,
   "failed": 0
 },
 "hits": {
   "total": 7,
   "max_score": 0,
   "hits": []
 },
 "aggregations": {
   "myname": {
     "value": 4
   }
 }
}

terms 聚合

terms 聚合主要用于查询某个字段的所有唯一值以及该值的个数,用户可指定唯一值返回时的排序规则,以及返回结果数,并且可对参与聚合的数据字段进行模糊或者精确匹配,详情请参考如下示例,使用 filter_path 参数可定制返回结果字段,使用方法请参考 批量查询数据

CURL 示例说明:

请求:

curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search?filter_path=aggregations -d'
{
    "aggs": {
        "myname": {
            "terms": {"field":"region"}
        }
    }
}'

返回:

{
  "aggregations": {
    "myname": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "sh",
          "doc_count": 10
        },
        {
          "key": "Motor_sports",
          "doc_count": 6
        },
        {
          "key": "gz",
          "doc_count": 3
        },
        {
          "key": "bj",
          "doc_count": 2
        },
        {
          "key": "cd",
          "doc_count": 2
        },
        {
          "key": "Winter_sports",
          "doc_count": 1
        },
        {
          "key": "water_sports",
          "doc_count": 1
        }
      ]
    }
  }
}
说明:

上例表示查看 ctsdb_test 的 region 字段中所有的唯一值及其出现的次数。分析返回字段 aggregations 中的 buckets 字段发现,region 字段共有7种类型的值,分别是 sh、Motor_sports、gz、bj、cd、Winter_sports、water_sports, 并且返回字段通过 doc_count 指明了每种值的个数。

用户可通过 size 字段指定返回的唯一值的个数,例如某字段名为 region,其唯一值有7个,可通过设置 size 字段为5指定只返回前5个,具体请参考示例。

请求:

curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search?filter_path=aggregations -d'
{
    "aggs": {
        "myname": {
            "terms": {
               "field":"region",
                "size":5
            }
        }
    }
}'

返回:

{
  "aggregations": {
    "myname": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 2,
      "buckets": [
        {
          "key": "sh",
          "doc_count": 10
        },
        {
          "key": "Motor_sports",
          "doc_count": 6
        },
        {
          "key": "gz",
          "doc_count": 3
        },
        {
          "key": "bj",
          "doc_count": 2
        },
        {
          "key": "cd",
          "doc_count": 2
        }
      ]
    }
  }
}

用户可对返回结果进行排序,详情请参考如下示例。
1.请求(返回结果按照唯一值的个数降序排列):

   curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search?filter_path=aggregations -d'
   {
     "aggs": {
         "myname": {
             "terms": {
               "field":"region",
               "order":{"_count":"desc"}
             }
         }
     }
   }'

返回:

  {
     "aggregations": {
       "myname": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
           {
             "key": "sh",
             "doc_count": 10
           },
           {
             "key": "Motor_sports",
             "doc_count": 6
           },
           {
             "key": "gz",
             "doc_count": 3
           },
           {
             "key": "bj",
             "doc_count": 2
           },
           {
             "key": "cd",
             "doc_count": 2
           },
           {
             "key": "Winter_sports",
             "doc_count": 1
           },
           {
             "key": "water_sports",
             "doc_count": 1
           }
         ]
       }
     }
   }

2.请求(返回结果按照唯一值的字母升序进行排列):

   curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search?filter_path=aggregations -d'
   {
     "aggs": {
         "myname": {
             "terms": {
               "field":"region",
               "order":{"_term":"asc"}
             }
         }
     }
   }'

返回:

   {
     "aggregations": {
       "myname": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
           {
             "key": "Motor_sports",
             "doc_count": 6
           },
           {
             "key": "Winter_sports",
             "doc_count": 1
           },
           {
             "key": "bj",
             "doc_count": 2
           },
           {
             "key": "cd",
             "doc_count": 2
           },
           {
             "key": "gz",
             "doc_count": 3
           },
           {
             "key": "sh",
             "doc_count": 10
           },
           {
             "key": "water_sports",
             "doc_count": 1
           }
         ]
       }
     }
   }

用户可设置正则表达式对参与 terms 聚合的数据字段进行模糊匹配或者指定字段进行精确匹配,详情请参考示例。

模糊匹配示例:
请求(只针对 region 字段中有 sport 并且不是以 water_ 开头的数据进行聚合并返回结果):

curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search?filter_path=aggregations -d'
{
 "aggs": {
     "myname": {
         "terms": {
           "field":"region",
           "include" : ".*sport.*",
             "exclude" : "water_.*"
         }
     }
 }
}'

返回:

{
  "aggregations": {
    "myname": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "Motor_sports",
          "doc_count": 6
        },
        {
          "key": "Winter_sports",
          "doc_count": 1
        }
      ]
    }
  }
}

精确匹配示例
请求(region_zone 指定值为"sh", "bj","cd","gz"的 region 字段进行聚合,而 region_sports 指定值不为"sh", "bj","cd","gz"的 region 字段进行聚合)

curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search?filter_path=aggregations -d'
{
"aggs" : {
 "region_zone" : {
   "terms" : {
     "field" : "region",
     "include" : ["sh", "bj","cd","gz"]
     }
   },
   "region_sports" : {
    "terms" : {
        "field" : "region",
        "exclude" : ["sh", "bj","cd","gz"]
    }
   }
 }
}'

返回:

{
  "aggregations": {
    "region_sport": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "Motor_sports",
          "doc_count": 6
        },
        {
          "key": "Winter_sports",
          "doc_count": 1
        },
        {
          "key": "water_sports",
          "doc_count": 1
        }
      ]
    },
    "region_zone": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "sh",
          "doc_count": 10
        },
        {
          "key": "gz",
          "doc_count": 3
        },
        {
          "key": "bj",
          "doc_count": 2
        },
        {
          "key": "cd",
          "doc_count": 2
        }
      ]
    }
  }
}

Date Histogram 聚合

Date Histogram 主要对日期做直方图聚合。
CURL 示例说明:

   curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
   {
       "query": {
           "terms": {
               "region": ["sh", "bj"]
           }
       },
       "aggs": {
           "time_1h_agg": {
               "date_histogram": {
                   "field": "timestamp",
                   "interval": "1h"
               },
               "aggs": {
                   "avgCpuUsage": {
                       "avg": {
                           "field": "cpuUsage"
                       }
                   }
               }
           }
       }
   }'

返回:

   {
     "took": 5,
     "timed_out": false,
     "_shards": {
       "total": 20,
       "successful": 20,
       "skipped": 0,
       "failed": 0
     },
     "hits": {
       "total": 6,
       "max_score": 0.074107975,
       "hits": [
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2QtGR5xcjRaw2ETf-",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2QtGR5xcjRaw2ETf_",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2Q5Xr5xcjRaw2ETgA",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2Q5Xr5xcjRaw2ETgB",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2RGfF5xcjRaw2ETgC",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2RGfF5xcjRaw2ETgD",
           "_score": 0.074107975,
           "_routing": "sh"
         }
       ]
     },
     "aggregations": {
       "time_1h_agg": {
         "buckets": [
           {
             "key_as_string": "1520222400",
             "key": 1520222400000,
             "doc_count": 1,
             "avgCpuUsage": {
               "value": 2.5
             }
           },
           {
             "key_as_string": "1520226000",
             "key": 1520226000000,
             "doc_count": 0,
             "avgCpuUsage": {
               "value": null
             }
           },
           {
             "key_as_string": "1520229600",
             "key": 1520229600000,
             "doc_count": 0,
             "avgCpuUsage": {
               "value": null
             }
           },
           {
             "key_as_string": "1520233200",
             "key": 1520233200000,
             "doc_count": 0,
             "avgCpuUsage": {
               "value": null
             }
           },
           {
             "key_as_string": "1520236800",
             "key": 1520236800000,
             "doc_count": 0,
             "avgCpuUsage": {
               "value": null
             }
           },
           {
             "key_as_string": "1520240400",
             "key": 1520240400000,
             "doc_count": 0,
             "avgCpuUsage": {
               "value": null
             }
           },
           {
             "key_as_string": "1520244000",
             "key": 1520244000000,
             "doc_count": 0,
             "avgCpuUsage": {
               "value": null
             }
           },
           {
             "key_as_string": "1520247600",
             "key": 1520247600000,
             "doc_count": 1,
             "avgCpuUsage": {
               "value": 2
             }
           },
           {
             "key_as_string": "1520251200",
             "key": 1520251200000,
             "doc_count": 4,
             "avgCpuUsage": {
               "value": 2.25
             }
           }
         ]
       }
     }
   }
说明:

上例是对字段 cpuUsage 进行粒度为1小时的 date_histogram 聚合。返回结果中总的聚合名称为 time_1h_agg(用户可任意指定该名称),每个时间分段内的聚合名称为 avgCpuUsage(用户可任意指定该名称)。interval 字段可选的时间粒度有 year、quarter、 month、week、day、hour、minute、second,用户也可将时间粒度用时间单位来表示,例如1y代表1year,1h代表1hour。系统不支持小数粒度的时间单位,例如1.5h您需要转换为90m。

Percentiles 聚合

Percentiles 聚合即百分位聚合。百分位可以任意指定,系统默认的百分位是 1、5、25、50、75、95、99。

CURL 示例说明:

curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
{
    "query": {
        "terms": {
            "region": ["sh", "bj"]
        }
    },
    "aggs": 
    {
        "myname": 
        {
            "percentiles":
            {
                "field": "cpuUsage",
                "percents": [1,25,50,70,99]
            }
        }
    }
}'

返回:

   {
     "took": 18,
     "timed_out": false,
     "_shards": {
       "total": 20,
       "successful": 20,
       "skipped": 0,
       "failed": 0
     },
     "hits": {
       "total": 6,
       "max_score": 0.074107975,
       "hits": [
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2QtGR5xcjRaw2ETf-",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2QtGR5xcjRaw2ETf_",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2Q5Xr5xcjRaw2ETgA",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2Q5Xr5xcjRaw2ETgB",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2RGfF5xcjRaw2ETgC",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2RGfF5xcjRaw2ETgD",
           "_score": 0.074107975,
           "_routing": "sh"
         }
       ]
     },
     "aggregations": {
       "myname": {
         "values": {
           "1.0": 2,
           "25.0": 2,
           "50.0": 2.25,
           "70.0": 2.5,
           "99.0": 2.5
         }
       }
     }
   }
说明:

上例中是对字段 cpuUsage 做 percentiles 聚合,选取的百分位为 1,25,50,70,99。聚合返回结果的别名是 myname(用户可任意指定该名称)。

Cardinality 聚合

Cardinality 聚合主要用于统计去重后的数量。默认情况下,当聚合结果小于等于3000时,Cardinality 返回的结果是精确值,大于3000时,为近似值。

CURL 示例说明:

curl -u root:le201909 -H 'Content-Type:application/json' -X GET 172.16.345.14:9201/ctsdb_test/_search -d'
{
    "query": {
        "terms": {
            "region": ["sh", "bj"]
        }
    },
    "aggs": 
    {
        "myname": 
        {
            "cardinality":
            {
                "field": "cpuUsage"
            }
        }
    }
}'

返回:

   {
     "took": 15,
     "timed_out": false,
     "_shards": {
       "total": 20,
       "successful": 20,
       "skipped": 0,
       "failed": 0
     },
     "hits": {
       "total": 6,
       "max_score": 0.074107975,
       "hits": [
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2QtGR5xcjRaw2ETf-",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2QtGR5xcjRaw2ETf_",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2Q5Xr5xcjRaw2ETgA",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2Q5Xr5xcjRaw2ETgB",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2RGfF5xcjRaw2ETgC",
           "_score": 0.074107975,
           "_routing": "sh"
         },
         {
           "_index": "ctsdb_test@1520092800000_3",
           "_type": "doc",
           "_id": "AWH2RGfF5xcjRaw2ETgD",
           "_score": 0.074107975,
           "_routing": "sh"
         }
       ]
     },
     "aggregations": {
       "myname": {
         "value": 2
       }
     }
   }
说明:

上例的聚合是对字段 cpuUsage 做 cardinality 聚合,返回的聚合结果取别名 myname(用户可任意指定该名称)。

目录