前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Elasticsearch探索:7.0版本精确的总 hits 数

Elasticsearch探索:7.0版本精确的总 hits 数

原创
作者头像
HLee
修改2020-12-31 18:06:46
1.5K0
修改2020-12-31 18:06:46
举报
文章被收录于专栏:房东的猫房东的猫

简介

从 Elasticsearch 7.0之后,为了提高搜索的性能,在 hits 字段中返回的文档数有时不是最精确的数值。Elasticsearch 限制了最多的数值为10000。

当文档的数值大于10000时,返回的 total 数值为10000,并在 relation 中指出 gte。

实操

启动elasticsearch-7.2.1版本+kibana-7.2.1版本

选中“Add data”:

这样我们就把Sample flight data的数据加载到Elasticsearch中去了。

我们在Dev tools中来查询我们的文档个数:

代码语言:javascript
复制
GET kibana_sample_data_flights/_count

{
  "count" : 13059,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}

我们可以看到有13059个数值。假如我们使用如下的方式来进行搜索的话:

代码语言:javascript
复制
GET kibana_sample_data_flights/_search

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 13059,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "kibana_sample_data_flights",
        "_type" : "_doc",
        "_id" : "ZOgJuHYBrB5g-f-njyTu",
        "_score" : 1.0,
        "_source" : {
          "FlightNum" : "9HY9SWR",
          "DestCountry" : "AU",
          "OriginWeather" : "Sunny",
          "OriginCityName" : "Frankfurt am Main",
          "AvgTicketPrice" : 841.2656419677076,
          "DistanceMiles" : 10247.856675613455,
          "FlightDelay" : false,
          "DestWeather" : "Rain",
          "Dest" : "Sydney Kingsford Smith International Airport",
          "FlightDelayType" : "No Delay",
          "OriginCountry" : "DE",
          "dayOfWeek" : 0,
          "DistanceKilometers" : 16492.32665375846,
          "timestamp" : "2020-12-21T00:00:00",
          "DestLocation" : {
            "lat" : "-33.94609833",
            "lon" : "151.177002"
          },
          "DestAirportID" : "SYD",
          "Carrier" : "Kibana Airlines",
          "Cancelled" : false,
          "FlightTimeMin" : 1030.7704158599038,
          "Origin" : "Frankfurt am Main Airport",
          "OriginLocation" : {
            "lat" : "50.033333",
            "lon" : "8.570556"
          },
          "DestRegion" : "SE-BD",
          "OriginAirportID" : "FRA",
          "OriginRegion" : "DE-HE",
          "DestCityName" : "Sydney",
          "FlightTimeHour" : 17.179506930998397,
          "FlightDelayMin" : 0
        }
      }
    ]
  }
}

显然我们得到的文档的数目是10000个,但是它并不是我们的实际的满足条件的所有文档数。假如我们想得到所有的文档数,那么我们可以做如下的方式:

代码语言:javascript
复制
GET kibana_sample_data_flights/_search
{
  "track_total_hits":true
}

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "kibana_sample_data_flights",
        "_type" : "_doc",
        "_id" : "ZOgJuHYBrB5g-f-njyTu",
        "_score" : 1.0,
        "_source" : {
          "FlightNum" : "9HY9SWR",
          "DestCountry" : "AU",
          "OriginWeather" : "Sunny",
          "OriginCityName" : "Frankfurt am Main",
          "AvgTicketPrice" : 841.2656419677076,
          "DistanceMiles" : 10247.856675613455,
          "FlightDelay" : false,
          "DestWeather" : "Rain",
          "Dest" : "Sydney Kingsford Smith International Airport",
          "FlightDelayType" : "No Delay",
          "OriginCountry" : "DE",
          "dayOfWeek" : 0,
          "DistanceKilometers" : 16492.32665375846,
          "timestamp" : "2020-12-21T00:00:00",
          "DestLocation" : {
            "lat" : "-33.94609833",
            "lon" : "151.177002"
          },
          "DestAirportID" : "SYD",
          "Carrier" : "Kibana Airlines",
          "Cancelled" : false,
          "FlightTimeMin" : 1030.7704158599038,
          "Origin" : "Frankfurt am Main Airport",
          "OriginLocation" : {
            "lat" : "50.033333",
            "lon" : "8.570556"
          },
          "DestRegion" : "SE-BD",
          "OriginAirportID" : "FRA",
          "OriginRegion" : "DE-HE",
          "DestCityName" : "Sydney",
          "FlightTimeHour" : 17.179506930998397,
          "FlightDelayMin" : 0
        }
      }
    ]
  }
}

我们在请求的参数中加入 track_total_hits,并设置为true,那么我们可以看到在返回的参数中,它正确地显示了所有满足条件的文档个数。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 简介
  • 实操
相关产品与服务
Elasticsearch Service
腾讯云 Elasticsearch Service(ES)是云端全托管海量数据检索分析服务,拥有高性能自研内核,集成X-Pack。ES 支持通过自治索引、存算分离、集群巡检等特性轻松管理集群,也支持免运维、自动弹性、按需使用的 Serverless 模式。使用 ES 您可以高效构建信息检索、日志分析、运维监控等服务,它独特的向量检索还可助您构建基于语义、图像的AI深度应用。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档