专栏首页腾讯云Elasticsearch ServiceElasticsearch: 运用 Field collapsing 来减少基于单个字段的搜索结果

Elasticsearch: 运用 Field collapsing 来减少基于单个字段的搜索结果

腾讯云 Elasticsearch Service】高可用,可伸缩,云端全托管。集成X-Pack高级特性,适用日志分析/企业搜索/BI分析等场景


允许根据字段值折叠搜索结果。 折叠是通过每个折叠键仅选择排序最靠前的文档来完成的。要想理解这个其实也并不难,我们就那百度音乐的页面例子来说:

我们可以看到在上面的页面中,它有展示很多喜欢的歌曲。其实这个歌曲可能是一个专辑里的最突出的一个。当我们做页面的时候,我们没有必要把一个专辑里所有的歌曲都放到这个封面的位置。我也许就只想放这个专辑里点击率最高的或者是最受欢迎的一首歌作为这个专辑的代表。当我们点击这个专辑的时候,我们还可以看到其它在这个专辑里的歌曲:

Field collapsing 就是为这个而生。这种情况也适用于有些新闻头条出现在标题栏中。当我们点击进去过,可以看到更多的相关类别的新闻。

下面我们来通过一个例子来展示如何使用。

准备数据

今天我们使用的数据是一个最好游戏的一个数据。我们可以从我的 github 项目里把这个数据下载下来:

git clon https://github.com/liu-xiao-guo/best_games_json_data

然后,我们通过如下的方式把我们下载的JSON数据导入到Elasticsearch中:

我们把这个index的名字叫做best_games:

这样我们的数据就准备好了。整个索引共有500条数据。这个索引里的每一条数据就像:

{"id":"madden-nfl-2002-ps2-2001","name":"Madden NFL 2002","year":2001,"platform":"PS2","genre":"Sports","publisher":"Electronic Arts","global_sales":3.08,"critic_score":94,"user_score":7,"developer":"EA Sports","image_url":"http://www.mobygames.com/images/covers/l/202684-madden-nfl-2002-playstation-2-back-cover.png"}

它的mapping为:

{  "best_games" : {    "mappings" : {      "_meta" : {        "created_by" : "ml-file-data-visualizer"      },      "properties" : {        "critic_score" : {          "type" : "long"        },        "developer" : {          "type" : "text"        },        "genre" : {          "type" : "keyword"        },        "global_sales" : {          "type" : "double"        },        "id" : {          "type" : "keyword"        },        "image_url" : {          "type" : "keyword"        },        "name" : {          "type" : "text"        },        "platform" : {          "type" : "keyword"        },        "publisher" : {          "type" : "keyword"        },        "user_score" : {          "type" : "long"        },        "year" : {          "type" : "long"        }      }    }  }}

Field collapsing

下面我们用 collapsing 的方法来对我们的数据进行搜索:

GET best_games/_search{  "query": {    "match": {      "name": "Final Fantasy"    }  },  "collapse": {    "field": "publisher"  },   "sort": [    {      "critic_score": {        "order": "desc"      }    }  ]}

搜索的结果是:

{  "took" : 1,  "timed_out" : false,  "_shards" : {    "total" : 1,    "successful" : 1,    "skipped" : 0,    "failed" : 0  },  "hits" : {    "total" : {      "value" : 11,      "relation" : "eq"    },    "max_score" : null,    "hits" : [      {        "_index" : "best_games",        "_type" : "_doc",        "_id" : "E3JzF28BjrINWI3xtt80",        "_score" : null,        "_source" : {          "id" : "final-fantasy-ix-ps-2000",          "name" : "Final Fantasy IX",          "year" : 2000,          "platform" : "PS",          "genre" : "Role-Playing",          "publisher" : "SquareSoft",          "global_sales" : 5.3,          "critic_score" : 94,          "user_score" : 8,          "developer" : "SquareSoft",          "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"        },        "fields" : {          "publisher" : [            "SquareSoft"          ]        },        "sort" : [          94        ]      },      {        "_index" : "best_games",        "_type" : "_doc",        "_id" : "wnJzF28BjrINWI3xtt40",        "_score" : null,        "_source" : {          "id" : "final-fantasy-vii-ps-1997",          "name" : "Final Fantasy VII",          "year" : 1997,          "platform" : "PS",          "genre" : "Role-Playing",          "publisher" : "Sony Computer Entertainment",          "global_sales" : 9.72,          "critic_score" : 92,          "user_score" : 9,          "developer" : "SquareSoft",          "image_url" : "https://r.hswstatic.com/w_907/gif/finalfantasyvii-MAIN.jpg"        },        "fields" : {          "publisher" : [            "Sony Computer Entertainment"          ]        },        "sort" : [          92        ]      },      {        "_index" : "best_games",        "_type" : "_doc",        "_id" : "_nJzF28BjrINWI3xtt40",        "_score" : null,        "_source" : {          "id" : "final-fantasy-xii-ps2-2006",          "name" : "Final Fantasy XII",          "year" : 2006,          "platform" : "PS2",          "genre" : "Role-Playing",          "publisher" : "Square Enix",          "global_sales" : 5.95,          "critic_score" : 92,          "user_score" : 7,          "developer" : "Square Enix",          "image_url" : "https://m.media-amazon.com/images/M/MV5BM2I4MDMyMDQtNjM2OC00ZWNkLTg0ODQtNzYxZjY0M2QxODQyXkEyXkFqcGdeQXVyNjY5NTM5MjA@._V1_.jpg"        },        "fields" : {          "publisher" : [            "Square Enix"          ]        },        "sort" : [          92        ]      },      {        "_index" : "best_games",        "_type" : "_doc",        "_id" : "FXJzF28BjrINWI3xtt80",        "_score" : null,        "_source" : {          "id" : "final-fantasy-x-2-ps2-2003",          "name" : "Final Fantasy X-2",          "year" : 2003,          "platform" : "PS2",          "genre" : "Role-Playing",          "publisher" : "Electronic Arts",          "global_sales" : 5.29,          "critic_score" : 85,          "user_score" : 6,          "developer" : "SquareSoft",          "image_url" : "https://upload.wikimedia.org/wikipedia/en/thumb/6/6c/FFX-2_box.jpg/220px-FFX-2_box.jpg"        },        "fields" : {          "publisher" : [            "Electronic Arts"          ]        },        "sort" : [          85        ]      }    ]  }}

上面的结果显示:

  • 我们搜索所有的名字为 Final Fantasy 的游戏,并按照 critic_score 降序排序。
  • 由于我们使用 collapse,并按照 publisher 来进行分类。它的意思就是每个 publisher 只能有一个搜索的结果,尽管每一 publisher 有很多款的游戏

比如,我们可以找到 publisher 为 SquareSoft 并且 name 里含有 Final Fantasy 的游戏,有三款之多:

GET best_games/_search{  "query": {    "bool": {      "must": [        {          "match": {            "name": "Final Fantasy"          }        },        {          "match": {            "publisher": "SquareSoft"          }        }      ]    }  },  "sort": [    {      "critic_score": {        "order": "desc"      }    }  ]}

上面的查询结果:

    "hits" : [      {        "_index" : "best_games",        "_type" : "_doc",        "_id" : "E3JzF28BjrINWI3xtt80",        "_score" : null,        "_source" : {          "id" : "final-fantasy-ix-ps-2000",          "name" : "Final Fantasy IX",          "year" : 2000,          "platform" : "PS",          "genre" : "Role-Playing",          "publisher" : "SquareSoft",          "global_sales" : 5.3,          "critic_score" : 94,          "user_score" : 8,          "developer" : "SquareSoft",          "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"        },        "sort" : [          94        ]      },      {        "_index" : "best_games",        "_type" : "_doc",        "_id" : "0nJzF28BjrINWI3xtt40",        "_score" : null,        "_source" : {          "id" : "final-fantasy-viii-ps-1999",          "name" : "Final Fantasy VIII",          "year" : 1999,          "platform" : "PS",          "genre" : "Role-Playing",          "publisher" : "SquareSoft",          "global_sales" : 7.86,          "critic_score" : 90,          "user_score" : 8,          "developer" : "SquareSoft",          "image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585"        },        "sort" : [          90        ]      },      {        "_index" : "best_games",        "_type" : "_doc",        "_id" : "SHJzF28BjrINWI3xtuA1",        "_score" : null,        "_source" : {          "id" : "final-fantasy-tactics-ps-1997",          "name" : "Final Fantasy Tactics",          "year" : 1997,          "platform" : "PS",          "genre" : "Role-Playing",          "publisher" : "SquareSoft",          "global_sales" : 2.45,          "critic_score" : 83,          "user_score" : 8,          "developer" : "SquareSoft",          "image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg"        },        "sort" : [          83        ]      }    ]  }

但是由于我们使用了collapse,只有一款游戏,并且是按照 critic_score 最高的那个被搜索出来。

注意:能够被 collapse 所使用的字段必须是数字或 keyword 字段,并且含有 doc_values

扩展 Collapse 结果

我们也可以通过使用 inner_hits 选项来扩展 Collapse 的热门匹配:

GET best_games/_search{  "query": {    "match": {      "name": "Final Fantasy"    }  },  "collapse": {    "field": "publisher",    "inner_hits": {      "name": "top 3 games",      "size": 3,      "sort": [{"user_score": "desc"}]    }  },   "sort": [    {      "critic_score": {        "order": "desc"      }    }  ]}

那么运行后的结果为:

  "hits" : [      {        "_index" : "best_games",        "_type" : "_doc",        "_id" : "E3JzF28BjrINWI3xtt80",        "_score" : null,        "_source" : {          "id" : "final-fantasy-ix-ps-2000",          "name" : "Final Fantasy IX",          "year" : 2000,          "platform" : "PS",          "genre" : "Role-Playing",          "publisher" : "SquareSoft",          "global_sales" : 5.3,          "critic_score" : 94,          "user_score" : 8,          "developer" : "SquareSoft",          "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"        },        "fields" : {          "publisher" : [            "SquareSoft"          ]        },        "sort" : [          94        ],        "inner_hits" : {          "top 3 games" : {            "hits" : {              "total" : {                "value" : 3,                "relation" : "eq"              },              "max_score" : null,              "hits" : [                {                  "_index" : "best_games",                  "_type" : "_doc",                  "_id" : "0nJzF28BjrINWI3xtt40",                  "_score" : null,                  "_source" : {                    "id" : "final-fantasy-viii-ps-1999",                    "name" : "Final Fantasy VIII",                    "year" : 1999,                    "platform" : "PS",                    "genre" : "Role-Playing",                    "publisher" : "SquareSoft",                    "global_sales" : 7.86,                    "critic_score" : 90,                    "user_score" : 8,                    "developer" : "SquareSoft",                    "image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585"                  },                  "sort" : [                    8                  ]                },                {                  "_index" : "best_games",                  "_type" : "_doc",                  "_id" : "E3JzF28BjrINWI3xtt80",                  "_score" : null,                  "_source" : {                    "id" : "final-fantasy-ix-ps-2000",                    "name" : "Final Fantasy IX",                    "year" : 2000,                    "platform" : "PS",                    "genre" : "Role-Playing",                    "publisher" : "SquareSoft",                    "global_sales" : 5.3,                    "critic_score" : 94,                    "user_score" : 8,                    "developer" : "SquareSoft",                    "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"                  },                  "sort" : [                    8                  ]                },                {                  "_index" : "best_games",                  "_type" : "_doc",                  "_id" : "SHJzF28BjrINWI3xtuA1",                  "_score" : null,                  "_source" : {                    "id" : "final-fantasy-tactics-ps-1997",                    "name" : "Final Fantasy Tactics",                    "year" : 1997,                    "platform" : "PS",                    "genre" : "Role-Playing",                    "publisher" : "SquareSoft",                    "global_sales" : 2.45,                    "critic_score" : 83,                    "user_score" : 8,                    "developer" : "SquareSoft",                    "image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg"                  },                  "sort" : [                    8                  ]                }              ]            }          }        }      },

我们可以看出来在每个 publisher 里,在 inner_hits 里同时含有3个 top 3 games。它们分别是按照 user_score 来进行分类的。

也可以为每个合拢的匹配请求多个 inner_hits。 当您想要获得 Collapse 后的匹配的多种表示形式时,此功能很有用。

GET best_games/_search{  "query": {    "match": {      "name": "Final Fantasy"    }  },  "collapse": {    "field": "publisher",    "inner_hits": [      {        "name": "top user liked",        "size": 3,        "sort": [          {            "user_score": "desc"          }        ]      },      {        "name": "top most recent games",        "size": 3,        "sort": [          {            "year": "desc"          }        ]              }    ]  },  "sort": [    {      "critic_score": {        "order": "desc"      }    }  ]}

显示结果为:

    "hits" : [      {        "_index" : "best_games",        "_type" : "_doc",        "_id" : "E3JzF28BjrINWI3xtt80",        "_score" : null,        "_source" : {          "id" : "final-fantasy-ix-ps-2000",          "name" : "Final Fantasy IX",          "year" : 2000,          "platform" : "PS",          "genre" : "Role-Playing",          "publisher" : "SquareSoft",          "global_sales" : 5.3,          "critic_score" : 94,          "user_score" : 8,          "developer" : "SquareSoft",          "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"        },        "fields" : {          "publisher" : [            "SquareSoft"          ]        },        "sort" : [          94        ],        "inner_hits" : {          "top user liked" : {            "hits" : {              "total" : {                "value" : 3,                "relation" : "eq"              },              "max_score" : null,              "hits" : [                {                  "_index" : "best_games",                  "_type" : "_doc",                  "_id" : "0nJzF28BjrINWI3xtt40",                  "_score" : null,                  "_source" : {                    "id" : "final-fantasy-viii-ps-1999",                    "name" : "Final Fantasy VIII",                    "year" : 1999,                    "platform" : "PS",                    "genre" : "Role-Playing",                    "publisher" : "SquareSoft",                    "global_sales" : 7.86,                    "critic_score" : 90,                    "user_score" : 8,                    "developer" : "SquareSoft",                    "image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585"                  },                  "sort" : [                    8                  ]                },                {                  "_index" : "best_games",                  "_type" : "_doc",                  "_id" : "E3JzF28BjrINWI3xtt80",                  "_score" : null,                  "_source" : {                    "id" : "final-fantasy-ix-ps-2000",                    "name" : "Final Fantasy IX",                    "year" : 2000,                    "platform" : "PS",                    "genre" : "Role-Playing",                    "publisher" : "SquareSoft",                    "global_sales" : 5.3,                    "critic_score" : 94,                    "user_score" : 8,                    "developer" : "SquareSoft",                    "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"                  },                  "sort" : [                    8                  ]                },                {                  "_index" : "best_games",                  "_type" : "_doc",                  "_id" : "SHJzF28BjrINWI3xtuA1",                  "_score" : null,                  "_source" : {                    "id" : "final-fantasy-tactics-ps-1997",                    "name" : "Final Fantasy Tactics",                    "year" : 1997,                    "platform" : "PS",                    "genre" : "Role-Playing",                    "publisher" : "SquareSoft",                    "global_sales" : 2.45,                    "critic_score" : 83,                    "user_score" : 8,                    "developer" : "SquareSoft",                    "image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg"                  },                  "sort" : [                    8                  ]                }              ]            }          },          "top most recent games" : {            "hits" : {              "total" : {                "value" : 3,                "relation" : "eq"              },              "max_score" : null,              "hits" : [                {                  "_index" : "best_games",                  "_type" : "_doc",                  "_id" : "E3JzF28BjrINWI3xtt80",                  "_score" : null,                  "_source" : {                    "id" : "final-fantasy-ix-ps-2000",                    "name" : "Final Fantasy IX",                    "year" : 2000,                    "platform" : "PS",                    "genre" : "Role-Playing",                    "publisher" : "SquareSoft",                    "global_sales" : 5.3,                    "critic_score" : 94,                    "user_score" : 8,                    "developer" : "SquareSoft",                    "image_url" : "http://gamesdatabase.org/Media/SYSTEM/Sony_Playstation/Snap/Thumb/Thumb_Final_Fantasy_IX_-_2000_-_Square_Co.,_Ltd..jpg"                  },                  "sort" : [                    2000                  ]                },                {                  "_index" : "best_games",                  "_type" : "_doc",                  "_id" : "0nJzF28BjrINWI3xtt40",                  "_score" : null,                  "_source" : {                    "id" : "final-fantasy-viii-ps-1999",                    "name" : "Final Fantasy VIII",                    "year" : 1999,                    "platform" : "PS",                    "genre" : "Role-Playing",                    "publisher" : "SquareSoft",                    "global_sales" : 7.86,                    "critic_score" : 90,                    "user_score" : 8,                    "developer" : "SquareSoft",                    "image_url" : "https://gamingheartscollection.files.wordpress.com/2018/02/final-fantasy-8.png?w=585"                  },                  "sort" : [                    1999                  ]                },                {                  "_index" : "best_games",                  "_type" : "_doc",                  "_id" : "SHJzF28BjrINWI3xtuA1",                  "_score" : null,                  "_source" : {                    "id" : "final-fantasy-tactics-ps-1997",                    "name" : "Final Fantasy Tactics",                    "year" : 1997,                    "platform" : "PS",                    "genre" : "Role-Playing",                    "publisher" : "SquareSoft",                    "global_sales" : 2.45,                    "critic_score" : 83,                    "user_score" : 8,                    "developer" : "SquareSoft",                    "image_url" : "https://www.thefinalfantasy.com/gallery/screenshots/ff-tactics/dynamic_previews/ff-tactics-screenshot-1_scale_800_700.jpg"                  },                  "sort" : [                    1997                  ]                }              ]            }          }        }      },

这样针对每个 publisher,我们也可以得到每个 publisher 在 user 中最受欢迎的三个,同时显示最新的三个游戏。

参考:

【1】https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-body.html#request-body-search-collapse


最新活动

Elasticsearch Service免费体验馆 >>

Elasticsearch Service自建迁移特惠政策>>

Elasticsearch Service 新用户特惠狂欢,提供30天免费体验和最低4折首购优惠 >>

Elasticsearch Service 企业首购特惠,助力企业复工复产>>

关注“腾讯云大数据”公众号,技术交流、最新活动、服务专享一站Get~

原文链接:https://elasticstack.blog.csdn.net/article/details/103593811

我来说两句

0 条评论
登录 后参与评论

相关文章

  • Elasticsearch:使用 function_score 及 soft_score 定制搜索结果的分数

    我们将介绍使用 function_score 的基础知识,并介绍一些 function core 技术非常有用和有效的用例。

    腾讯云ES团队
  • 开始使用Elasticsearch (2)

    在上一篇文章中,我们已经介绍了如何使用 REST 接口来在 Elasticsearch 中创建 index ,文档以及对它们的操作。在今天的文章里,我们来介绍如...

    腾讯云ES团队
  • Elasticsearch Service价格下调通知

    Elasticsearch Service产品将于2020年9月中旬进行价格下调,届时所有新购、续费等结算都将按新价格进行。本次调价产品包括:

    腾讯云ES团队
  • JavaScript 逆向爬取实战(下)

    这一篇是 JavaScript 逆向爬取的第二篇。那么接下来我为大家缕顺一下学习顺序。

    崔庆才
  • Flask 快速入门

    Flask是一个Python编写的Web 微框架,让我们可以使用Python语言快速实现一个网站或Web服务。本文参考自Flask官方文档,大部分代码引用自官方...

    乐百川
  • IM消息ID技术专题(六):深度解密滴滴的高性能ID生成器(Tinyid)

    在中大型IM系统中,聊天消息的唯一ID生成策略是个很重要的技术点。不夸张的说,聊天消息ID贯穿了整个聊天生命周期的几乎每一个算法、逻辑和过程,ID生成策略的好坏...

    JackJiang
  • 3000 字 Flask 快速学习指南:从入门到开发

    作者:过了即是客 Flask是一个Python编写的Web 微框架,让我们可以使用Python语言快速实现一个网站或Web服务。本文参考自Flask官方文档,...

    小小科
  • sqlmap_修改tamper脚本_绕过WAF并制作通杀0day

    HACK学习
  • 如何在Ubuntu 14.04上使用Shipyard部署Wordpress

    Shipyard是Docker服务器的管理工具。Docker是用于集装箱化的尖端软件。Shipyard允许您查看每个服务器正在运行的容器,以便启动或停止现有容器...

    小铁匠米兰的v
  • 用图示和代码理解JVM

    JVM Java Virtual Machine JDK Java Development Kit JRE Java Runtime Environment 看...

    Java识堂

扫码关注云+社区

领取腾讯云代金券