我遇到了一个问题,因为我需要使用关键字字段对桶进行排序,为此,我尝试了两种方法。
"user_data": {
"top_hits": {
"_source": {
"includes": ["username"]
},
"size": 1
}
},
为了对桶进行排序,我尝试用水桶排序,桶排序是这样的
sorting": {
"bucket_sort": {
"sort": [
{
"user_data>username": { ----> This is the error
"order": "desc"
}
}
],
"from": 0,
"size": 25
}
}
但是我收到了语法错误,基本上桶路径是错误的。
"to_sort" : {
"max": {
"field": "username"
}
}
并使用以下bucket_sort
"sorting": {
"bucket_sort": {
"sort": [
{
"to_sort": {
"order": "desc"
}
}
],
"from": 0,
"size": 25
}
}
但基本上我不能使用关键字字段来使用最大聚合。是否有一种方法可以使用用户名对我的桶进行排序,用户名是关键字字段?
我的聚合的父级是
"aggs": {
"CountryId": {
"terms": {
"field": "countryId",
"size": 10000
}
用户名的值在每个桶之间是不同的
水桶的结果是这样的
"buckets" : [
{
"key" : "11111",
"doc_count" : 17,
"user_data" : {
"hits" : {
"total" : 10,
"max_score" : 11,
"hits" : [
{
"_index" : "index_name",
"_type" : "index_name",
"_id" : "101010",
"_score" : 0.0,
"_source" : {
"username" : "cccccc"
}
}
]
}
}
},
{
"key" : "33333",
"doc_count" : 17,
"user_data" : {
"hits" : {
"total" : 10,
"max_score" : 11,
"hits" : [
{
"_index" : "index_name",
"_type" : "index_name",
"_id" : "101010",
"_score" : 0.0,
"_source" : {
"username" : "bbbbb"
}
}
]
}
}
},
{
"key" : "22222",
"doc_count" : 17,
"user_data" : {
"hits" : {
"total" : 10,
"max_score" : 11,
"hits" : [
{
"_index" : "index_name",
"_type" : "index_name",
"_id" : "101010",
"_score" : 0.0,
"_source" : {
"username" : "aaaaa"
}
}
]
}
}
}
]
和下面的桶结果是,我想要
"buckets" : [
{
"key" : "22222",
"doc_count" : 17,
"user_data" : {
"hits" : {
"total" : 10,
"max_score" : 11,
"hits" : [
{
"_index" : "index_name",
"_type" : "index_name",
"_id" : "101010",
"_score" : 0.0,
"_source" : {
"username" : "aaaaa"
}
}
]
}
}
},
{
"key" : "33333",
"doc_count" : 17,
"user_data" : {
"hits" : {
"total" : 10,
"max_score" : 11,
"hits" : [
{
"_index" : "index_name",
"_type" : "index_name",
"_id" : "101010",
"_score" : 0.0,
"_source" : {
"username" : "bbbbb"
}
}
]
}
}
},
{
"key" : "11111",
"doc_count" : 17,
"user_data" : {
"hits" : {
"total" : 10,
"max_score" : 11,
"hits" : [
{
"_index" : "index_name",
"_type" : "index_name",
"_id" : "101010",
"_score" : 0.0,
"_source" : {
"username" : "ccccc"
}
}
]
}
}
}
]
如何看到这些桶是按用户名.订购的。
发布于 2021-02-13 10:13:36
我有一个类似的问题,没有在互联网上找到任何答案。所以我试着建造自己的房子,花了我差不多一周的时间。由于对字符串的有序哈希代码生成的限制,它不会总是工作,所以您必须使用自己的charset
和字符串上的第一个字符的长度来进行排序(对于我来说是6个),进行一些测试,因为您只想使用long
类型的正间隔,否则它根本不能工作(因为我的字符集长度可能高达13)。基本上,我使用一个基于手动从bucket_sort
中查找top_hits
的scripted_metric
来为这里构建度量,并对其进行了调整,以计算一个有序的scripted_metric
关键字哈希码。下面是我的查询,在这里,我按sso.name
关键字对用户的上一次访问次数进行排序,您应该可以或多或少地根据您的问题来调整它。
{
"size": 0,
"timeout": "60s",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "user_id"
}
}
]
}
},
"aggregations": {
"by_user": {
"terms": {
"field": "user_id",
"size": 10000,
"order": [
{
"_count": "desc"
},
{
"_key": "asc"
}
]
},
"aggregations": {
"my_top_hits_sso_ordered_hash": {
"scripted_metric": {
"init_script": "state.timestamp_latest = 0L; state.last_sso_ordered_hash = 0L",
"map_script": """
def current_date = doc['login_timestamp'].getValue().toInstant().toEpochMilli();
if (current_date > state.timestamp_latest) {
state.timestamp_latest = current_date;
state.last_sso_ordered_hash = 0L;
if(doc['sso.name'].size()>0) {
String charset = "abcdefghijklmnopqrstuvwxyz";
String ssoName = doc['sso.name'].value;
int length = charset.length();
for(int i = 0; i<Math.min(ssoName.length(), 6); i++) {
state.last_sso_ordered_hash = state.last_sso_ordered_hash*length + charset.indexOf(String.valueOf(ssoName.charAt(i))) + 1;
}
}
}
""",
"combine_script":"return state",
"reduce_script": """
def last_sso_ordered_hash = '';
def timestamp_latest = 0L;
for (s in states) {
if (s.timestamp_latest > (timestamp_latest)) {
timestamp_latest = s.timestamp_latest; last_sso_ordered_hash = s.last_sso_ordered_hash;
}
}
return last_sso_ordered_hash;
"""
}
},
"user_last_session": {
"top_hits": {
"from": 0,
"size": 1,
"sort": [
{
"login_timestamp": {
"order": "desc"
}
}
]
}
},
"pagination": {
"bucket_sort": {
"sort": [
{
"my_top_hits_sso_ordered_hash.value": {
"order": "desc"
}
}
],
"from": 0,
"size": 100
}
}
}
}
}
}
https://stackoverflow.com/questions/66002081
复制相似问题