我们希望利用ElasticSearch找到类似的对象。
假设我有一个包含4个字段的对象: product_name、seller_name、seller_phone、platform_id。
类似的产品可以在不同的平台上有不同的产品名称和销售者名称(模糊匹配)。
然而,手机是严格的,一个单一的变化可能导致错误的记录(严格匹配)。
试图创建的是一个查询,该查询将:
如果我用伪代码编写它,我将编写如下内容:
((product_name类似于'some_product_name')或(seller_name类似'some_seller_name')或(seller_phone =‘some_phone’)和(platform_id = 123)
发布于 2017-06-13 12:29:04
为了在seller_phone上进行精确匹配,我在索引这个字段时没有ngram分析器,还有查询 for product_name和seller_name
映射
PUT index111
{
"settings": {
"analysis": {
"analyzer": {
"edge_n_gram_analyzer": {
"tokenizer": "whitespace",
"filter" : ["lowercase", "ednge_gram_filter"]
}
},
"filter": {
"ednge_gram_filter" : {
"type" : "NGram",
"min_gram" : 2,
"max_gram": 10
}
}
}
},
"mappings": {
"document_type" : {
"properties": {
"product_name" : {
"type": "text",
"analyzer": "edge_n_gram_analyzer"
},
"seller_name" : {
"type": "text",
"analyzer": "edge_n_gram_analyzer"
},
"seller_phone" : {
"type": "text"
},
"platform_id" : {
"type": "text"
}
}
}
}
}索引文档
POST index111/document_type
{
"product_name":"macbok",
"seller_name":"apple",
"seller_phone":"9988",
"platform_id":"123"
}用于后续伪sql查询
((product_name like 'some_product_name') OR (seller_name like 'some_seller_name') OR (seller_phone = 'some_phone')) AND (platform_id = 123)弹性查询
POST index111/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"platform_id": {
"value": "123"
}
}
},
{
"bool": {
"should": [{
"fuzzy": {
"product_name": {
"value": "macbouk",
"boost": 1.0,
"fuzziness": 2,
"prefix_length": 0,
"max_expansions": 100
}
}
},
{
"fuzzy": {
"seller_name": {
"value": "apdle",
"boost": 1.0,
"fuzziness": 2,
"prefix_length": 0,
"max_expansions": 100
}
}
},
{
"term": {
"seller_phone": {
"value": "9988"
}
}
}
]
}
}]
}
}
}希望这能有所帮助
https://stackoverflow.com/questions/44517875
复制相似问题