前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >elasticsearch中文分词器的安装和体验

elasticsearch中文分词器的安装和体验

作者头像
算法之名
发布2019-08-20 17:36:19
3830
发布2019-08-20 17:36:19
举报
文章被收录于专栏:算法之名

因为本人使用的elasticsearch的版本为5.2.1,相对应的中文分词器的下载地址为https://github.com/medcl/elasticsearch-analysis-ik/tree/v5.2.1.(请根据自己使用版本的不同进行下载)

安装其实挺简单,编译后解压缩到elasticsearch的安装目录,以下是我的安装目录.

unzip -d /usr/local/elasticsearch/plugins/ik/ elasticsearch-analysis-ik-5.2.1.zip

然后重启elasticsearch.

如果使用docker安装elasticsearch 6.2.4的步骤如下

docker pull docker.elastic.co/elasticsearch/elasticsearch:6.2.4

docker run -d --name myes -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -v /etc/localtime:/etc/localtime:ro docker.elastic.co/elasticsearch/elasticsearch:6.2.4

安装中文分词器

docker cp elasticsearch-analysis-ik-6.2.4.zip myes:/usr/share/elasticsearch 将分词器的压缩文件放入容器中

docker exec -it myes bash进入容器

mkdir ./plugins/analysis-ik 在plugins目录下新建analysis-ik目录,此处与5.2.1不同

unzip -d ./ elasticsearch-analysis-ik-6.2.4.zip

cp -r ./elasticsearch/* ./plugins/analysis-ik/ 将解压后的./elasticsearch目录下的所有文件拷入./plugins/analysis-ik目录下

rm -rf ./elasticsearch

rm elasticsearch-analysis-ik-6.2.4.zip

exit

docker restart myes 重启elasticsearch服务

我们还是插入四条数据,检索全部如下

http://192.168.5.182:9200/ecommerce/toothpaste/_search

{ "took": 123, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 4, "max_score": 1, "hits": [ { "_index": "ecommerce", "_type": "toothpaste", "_id": "2", "_score": 1, "_source": { "name": "佳洁士牙膏", "desc": "有效防蛀", "price": 25, "producer": "佳洁士生产商", "tags": [ "防蛀" ] } }, { "_index": "ecommerce", "_type": "toothpaste", "_id": "4", "_score": 1, "_source": { "name": "特别牙膏", "desc": "特别美白", "price": 50, "producer": "特别牙膏生产商", "tags": [ "美白", "防蛀" ] } }, { "_index": "ecommerce", "_type": "toothpaste", "_id": "1", "_score": 1, "_source": { "name": "高露洁牙膏", "desc": "高效美白", "price": 30, "producer": "高露洁生产商", "tags": [ "美白", "防蛀" ] } }, { "_index": "ecommerce", "_type": "toothpaste", "_id": "3", "_score": 1, "_source": { "name": "中华牙膏", "desc": "草本植物", "price": 40, "producer": "中华生产商", "tags": [ "清新" ] } } ] } }

全文检索

[root@host2 bin]# curl -XGET 'http://192.168.5.182:9200/ecommerce/toothpaste/_search' -d' > { > "query":{ > "match":{ > "producer":"牙膏生产商" > } > } > }' {"took":76,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":4,"max_score":3.6265414,"hits":[{"_index":"ecommerce","_type":"toothpaste","_id":"4","_score":3.6265414,"_source":{ "name":"特别牙膏", "desc":"特别美白", "price":50, "producer":"特别牙膏生产商", "tags":["美白","防蛀"] }},{"_index":"ecommerce","_type":"toothpaste","_id":"3","_score":1.7306128,"_source":{ "name":"中华牙膏", "desc":"草本植物", "price":40, "producer":"中华生产商", "tags":["清新"] }},{"_index":"ecommerce","_type":"toothpaste","_id":"2","_score":1.6805282,"_source":{ "name":"佳洁士牙膏", "desc":"有效防蛀", "price":25, "producer":"佳洁士生产商", "tags":["防蛀"] }},{"_index":"ecommerce","_type":"toothpaste","_id":"1","_score":1.5775073,"_source":{ "name":"高露洁牙膏", "desc":"高效美白", "price":30, "producer":"高露洁生产商", "tags":["美白","防蛀"] }}]}}

根据以上结果可以看出,中文分词检索成功,所有有生产商的都会被检索出来,而由"牙膏生产商"检索出的特别牙膏生产商排第一,为最为匹配的.

本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
Elasticsearch Service
腾讯云 Elasticsearch Service(ES)是云端全托管海量数据检索分析服务,拥有高性能自研内核,集成X-Pack。ES 支持通过自治索引、存算分离、集群巡检等特性轻松管理集群,也支持免运维、自动弹性、按需使用的 Serverless 模式。使用 ES 您可以高效构建信息检索、日志分析、运维监控等服务,它独特的向量检索还可助您构建基于语义、图像的AI深度应用。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档