版权声明:欢迎转载,请注明出处,谢谢。 https://blog.csdn.net/boling_cavalry/article/details/86668180
elasticsearch常用的中文分词器是ik分词器,安装和使用的详情请参考《elasticsearch安装和使用ik分词器》;
借助elasticsearch的官方镜像,我们在docker环境能快速搭建elasticsearch服务,但是ik分词器如何安装呢?
第一种方法:执行docker exec命令进入容器,再按照物理机的步骤来安装,显然每次创建容器都要这么做一次的话代价太高了;
第二种方法:做个集成了ik分词器的elasticsearch镜像,这样每个容器运行的时都自带了ik分词器;
今天实战的内容就是上面的第二种方法:自制elasticsearch镜像,该镜像继承了ik分词器;
先来梳理一下常规的ik分词器安装步骤:
以上就是常规安装步骤,接下来就是把这些在elasticsearch的镜像中再做一遍即可;
Dockerfile的内容如下,已经有了详细注释就不再赘述了:
#Docker image of elasticsearch with ik tokenizer
# VERSION 6.5.0
# Author: bolingcavalry
#基础镜像使用elasticsearch:6.5.0
FROM elasticsearch:6.5.0
#作者
MAINTAINER BolingCavalry <zq2599@gmail.com>
#es插件目录
ENV ES_PLUGINS_PATH /usr/share/elasticsearch/plugins
#定义maven的安装目录
ENV MAVEN_BASE_PATH /opt
#定义编译ik分词器源码的目录
ENV IK_SRC_COMPILE_PATH /opt/ik_build
#maven解压后的文件夹名称
ENV MAVEN_PACKAGE_NAME apache-maven-3.6.0
#将maven的bin目录加入PATH
ENV PATH="${MAVEN_BASE_PATH}/${MAVEN_PACKAGE_NAME}/bin:${PATH}"
#进入要安装maven的文件夹
RUN cd $MAVEN_BASE_PATH && \
#下载maven压缩包
wget https://mirrors.tuna.tsinghua.edu.cn/apache/maven/maven-3/3.6.0/binaries/apache-maven-3.6.0-bin.tar.gz && \
#解压maven
tar -zxvf ${MAVEN_PACKAGE_NAME}-bin.tar.gz && \
#创建编译ik分词器源码的目录
mkdir $IK_SRC_COMPILE_PATH && \
#进入编译ik分词器源码的目录
cd $IK_SRC_COMPILE_PATH && \
#下载ik源码包
wget https://codeload.github.com/medcl/elasticsearch-analysis-ik/zip/master && \
#解压源码包
unzip master && \
#进入解压后的目录
cd elasticsearch-analysis-ik-master && \
#通过maven构建
mvn clean package -U -DskipTests && \
#创建ik文件夹
mkdir $ES_PLUGINS_PATH/ik && \
#构建成功后,将文件移动到插件目录
mv target/releases/*.zip $ES_PLUGINS_PATH/ik && \
#cd到ik文件夹
cd $ES_PLUGINS_PATH/ik && \
#解压
unzip *.zip && \
#进入要安装maven的文件夹
cd $MAVEN_BASE_PATH && \
#删除不需要的文件夹
rm -rf ${MAVEN_PACKAGE_NAME}-bin.tar.gz ${MAVEN_PACKAGE_NAME} && \
#删除ik的源码目录
rm -rf $IK_SRC_COMPILE_PATH
docker build -t bolingcavalry/elasticsearch-with-ik:6.5.0 .
构建的过程中,mave编译构建的时候会在下载很多jar包,比较耗时,请耐心等待;
[root@hedy es]# docker history bolingcavalry/elasticsearch-with-ik:6.5.0
IMAGE CREATED CREATED BY SIZE COMMENT
abef02e45496 About an hour ago /bin/sh -c cd $MAVEN_BASE_PATH && wget htt... 50.2 MB
92a91169e693 About an hour ago /bin/sh -c #(nop) ENV PATH=/opt/apache-ma... 0 B
9ddccd9a491a About an hour ago /bin/sh -c #(nop) ENV MAVEN_PACKAGE_NAME=... 0 B
d4a3e11e500e About an hour ago /bin/sh -c #(nop) ENV IK_SRC_COMPILE_PATH... 0 B
cde29a40070e About an hour ago /bin/sh -c #(nop) ENV MAVEN_BASE_PATH=/opt 0 B
979b6bb94f88 About an hour ago /bin/sh -c #(nop) ENV ES_PLUGINS_PATH=/us... 0 B
61d45dcbea07 About an hour ago /bin/sh -c #(nop) MAINTAINER BolingCavalr... 0 B
ff171d17e77c 2 months ago /bin/sh -c #(nop) CMD ["eswrapper"] 0 B
<missing> 2 months ago /bin/sh -c #(nop) ENTRYPOINT ["/usr/local... 0 B
<missing> 2 months ago /bin/sh -c #(nop) LABEL org.label-schema.... 0 B
<missing> 2 months ago /bin/sh -c #(nop) EXPOSE 9200 9300 0 B
<missing> 2 months ago /bin/sh -c chgrp 0 /usr/local/bin/docker-e... 5.05 kB
<missing> 2 months ago /bin/sh -c #(nop) COPY --chown=1000:0file:... 4.36 kB
<missing> 2 months ago /bin/sh -c #(nop) ENV PATH=/usr/share/ela... 0 B
<missing> 2 months ago /bin/sh -c #(nop) COPY --chown=1000:0dir:5... 237 MB
<missing> 2 months ago /bin/sh -c #(nop) WORKDIR /usr/share/elast... 0 B
<missing> 2 months ago /bin/sh -c groupadd -g 1000 elasticsearch ... 296 kB
<missing> 2 months ago /bin/sh -c yum update -y && yum instal... 25.7 MB
<missing> 2 months ago /bin/sh -c ln -sf /etc/pki/ca-trust/extrac... 0 B
<missing> 2 months ago /bin/sh -c #(nop) ENV JAVA_HOME=/opt/jdk-... 0 B
<missing> 2 months ago /bin/sh -c curl -s https://download.java.n... 310 MB
<missing> 2 months ago /bin/sh -c #(nop) ENV ELASTIC_CONTAINER=true 0 B
<missing> 3 months ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0 B
<missing> 3 months ago /bin/sh -c #(nop) LABEL org.label-schema.... 0 B
<missing> 3 months ago /bin/sh -c #(nop) ADD file:fbe9badfd2790f0... 200 MB
version: '2.2'
services:
elasticsearch:
image: bolingcavalry/elasticsearch-with-ik:6.5.0
container_name: elasticsearch
environment:
- cluster.name=docker-cluster
- bootstrap.memory_lock=true
- http.cors.enabled=true
- http.cors.allow-origin=*
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- esdata1:/usr/share/elasticsearch/data
ports:
- 9200:9200
networks:
- esnet
elasticsearch2:
image: bolingcavalry/elasticsearch-with-ik:6.5.0
container_name: elasticsearch2
environment:
- cluster.name=docker-cluster
- bootstrap.memory_lock=true
- http.cors.enabled=true
- http.cors.allow-origin=*
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- "discovery.zen.ping.unicast.hosts=elasticsearch"
ulimits:
memlock:
soft: -1
hard: -1
volumes:
- esdata2:/usr/share/elasticsearch/data
networks:
- esnet
volumes:
esdata1:
driver: local
esdata2:
driver: local
networks:
esnet:
[root@hedy ik]# docker-compose up -d
Creating network "ik_esnet" with the default driver
Creating elasticsearch ... done
Creating elasticsearch2 ... done
curl -X PUT http://192.168.1.101:9200/test001
curl -X POST \
'http://192.168.1.101:9200/test001/_analyze?pretty=true' \
-H 'Content-Type: application/json' \
-d '{"text":"我们是软件工程师","tokenizer":"ik_smart"}'
收到的响应如下,可见ik分词器已经生效:
{
"tokens" : [
{
"token" : "我们",
"start_offset" : 0,
"end_offset" : 2,
"type" : "CN_WORD",
"position" : 0
},
{
"token" : "是",
"start_offset" : 2,
"end_offset" : 3,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "软件",
"start_offset" : 3,
"end_offset" : 5,
"type" : "CN_WORD",
"position" : 2
},
{
"token" : "工程师",
"start_offset" : 5,
"end_offset" : 8,
"type" : "CN_WORD",
"position" : 3
}
]
}
前面构建好的镜像只存在本地电脑,我们可以将其提交到docker仓库给更多用户使用:
至此,ik分词器镜像的制作和验证就完成了,希望能帮助您在docker下更方便的使用elasticsearch服务;