Elasticsearch
是一个高度可扩展的开源的分布式Restful
全文搜索和分析引擎。它允许用户快速的(近实时的)存储、搜索和分析海量数据。它通常用作底层引擎技术,为具有复杂搜索功能和要求的应用程序提供支持。以下是ES
可用于的一些场景:
ES
来存储产品的目录和库存,并为它们提供搜索和自动填充建议。Logstash
来收集、聚合和解析数据, 然后让Logstash
将此数据提供给ES
。然后可在ES
中搜索和聚合开发者感兴趣的信息。Kibana
构建自定义仪表板,来可视化展示数据。还可以使用ES的聚合功能针对这些数据进行复杂的商业分析。为什么要提Doug Cutting
,因为Elasticsearch的底层是Lucene,而Lucene就是Doug Cutting
大神写的。
引用来自于:鲜枣课堂
1998年9月4日,Google公司在美国硅谷成立。正如大家所知,它是一家做搜索引擎起家的公司。
http://static.cyblogs.com/640.jpg无独有偶,一位名叫Doug Cutting的美国工程师,也迷上了搜索引擎。他做了一个用于文本搜索的函数库(姑且理解为软件的功能组件),命名为Lucene。
http://static.cyblogs.com/pUm6Hxkd434Mk1VTAruKa8.jpg左为Doug Cutting,右为Lucene的LOGO Lucene是用JAVA写成的,目标是为各种中小型应用软件加入全文检索功能。因为好用而且开源(代码公开),非常受程序员们的欢迎。 早期的时候,这个项目被发布在Doug Cutting的个人网站和SourceForge(一个开源软件网站)。后来,2001年底,Lucene成为Apache软件基金会jakarta项目的一个子项目。
http://static.cyblogs.com/iaqYeVeXiaLwMxssVyfyV0f69tfVMod6.jpgApache软件基金会,搞IT的应该都认识 2004年,Doug Cutting再接再励,在Lucene的基础上,和Apache开源伙伴Mike Cafarella合作,开发了一款可以代替当时的主流搜索的开源搜索引擎,命名为Nutch。
http://static.cyblogs.com/aqYeVeXiaLwMxssV.pngNutch是一个建立在Lucene核心之上的网页搜索应用程序,可以下载下来直接使用。它在Lucene的基础上加了网络爬虫和一些网页相关的功能,目的就是从一个简单的站内检索推广到全球网络的搜索上,就像Google一样。 Nutch在业界的影响力比Lucene更大。 大批网站采用了Nutch平台,大大降低了技术门槛,使低成本的普通计算机取代高价的Web服务器成为可能。甚至有一段时间,在硅谷有了一股用Nutch低成本创业的潮流。 随着时间的推移,无论是Google还是Nutch,都面临搜索对象“体积”不断增大的问题。 尤其是Google,作为互联网搜索引擎,需要存储大量的网页,并不断优化自己的搜索算法,提升搜索效率。
http://static.cyblogs.com/TAruKa8WKbr3qDia9ba.jpgGoogle搜索栏 在这个过程中,Google确实找到了不少好办法,并且无私地分享了出来。 2003年,Google发表了一篇技术学术论文,公开介绍了自己的谷歌文件系统GFS(Google File System)。这是Google公司为了存储海量搜索数据而设计的专用文件系统。 第二年,也就是2004年,Doug Cutting基于Google的GFS论文,实现了分布式文件存储系统,并将它命名为NDFS(Nutch Distributed File System)。
http://static.cyblogs.com/google_gfs_ndfs.jpg还是2004年,Google又发表了一篇技术学术论文,介绍自己的MapReduce编程模型。这个编程模型,用于大规模数据集(大于1TB)的并行分析运算。 第二年(2005年),Doug Cutting又基于MapReduce,在Nutch搜索引擎实现了该功能。
http://static.cyblogs.com/goole_mapreduce.jpg2006年,当时依然很厉害的Yahoo(雅虎)公司,招安了Doug Cutting。
http://static.cyblogs.com/goole_mapreduce_002.jpg这里要补充说明一下雅虎招安Doug的背景:2004年之前,作为互联网开拓者的雅虎,是使用Google搜索引擎作为自家搜索服务的。在2004年开始,雅虎放弃了Google,开始自己研发搜索引擎。所以。。。 加盟Yahoo之后,Doug Cutting将NDFS和MapReduce进行了升级改造,并重新命名为Hadoop(NDFS也改名为HDFS,Hadoop Distributed File System)。 这个,就是后来大名鼎鼎的大数据框架系统——Hadoop的由来。而Doug Cutting,则被人们称为Hadoop之父。
http://static.cyblogs.com/goole_mapreduce_003.jpgHadoop这个名字,实际上是Doug Cutting他儿子的黄色玩具大象的名字。所以,Hadoop的Logo,就是一只奔跑的黄色大象。
http://static.cyblogs.com/goole_mapreduce_004.jpg我们继续往下说。 还是2006年,Google又发论文了。 这次,它们介绍了自己的BigTable。这是一种分布式数据存储系统,一种用来处理海量数据的非关系型数据库。 Doug Cutting当然没有放过,在自己的hadoop系统里面,引入了BigTable,并命名为HBase。
http://static.cyblogs.com/goole_mapreduce_005.jpg好吧,反正就是紧跟Google时代步伐,你出什么,我学什么。 所以,Hadoop的核心部分,基本上都有Google的影子。
http://static.cyblogs.com/goole_mapreduce_006.png
其实从这里也能看到,站在巨人肩膀上或者仿照强者,也可以走出一条属于自己的道路。
➜ Tools brew search elasticsearch
==> Formulae
elasticsearch elasticsearch@2.4 elasticsearch@5.6
➜ Tools brew install elasticsearch@5.6
==> Downloading https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.16.tar.gz
######################################################################## 100.0%
Warning: elasticsearch@5.6 has been deprecated!
==> Caveats
Data: /usr/local/var/elasticsearch/elasticsearch_chenyuan/
Logs: /usr/local/var/log/elasticsearch/elasticsearch_chenyuan.log
Plugins: /usr/local/opt/elasticsearch@5.6/libexec/plugins/
Config: /usr/local/etc/elasticsearch/
plugin script: /usr/local/opt/elasticsearch@5.6/libexec/bin/elasticsearch-plugin
elasticsearch@5.6 is keg-only, which means it was not symlinked into /usr/local,
because this is an alternate version of another formula.
If you need to have elasticsearch@5.6 first in your PATH run:
echo 'export PATH="/usr/local/opt/elasticsearch@5.6/bin:$PATH"' >> ~/.zshrc
To have launchd start elasticsearch@5.6 now and restart at login:
brew services start elasticsearch@5.6
Or, if you don't want/need a background service you can just run:
/usr/local/opt/elasticsearch@5.6/bin/elasticsearch
==> Summary
🍺 /usr/local/Cellar/elasticsearch@5.6/5.6.16: 106 files, 36.0MB, built in 10 seconds
==> `brew cleanup` has not been run in 30 days, running now...
Removing: /Users/chenyuan/Library/Caches/Homebrew/erlang--22.1.2.mojave.bottle.tar.gz... (77.3MB)
Removing: /Users/chenyuan/Library/Caches/Homebrew/gettext--0.20.1.catalina.bottle.tar.gz... (8.3MB)
Removing: /Users/chenyuan/Library/Caches/Homebrew/icu4c--64.2.catalina.bottle.tar.gz... (26.1MB)
Removing: /Users/chenyuan/Library/Caches/Homebrew/jpeg--9c.mojave.bottle.tar.gz... (300.8KB)
Removing: /Users/chenyuan/Library/Caches/Homebrew/libpng--1.6.37.mojave.bottle.tar.gz... (442.2KB)
Removing: /Users/chenyuan/Library/Caches/Homebrew/libtiff--4.0.10_1.mojave.bottle.tar.gz... (1MB)
Removing: /Users/chenyuan/Library/Caches/Homebrew/node--12.11.1.catalina.bottle.tar.gz... (14.8MB)
Removing: /Users/chenyuan/Library/Caches/Homebrew/openssl@1.1--1.1.1d.mojave.bottle.tar.gz... (5.2MB)
Removing: /Users/chenyuan/Library/Caches/Homebrew/perl--5.30.0.catalina.bottle.tar.gz... (16.3MB)
Removing: /Users/chenyuan/Library/Caches/Homebrew/rabbitmq--3.8.0.tar.xz... (11MB)
Removing: /Users/chenyuan/Library/Caches/Homebrew/subversion--1.12.2_1.catalina.bottle.1.tar.gz... (10MB)
Removing: /Users/chenyuan/Library/Caches/Homebrew/utf8proc--2.4.0.catalina.bottle.tar.gz... (152.2KB)
Removing: /Users/chenyuan/Library/Caches/Homebrew/wxmac--3.0.4_2.mojave.bottle.tar.gz... (7.4MB)
Removing: /Users/chenyuan/Library/Logs/Homebrew/icu4c... (64B)
Removing: /Users/chenyuan/Library/Logs/Homebrew/node... (64B)
Pruned 0 symbolic links and 2 directories from /usr/local
查看一下版本号,结果没有结果。
➜ ~ elasticsearch --version
zsh: command not found: elasticsearch
然后看之前的日志,需要你手动配置一下环境变量。
echo 'export PATH="/usr/local/opt/elasticsearch@5.6/bin:$PATH"' >> ~/.zshrc
# 重新加载环境变量
➜ ~ source ~/.zshrc
➜ ~ elasticsearch --version
Version: 5.6.16, Build: 3a740d1/2019-03-13T15:33:36.565Z, JVM: 1.8.0_162
启动ElasticSearch
➜ ~ elasticsearch
[2020-05-14T21:47:06,301][INFO ][o.e.n.Node ] [] initializing ...
[2020-05-14T21:47:06,403][INFO ][o.e.e.NodeEnvironment ] [vXW29Yn] using [1] data paths, mounts [[/ (/dev/disk1s5)]], net usable_space [42.5gb], net total_space [465.7gb], spins? [unknown], types [apfs]
[2020-05-14T21:47:06,404][INFO ][o.e.e.NodeEnvironment ] [vXW29Yn] heap size [1.9gb], compressed ordinary object pointers [true]
[2020-05-14T21:47:06,406][INFO ][o.e.n.Node ] node name [vXW29Yn] derived from node ID [vXW29YnkRDaIb8XuGeKRxQ]; set [node.name] to override
[2020-05-14T21:47:06,406][INFO ][o.e.n.Node ] version[5.6.16], pid[75858], build[3a740d1/2019-03-13T15:33:36.565Z], OS[Mac OS X/10.15.4/x86_64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_162/25.162-b12]
[2020-05-14T21:47:06,406][INFO ][o.e.n.Node ] JVM arguments [-Xms2g, -Xmx2g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -Djdk.io.permissionsUseCanonicalPath=true, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j.skipJansi=true, -XX:+HeapDumpOnOutOfMemoryError, -Des.path.home=/usr/local/Cellar/elasticsearch@5.6/5.6.16/libexec]
[2020-05-14T21:47:07,237][INFO ][o.e.p.PluginsService ] [vXW29Yn] loaded module [aggs-matrix-stats]
[2020-05-14T21:47:07,237][INFO ][o.e.p.PluginsService ] [vXW29Yn] loaded module [ingest-common]
[2020-05-14T21:47:07,237][INFO ][o.e.p.PluginsService ] [vXW29Yn] loaded module [lang-expression]
[2020-05-14T21:47:07,238][INFO ][o.e.p.PluginsService ] [vXW29Yn] loaded module [lang-groovy]
[2020-05-14T21:47:07,238][INFO ][o.e.p.PluginsService ] [vXW29Yn] loaded module [lang-mustache]
[2020-05-14T21:47:07,238][INFO ][o.e.p.PluginsService ] [vXW29Yn] loaded module [lang-painless]
[2020-05-14T21:47:07,238][INFO ][o.e.p.PluginsService ] [vXW29Yn] loaded module [parent-join]
[2020-05-14T21:47:07,238][INFO ][o.e.p.PluginsService ] [vXW29Yn] loaded module [percolator]
[2020-05-14T21:47:07,238][INFO ][o.e.p.PluginsService ] [vXW29Yn] loaded module [reindex]
[2020-05-14T21:47:07,238][INFO ][o.e.p.PluginsService ] [vXW29Yn] loaded module [transport-netty3]
[2020-05-14T21:47:07,238][INFO ][o.e.p.PluginsService ] [vXW29Yn] loaded module [transport-netty4]
[2020-05-14T21:47:07,239][INFO ][o.e.p.PluginsService ] [vXW29Yn] no plugins loaded
[2020-05-14T21:47:08,643][INFO ][o.e.d.DiscoveryModule ] [vXW29Yn] using discovery type [zen]
[2020-05-14T21:47:09,099][INFO ][o.e.n.Node ] initialized
[2020-05-14T21:47:09,099][INFO ][o.e.n.Node ] [vXW29Yn] starting ...
[2020-05-14T21:47:09,347][INFO ][o.e.t.TransportService ] [vXW29Yn] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2020-05-14T21:47:12,405][INFO ][o.e.c.s.ClusterService ] [vXW29Yn] new_master {vXW29Yn}{vXW29YnkRDaIb8XuGeKRxQ}{0aNOjaAGSGGLSXHQUS-lyg}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2020-05-14T21:47:12,425][INFO ][o.e.h.n.Netty4HttpServerTransport] [vXW29Yn] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2020-05-14T21:47:12,425][INFO ][o.e.n.Node ] [vXW29Yn] started
[2020-05-14T21:47:12,431][INFO ][o.e.g.GatewayService ] [vXW29Yn] recovered [0] indices into cluster_state
直接在浏览器输入:http://localhost:9200/
http://static.cyblogs.com/QQ20200514-214940@2x.jpg
➜ ~ brew search kibana
==> Formulae
kibana kibana@5.6
➜ ~ brew install kibana@5.6
==> Downloading https://mirrors.ustc.edu.cn/homebrew-bottles/bottles/kibana%405.6-5.6.16.catalina.bottle.1.tar.gz
==> Downloading from https://akamai.bintray.com/f4/f451a8784dc52182670152d040f6533d4dc2f1b251ef3797eed6c6ff565db8af?__gda__=exp=1589465020~hm
######################################################################## 100.0%
Warning: kibana@5.6 has been deprecated!
==> Pouring kibana@5.6-5.6.16.catalina.bottle.1.tar.gz
==> Caveats
Config: /usr/local/etc/kibana/
If you wish to preserve your plugins upon upgrade, make a copy of
/usr/local/opt/kibana@5.6/plugins before upgrading, and copy it into the
new keg location after upgrading.
kibana@5.6 is keg-only, which means it was not symlinked into /usr/local,
because this is an alternate version of another formula.
If you need to have kibana@5.6 first in your PATH run:
echo 'export PATH="/usr/local/opt/kibana@5.6/bin:$PATH"' >> ~/.zshrc
To have launchd start kibana@5.6 now and restart at login:
brew services start kibana@5.6
Or, if you don't want/need a background service you can just run:
/usr/local/opt/kibana@5.6/bin/kibana
==> Summary
🍺 /usr/local/Cellar/kibana@5.6/5.6.16: 37,391 files, 200MB
同样的道理,配置好环境变量。
启动kibana
➜ ~ kibana
log [13:56:55.842] [info][status][plugin:kibana@5.6.16] Status changed from uninitialized to green - Ready
log [13:56:55.901] [info][status][plugin:elasticsearch@5.6.16] Status changed from uninitialized to yellow - Waiting for Elasticsearch
log [13:56:55.923] [info][status][plugin:console@5.6.16] Status changed from uninitialized to green - Ready
log [13:56:55.952] [info][status][plugin:metrics@5.6.16] Status changed from uninitialized to green - Ready
log [13:56:56.158] [info][status][plugin:timelion@5.6.16] Status changed from uninitialized to green - Ready
log [13:56:56.162] [info][listening] Server running at http://localhost:5601
log [13:56:56.163] [info][status][ui settings] Status changed from uninitialized to yellow - Elasticsearch plugin is yellow
log [13:57:01.165] [info][status][plugin:elasticsearch@5.6.16] Status changed from yellow to yellow - No existing Kibana index found
log [13:57:02.456] [info][status][plugin:elasticsearch@5.6.16] Status changed from yellow to green - Kibana index ready
log [13:57:02.457] [info][status][ui settings] Status changed from yellow to green - Ready
直接在浏览器输入:http://localhost:5601/
http://static.cyblogs.com/QQ20200514-215814@2x.jpg
PUT /megacorp/employee/1
{
"first_name" : "John",
"last_name" : "Smith",
"age" : 25,
"about" : "I love to go rock climbing",
"interests": [ "sports", "music" ]
}
http://static.cyblogs.com/QQ20200514-220241@2x.jpg
{
"_index": "megacorp",
"_type": "employee",
"_id": "2",
"_version": 1, // 版本
"result": "created", // 是新增还是修改
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"created": true
}
对着官方文档一一的研究了一下它的一些语法与介绍。我个人觉得ElasticSearch的官方文档还算比较过关的:https://www.elastic.co/guide/cn/elasticsearch/guide/current/foreword_id.html 学习起来基本毫无障碍。
POST /my_store/products/_bulk
{ "index": { "_id": 1 }}
{ "price" : 10, "productID" : "XHDK-A-1293-#fJ3" }
{ "index": { "_id": 2 }}
{ "price" : 20, "productID" : "KDKE-B-9947-#kL5" }
{ "index": { "_id": 3 }}
{ "price" : 30, "productID" : "JODL-X-1937-#pV7" }
{ "index": { "_id": 4 }}
{ "price" : 30, "productID" : "QQPX-R-3956-#aD8" }
GET /my_store/products/_search
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"price" : 20
}
}
}
}
}
GET /my_store/products/_search
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"productID" : "XHDK-A-1293-#fJ3"
}
}
}
}
}
GET /my_store/_analyze
{
"field": "productID",
"text": "XHDK-A-1293-#fJ3"
}
DELETE /my_store
PUT /my_store
{
"mappings" : {
"products" : {
"properties" : {
"productID" : {
"type" : "string",
"index" : "not_analyzed"
}
}
}
}
}
GET /my_store/products/_search
{
"query" : {
"filtered" : {
"filter" : {
"bool" : {
"should" : [
{ "term" : {"price" : 20}},
{ "term" : {"productID" : "XHDK-A-1293-#fJ3"}}
],
"must_not" : {
"term" : {"price" : 30}
}
}
}
}
}
}
GET /my_store/products/_search
{
"query" : {
"constant_score" : {
"filter" : {
"terms" : {
"price" : [20, 30]
}
}
}
}
}
GET /my_store/products/_search
{
"query" : {
"constant_score" : {
"filter" : {
"range" : {
"price" : {
"gte" : 20,
"lt" : 40
}
}
}
}
}
}
POST /my_index/posts/_bulk
{ "index": { "_id": "1" }}
{ "tags" : ["search"] }
{ "index": { "_id": "2" }}
{ "tags" : ["search", "open_source"] }
{ "index": { "_id": "3" }}
{ "other_field" : "some data" }
{ "index": { "_id": "4" }}
{ "tags" : null }
{ "index": { "_id": "5" }}
{ "tags" : ["search", null] }
GET /my_index/posts/_search
{
"query" : {
"constant_score" : {
"filter" : {
"exists" : { "field" : "tags" }
}
}
}
}
GET /my_index/posts/_search
{
"query" : {
"constant_score" : {
"filter": {
"missing" : { "field" : "tags" }
}
}
}
}
POST /cars/transactions/_bulk
{ "index": {}}
{ "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" }
{ "index": {}}
{ "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2014-07-02" }
{ "index": {}}
{ "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2014-08-19" }
{ "index": {}}
{ "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" }
{ "index": {}}
{ "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" }
{ "index": {}}
{ "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" }
GET /cars/transactions/_search
{
"size" : 0,
"aggs" : {
"popular_colors111" : {
"terms" : {
"field" : "color.keyword"
}
}
}
}
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"colors": {
"terms": {
"field": "color.keyword"
},
"aggs": {
"avg_price": {
"avg": {
"field": "price"
}
}
}
}
}
}
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"colors": {
"terms": {
"field": "color.keyword"
},
"aggs": {
"avg_price": {
"avg": {
"field": "price"
}
},
"make": {
"terms": {
"field": "make.keyword"
}
}
}
}
}
}
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"colors": {
"terms": {
"field": "color.keyword"
},
"aggs": {
"avg_price": { "avg": { "field": "price" }
},
"make" : {
"terms" : {
"field" : "make.keyword"
},
"aggs" : {
"min_price" : { "min": { "field": "price"} },
"max_price" : { "max": { "field": "price"} }
}
}
}
}
}
}
GET /cars/transactions/_search
{
"size" : 0,
"aggs":{
"price":{
"histogram":{
"field": "price",
"interval": 20000
},
"aggs":{
"revenue": {
"sum": {
"field" : "price"
}
}
}
}
}
}
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"makes": {
"terms": {
"field": "make.keyword",
"size": 10
},
"aggs": {
"stats": {
"extended_stats": {
"field": "price"
}
}
}
}
}
}
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"sales": {
"date_histogram": {
"field": "sold",
"interval": "month",
"format": "yyyy-MM-dd",
"min_doc_count" : 0,
"extended_bounds" : {
"min" : "2014-01-01",
"max" : "2014-12-31"
}
}
}
}
}
GET /cars/transactions/_search
{
"size" : 0,
"aggs": {
"sales": {
"date_histogram": {
"field": "sold",
"interval": "quarter",
"format": "yyyy-MM-dd",
"min_doc_count" : 0,
"extended_bounds" : {
"min" : "2014-01-01",
"max" : "2014-12-31"
}
},
"aggs": {
"per_make_sum": {
"terms": {
"field": "make.keyword"
},
"aggs": {
"sum_price": {
"sum": { "field": "price" }
}
}
},
"total_sum": {
"sum": { "field": "price" }
}
}
}
}
}
GET /cars/transactions/_search
{
"query" : {
"match" : {
"make" : "ford"
}
},
"aggs" : {
"colors" : {
"terms" : {
"field" : "color.keyword"
}
}
}
}
GET /cars/transactions/_search
{
"size" : 0,
"query" : {
"match" : {
"make" : "ford"
}
},
"aggs" : {
"single_avg_price": {
"avg" : { "field" : "price" }
},
"all": {
"global" : {},
"aggs" : {
"avg_price": {
"avg" : { "field" : "price" }
}
}
}
}
}
一个 bool
过滤器由三部分组成:
{
"bool" : {
"must" : [],
"should" : [],
"must_not" : [],
}
}
must
所有的语句都 必须(must) 匹配,与 AND
等价。must_not
所有的语句都 不能(must not) 匹配,与 NOT
等价。should
至少有一个语句要匹配,与 OR
等价。就这么简单!当我们需要多个过滤器时,只须将它们置入 bool
过滤器的不同部分即可。
传统数据库为特定列增加一个索引,例如B-Tree索引来加速检索。Elasticsearch和Lucene使用倒排索引(inverted index)来达到相同目的,倒排索引中用到的数据结构是FST树。
Elasticsearch的选举是ZenDiscovery模块负责的,通过多播或单播技术来发现同一个集群中的其他节点并与它们连接。
一个节点如何选取它自己认为的master节点?
它会对所有可以成为master的节点(node.master: true)根据nodeId字典排序,,然后选出第一个(第0位)节点,暂且认为它是master节点。
如果对某个节点的投票数达到一定的值(可以成为master节点数n/2+1)并且该节点自己也选举自己,那这个节点就是master。否则重新选举一直到满足上述条件。
集群分片的读写操作流程
第一:路由计算(routing)和副本一致性(replica)
Elasticsearch针对路由计算选择了一个很简单的方法,计算如下:
routing = hash(routing) % number_of_primary_shards
每个数据都有一个routing参数,默认情况下,就使用其_id值,将其_id值计算hash后,对索引的主分片数取余,就是数据实际应该存储到的分片ID
由于取余这个计算,完全依赖于分母,所以导致Elasticsearch索引有一个限制,索引的主分片数,不可以随意修改。因为一旦主分片数不一样,索引数据不可读。
作为分布式系统,数据副本可算是一个标配。Elasticsearch数据写入流程。自然涉及副本,在有副本配置的情况下,数据从发向Elasticsearch节点,到接到Elasticsearch节点响应返回,流向如下
第二:shard的allocate配置
上文介绍了分片的索引过程,通过路由计算可以确定文本所在的分片id,那么分片在集群中的分配策略是如何确定的?
一般来说,某个shard分配在哪个节点上,是由Elasticsearch自动决定的。以下几种情况会触发分配动作:
如何更高效的集成一些已经成型的开源框架呢?推荐一个比较好用的es+spring的框架,而且是基于Restful方式的,支持像mybatis的写法。
<!-- ES start -->
<dependency>
<groupId>com.bbossgroups.plugins</groupId>
<artifactId>bboss-elasticsearch-spring-boot-starter</artifactId>
<version>6.1.1</version>
<exclusions>
<exclusion>
<artifactId>slf4j-log4j12</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
</exclusions>
</dependency>
<!-- ES end -->
<!-- 让你轻松的搞定ECharts各种域对象 -->
<dependency>
<groupId>com.github.abel533</groupId>
<artifactId>ECharts</artifactId>
<version>3.0.0.6</version>
</dependency>
后面专门来用一篇描述对于bboss-elasticsearch-spring-boot-starter
的集成以及改造。
# ElasticSearch 配置
spring.elasticsearch.bboss.elasticsearch.rest.hostNames=elasticsearch-test.za.net:9200
<properties>
<property name="pieLoanSuccessAndGroupByProduct">
<![CDATA[
{
"query": {
"bool": {
"filter": {
"range": {
"gmt_created": {
"include_lower": true,
"include_upper": true,
"from": #[from],
"to": #[to]
}
}
},
"must": {
"match": {
"status": 5
}
}
}
},
"size": 0,
"aggs": {
"list": {
"terms": {
"field": "product_code"
}
}
}
}
]]>
</property>
</properties>
@Test
public void testSearchAgg() throws Exception {
String mappath = "esmapper/LoanApply.xml";
//创建加载配置文件的客户端工具,用来检索文档,单实例多线程安全
ClientInterface clientInterface = bbossESStarter.getConfigRestClient(mappath);
Map<String, Object> params = new HashMap<String, Object>();
params.put("from", "2017-01-14 12:14:09");
params.put("to", "2018-05-14 12:14:09");
String path = index + "/" + type + "/_search";
ESAggDatas<LongAggHit> response = clientInterface.searchAgg(path,
"pieLoanSuccessAndGroupByProduct",
params,
LongAggHit.class,
"list");
log.info("response={}", JSONUtils.toFormatJsonString(response));
}
上面就是一个简单的例子,后面可以专门为这个开源项目做一个详细的介绍,不过人家的文档写的也是非常的Nice的。https://esdoc.bbossgroups.com/#/quickstart
如果大家喜欢我的文章,可以关注个人订阅号。欢迎随时留言、交流。如果想加入微信群的话一起讨论的话,请加管理员简栈文化-小助手(lastpass4u),他会拉你们进群。