专栏首页算法之名elasticsearch的restful API和Java API

elasticsearch的restful API和Java API

本人现在使用的是elasticsearch 5.2.1的,服务器IP为192.168.5.182.所以在Java API和jar包中会有所不同.

常用的restful API如下:

http://192.168.5.182:9200/_cat/health?v 健康检查 http://192.168.5.182:9200/_cat/indices?v 查看索引 PUT http://192.168.5.182:9200/test_index?pretty 添加索引 DELETE http://192.168.5.182:9200/test_index 删除索引 PUT http://192.168.5.182:9200/ecommerce/product/1 BODY { "name":"zhonghua yagao", "desc":"caoben zhiwu", "price":40, "producer":"zhonghua producer", "tags":["qingxin"] } 为索引添加数据,ecommerce索引,product type,1 ID GET http://192.168.5.182:9200/ecommerce/product/1 查询数据 PUT http://192.168.5.182:9200/ecommerce/product/1 BODY { "name":"jiaqiangban zhonghua yagao", "desc":"caoben zhiwu", "price":40, "producer":"zhonghua producer", "tags":["qingxin"] } 更新索引数据,方式一,必须带上所有数据 POST http://192.168.5.182:9200/ecommerce/product/1/_update BODY { "doc": { "name":"gaolujie yagao" } } 更新索引数据,方式二 DELETE http://192.168.5.182:9200/ecommerce/product/1 删除索引数据 GET http://192.168.5.182:9200/ecommerce/product/_search 搜索所有 GET http://192.168.5.182:9200/ecommerce/product/_search?q=name:yagao&sort=price:desc <query string search> curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' > { > "query":{ > "match_all":{} > } > }' <query DSL查询> curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' > { > "query":{ > "match":{ > "name":"yagao" > } > }, > "sort":[ > {"price":"desc"} > ] > }' 排序查询 curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' > { > "query":{ > "match_all":{} > }, > "from":1, > "size":1 > }' 分页查询 curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' { "query":{ "match_all":{} }, "_source":["name","price"] }' 只查询指定的字段 curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' { > "query":{ > "bool":{ > "must":{ > "match":{ > "name":"yagao" > } > }, > "filter":{ > "range":{ > "price":{ > "gt":25 > } > } > } > } > } > }' 查询yagao的price范围,大于25 <query filter> curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' > { > "query":{ > "match":{ > "producer":"yagao producer" > } > } > }' 全文检索<full-text search> curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' { "query":{ "match_phrase":{ "producer":"yagao producer" } } }' 短语搜索<phrase search> curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' > { > "query":{ > "match":{ > "producer":"producer" > } > }, > "highlight":{ > "fields":{ > "producer":{} > } > } > }' 高亮显示<highlight search> PUT http://192.168.5.182:9200/ecommerce/_mapping/product BODY { "properties":{ "tags":{ "type":"text", "fielddata":true } } } 将文本field的fielddata属性设置为true curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' > { > "aggs":{ > "group_by_tags":{ > "terms":{ > "field":"tags" > } > } > } > }' 对tags聚合,会显示明细 curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' { "size":0, "aggs":{ "group_by_tags":{ "terms":{ "field":"tags" } } } }' 对tags聚合,不显示明细,只显示聚合 curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' > { > "size":0, > "query":{ > "match":{ > "name":"yagao" > } > }, > "aggs":{ > "group_by_tags":{ > "terms":{ > "field":"tags" > } > } > } > }' 搜索包含条件的聚合 curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' > { > "size":0, > "aggs":{ > "group_by_tags":{ > "terms":{ > "field":"tags" > }, > "aggs":{ > "avg_price":{ > "avg":{ > "field":"price" > } > } > } > } > } > }' 聚合计算平均值 curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' > { > "size":0, > "aggs":{ > "group_by_tags":{ > "terms":{ > "field":"tags", > "order":{ > "avg_price":"desc" > } > }, > "aggs":{ > "avg_price":{ > "avg":{ > "field":"price" > } > } > } > } > } > }' 聚合后降序排序

curl -XGET 'http://192.168.5.182:9200/ecommerce/product/_search' -d' { "size":0, "aggs":{ "group_by_price":{ "range":{ "field":"price", "ranges":[ { "from":0, "to":20 }, { "from":20, "to":40 }, { "from":40, "to":60 } ] }, "aggs":{ "group_by_tags":{ "terms":{ "field":"tags" }, "aggs":{ "average_price":{ "avg":{ "field":"price" } } } } } } } }' 按照价格区间分组后再聚合tags平均价格 PUT http://192.168.5.182:9200/company BODY { "mappings": { "employee": { "properties": { "age": { "type": "long" }, "country": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }, "fielddata":true }, "join_date": { "type": "date" }, "name": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "position": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "salary": { "type": "long" } } } } } 给country建立正排索引

在Java API中,我们需要先找到相应的jar包,maven中的配置如下(开始之前请先执行上面的给country建立正排索引的restful API)

<dependency>
   <groupId>org.elasticsearch.client</groupId>
   <artifactId>transport</artifactId>
   <version>5.2.1</version>
</dependency>

5.2.1中只需要配这一个就可以了,当然不同的版本配置的都不同,高版本的需要配

<dependency>
   <groupId>org.elasticsearch</groupId>
   <artifactId>elasticsearch</artifactId>
</dependency>

我们依然在resources文件中做如下配置(注意restful API中使用的是9200端口,而Java API使用的是9300端口)

elasticsearch:
  clusterName: aubin-cluster
  clusterNodes: 192.168.5.182:9300

配置类如下

@Getter
@Setter
@Configuration
@ConfigurationProperties(prefix = "elasticsearch")
public class ElasticSearchConfig {

   private String clusterName;

   private String clusterNodes;

    /**
     * 使用elasticsearch实现类时才触发
     *
     * @return
     */
   @Bean
   public TransportClient transportClient() {
      // 设置集群名字
      Settings settings = Settings.builder().put("cluster.name", this.clusterName).build();
      TransportClient client = new PreBuiltTransportClient(settings);
      try {
         // 读取的ip列表是以逗号分隔的
         for (String clusterNode : this.clusterNodes.split(",")) {
            String ip = clusterNode.split(":")[0];
            String port = clusterNode.split(":")[1];
            client.addTransportAddress(new InetSocketTransportAddress(InetAddress.getByName(ip), Integer.parseInt(port)));
         }
      } catch (UnknownHostException e) {
         e.printStackTrace();
      }

      return client;
   }
}

在5.2.1中使用的是InetSocketTransportAddress,这是一个具体的类,而在高版本中此处为TransportAddress,这是一个接口.

我们做一个数据类

@Component
public class DataEs {
    @Autowired
    private TransportClient transportClient;

    /**
     * 添加原始数据
     * @throws IOException
     */
    @PostConstruct
    private void init() throws IOException {
        transportClient.prepareIndex("company","employee","1").setSource(XContentFactory.jsonBuilder().startObject()
                .field("name","jack")
                .field("age",27)
                .field("position","technique software")
                .field("country","China")
                .field("join_date","2018-01-01")
                .field("salary",10000)
                .endObject()).get();
        transportClient.prepareIndex("company","employee","2").setSource(XContentFactory.jsonBuilder().startObject()
                .field("name","marry")
                .field("age",35)
                .field("position","technique manager")
                .field("country","China")
                .field("join_date","2018-01-01")
                .field("salary",12000)
                .endObject()).get();
        transportClient.prepareIndex("company","employee","3").setSource(XContentFactory.jsonBuilder().startObject()
                .field("name","tom")
                .field("age",32)
                .field("position","senior technique software")
                .field("country","China")
                .field("join_date","2017-01-01")
                .field("salary",11000)
                .endObject()).get();
        transportClient.prepareIndex("company","employee","4").setSource(XContentFactory.jsonBuilder().startObject()
                .field("name","jen")
                .field("age",25)
                .field("position","junior finance")
                .field("country","USA")
                .field("join_date","2017-01-01")
                .field("salary",7000)
                .endObject()).get();
        transportClient.prepareIndex("company","employee","5").setSource(XContentFactory.jsonBuilder().startObject()
                .field("name","mike")
                .field("age",37)
                .field("position","finance manager")
                .field("country","USA")
                .field("join_date","2016-01-01")
                .field("salary",15000)
                .endObject()).get();
    }

    /**
     * 员工搜索应用程序
     * 搜索职位中包含technique的员工
     * 同时要求age在30到40岁之间
     * 分页查询,查找第一页
     */
    public void executeSearch() {
        SearchResponse searchResponse = transportClient.prepareSearch("company")
                .setTypes("employee")
                .setQuery(QueryBuilders.matchQuery("position", "technique"))
                .setPostFilter(QueryBuilders.rangeQuery("age").from(30).to(40))
                .setFrom(0).setSize(1)
                .get();
        SearchHit[] hits = searchResponse.getHits().getHits();
        for (int i = 0;i < hits.length;i++) {
            System.out.println(hits[i].getSourceAsString());
        }
    }

    /**
     * 员工聚合分析应用程序
     * 首先按照country国家来进行分组
     * 然后在每个country分组内,再按照入职年限进行分组
     * 最后计算每个分组内的平均薪资
     */
    public void executeAggregation() {
        SearchResponse searchResponse = transportClient.prepareSearch("company")
                .addAggregation(AggregationBuilders.terms("group_by_country").field("country")
                .subAggregation(AggregationBuilders.dateHistogram("group_by_join_date")
                .field("join_date").dateHistogramInterval(DateHistogramInterval.YEAR)
                .subAggregation(AggregationBuilders.avg("avg_salary").field("salary"))))
                .execute().actionGet();
        Map<String,Aggregation> aggrMap = searchResponse.getAggregations().asMap();
        StringTerms groupByCountry = (StringTerms) aggrMap.get("group_by_country");
        Iterator<StringTerms.Bucket> groupByCountryBucketIterator = groupByCountry.getBuckets().iterator();
        while (groupByCountryBucketIterator.hasNext()) {
            StringTerms.Bucket groupByCountryBucket = groupByCountryBucketIterator.next();
            System.out.println(groupByCountryBucket.getKey() + ":" + groupByCountryBucket.getDocCount());
            Histogram groupByJoinDate = (Histogram) groupByCountryBucket.getAggregations().asMap().get("group_by_join_date");
            Iterator<? extends Histogram.Bucket> groupByJoinDateIterator = groupByJoinDate.getBuckets().iterator();
            while (groupByJoinDateIterator.hasNext()) {
                Histogram.Bucket groupByJoinDateBucket = groupByJoinDateIterator.next();
                System.out.println(groupByJoinDateBucket.getKey() + ":" + groupByJoinDateBucket.getDocCount());
                Avg avg = (Avg) groupByJoinDateBucket.getAggregations().asMap().get("avg_salary");
                System.out.println(avg.getValue());
            }
        }
    }
    public void close() {
        transportClient.close();
    }
}

在主程序中调用如下(一般我们可以先不执行搜索操作,先注入数据,因为elasticsearch本身有一个秒级写读的问题,如果数据写入,得需要1秒的时间才能读取出来)

@SpringBootApplication
public class EsApplication {
   public static void main(String[] args) {
      ApplicationContext applicationContext = SpringApplication.run(EsApplication.class, args);
      DataEs dataEs = (DataEs) applicationContext.getBean(DataEs.class);
      dataEs.executeSearch();
      dataEs.executeAggregation();
      dataEs.close();
   }
}

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 归并排序算法 顶

    合并排序算法就是把多个有序数据表合并成一个有序数据表。如果参与合并的只有两个有序表,则称为二路合并。

    算法之名
  • centos 7 docker WARNING:IPv4 forwarding isdisabled

    算法之名
  • 给注册用户发红包,RabbitMQ实现(分布式事务2)

    DLX,Dead Letter Exchange 的缩写,又死信邮箱、死信交换机。DLX就是一个普通的交换机,和一般的交换机没有任何区别。 当消息在一个队列中变...

    算法之名
  • 非插件实现WordPress中文用户名注册方法

    作为国人注册时最喜欢用的还是中文名字,但默认情况下使用wordpress注册时是无法使用中文的,那怎么解决呢?国人是万能的,其实这也是我从网上找到的。那就是修改...

    汐楓
  • Java 版 C 语言经典 100 例(21 - 25)

    图形可拆分为两部分来看待,前四行一个规律,后三行一个规律,利用双重 for 循环,第一层控制行,第二层控制列

    村雨遥
  • 快速傅里叶变换(FFT)详解

    本文只讨论FFT在信息学奥赛中的应用 文中内容均为个人理解,如有错误请指出,不胜感激 前言 先解释几个比较容易混淆的缩写吧 DFT:离散傅里叶变换—> 计算多...

    attack
  • 求高手,求解释

    Select B.FItemNum, A.FAmount From  B Left Join A On A.FItemNum = B.FItemNum And ...

    跟着阿笨一起玩NET
  • T4模板语法

    T4,即4个T开头的英文字母组合:Text Template Transformation Toolkit。

    跟着阿笨一起玩NET
  • sparksql比hivesql优化的点(窗口函数)

    有时候,一个 select 语句中包含多个窗口函数,它们的窗口定义(OVER 子句)可能相同、也可能不同。

    数据仓库践行者
  • 【Mutual Training for Wannafly Union #1 】

    题意:过隧道,每次人可以先向前一格,然后向上或向下或不动,然后车都向左2格。问能否到达隧道终点。

    饶文津

扫码关注云+社区

领取腾讯云代金券