前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >cloudera 第四天 Flume

cloudera 第四天 Flume

作者头像
DataScience
发布2019-12-30 11:38:34
3780
发布2019-12-30 11:38:34
举报
文章被收录于专栏:A2DataA2Data
所谓·生活

就是一系列下定决心的努力

· 正 · 文 · 来 · 啦 ·

Flume

Flume是一个可伸缩的、实时的摄取框架,它允许您路由、筛选、聚合和执行数据的“小型操作”,并将其传送到一个可伸缩的处理平台,如CDH。但是,您确实希望最小化在进入集群的过程中完成的逻辑,这将确保其他工作负载的可用性,并防止摄取瓶颈。它仍然允许您利用CDH集群的巨大可伸缩性进行更繁重的处理。如果您需要执行一些繁重的聚合或多步传入数据的ETL,那么您应该使用Spark—一个内存中的处理框架,它可以与处理框架的其他部分进行扩展,并内置了高级的分析功能。

还要注意,在实际生产系统中,通过syslog来管理任何日志事件可能是更好的选择。这提供了一个更健壮的产品部署,因为它不依赖于文件追加。

首先要简单看一下,如果文字只适合单行或者几个字,尽量按照格式进行输入。

代码语言:javascript
复制
Last login: Thu Dec 27 05:55:50 2018 from 192.168.6.1

Creating an empty configuration

For the sake of this tutorial, you won't need to actually execute steps 1 or 2, as we have included the configuration and the schema file in your cluster already. They can be reviewed by exploring /opt/examples/flume/solr_configs.

If you were doing this on your own, you would generate the configs by executing the following command:
#为本教程创建一个空配置,您不需要实际执行步骤1或2,
#因为我们已经在集群中包含了配置和模式文件。
#可以通过探索/opt/example /flume/solr_configs来查看它们。
#如果您独自完成这项工作,您可以通过执行以下命令来生成配置:

[root@quickstart ~]# solrctl --zk quickstart:2181/solr instancedir --generate solr_configs

You don't need to do this for this tutorial. We have already generated the configuration for you. This instruction is here in case you want to create your own index.

The result of this command would be a skeleton configuration that you could then customize to your liking. The primary thing that you would ordinarily be customizing is the conf/schema.xml, which we cover in the next step.
Edit your schema

As mentioned previously, we have already generated the configuration files for you. You can view the modified sample schema here.

The most common area that you would be interested in is the <fields></fields> section. From this area you can define the fields that are present and searchable in your index.
Uploading your configuration

#本教程不需要这样做。我们已经为您生成了配置。
#如果您希望创建自己的索引,可以在这里使用此指令。
#这个命令的结果将是一个框架配置,然后您可以根据自己的喜好进行定制。
#您通常要定制的主要内容是conf/schema。xml,我们将在下一个步骤中介绍。
#编辑你的模式
#如前所述,我们已经为您生成了配置文件。
#您可以在这里查看修改后的示例模式。
#你最感兴趣的领域是什么

[root@quickstart ~]# cd /opt/examples/flume
[root@quickstart flume]#  solrctl --zk quickstart:2181/solr instancedir --create live_logs ./solr_configs
Uploading configs from ./solr_configs/conf to quickstart:2181/solr. This may take up to a minute.


Creating your collection
#创建你的集合

[root@quickstart flume]# solrctl --zk quickstart:2181/solr collection --create live_logs -s 1

You can verify that you successfully created your collection in Solr by going to Hue, and clicking Search in the top menu
#您可以通过转到Hue并单击顶部菜单中的Search来验证是否成功地在Solr中创建了集合
[root@quickstart flume]# 

然后从右上角单击索引,查看所有索引/集合

现在您可以查看我们在模式中定义的字段。xml文件

Starting the Log Generator

代码语言:javascript
复制
#您的Cloudera Live集群有一个日志生成器,用于示例数据。
#通过运行以下命令启动日志生成器:

[root@quickstart ~]# start_logs
[root@quickstart ~]# tail_logs
195.205.97.205 - - [03/Jan/2019:08:07:23 -0800] "GET /departments HTTP/1.1" 200 1272 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36"
211.12.138.198 - - [03/Jan/2019:08:07:24 -0800] "GET /departments HTTP/1.1" 200 2194 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0"
100.27.86.204 - - [03/Jan/2019:08:07:16 -0800] "GET /categories/mlb%20players/products HTTP/1.1" 200 2214 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36"
85.39.147.56 - - [03/Jan/2019:08:07:17 -0800] "GET /departments HTTP/1.1" 200 324 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.77.4 (KHTML, like Gecko) Version/7.0.5 Safari/537.77.4"
110.91.75.175 - - [03/Jan/2019:08:07:18 -0800] "GET /product/1145 HTTP/1.1" 200 196 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.76.4 (KHTML, like Gecko) Version/7.0.4 Safari/537.76.4"
218.75.175.52 - - [03/Jan/2019:08:07:19 -0800] "GET /departments HTTP/1.1" 200 429 "-" "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0"
140.117.68.194 - - [03/Jan/2019:08:07:20 -0800] "GET /department/outdoors/categories HTTP/1.1" 200 1500 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36"
145.216.153.36 - - [03/Jan/2019:08:07:21 -0800] "GET /product/1243 HTTP/1.1" 200 264 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36"
10.179.9.112 - - [03/Jan/2019:08:07:22 -0800] "GET /departments HTTP/1.1" 200 1779 "-" "Mozilla/5.0 (Windows NT 6.1; rv:30.0) Gecko/20100101 Firefox/30.0"
163.133.60.127 - - [03/Jan/2019:08:07:23 -0800] "GET /department/golf/products HTTP/1.1" 200 1640 "-" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36"
188.9.126.123 - - [03/Jan/2019:08:07:24 -0800] "GET /department/golf/categories HTTP/1.1" 200 1881 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.153 Safari/537.36"
85.39.147.56 - - [03/Jan/2019:08:07:25 -0800] "GET /departments HTTP/1.1" 200 1451 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.77.4 (KHTML, like Gecko) Version/7.0.5 Safari/537.77.4"
14.31.248.176 - - [03/Jan/2019:08:07:26 -0800] "GET /department/footwear/categories HTTP/1.1" 200 689 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36"
173.236.144.147 - - [03/Jan/2019:08:07:27 -0800] "GET /departments HTTP/1.1" 200 2115 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.77.4 (KHTML, like Gecko) Version/7.0.5 Safari/537.77.4"
108.81.243.150 - - [03/Jan/2019:08:07:28 -0800] "GET /departments HTTP/1.1" 200 1399 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"
^Z
[1]+  Stopped                 tail_logs
[root@quickstart ~]# stop_logs

Flume and the morphline

代码语言:javascript
复制
执行以下命令启动水槽代理:
[root@quickstart ~]$ flume-ng agent \
    --conf /opt/examples/flume/conf \
    --conf-file /opt/examples/flume/conf/flume.conf \
    --name agent1 \
    -Dflume.root.logger=DEBUG,INFO,console

现在您可以返回到Hue UI(关于链接请参阅您的集群指南页面),并从集合页面单击“搜索”:

‘ 所谓成功 ’

坚持把简单的事情做好就是不简单

坚持把平凡的事情做好就是不平凡

每个人都有潜在的能量,只是很容易--

被习惯所掩盖,

被时间所迷离,

被惰性所消磨。

那么,成功呢?就是在平凡中做出不平凡的坚持

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2019-01-04,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 DataScience 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
专用宿主机
专用宿主机(CVM Dedicated Host,CDH)提供用户独享的物理服务器资源,满足您资源独享、资源物理隔离、安全、合规需求。专用宿主机搭载了腾讯云虚拟化系统,购买之后,您可在其上灵活创建、管理多个自定义规格的云服务器实例,自主规划物理资源的使用。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档