Flume 结构以及使用
# 每个组件的名称
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# netcat监控方式、监控的ip:localhost、端口:44444
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
# sink 的方式 logger
a1.sinks.k1.type = logger
# 写入到内存、
a1.channels.c1.type = memory
# 绑定source和sink到channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
用于监控Linux命令
a1.sources = r1
a1.channels = c1
# 指定类型、命令
a1.sources.r1.type = exec
a1.sources.r1.command = tail -F /var/log/secure
a1.sources.r1.channels = c1
用于监控文件,比Exec监控更加可靠
a1.channels = ch-1
a1.sources = src-1
fs.sources.r3.type=spooldir
fs.sources.r3.spoolDir=/opt/modules/apache-flume-1.6.0-bin/flume_template
fs.sources.r3.fileHeader=true
fs.sources.r3.ignorePattern=^(.)*\\.out$ # 过滤out结尾的文件
Spooling Directory Source 详细参数
中间文件存放在内存中
a1.channels = c1
a1.channels.c1.type = memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 10000
a1.channels.c1.byteCapacityBufferPercentage = 20
a1.channels.c1.byteCapacity = 800000
中间文件存放在文件中
a1.channels = c1
a1.channels.c1.type = file
a1.channels.c1.checkpointDir = /mnt/flume/checkpoint
a1.channels.c1.dataDirs = /mnt/flume/data
在INFO级别记录文件,通常用于调试
a1.channels = c1
a1.sinks = k1
a1.sinks.k1.type = logger
a1.sinks.k1.channel = c1
记录文件写入到HDFS中
a1.channels = c1
a1.sinks = k1
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute