前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >《快学BigData》--Hadoop总结(I)(42)

《快学BigData》--Hadoop总结(I)(42)

作者头像
小徐
发布2019-08-05 14:43:43
9730
发布2019-08-05 14:43:43
举报
文章被收录于专栏:Greenplum

Hadoop总结 - - - - - - - - - - - - - - - - - - - - - - - - - - - - 210

概述 - - - - - - - - - - - - - - - - - - - - - - - - - - - - 211

CDH - - - - - - - - - - - - - - - - - - - - - - - - - - - - 211

安装Hadoop2.6.4 非Zookeeper集群版 - - - - - - - - - - - - - - - 211

安装Hadoop2.6.4 Zookeeper集群版 - - - - - - - - - - - - - - - 216

MapReduce整体的流程详解 - - - - - - - - - - - - - - - - - - - - 225

Hadoop HDFS 系统详解 - - - - - - - - - - - - - - - - - - - - - 226

JAVA 操作HDFS - - - - - - - - - - - - - - - - - - - - - - - - 241

Hadoop MapReduce 实例 - - - - - - - - - - - - - - - - - - - - 248

Hadoop 其他总结 - - - - - - - - - - - - - - - - - - - - - - - - 259

Hadoop 优化总结 - - - - - - - - - - - - - - - - - - - - - - - - 259

基于HDP2.6.0.3-8的Hadoop TestDFSIO、mrbench和nnbench是三个广泛被使用的测试

详细测试过程请查看:http://blog.csdn.net/xfg0218/article/details/78592512

1-1)、Hadoop Test 的测试

A)、进入的目录

# cd /usr/hdp/2.6.0.3-8/hadoop-mapreduce

B)、查看参数

# hadoop jar hadoop-mapreduce-client-jobclient-2.7.3.2.6.0.3-8.jar

*****

1-2)、TestDFSIO write的性能测试

主要目的是测试hadoop写的速度

A)、查看参数

# hadoop jar hadoop-mapreduce-client-jobclient-2.7.3.2.6.0.3-8.jar TestDFSIO

17/11/21 14:46:38 INFO fs.TestDFSIO: TestDFSIO.1.8

Missing arguments.

Usage: TestDFSIO [genericOptions] -read [-random | -backward | -skip [-skipSize Size]] | -write | -append | -truncate | -clean [-compression codecClassName] [-nrFiles N] [-size Size[B|KB|MB|GB|TB]] [-resFile resultFileName] [-bufferSize Bytes] [-rootDir]

B)、运行实例

# hadoop jar hadoop-mapreduce-client-jobclient-2.7.3.2.6.0.3-8.jar TestDFSIO -write -nrFiles 10 -size 10MB

*********

C)、查看数据

# hadoop fs -ls -h /benchmarks/TestDFSIO/io_data

Found 10 items

-rw-r--r-- 3 admin hdfs 10 M 2017-11-21 14:53 /benchmarks/TestDFSIO/io_data/test_io_0

-rw-r--r-- 3 admin hdfs 10 M 2017-11-21 14:53 /benchmarks/TestDFSIO/io_data/test_io_1

***********

D)、查看执行的结果

# cat TestDFSIO_results.log

----- TestDFSIO ----- : write

Date & time: Tue Nov 21 14:53:44 CST 2017

Number of files: 10

Total MBytes processed: 100.0

Throughput mb/sec: 19.485580670303975

Average IO rate mb/sec: 24.091276168823242

IO rate std deviation: 9.242316274402379

Test exec time sec: 63.103

1-3)、TestDFSIO Read的性能测试

主要目的测试hadoop读文件的速度

A)、运行命令

TestDFSIO的用法如下:

Usage: TestDFSIO [genericOptions] -read [-random | -backward | -skip [-skipSize Size]] | -write | -append | -clean [-compression codecClassName] [-nrFiles N] [-size Size[B|KB|MB|GB|TB]] [-resFile resultFileName] [-bufferSize Bytes] [-rootDir]

# hadoop jar hadoop-mapreduce-client-jobclient-2.7.3.2.6.0.3-8.jar TestDFSIO -read -nrFiles 10 -size 10

***************

B)、查看运行的情况

# cat TestDFSIO_results.log

----- TestDFSIO ----- : write

Date & time: Tue Nov 21 14:53:44 CST 2017

Number of files: 10

Total MBytes processed: 100.0

Throughput mb/sec: 19.485580670303975

Average IO rate mb/sec: 24.091276168823242

IO rate std deviation: 9.242316274402379

Test exec time sec: 63.103

----- TestDFSIO ----- : read

Date & time: Tue Nov 21 15:04:33 CST 2017

Number of files: 10

Total MBytes processed: 100.0

Throughput mb/sec: 617.283950617284

Average IO rate mb/sec: 688.1331176757812

IO rate std deviation: 182.42935237458195

Test exec time sec: 36.148

1-4)、清空测试数据

# hadoop jar hadoop-mapreduce-client-jobclient-2.7.3.2.6.0.3-8.jar TestDFSIO -clean

17/11/21 15:15:35 INFO fs.TestDFSIO: TestDFSIO.1.8

17/11/21 15:15:35 INFO fs.TestDFSIO: nrFiles = 1

17/11/21 15:15:35 INFO fs.TestDFSIO: nrBytes (MB) = 1.0

17/11/21 15:15:35 INFO fs.TestDFSIO: bufferSize = 1000000

17/11/21 15:15:35 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO

17/11/21 15:15:35 INFO fs.TestDFSIO: Cleaning up test files

1-5)、查看hadoop文件系统

# hadoop fs -ls /benchmarks/

1-6)、nnbench 测试 [NameNode benchmark (nnbench)]

nnbench用于测试NameNode的负载,它会生成很多与HDFS相关的请求,给NameNode施加较大的压力。

这个测试能在HDFS上创建、读取、重命名和删除文件操作

A)、查看nnbench选项

# hadoop jar hadoop-mapreduce-client-jobclient-2.7.3.2.6.0.3-8.jar nnbench

*********

B)、运行命令

以下例子使用10个mapper和5个reducer来创建1000个文件

# hadoop jar hadoop-mapreduce-client-jobclient-2.7.3.2.6.0.3-8.jar nnbench -operation create_write -maps 10 -reduces 5 -numberOfFiles 1000 -replicationFactorPerFile 3 -readFileAfterOpen true

***************

C)、查看结果

# cat NNBench_results.log

-------------- NNBench -------------- :

Version: NameNode Benchmark 0.4

Date & time: 2017-11-21 15:21:35,703

Test Operation: create_write

Start time: 2017-11-21 15:21:08,692

Maps to run: 10

Reduces to run: 5

Block Size (bytes): 1

Bytes to write: 0

Bytes per checksum: 1

Number of files: 1000

Replication factor: 3

Successful file operations: 0

# maps that missed the barrier: 5

# exceptions: 5000

TPS: Create/Write/Close: 0

Avg exec time (ms): Create/Write/Close: Infinity

Avg Lat (ms): Create/Write: NaN

Avg Lat (ms): Close: NaN

RAW DATA: AL Total #1: 0

RAW DATA: AL Total #2: 0

RAW DATA: TPS Total (ms): 21176

RAW DATA: Longest Map Time (ms): 4535.0

RAW DATA: Late maps: 5

RAW DATA: # of exceptions: 5000

1-7)、mrbench测试[MapReduce benchmark (mrbench)]

mrbench会多次重复执行一个小作业,用于检查在机群上小作业的运行是否可重复以及运行是否高效。

A)、查看帮助

# hadoop jar hadoop-mapreduce-client-jobclient-2.7.3.2.6.0.3-8.jar mrbench --help

MRBenchmark.0.0.2

Usage: mrbench [-baseDir <base DFS path for output/input, default is /benchmarks/MRBench>] [-jar <local path to job jar file containing Mapper and Reducer implementations, default is current jar file>] [-numRuns <number of times to run the job, default is 1>] [-maps <number of maps for each run, default is 2>] [-reduces <number of reduces for each run, default is 1>] [-inputLines <number of input lines to generate, default is 1>] [-inputType <type of input to generate, one of ascending (default), descending, random>] [-verbose]

B)、下面的例子会运行一个小作业2次

# hadoop jar hadoop-mapreduce-client-jobclient-2.7.3.2.6.0.3-8.jar mrbench -numRuns 2

MRBenchmark.0.0.2

*************

DataLines Maps Reduces AvgTime (milliseconds)

1 2 1 39012

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2018-03-28,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 河马coding 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 1-1)、Hadoop Test 的测试
    • A)、进入的目录
      • B)、查看参数
      • 1-2)、TestDFSIO write的性能测试
        • A)、查看参数
          • B)、运行实例
            • C)、查看数据
              • D)、查看执行的结果
              • 1-3)、TestDFSIO Read的性能测试
                • A)、运行命令
                  • B)、查看运行的情况
                  • 1-4)、清空测试数据
                  • 1-5)、查看hadoop文件系统
                  • 1-6)、nnbench 测试 [NameNode benchmark (nnbench)]
                    • A)、查看nnbench选项
                      • B)、运行命令
                        • C)、查看结果
                        • 1-7)、mrbench测试[MapReduce benchmark (mrbench)]
                          • A)、查看帮助
                            • B)、下面的例子会运行一个小作业2次
                            相关产品与服务
                            文件存储
                            文件存储(Cloud File Storage,CFS)为您提供安全可靠、可扩展的共享文件存储服务。文件存储可与腾讯云服务器、容器服务、批量计算等服务搭配使用,为多个计算节点提供容量和性能可弹性扩展的高性能共享存储。腾讯云文件存储的管理界面简单、易使用,可实现对现有应用的无缝集成;按实际用量付费,为您节约成本,简化 IT 运维工作。
                            领券
                            问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档