经过上一篇的引导,相信小伙伴们已经可以进行下面的操作了
首先,我们需要知道配置伪分布式集群要修改的配置文件
所有配置文件都在 /opt/module/hadoop-2.7.2/etc/hadoop/
内
序号 | 文件名 |
---|---|
01 | hadoop-env.sh |
02 | core-site.xml |
03 | hdfs-site.xml |
序号 | 文件名 |
---|---|
01 | yarn-env.sh |
02 | yarn-site.xml |
03 | mapred-env.sh |
序号 | 文件名 |
---|---|
01 | mapred-site.xml |
序号 | 文件名 |
---|---|
01 | yarn-site.xml |
①Linux系统中获取JDK的安装路径(如果能记住路径可以省略):
[bigdata@hadoop001 ~]$ echo $JAVA_HOME
/opt/module/jdk1.8.0_144
下面需要修改JAVA_HOME 路径:
export JAVA_HOME=/opt/module/jdk1.8.0_144
[bigdata@hadoop001 hadoop]$ vim core-site.xml
<!-- 指定HDFS中NameNode的地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop001:9000</value>
</property>
<!-- 指定Hadoop运行时产生文件的存储目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/module/hadoop-2.7.2/data/tmp</value>
</property>
[bigdata@hadoop001 hadoop]$ vim hdfs-site.xml
<!-- 指定HDFS副本的数量 -->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
[bigdata@hadoop001 hadoop-2.7.2]$ bin/hdfs namenode -format
和上述一样即为正确。
[bigdata@hadoop001 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start namenode
[bigdata@hadoop001 hadoop-2.7.2]$ sbin/hadoop-daemon.sh start datanode
[bigdata@hadoop001 logs]$ ll
# 下面的为日志文件
总用量 220
-rw-rw-r--. 1 bigdata bigdata 82138 4月 21 02:38 hadoop-bigdata-datanode-hadoop001.log
-rw-rw-r--. 1 bigdata bigdata 719 4月 21 02:38 hadoop-bigdata-datanode-hadoop001.out
-rw-rw-r--. 1 bigdata bigdata 719 4月 21 02:28 hadoop-bigdata-datanode-hadoop001.out.1
-rw-rw-r--. 1 bigdata bigdata 111269 4月 21 02:38 hadoop-bigdata-namenode-hadoop001.log
-rw-rw-r--. 1 bigdata bigdata 719 4月 21 02:38 hadoop-bigdata-namenode-hadoop001.out
-rw-rw-r--. 1 bigdata bigdata 719 4月 21 02:36 hadoop-bigdata-namenode-hadoop001.out.1
-rw-rw-r--. 1 bigdata bigdata 719 4月 21 02:30 hadoop-bigdata-namenode-hadoop001.out.2
-rw-rw-r--. 1 bigdata bigdata 719 4月 21 02:28 hadoop-bigdata-namenode-hadoop001.out.3
-rw-rw-r--. 1 bigdata bigdata 0 4月 21 02:28 SecurityAuth-bigdata.audit
[bigdata@hadoop001 logs]$ cat hadoop-bigdata-datanode-hadoop001.log
[bigdata@hadoop001 hadoop]$ vim yarn-env.sh
export JAVA_HOME=/opt/module/jdk1.8.0_144
[bigdata@hadoop001 hadoop]$ yarn-site.xml
<!-- Reducer获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>2</value>
</property>
<!-- 指定YARN的ResourceManager的地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop001</value>
</property>
export JAVA_HOME=/opt/module/jdk1.8.0_144
[bigdata@hadoop001 hadoop]$ mv mapred-site.xml.template mapred-site.xml
[bigdata@hadoop001 hadoop]$ vim mapred-site.xml
<!-- 指定MR运行在YARN上 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
# 启动服务
[bigdata@hadoop001 hadoop-2.7.2]$ sbin/yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-bigdata-resourcemanager-hadoop001.out
[bigdata@hadoop001 hadoop-2.7.2]$ sbin/yarn-daemon.sh start nodemanager
starting nodemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-bigdata-nodemanager-hadoop001.out
# 查看是否启动成功
[bigdata@hadoop001 hadoop-2.7.2]$ jps
3414 DataNode
3993 ResourceManager
3722 NodeManager
3327 NameNode
4159 Jps
YARN的浏览器页面查看:http://hadoop001:8088/cluster
如果想要查看程序的历史运行情况,需要配置一下历史服务器。具体配置步骤如下:
[bigdata@hadoop001 hadoop]$ vim mapred-site.xml
# 在该文件里面增加如下配置。
<!-- 历史服务器端地址 -->
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop001:10020</value>
</property>
<!-- 历史服务器web端地址 -->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop001:19888</value>
</property>
[bigdata@hadoop001 hadoop-2.7.2]$ sbin/mr-jobhistory-daemon.sh start historyserver
[bigdata@hadoop001 hadoop-2.7.2]$ jps
4304 JobHistoryServer
26210 Jps
3414 DataNode
3993 ResourceManager
3327 NameNode
4495 NodeManager
http://hadoop001:19888/jobhistory
日志聚集概念:应用运行完成以后,将程序运行日志信息上传到HDFS系统上。 日志聚集功能好处:可以方便的查看到程序运行详情,方便开发调试。
注意:开启日志聚集功能,需要重新启动NodeManager 、ResourceManager和HistoryManager。 下面为开启日志聚集功能具体操作步骤:
[bigdata@hadoop001 hadoop]$ vim yarn-site.xml
# 在该文件里面增加如下配置。
<!-- 日志聚集功能使能 -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<!-- 日志保留时间设置7天 -->
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
[bigdata@hadoop001 hadoop-2.7.2]$ sbin/yarn-daemon.sh stop resourcemanager
stopping resourcemanager
[bigdata@hadoop001 hadoop-2.7.2]$ sbin/yarn-daemon.sh stop nodemanager
stopping nodemanager
[bigdata@hadoop001 hadoop-2.7.2]$ sbin/mr-jobhistory-daemon.sh stop historyserver
stopping historyserver
[bigdata@hadoop001 hadoop-2.7.2]$ sbin/yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-bigdata-resourcemanager-hadoop001.out
[bigdata@hadoop001 hadoop-2.7.2]$ sbin/yarn-daemon.sh start nodemanager
starting nodemanager, logging to /opt/module/hadoop-2.7.2/logs/yarn-bigdata-nodemanager-hadoop001.out
[bigdata@hadoop001 hadoop-2.7.2]$ sbin/mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /opt/module/hadoop-2.7.2/logs/mapred-bigdata-historyserver-hadoop001.out
[bigdata@hadoop001 hadoop-2.7.2]$ bin/hdfs dfs -rm -R /user/bigdata/output
# 如果没有input 可先创建
[bigdata@hadoop001 hadoop-2.7.2]$ bin/hdfs dfs -mkdir -p /user/bigdata/input
# 运行程序
[bigdata@hadoop001 hadoop-2.7.2]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /user/bigdata/input /user/bigdata/output
http://hadoop001:19888/jobhistory
各位路过的朋友,如果觉得可以学到些什么的话,点个赞再走吧,欢迎各位路过的大佬评论,指正错误,也欢迎有问题的小伙伴评论留言,私信。