Hadoop基础教程-第9章 HA高可用(9.4 YARN 高可用)(草稿)

第9章 HA高可用

9.4 YARN 高可用

9.4.1 RM单点故障

http://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html ResourceManager(RM)负责跟踪群集中的资源,并调度应用程序(例如MapReduce作业)。在Hadoop 2.4之前,ResourceManager是YARN集群中的单点故障。高可用性功能以活动/待机资源管理器对的形式添加冗余,以消除此单一故障点。

YARN高可用,也就是ResourceManager高可用,规划如下

IP

nodename

RM

NM

192.168.80.131

node1

Y

Y

192.168.80.132

node2

Y

192.168.80.133

node3

Y

Y

9.4.2 配置yarn-site.xml

[root@node1 hadoop]# vi yarn-site.xml
[root@node1 hadoop]# cat yarn-site.xml 
<?xml version="1.0"?>
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
        <description>NodeManager上运行的附属服务(运行MapReduce程序)</description>
    </property>
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
        <description>启用RM高可用性</description>
    </property>
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>yarn1</value>
        <description>YARN集群ID</description>
    </property>
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
        <description>启用HA时,群集中的RM节点列表</description>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>node1</value>
        <description>第1个resourcemanager</description>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>node3</value>
        <description>第2个resourcemanager</description>
    </property>
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>node1:2181,node2:2181,node3:2181</value>
        <description>Zookeeper列表</description>
    </property>
</configuration>

[root@node1 hadoop]# 

9.4.3 修改yarn-env.sh

编译yarn-env.sh文件,增加PID文件存放目录

[root@node1 hadoop]# vi yarn-env.sh

添加一行,内容如下

export YARN_PID_DIR=/var/run

9.4.4 分发配置文件

[root@node1 hadoop]# scp yarn-site.xml node2:/opt/hadoop-2.7.3/etc/hadoop/
yarn-site.xml                                                                                                                                              100%  841     0.8KB/s   00:00    
[root@node1 hadoop]# scp yarn-site.xml node3:/opt/hadoop-2.7.3/etc/hadoop/
yarn-site.xml                                                                                                                                              100%  841     0.8KB/s   00:00     
[root@node1 hadoop]# scp yarn-env.sh node2:/opt/hadoop-2.7.3/etc/hadoop/
yarn-env.sh                                                                                                                                                100% 4595     4.5KB/s   00:00    
[root@node1 hadoop]# scp yarn-env.sh node3:/opt/hadoop-2.7.3/etc/hadoop/
yarn-env.sh                                                                                                                                                100% 4595     4.5KB/s   00:00    
[root@node1 hadoop]# 

9.4.5 启动YARN

[root@node1 hadoop]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-resourcemanager-node1.out
node2: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-node2.out
node3: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-node3.out
node1: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-node1.out
[root@node1 hadoop]# 

node1

[root@node1 hadoop]# jps
4337 Jps
4036 NodeManager
3189 DataNode
3529 NameNode
3402 DFSZKFailoverController
3931 ResourceManager
3310 JournalNode
[root@node1 hadoop]#

node2

[root@node2 ~]# jps
4306 Jps
3828 DFSZKFailoverController
3670 JournalNode
4200 NodeManager
3449 DataNode
3947 NameNode
[root@node2 ~]#

node3

[root@node3 ~]# jps
2643 JournalNode
2771 NodeManager
2549 DataNode
2871 Jps

单独启用备用resourcemanager

[root@node3 ~]# yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-resourcemanager-node3.out
[root@node3 ~]# jps
2643 JournalNode
2771 NodeManager
2900 ResourceManager
2932 Jps
2549 DataNode
[root@node3 ~]# 

9.4.6 Web

通过浏览器打开 http://192.168.80.131:8088

通过浏览器打开 http://192.168.80.133:8088 跳转到http://node1:8088/

实际上http://node1:8088/就是http://192.168.80.133:8088,是因为物理机Windows系统的hosts文件没有配置node1而已。

9.4.7 测试

[root@node1 hadoop]# jps
4337 Jps
4036 NodeManager
3189 DataNode
3529 NameNode
3402 DFSZKFailoverController
3931 ResourceManager
3310 JournalNode
[root@node1 hadoop]# kill 3931
[root@node1 hadoop]# 

这时再次打开http://192.168.80.133:8088

9.4.8 Zookeeper

[root@node2 ~]# zkCli.sh 
Connecting to localhost:2181
2017-07-22 10:36:18,876 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
2017-07-22 10:36:18,881 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=node2
2017-07-22 10:36:18,881 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.8.0_112
2017-07-22 10:36:18,884 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation
2017-07-22 10:36:18,885 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/opt/jdk1.8.0_112/jre
2017-07-22 10:36:18,885 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/opt/zookeeper-3.4.10/bin/../build/classes:/opt/zookeeper-3.4.10/bin/../build/lib/*.jar:/opt/zookeeper-3.4.10/bin/../lib/slf4j-log4j12-1.6.1.jar:/opt/zookeeper-3.4.10/bin/../lib/slf4j-api-1.6.1.jar:/opt/zookeeper-3.4.10/bin/../lib/netty-3.10.5.Final.jar:/opt/zookeeper-3.4.10/bin/../lib/log4j-1.2.16.jar:/opt/zookeeper-3.4.10/bin/../lib/jline-0.9.94.jar:/opt/zookeeper-3.4.10/bin/../zookeeper-3.4.10.jar:/opt/zookeeper-3.4.10/bin/../src/java/lib/*.jar:/opt/zookeeper-3.4.10/bin/../conf:.::/opt/jdk1.8.0_112/lib
2017-07-22 10:36:18,885 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2017-07-22 10:36:18,885 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp
2017-07-22 10:36:18,885 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>
2017-07-22 10:36:18,885 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux
2017-07-22 10:36:18,885 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=amd64
2017-07-22 10:36:18,886 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=3.10.0-514.el7.x86_64
2017-07-22 10:36:18,886 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=root
2017-07-22 10:36:18,886 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/root
2017-07-22 10:36:18,886 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/root
2017-07-22 10:36:18,888 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@506c589e
Welcome to ZooKeeper!
2017-07-22 10:36:19,007 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1032] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
JLine support is enabled
2017-07-22 10:36:19,252 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@876] - Socket connection established to localhost/127.0.0.1:2181, initiating session
2017-07-22 10:36:19,282 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1299] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x25d6a8d40030005, negotiated timeout = 30000

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
[zookeeper, test, yarn-leader-election, hadoop-ha]
[zk: localhost:2181(CONNECTED) 1] ls /yarn-leader-election
[yarn1]
[zk: localhost:2181(CONNECTED) 2] ls /yarn-leader-election/yarn1
[ActiveBreadCrumb, ActiveStandbyElectorLock]
[zk: localhost:2181(CONNECTED) 3] 

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏QQ空间开发团队的专栏

WKWebView 那些坑

WKWebView 是苹果在 WWDC 2014上推出的新一代 webView 组件,用以替代 UIKit 中笨重难用,本文主要讲述适配 WKWebView 过...

1.6K1
来自专栏散尽浮华

git review报错一例

在线上修改代码,最后使用git review提交代码审核的时候出现报错如下: [wangshibo@115~]$ vim testfile           ...

1637
来自专栏ShaoYL

遮罩 HUD 指示器 蒙板 弹窗

2825
来自专栏iOS开发攻城狮的集散地

WKWebView的使用

1975
来自专栏清风

hadoop学习笔记 原

653
来自专栏Hadoop实操

如何在Kerberos与非Kerberos的CDH集群BDR不可用时复制数据

本文档描述了在Kerberos与非Kerberos的CDH集群之间BDR不可用的情况下实现数据互导。文档主要讲述

48711
来自专栏猿人谷

Hadoop架构——云计算的具体实现

Hadoop是IT行业一个新的热点,是云计算的一个具体实现、Hadoop本身具有很高的技术含量,是IT工程师学习的首选!下面我们来详细讲讲什么是Hadoop。 ...

1736
来自专栏小狼的世界

Kubernetes中StatefulSet介绍

使用Kubernetes来调度无状态的应用非常简单,那Kubernetes如何来管理调度有状态的应用呢?Kubernetes中提供了一个StatefulSet控...

1094
来自专栏大数据-Hadoop、Spark

启动hadoop,jps没有datanode

启动./start-dfs.sh后jps发现没有datanode进程。 查看日志 2018-02-27 13:54:27,918 INFO org.apache...

3146
来自专栏大数据学习笔记

Hadoop基础教程-第13章 源码编译(13.2 Hadoop2.7.3源码编译)

第13章 源码编译 13.2 Hadoop2.7.3源码编译 13.2.1下载Hadoop源码包 (1)到官网http://hadoop.apache.org/...

1758

扫码关注云+社区