hadoop ha搭建

环境: Ubuntu 16.04.2 3台 下面来看一下架构图

下面我们将直接进行部署流程,最后会来简单阐述一下原理 zookeeper的部署

tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
dataDir=/apps/svr/install/apache-zookeeper-3.5.7-bin/data
dataLogDir=/apps/svr/install/apache-zookeeper-3.5.7-bin/log
server.0=192.168.1.7:2888:3888
server.1=192.168.1.8:2888:3888
server.2=192.168.1.9:2888:3888

修改zoo.cfg,创建对应的目录,在data目录下创建myid文件,一切完毕后进行启动

hadoop-ha部署,我们这里采用一步到位的做法 先来看一下配置

###core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
<!--说明:hadoop2.x端口默认9000;hadoop3.x端口默认9820-->
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://mycluster</value>
        </property>
        <!--注意:临时目录自己创建下-->
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/opt/hadoop/ha</value>
        </property>
       <property>
            <name>ha.zookeeper.quorum</name>
            <value>192.168.1.7:2181,192.168.1.8:2181,192.168.1.9:2181</value>
        </property>
</configuration>
###hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
        <!--说明:不配置副本的情况下默认是3 -->
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>
        <property>
            <!--设置 secondaryNameNode 为 node02节点的虚拟机; hadoop2.x 端口为50090-->
            <name>dfs.namenode.secondary.http-address</name>
            <value>ubuntu-node2:50090</value>
        </property>
        <!--关闭 hdfs 读取权限,即不检查权限-->
        <property>
            <name>dfs.permissions.enabled</name>
            <value>false</value>
        </property>
        <property>
            <name>dfs.nameservices</name>
            <value>mycluster</value>
        </property>
        <property>
            <name>dfs.ha.namenodes.mycluster</name>
            <value>node1,node2</value>
        </property>
        <property>
            <name>dfs.namenode.rpc-address.mycluster.node1</name>
            <value>ubuntu-node1:8020</value>
        </property>
        <property>
            <name>dfs.namenode.rpc-address.mycluster.node2</name>
            <value>ubuntu-node2:8020</value>
        </property>
        <property>
            <name>dfs.namenode.http-address.mycluster.node1</name>
            <value>ubuntu-node1:50070</value>
        </property>
        <property>
            <name>dfs.namenode.http-address.mycluster.node2</name>
            <value>ubuntu-node2:50070</value>
        </property>
        <property>
            <name>dfs.namenode.shared.edits.dir</name>
            <value>qjournal://ubuntu-node1:8485;ubuntu-node2:8485;ubuntu-node3:8485/mycluster</value>
        </property>
        <property>
            <name>dfs.client.failover.proxy.provider.mycluster</name>
            <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        </property>
        <property>
            <name>dfs.ha.fencing.methods</name>
            <value>sshfence</value>
        </property>
        <property>
            <name>dfs.ha.fencing.ssh.private-key-files</name>
            <value>/home/ubuntu/.ssh/id_rsa</value>
        </property>
        <property>
            <name>dfs.ha.automatic-failover.enabled</name>
            <value>true</value>
        </property>
        <property>
            <name>dfs.journalnode.edits.dir</name>
            <value>/tmp/hadoop/journalnode/data</value>
        </property>
</configuration>

启动 在奇数个节点上启动QJM

sbin/hadoop-daemon.sh start journalnode

首先在namenode1上执行

bin/hdfs namenode -format

然后在namenode1上执行

bin/hdfs zkfc -formatZK

启动程序

sbin/start-dfs.sh

在namenode2上执行格式化

bin/hdfs namenode -bootstrapStandby

启动namenode2上的namenode

sbin/hadoop-daemon.sh start

到此hadoop-ha已经搭建完毕 查看状态的命令

bin/hdfs haadmin -getServiceState <id>

下面说说yarn ha的搭建

<?xml version="1.0"?>
<configuration>
    <!--启用resourcemanager ha-->
	<property>
			<name>yarn.resourcemanager.ha.enabled</name>
			<value>true</value>
	</property>
	<property>
		<name>yarn.resourcemanager.cluster-id</name>
		<value>yarn-cluster</value>
	</property>
	<property>
		<name>yarn.resourcemanager.ha.rm-ids</name>
		<value>rm1,rm2</value>
	</property>
	<property>
		<name>yarn.resourcemanager.hostname.rm1</name>
		<value>ubuntu-node1</value>
	</property>
	<property>
		<name>yarn.resourcemanager.hostname.rm2</name>
		<value>ubuntu-node2</value>
	</property>
	<!--zookeeper-->
	<property>
		<name>yarn.resourcemanager.zk-address</name>
		<value>192.168.1.7:2181,192.168.1.8:2181,192.168.1.9:2181</value>
	</property>
	<!--NodeManager上运行的附属服务。需配置成mapreduce_shuffle,才可运行MapReduce程序-->
	<property>
		<name>yarn.nodemanager.aux-services</name>
		<value>mapreduce_shuffle</value>
	</property>
	<!--启用自动恢复-->
	<property>
		<name>yarn.resourcemanager.recovery.enabled</name>
		<value>true</value>
	</property>

	<!--指定resourcemanager的状态信息存储在zookeeper集群-->
	<property>
		<name>yarn.resourcemanager.store.class</name>     
		<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
	</property>
	<!--日志聚合-->
	<property>
		<name>yarn.log-aggregation-enable</name>
		<value>true</value>
	</property>
	<!--任务历史服务-->
	<property>
		<name>yarn.log.server.url</name>
		<value>http://ubuntu-node1:19888/jobhistory/logs/</value>
	</property>
	<!--HDFS上保存多长时间-->
	<property>
		<name>yarn.log-aggregation.retain-seconds</name>
		<value>86400</value>
	</property>
</configuration>

启动命令

sbin/start-yarn.sh

启动备用节点

sbin/yarn-daemon.sh start resourcemanager

查看状态的命令

bin/yarn rmadmin -getServiceState <id>

手动激活命令

bin/hdfs haadmin -transitionToActive [--forcemanual] <id>

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • scala(3):class和object,trait的区别

    Scala类和java类中有些区别,在Scala声明private变量会Scala编译器会自动生成get,set,在Scala中变量是需要初始化的,如果不声明p...

    yiduwangkai
  • spark 编写udaf函数求中位数

    yiduwangkai
  • Effective Java 第三版【中文版】--1.引言

    https://github.com/jbloch/effective-java-3e-source-code

    yiduwangkai
  • Hive篇--搭建Hive集群

    Hive中搭建分为三中方式 a)内嵌Derby方式 b)Local方式 c)Remote方式 三种方式归根到底就是元数据的存储位置不一样。

    LhWorld哥陪你聊算法
  • druid 数据源 使用属性文件的一个坑

    直接上代码: <bean id="propertiesFactoryBean" class="org.springframework.bea...

    菩提树下的杨过
  • mysql : utf8mb4 的问题

    微信呢称和QQ呢称上有很多火星文和emoji表情图片,这些数据,如果直接insert到mysql数据库,一般会报错,设置成utf8都不好使,必须改成utf8mb...

    菩提树下的杨过
  • Java程序员的日常——SpringMVC+Mybatis开发流程、推荐系统

    今天大部分时间都在写业务代码,然后算是从无到有的配置了下spring与mybatis的集成。 SpringMVC+Mybatis Web开发流程 配置数据...

    用户1154259
  • Zookeeper + Hadoop2.6 集群HA + spark1.6完整搭建与所有参数解析

    yum install autoconfautomake libtool cmake

    用户3003813
  • Hadoop 2.6.0集群搭建

    yum install autoconfautomake libtool cmake

    用户3003813
  • Spring MVC多个视图解析器及优先级

    如果应用了多个视图解析器策略,那么就必须通过“order”属性来声明优先级,order值越低,则优先级越高

    試毅-思伟

扫码关注云+社区

领取腾讯云代金券