前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Ansible搭建hadoop-3.1.3高可用

Ansible搭建hadoop-3.1.3高可用

原创
作者头像
小朋友呢
修改2020-08-10 17:38:50
2.6K2
修改2020-08-10 17:38:50
举报

Ansible搭建hadoop3.1.3高可用

一、节点信息

  • 内核版本:3.10.0-1062.el7.x86_64
  • 系统版本:Red Hat Enterprise Linux Server release 7.7 (Maipo)

节点

ip

内存

jdk

hadoop

ZK

NN

DN

RN

NM

JN

ZKFC

hdp-01

192.186.10.11

1G

hdp-02

192.186.10.12

1G

hdp-03

192.186.10.13

1G

hdp-04

192.186.10.14

1G

hdp-05

192.186.10.15

4G

hdp-06

192.186.10.16

4G

hdp-07

192.186.10.17

4G

二、准备工作

1.登录环境

系统启动时进入字符界面

代码语言:txt
复制
systemctl set-default multi-user.target &&\
systemctl isolate multi-user.target

2.网卡

ens33用来连接外网,下载软件

代码语言:txt
复制
TYPE="Ethernet"
BOOTPROTO="dhcp"
NAME="ens33"
DEVICE="ens33"
ONBOOT="yes"

ens34用来连接内网,进行集群间的通信

代码语言:txt
复制
BOOTPROTO=static
NAME=ens34
DEVICE=ens34
ONBOOT=yes
IPADDR=192.186.10.13
PREFIX=24

3.防火墙、Selinux

关闭防火墙与Selinux

代码语言:txt
复制
yum install -y iptables-services
iptables -F && \
service iptables save && \
systemctl stop firewalld && \
systemctl disable firewalld && \
setenforce 0 && \
sed -ri 's#(SELINUX=)(enforcing)#\1disabled#' /etc/selinux/config

4.ssh免密登录

因为hdp-01hdp-02hdfs-ha,所以它们之间必须要自己可以免密登录自己,自己可以登录免密对方

此功能在剧本中已经配置完毕

  • hdp-01->hdp-01
  • hdp-01->hdp-02
  • hdp-01->其它所有主机
  • hdp-02->hdp-02
  • hdp-02->hdp-01
  • hdp-02->其它所有主机

5.安装软件

jdk,hadoop ,zookeeper安装环境变量的配置均已在剧本中写好

6.配置hosts

hosts配置已经在剧本中写好

三、配置文件

  • ansible.cfg配置文件的时候注意,所有的配置栏目不能少,否则使用ansible时就会报错
代码语言:txt
复制
[defaults]
inventory    	= /root/ansible/inventory	
roles_path   	= /root/ansible/roles	
remote_user  	= root
ask_pass     	= Flase							
forks		= 10								
[inventory]
[privilege_escalation]
[paramiko_connection]
[ssh_connection]
[persistent_connection]
[accelerate]
[selinux]
[colors]
[diff]

四、目录信息

代码语言:txt
复制
[root@hdp-01 ~]# tree ansible/
ansible/
├── hadoop_ha.yml #角色启动文件
├── inventory # 主机清单
└── roles
    └── hadoop_ha
        ├── defaults
        ├── files
        ├── handlers
        ├── meta
        ├── README.md # 帮助文档
        ├── tasks
        │   ├── 01-ssh.yml # 生成hosts文件设置主机名及nn主机免密登录集群
        │   ├── 02-install-soft.yml # 安装jdk hadoop zookeeper软件及配置环境变量
        │   ├── 03-config_zk.yml # 配置zookeeper集群
        │   ├── 04-copy_conf_file.yml # 复制配置文件到所有主机
        │   ├── 05-init_ha.yml # 初始化集群
        │   ├── 06-start-cluster.yml # 启动集群
        │   └── main.yml # 任务入口执行文件
        ├── templates
        │   ├── core-site.xml.j2 # core-site.xml模板文件
        │   ├── hadoop-env.sh.j2 # hadoop-env.sh模板文件
        │   ├── hdfs-site.xml.j2 # hdfs-site.xml模板文件
        │   ├── mapred-site.xml.j2 # mapred-site.xml模板文件
        │   ├── workers.j2 # workers模板文件
        │   └── yarn-site.xml.j2 # 
        ├── tests
        └── vars
            ├── core.yml # core-site.xml变量
            ├── hdfs.yml # hdfs-site.xml变量
            ├── soft.yml # 软件环境及网络变量
            └── yarn.yml # yarn-site.xml变量

五、主机清单

代码语言:txt
复制
[hdp]
hdp-0[1:7] ansible_user=root ansible_ssh_pass="123456"

[nn]
hdp-0[1:2]

[rm]
hdp-0[3:4]

[zk]
hdp-0[5:7]

[jn]
hdp-0[5:7]

[dn]
hdp-0[5:7]

[nm]
hdp-0[5:7]

[nn1]
hdp-01

[nn2]
hdp-02

六、角色

tasks

00-main.yml
代码语言:txt
复制
- name: include vars
  include_vars:
    dir: vars/
    depth: 1
  tags: "always"

- name: config ssh yml
  import_tasks: "01-ssh.yml"
  tags: "confg-ssh"

- name: install soft yml
  import_tasks: "02-install-soft.yml"
  tags: "install-soft"

- name: config zk
  import_tasks: "03-config_zk.yml"
  tags: "config-zk"

- name: copy config file
  import_tasks: "04-copy_conf_file.yml"
  tags: "copy-con-file"

- name: init ha
  import_tasks: "05-init_ha.yml"
  tags: "ini-ha"

- name: start cluster
  import_tasks: "06-start-cluster.yml"
  tags: "start-cluster"
01-ssh.yml
代码语言:txt
复制
# 1.执行生成主机名脚本
- name: 1. make hosts
  script: hosts.sh
  register: r
  when: ansible_hostname in groups['nn1']

# 2.输出到hosts文件中
- name: 2. out vars
  lineinfile:
    path: /etc/hosts
    line: "{{ hostname }}"
    regexp: '^{{ hostname }}'
    owner: root
    group: root
    mode: '0644'
  with_items: "{{ r.stdout_lines }}"
  loop_control:
    loop_var: hostname  
  when: ansible_hostname in groups['nn1']

#3.在NameNode主机上生成密钥对
- name: gen-pub-key
  shell: echo 'y' |ssh-keygen -t rsa -P "" -f /root/.ssh/id_rsa
  when: ansible_hostname in groups['nn']

#4.将hdp-01中的host文件复制给所有主机
- name: copy-hosts
  copy:
    src: /etc/hosts
    dest: /etc/hosts
    mode: '0644'
    force: yes
  when: ansible_hostname in groups['nn1']

#5.设置所有主机名
- name: set-hostname
  shell: hostnamectl set-hostname $(cat /etc/hosts|grep  `ifconfig |grep "inet "|awk '{print $2}'|grep "{{ network }}"`|cut -d " "  -f2)

#6.将NameNode主机上将公钥复制给所有的主机
- name: ssh-pub-key-copy
  shell: sshpass -p "{{ ansible_ssh_pass }}" ssh-copy-id -i ~/.ssh/id_rsa.pub "{{ ansible_user }}"@"{{ host }}" -o StrictHostKeyChecking=no
  with_items: "{{ groups['hdp'] }}"
  loop_control:
    loop_var: host
  when: ansible_hostname in groups['nn']


#8.清除所有主机的iptables规则,关闭selinux
- name: clean
  shell: 'source /etc/profile ; iptables -F ; setenforce 0 ; sed -ri "s#(SELINUX=)(enforcing)#\1disabled#" /etc/selinux/config'
  ignore_errors: true
02-install-soft.yml
代码语言:txt
复制
#1.创建软件安装目录
- name: create apps directory
  file:
    path: "{{ soft_install_path }}"
    state: directory
    mode: '0755'

#2.所有主机安装jdk与hadoop
- name: install-jdk-hadoop
  unarchive:
    src: "{{ soft }}"
    dest: "{{ soft_install_path }}"
  with_items:
  - [ "{{ hadoop_soft }}", "{{ jdk_soft }}" ]
  loop_control:
    loop_var: soft
  tags: install-ha-jdk

  
#3.清掉原来的jdk,hadoop环境变量
- name: clean jdk,hadoop env
  shell: sed -ri '/HADOOP_HOME/d;/JAVA_HOME/d;/ZOOKEEPER_HOME/d'  "{{ env_file }}"
  tags: set-env

#4.配置用户的jdk,hadoop环境变量
- name: set jdk hadoop env
  lineinfile:
    dest: "{{ env_file }}" 
    line: "{{ soft_env.env }}"
    regexp: "{{ soft_env.reg }}"
    state: present
  with_items:
  - { env: 'export JAVA_HOME={{ jdk_home }}' ,reg: '^export JAVA_HOME=' }
  - { env: 'export HADOOP_HOME={{ hdp_home }}' ,reg: '^export HADOOP_HOME' }
  loop_control:
    loop_var: soft_env
  tags: set-env
 
#5.在指定主机组,安装zookeeper集群
- name: install zookeeper
  unarchive:
    src: "{{ zookeeper_soft }}"
    dest: "{{ soft_install_path }}"
  when: ansible_hostname in groups['zk']
  tags: install-zookeeper
   
#6.设置zookeeper的用户环境变量
- name: set zookeeper env
  lineinfile:
    dest: "{{ env_file }}"
    line: "{{ zk_env.env }}"
    regexp: "{{ zk_env.reg }}"
    state: present
  with_items:
  - { env: 'export ZOOKEEPER_HOME={{ zk_home }}' ,reg: '^export ZOOKEEPER_HOME=' }  
  loop_control:
    loop_var: zk_env  
  when: ansible_hostname in groups['zk']
  tags: set-env
   
#7.export所有主机的jdk与hadoop环境变量
- name: export jdk hadoop env
  lineinfile:
    dest: "{{ env_file }}"
    line: 'export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$JAVA_HOME/bin'
    regexp: "^export PATH"
    state: present
  tags: set-env

# 8.export zookeeper集群主机的环境变量
- name: export zookeeper env
  replace:
    path: "{{ env_file }}"
    regexp: "^(export PATH=)(.+)$"
    replace: '\1\2:$ZOOKEEPER_HOME/bin'
  when: ansible_hostname in groups['zk']
  tags: set-env
03-config_zk.yml
代码语言:txt
复制
# 1.复制配置文件
- name: copy config file
  copy:
    src: "{{ zk_home }}/conf/zoo_sample.cfg"
    dest: "{{ zk_home }}/conf/zoo.cfg"
    remote_src: yes
  when: ansible_hostname in groups['zk']

# 2.创建zk运行时的数据目录
- name: create zk data directory
  file:
    path: "{{ zk_data_dir  }}"
    state: directory
    mode: '0755'
  when: ansible_hostname in groups['zk']

# 3.在配置文件中指定数据目录
- name: set zookeeper dataDir
  lineinfile:
    dest: "{{ zk_home }}/conf/zoo.cfg"
    line: "dataDir={{ zk_data_dir }}"
    regexp: "^dataDir="
    state: present
  when: ansible_hostname in groups['zk']

# 4.设置集群信息
- name: set cluster info
  lineinfile:
    dest: "{{ zk_home }}/conf/zoo.cfg"
    line: "server.{{ item.0 + 1 }}={{ item.1 }}:2888:3888"
    regexp: "^server{{ item.0 + 1 }}"
  with_indexed_items: "{{ groups['zk'] }}"
  when: ansible_hostname in groups['zk']

# 5.根据集群信息,创建对应的myid文件
- name: make server id
  shell: 'cat {{ zk_home }}/conf/zoo.cfg |grep {{ ansible_hostname }}|cut -d "." -f2|head -c1 > {{ zk_data_dir }}/myid'
  when: ansible_hostname in groups['zk']
04-copy_conf_file
代码语言:txt
复制
# 1.生成classpath变量
- name: hadoopath
  shell: 'source {{ env_file }} ; hadoop classpath'
  register: r

# 2.复制配置文件到所有主机中
- name: template
  template:
    src: "{{ item }}"
    dest: "{{ hdp_conf }}/{{ item |  replace('.j2','') }}"
    mode: '0644'        
  vars:
    hdp_classpath: "{{ r.stdout }}"
  with_items: ["core-site.xml.j2","hdfs-site.xml.j2","mapred-site.xml.j2","yarn-site.xml.j2","hadoop-env.sh.j2","workers.j2"]
05-init_ha.yml
代码语言:txt
复制
# 1.首先在zk上要删除hadoop数据目录下所有文件
- name: delete hdp data
  shell: "rm -rf {{ hdp_data }}/*"
  when: ansible_hostname in groups['zk']

# 2.启动zkServer
- name: start zookeeper
  shell: 'source {{ env_file }} && nohup zkServer.sh  restart'
  when: ansible_hostname in groups['zk']

# 3.启动journalnode
- name: start journalnode
  shell: 'source {{ env_file }}  ; nohup hdfs --daemon stop journalnode ; nohup hdfs --daemon start journalnode'
  when: ansible_hostname in groups['jn']


# 4.首先在nn上要删除hadoop数据目录下所有文件
- name: delete hdp data
  shell: "rm -rf {{ hdp_data }}/*"
  when: ansible_hostname in groups['nn']

# 5.格式化前要能连接journnode,并且journnode的目录是空的  
- name: format namenode
  shell: 'source {{ env_file }} && nohup echo y | hdfs namenode -format'    
  when: ansible_hostname in groups['nn1']
    
# 6.nn1启动namenode
- name: start namenode
  shell: 'source {{ env_file }} ; nohup hdfs --daemon stop namenode ; nohup hdfs --daemon start namenode'
  when: ansible_hostname in groups['nn1']

# 7.nn2在复制nn1的元数据之前,nn1要启动namenode
- name: copy mate data
  shell: 'source {{ env_file }} && nohup hdfs namenode -bootstrapStandby'
  when: ansible_hostname in groups['nn2']

# 8.nn1格式化zkfc
- name: format zkfc
  shell: 'source {{ env_file }} && nohup echo y |hdfs zkfc -formatZK'
  when: ansible_hostname in groups['nn1']
06-start-cluster.yml
代码语言:txt
复制
- name: start zookeeper
  shell: "source {{ env_file }} ; zkServer.sh restart"
  when: ansible_hostname in groups['zk']

# 启动dfs
- name: start dfs
  shell: "source {{ env_file }} ;nohup stop-dfs.sh ; nohup start-dfs.sh"
  when: ansible_hostname in groups['nn1']

# 启动yarn
- name: start yarn
  shell: "source {{ env_file }} ; nohup stop-yarn.sh ; nohup start-yarn.sh"
  when: ansible_hostname in groups['nn1']

vars

00-soft.yml
代码语言:txt
复制
# 主机网段
network: "192.186.10."

# 软件安装路径
soft_install_path: "/root/apps"

# hadoop安装包
hadoop_soft: "/root/soft/hadoop-3.1.3.tar.gz"

# hadoop家目录
hdp_home: "{{ soft_install_path }}/hadoop-3.1.3"

# hadoop配置文件目录
hdp_conf: "{{ hdp_home }}/etc/hadoop"

# hadoop 数据目录
hdp_data: "/root/hdpdata"

# hadoop执行用户
hdp_user: "root"

# jdk安装包
jdk_soft: "/root/soft/jdk1.8.0.tar.gz"

# jdk家目录
jdk_home: "{{ soft_install_path }}/jdk1.8.0"

# zookeeper安装包
zookeeper_soft: "/root/soft/apache-zookeeper-3.5.8-bin.tar.gz"

# zookeeper的安装目录
zk_home: "{{ soft_install_path }}/apache-zookeeper-3.5.8-bin"

# zookeeper运行时数据目录
zk_data_dir: "/root/zkdata"

# 环境变量文件
env_file: "/root/.bashrc"
01-core.yml
代码语言:txt
复制
# hdfs集群名称
dfs_cluster_name: "mycluster"

# hadoop的临时目录
tmp_dir: "/root/hdpdata/tmp"

# zookeeper集群地址
zk_cluster: "hdp-05:2181,hdp-06:2181,hdp-07:2181"
03-hdfs.yml
代码语言:txt
复制
# 名称目录
name_dir: "/root/hdpdata/name"

# 数据目录
data_dir: "/root/hdpdata/data"

# namesnodes的名称
nn_names: ["nn1","nn2"]

# namesnodes的rpc地址
nn_rpc_address: ["hdp-01:9000","hdp-02:9000"]

# namesnodes的http地址
nn_http_address: ["hdp-01:9870","hdp-02:9870"]

# NameNode的共享edits元数据在存放的位置
edits_dir: "qjournal://hdp-05:8485;hdp-06:8485;hdp-07:8485/{{ dfs_cluster_name }}"

# JournalNode数据存入的位置
jn_data_dir: "/root/hdpdata/journaldata"

# ssh私钥存入的位置
pri_key: /root/.ssh/id_rsa

#sshfence隔离机制超时时间
ssh_fen_con_timeout: 3000
04-yarn.yml
代码语言:txt
复制
# yarn集群id
yarn_cluster_id: yrc

# resoucemanager名称
rm_names: ["rm1","rm2"]

# resoucemanager主机名称
rm_hostnames: ["hdp-03","hdp-04"]

# resoucemanager的Web地址
rm_webapp_address: ["hdp-03:8088","hdp-04:8088"]

# 环境白名单列表
env_whitelist: ["JAVA_HOME","HADOOP_HOME"]

templates

1.hadoop-env.sh.j2
代码语言:txt
复制
export HADOOP_OS_TYPE=${HADOOP_OS_TYPE:-$(uname -s)}
export HADOOP_HOME={{ hdp_home }}
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export JAVA_LIBRARY_PATH=$HADOOP_COMMON_LIB_NATIVE_DIR:$JAVA_LIBRARY_PATH
export HDFS_NAMENODE_USER={{ hdp_user }}
export HDFS_DATANODE_USER={{ hdp_user }}
export YARN_NODEMANAGER_USER={{ hdp_user }}
export YARN_RESOURCEMANAGER_USER={{ hdp_user }}
export HDFS_JOURNALNODE_USER={{ hdp_user }}
export HDFS_ZKFC_USER={{ hdp_user }}
export JAVA_HOME={{ jdk_home }}
2.core-site.xml.j2
代码语言:txt
复制
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
	<!-- 配置集群地址 -->
    <property>
      <name>fs.defaultFS</name>
      <value>hdfs://{{ dfs_cluster_name }}/</value>
    </property>

    <!-- 指定hadoop临时目录 -->
    <property>
      <name>hadoop.tmp.dir</name>
      <value>{{ tmp_dir }}</value>
    </property>

    <!-- 指定zookeeper地址 -->
    <property>
      <name>ha.zookeeper.quorum</name>
      <value>{{ zk_cluster }}</value>
    </property>     
</configuration>
3.hdfs-site.xml.j2
代码语言:txt
复制
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <!--指定hdfs的nameservice,需要和core-site.xml中的保持一致 -->
    <property>
        <name>dfs.nameservices</name>
        <value>{{ dfs_cluster_name }}</value>
    </property>

    <!-- 指定namenodes的名称 -->
    <property>
        <name>dfs.ha.namenodes.{{ dfs_cluster_name }}</name>
        <value>        
        {% for nn in nn_names %}
            {%- set sep=',' -%}
            {%- if loop.last -%}
                {%- set sep='' -%} 
            {%- endif -%} 
            {{ nn }}{{ sep }}
        {%- endfor -%}
        </value>
    </property>


	{% for nn in nn_names %}
    <!-- {{ nn }}的RPC通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.{{ dfs_cluster_name }}.{{ nn }}</name>
        <value>{{ nn_rpc_address[loop.index0] }}</value>
    </property>

	{% endfor %}


	{% for nn in nn_names %}
    <!-- {{ nn }}的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.{{ dfs_cluster_name }}.{{ nn }}</name>
        <value>{{ nn_http_address[loop.index0] }}</value>
    </property>

	{% endfor %}

		<!-- 名称目录位置 -->
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>{{ name_dir }}</value>
    </property>

		<!-- 数据目录位置 -->
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>{{ data_dir }}</value>
    </property>

    <!-- 指定NameNode的共享edits元数据在JournalNode上的存放位置 -->
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>{{ edits_dir }}</value>
    </property>

    <!-- 指定JournalNode在本地磁盘存放数据的位置 -->
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>{{ jn_data_dir }}</value>
    </property>

    <!-- 开启NameNode失败自动切换 -->
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>

    <!-- 配置失败自动切换实现方式 -->
    <property>
        <name>dfs.client.failover.proxy.provider.{{ dfs_cluster_name }}</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>

    <!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>
		  sshfence
  		  shell(/bin/true)
		 </value>
    </property>

    <!-- 使用sshfence隔离机制时需要ssh免登陆 -->
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>{{ pri_key }}</value>
    </property>

    <!-- 配置sshfence隔离机制超时时间 -->
    <property>
        <name>dfs.ha.fencing.ssh.connect-timeout</name>
        <value>{{ ssh_fen_con_timeout }}</value>
    </property>
</configuration>
4.yarn-site.xml.j2
代码语言:txt
复制
<?xml version="1.0"?>
<configuration>
        <!-- 开启RM高可用 -->
        <property>
            <name>yarn.resourcemanager.ha.enabled</name>
            <value>true</value>
        </property>

        <!-- 指定RM的cluster id -->
        <property>
            <name>yarn.resourcemanager.cluster-id</name>
            <value>{{ yarn_cluster_id }}</value>
        </property>

        <!-- 指定RM的逻辑名字 -->
        <property>
            <name>yarn.resourcemanager.ha.rm-ids</name>
            <value>
                {%- for rm in rm_names -%}
                    {%- set sep=',' -%}
                    {%- if loop.last -%}
                        {%- set sep='' -%} 
                    {%- endif -%} 
                    {{ rm }}{{ sep }}
                {%- endfor -%}
            </value>
        </property>
			
		{%- for rm in rm_names -%}
        <!-- 指定{{ rm }}的地址 -->
        <property>
            <name>yarn.resourcemanager.hostname.{{ rm }}</name>
            <value>{{ rm_hostnames[loop.index0] }}</value>
        </property>
		{%- endfor -%}

        <!-- 至关重要,即使默认有也要配置 -->

	    {%- for rm in rm_names -%}

        <!-- {{ rm }}的webapp地址 -->
        
        <property>
            <name>yarn.resourcemanager.webapp.address.{{ rm }}</name>
            <value>{{ rm_webapp_address[loop.index0] }}</value>
        </property>
		{%- endfor -%}


        <!-- 指定zk集群地址 -->
        <property>
            <name>yarn.resourcemanager.zk-address</name>
            <value>{{ zk_cluster }}</value>
        </property>

        <!--启用自动恢复--> 
        <property>
            <name>yarn.resourcemanager.recovery.enabled</name>
            <value>true</value>
        </property>

        
        <!-- 启用自动切换 -->
        <property>
            <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
            <value>true</value>
        </property>

    
        <!-- 指定resourcemanager的状态信息存储在zookeeper集群 --> 
        <property>
            <name>yarn.resourcemanager.store.class</name>
            <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
        </property>

         <!-- NodeManager上运行的附属服务,需配置成mapreduce_shuffle,才可运行MapReduce程序 -->
        <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>

        <!-- 配置nm环境环境变量白名单 -->
        <property>
            <name>yarn.nodemanager.env-whitelist</name>
            <value>{{ env_whitelist }}</value>
        </property>

         <!-- yarn程序运行环境变量 -->
        <property>
            <name>yarn.application.classpath</name>
            <value>{{ hdp_classpath }}</value>
        </property>

        <!-- 让NodeManager自动检测内存和CPU -->
        <property>
            <name>yarn.nodemanager.resource.detect-hardware-capabilities</name>
            <value>true</value>
        </property>

</configuration>
5.mapred-site.xml.j2
代码语言:txt
复制
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
    <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
    </property>
</configuration>
6.workers.j2
代码语言:txt
复制
{% for host in groups['dn'] %}
{{ host }}
{% endfor %}

七、使用方法

1.执行所有

  • 查看hadoop_ha角色文件
代码语言:txt
复制
[root@hdp-01 ansible]# cat hadoop_ha.yml
- hosts: all
  roles:
  - { role: hadoop_ha }
  • 从头开始执行所有步骤,适合初化环境下运行
代码语言:txt
复制
[root@hdp-01 ansible]# ansible-playbook hadoop_ha.yml

2.指定执行

  • 查看角色tasks中的所有标签
代码语言:txt
复制
[root@hdp-01 ~]# ansible-playbook --list-tags hadoop_ha.yml
[always, config-ssh, config-zk, copy-con-file, ini-ha, install-ha-jdk, install-soft, install-zookeeper, set-env, start-cluster]
  • 可以指定标签执行对应的功能,适合精确的使用某个功能
代码语言:txt
复制
ansible -t config-ssh hadoop_ha.yml

八、测试集群

1.查看集群进程信息

代码语言:txt
复制
[root@hdp-01 ~]# ansible -m shell -a 'jps' hdp
hdp-02 | CHANGED | rc=0 >>
13909 Jps
11597 NameNode
11663 DFSZKFailoverController

hdp-04 | CHANGED | rc=0 >>
11219 Jps
9802 ResourceManager

hdp-03 | CHANGED | rc=0 >>
9827 ResourceManager
11436 Jps

hdp-01 | CHANGED | rc=0 >>
2882 Jps
1829 NameNode
1957 DFSZKFailoverController

hdp-05 | CHANGED | rc=0 >>
12560 Jps
10281 JournalNode
10026 QuorumPeerMain
10219 DataNode
10475 NodeManager

hdp-06 | CHANGED | rc=0 >>
10197 JournalNode
9942 QuorumPeerMain
10135 DataNode
12430 Jps
10399 NodeManager

hdp-07 | CHANGED | rc=0 >>
10112 DataNode
12518 Jps
9927 QuorumPeerMain
10375 NodeManager

2.测试mapreduce

1).查看yarn集群信息

代码语言:txt
复制
[root@hdp-02 ~]# yarn rmadmin -getAllServiceState
hdp-03:8033                                        active    
hdp-04:8033                                        standby

2).进入示例目录

代码语言:txt
复制
[root@hdp-01 ~]# cd /root/apps/hadoop-3.1.3/share/hadoop/mapreduce

3).执行pimapreduce程序

代码语言:txt
复制
[root@hdp-01 mapreduce]# hadoop jar hadoop-mapreduce-examples-3.1.3.jar pi 3 5

4).执行结果

代码语言:txt
复制
Estimated value of Pi is 3.73333333333333333333

3.测试hdfs高可用

1).上传一个文件到hdfs中*

代码语言:txt
复制
[root@hdp-01 ~]# hadoop fs -put /var/log/messages /

2).获取active状态的主机,kill掉namenode

代码语言:txt
复制
[root@hdp-01 ~]# hdfs haadmin -getAllServiceState
hdp-01:9000                                        standby   
hdp-02:9000                                        active 

[root@hdp-02 ~]# jps
14020 Jps
11597 NameNode
11663 DFSZKFailoverController

[root@hdp-02 ~]# kill -9 11597

3).查看nn1对应hdp-01namenode状态

代码语言:txt
复制
[root@hdp-01 ~]# hdfs haadmin -getServiceState nn1
active

4).再次查看hdfs中的文件信息,发现仍然可以访问,说明成功

代码语言:txt
复制
[root@hdp-01 ~]# hadoop fs -ls /messages
-rw-r--r--   3 root supergroup     684483 2020-08-10 14:48 /messages

5).再次启动刚刚kill掉的namdenode,查看集群状态信息,发现hdp-02已经是standby

代码语言:txt
复制
[root@hdp-02 ~]# hdfs --daemon start namenode
[root@hdp-02 ~]# hdfs haadmin -getAllServiceState
hdp-01:9000                                        active    
hdp-02:9000                                        standby 

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • Ansible搭建hadoop3.1.3高可用
    • 一、节点信息
      • 二、准备工作
        • 1.登录环境
        • 2.网卡
        • 3.防火墙、Selinux
        • 4.ssh免密登录
        • 5.安装软件
        • 6.配置hosts
      • 三、配置文件
        • 四、目录信息
          • 五、主机清单
            • 六、角色
              • tasks
              • vars
              • templates
            • 七、使用方法
              • 1.执行所有
              • 2.指定执行
            • 八、测试集群
              • 1.查看集群进程信息
              • 2.测试mapreduce
              • 3.测试hdfs高可用
          领券
          问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档