前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Hadoop动态添加删除节点datanode及恢复

Hadoop动态添加删除节点datanode及恢复

作者头像
星哥玩云
发布2022-07-12 13:58:54
7870
发布2022-07-12 13:58:54
举报
文章被收录于专栏:开源部署开源部署

1. 配置系统环境

主机名,ssh互信,环境变量等

本文略去jdk安装,请将datanode的jdk安装路径与/etc/Hadoop/hadoop-evn.sh中的java_home保持一致,版本hadoop2.7.5

修改/etc/sysconfig/network

然后执行命令 hostname 主机名 这个时候可以注销一下系统,再重登录之后就行了

[root@localhost ~]# hostname localhost.localdomain [root@localhost ~]# hostname -i ::1 127.0.0.1 [root@localhost ~]# [root@localhost ~]# cat /etc/sysconfig/network # Created by anaconda NETWORKING=yes HOSTNAME=slave2 GATEWAY=192.168.48.2 # Oracle-rdbms-server-11gR2-preinstall : Add NOZEROCONF=yes NOZEROCONF=yes [root@localhost ~]# hostname slave2 [root@localhost ~]# hostname slave2 [root@localhost ~]# su - hadoop Last login: Sat Feb 24 14:25:48 CST 2018 on pts/1 [hadoop@slave2 ~]$ su - root

datanode目录并改所有者

(此处的具体路径值,请参照namenode中/usr/hadoop/hadoop-2.7.5/etc/hadoop/hdfs-site.xml,core-site.xml中的dfs.name.dir,dfs.data.dir,dfs.tmp.dir等)

Su - root

# mkdir -p /usr/local/hadoop-2.7.5/tmp/dfs/data

# chmod -R 777 /usr/local/hadoop-2.7.5/tmp

# chown -R hadoop:hadoop /usr/local/hadoop-2.7.5

[root@slave2 ~]# mkdir -p /usr/local/hadoop-2.7.5/tmp/dfs/data [root@slave2 ~]# chmod -R 777 /usr/local/hadoop-2.7.5/tmp  [root@slave2 ~]# chown -R hadoop:hadoop /usr/local/hadoop-2.7.5  [root@slave2 ~]# pwd /root [root@slave2 ~]# cd /usr/local/ [root@slave2 local]# ll total 0 drwxr-xr-x. 2 root  root  46 Mar 21  2017 bin drwxr-xr-x. 2 root  root    6 Jun 10  2014 etc drwxr-xr-x. 2 root  root    6 Jun 10  2014 games drwxr-xr-x  3 hadoop hadoop 16 Feb 24 18:18 hadoop-2.7.5 drwxr-xr-x. 2 root  root    6 Jun 10  2014 include drwxr-xr-x. 2 root  root    6 Jun 10  2014 lib drwxr-xr-x. 2 root  root    6 Jun 10  2014 lib64 drwxr-xr-x. 2 root  root    6 Jun 10  2014 libexec drwxr-xr-x. 2 root  root    6 Jun 10  2014 sbin drwxr-xr-x. 5 root  root  46 Dec 17  2015 share drwxr-xr-x. 2 root  root    6 Jun 10  2014 src [root@slave2 local]#

ssh互信,即实现 master-->slave2免密码

master:

[root@hadoop-master ~]# cat /etc/hosts

127.0.0.1  localhost localhost.localdomain localhost4 localhost4.localdomain4

::1        localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.48.129    hadoop-master

192.168.48.132    slave1

192.168.48.131    slave2

[hadoop@hadoop-master ~]$ scp /usr/hadoop/.ssh/authorized_keys hadoop@slave2:/usr/hadoop/.ssh

The authenticity of host 'slave2 (192.168.48.131)' can't be established.

ECDSA key fingerprint is 1e:cd:d1:3d:b0:5b:62:45:a3:63:df:c7:7a:0f:b8:7c.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'slave2,192.168.48.131' (ECDSA) to the list of known hosts.

hadoop@slave2's password:

authorized_keys       

[hadoop@hadoop-master ~]$ ssh hadoop@slave2

Last login: Sat Feb 24 18:27:33 2018

[hadoop@slave2 ~]$

[hadoop@slave2 ~]$ exit

logout

Connection to slave2 closed.

[hadoop@hadoop-master ~]$

2. 修改namenode节点的slave文件,增加新节点信息

[hadoop@hadoop-master hadoop]$ pwd

/usr/hadoop/hadoop-2.7.5/etc/hadoop

[hadoop@hadoop-master hadoop]$ vi slaves

slave1

slave2

3. 在namenode节点上,将hadoop-2.7.3复制到新节点上,并在新节点上删除data和logs目录中的文件

Master

[hadoop@hadoop-master ~]$ scp -R hadoop-2.7.5 hadoop@slave2:/usr/hadoop

Slave2

[hadoop@slave2 hadoop-2.7.5]$ ll

total 124

drwxr-xr-x 2 hadoop hadoop  4096 Feb 24 14:29 bin

drwxr-xr-x 3 hadoop hadoop    19 Feb 24 14:30 etc

drwxr-xr-x 2 hadoop hadoop  101 Feb 24 14:30 include

drwxr-xr-x 3 hadoop hadoop    19 Feb 24 14:29 lib

drwxr-xr-x 2 hadoop hadoop  4096 Feb 24 14:29 libexec

-rw-r--r-- 1 hadoop hadoop 86424 Feb 24 18:44 LICENSE.txt

drwxrwxr-x 2 hadoop hadoop  4096 Feb 24 14:30 logs

-rw-r--r-- 1 hadoop hadoop 14978 Feb 24 18:44 NOTICE.txt

-rw-r--r-- 1 hadoop hadoop  1366 Feb 24 18:44 README.txt

drwxr-xr-x 2 hadoop hadoop  4096 Feb 24 14:29 sbin

drwxr-xr-x 4 hadoop hadoop    29 Feb 24 14:30 share

[hadoop@slave2 hadoop-2.7.5]$ pwd

/usr/hadoop/hadoop-2.7.5

[hadoop@slave2 hadoop-2.7.5]$ rm -R logs/*

4. 启动新datanode的datanode和nodemanger进程

先确认namenode和当前的datanode中,etc/hoadoop/excludes文件中无待加入的主机,再进行下面操作

[hadoop@slave2 hadoop-2.7.5]$ sbin/hadoop-daemon.sh start datanode starting datanode, logging to /usr/hadoop/hadoop-2.7.5/logs/hadoop-hadoop-datanode-slave2.out [hadoop@slave2 hadoop-2.7.5]$ sbin/yarn-daemon.sh start nodemanager starting datanode, logging to /usr/hadoop/hadoop-2.7.5/logs/yarn-hadoop-datanode-slave2.out [hadoop@slave2 hadoop-2.7.5]$ [hadoop@slave2 hadoop-2.7.5]$ jps 3897 DataNode 6772 NodeManager 8189 Jps [hadoop@slave2 ~]$

5、在NameNode上刷新节点

[hadoop@hadoop-master ~]$ hdfs dfsadmin -refreshNodes Refresh nodes successful [hadoop@hadoop-master ~]$sbin/start-balancer.sh

6. 在namenode查看当前集群情况,

确认信节点已经正常加入

[hadoop@hadoop-master hadoop]$ hdfs dfsadmin -report Configured Capacity: 58663657472 (54.63 GB) Present Capacity: 15487176704 (14.42 GB) DFS Remaining: 15486873600 (14.42 GB) DFS Used: 303104 (296 KB) DFS Used%: 0.00% Under replicated blocks: 5 Blocks with corrupt replicas: 0 Missing blocks: 0 Missing blocks (with replication factor 1): 0

------------------------------------------------- Live datanodes (2):

Name: 192.168.48.131:50010 (slave2) Hostname: 183.221.250.11 Decommission Status : Normal Configured Capacity: 38588669952 (35.94 GB) DFS Used: 8192 (8 KB) Non DFS Used: 36887191552 (34.35 GB) DFS Remaining: 1701470208 (1.58 GB) DFS Used%: 0.00% DFS Remaining%: 4.41% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Thu Mar 01 19:36:33 PST 2018

Name: 192.168.48.132:50010 (slave1) Hostname: slave1 Decommission Status : Normal Configured Capacity: 20074987520 (18.70 GB) DFS Used: 294912 (288 KB) Non DFS Used: 6289289216 (5.86 GB) DFS Remaining: 13785403392 (12.84 GB) DFS Used%: 0.00% DFS Remaining%: 68.67% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Thu Mar 01 19:36:35 PST 2018

[hadoop@hadoop-master hadoop]$

7动态删除datanode

7.1配置NameNode的hdfs-site.xml,

适当减小dfs.replication副本数,增加dfs.hosts.exclude配置

[hadoop@hadoop-master hadoop]$ pwd /usr/hadoop/hadoop-2.7.5/etc/hadoop [hadoop@hadoop-master hadoop]$ cat hdfs-site.xml <configuration> <property>       <name>dfs.replication</name>       <value>3</value> </property>   <property>       <name>dfs.name.dir</name>       <value>/usr/local/hadoop-2.7.5/tmp/dfs/name</value> </property>     <property>       <name>dfs.data.dir</name>       <value>/usr/local/hadoop-2.7.5/tmp/dfs/data</value>     </property> <property>     <name>dfs.hosts.exclude</name>     <value>/usr/hadoop/hadoop-2.7.5/etc/hadoop/excludes</value>   </property>

</configuration>

7.2在namenode对应路径(/etc/hadoop/)下新建excludes文件,

并写入待删除DataNode的ip或域名

[hadoop@hadoop-master hadoop]$ pwd /usr/hadoop/hadoop-2.7.5/etc/hadoop [hadoop@hadoop-master hadoop]$ vi excludes ####slave2 192.168.48.131[hadoop@hadoop-master hadoop]$

7.3在NameNode上刷新所有DataNode

hdfs dfsadmin -refreshNodes sbin/start-balancer.sh

7.4在namenode查看当前集群情况,

确认信节点已经正常删除,结果中已无slave2

[hadoop@hadoop-master hadoop]$ hdfs dfsadmin -report

或者可以在web检测界面(ip:50070)上可以观测到DataNode逐渐变为Dead。

http://192.168.48.129:50070/

在datanode项,Admin state已经由“In Service“变为”Decommissioned“,则表示删除成功

7.5停止已删除的节点相关进程

[hadoop@slave2 hadoop-2.7.5]$ jps 9530 Jps 3897 DataNode 6772 NodeManager [hadoop@slave2 hadoop-2.7.5]$ sbin/hadoop-daemon.sh stop datanode stopping datanode [hadoop@slave2 hadoop-2.7.5]$ sbin/yarn-daemon.sh stop nodemanager stopping nodemanager [hadoop@slave2 hadoop-2.7.5]$ jps 9657 Jps [hadoop@slave2 hadoop-2.7.5]$

8恢复已删除节点

执行7.2 中删除相关信息,然后4,5,6即可。

本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
大数据
全栈大数据产品,面向海量数据场景,帮助您 “智理无数,心中有数”!
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档