前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >hadoop-3.1.3 cluster setup on linux

hadoop-3.1.3 cluster setup on linux

原创
作者头像
esse LL
修改2023-11-04 19:51:57
4870
修改2023-11-04 19:51:57
举报
文章被收录于专栏:操作系统实验操作系统实验

1. config envirment

1.1 switch to root user

代码语言:shell
复制
su passwd root
su

1.2 make directories to proceed

代码语言:shell
复制
mkdir /opt/software /opt/module
cd /opt/software

1.3 download jdk1.8 and extract to target path

代码语言:shell
复制
wget "https://mirrors.tuna.tsinghua.edu.cn/Adoptium/8/jdk/x64/linux/OpenJDK8U-jdk_x64_linux_hotspot_8u392b08.tar.gz"
tar -xzvf /opt/software/OpenJDK8U-jdk_x64_linux_hotspot_8u392b08.tar.gz -C /opt/module

1.4 cp hadoop and extract

代码语言:shell
复制
tar -xzvf /opt/software/hadoop-3.1.3.tar.gz -C /opt/module

1.5 config env variables

代码语言:shell
复制
vi /etc/profile

add following lines:

代码语言:shell
复制
export JAVA_HOME="/opt/module/jdk8u392-b08"
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME="/opt/module/hadoop-3.1.3"
export PATH=$PATH:$HADOOP_HOME/bin
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

load file:

代码语言:shell
复制
source /etc/profile

1.6 config hostname and add hosts map

代码语言:shell
复制
vi /etc/hostname
# modify hostname to master
vi /etc/hosts
# add:
# 192.168.56.104 master
# 192.168.56.105 slave1

ping slave1 to test connection

1.7 rsa id auth

代码语言:shell
复制
cd ~/.ssh
ssh-keygen -t rsa
cat id_rsa.pub >> authorized_keys
ssh master
# rsa auth, no need to type in password

2. config hadoop

need to config hadoop-env

and then workers

and then 4 xml files: core, hdfs, mapreduce, yarn

2.1 hadoop-env

代码语言:shell
复制
cd $HADOOP_HOME
vi etc/hadoop/hadoop-env.sh

add lines:

代码语言:shell
复制
export JAVA_HOME="/opt/module/jdk8u392-b08"
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_NODEMANAGER_USER=root
export YARN_RESOURCEMANAGER_USER=root

2.2 workers

代码语言:shell
复制
vi etc/hadoop/workers

drop localhost and add lines:

代码语言:txt
复制
master
slave1

2.3.1 core

代码语言:shell
复制
vi etc/hadoop/core-site.xml
代码语言:html
复制
<configuration>
     <property>
         <name>fs.defaultFS</name>
         <value>hdfs://master:9000</value>
     </property>
</configuration>

2.3.2 hdfs

place directories for config:

代码语言:shell
复制
mkdir -p /opt/data/nameNode /opt/data/dataNode
vi etc/hadoop/hdfs-site.xml
代码语言:html
复制
<configuration>
    <property>
            <name>dfs.replication</name>
            <value>2</value>
    </property>
    <property>
            <name>dfs.namenode.name.dir</name>
            <value>/opt/data/nameNode</value>
    </property>
    <property>
            <name>dfs.datanode.data.dir</name>
            <value>/opt/data/dataNode</value>
    </property>
</configuration>

2.3.3 mapreduce

代码语言:shell
复制
vi etc/hadoop/mapred-site.xml
代码语言:html
复制
<configuration>
     <property>
         <name>mapreduce.framework.name</name>
         <value>yarn</value>
     </property>
     <property>
         <name>mapreduce.application.classpath</name>
         <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
     </property>
</configuration>

2.3.4 yarn

代码语言:shell
复制
vi etc/hadoop/yarn-site.xml
代码语言:html
复制
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.env-whitelist</name>
        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
    </property>
</configuration>

3. config slave1 node

3.1 salve1 dest path

代码语言:shell
复制
ssh slave1
mkdir /opt/module

3.3 config hostname

modify hostname to slave1

代码语言:shell
复制
vi /etc/hostname
# add slave1

exit from slave1 to proceed scp

3.3 scp hadoop to slave1

代码语言:shell
复制
scp -r /opt/module/hadoop-3.1.3 slave1:/opt/module

4. start hdfs

代码语言:shell
复制
hdfs namenode -format
$HADOOP_HOME/sbin/start-dfs.sh

namenode will run on master, and datanodes will run on master and slave1

1) use jps to check namenode and datanode

2) ssh slave1 and use jps to chek datanode

5. test hdfs

代码语言:shell
复制
hdfs dfs -mkdir /test-path
hdfs dfs -ls /
# test with file:
wget -O ~/alice.txt "https://www.gutenberg.org/files/11/11-0.txt"
hdfs dfs -put ~/alice.txt /test-path
hdfs dfs -ls /test-path

more ops, view https://sparkbyexamples.com/apache-hadoop/hadoop-hdfs-dfs-commands-and-starting-hdfs-dfs-services/

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 1. config envirment
    • 1.1 switch to root user
      • 1.2 make directories to proceed
        • 1.3 download jdk1.8 and extract to target path
          • 1.4 cp hadoop and extract
            • 1.5 config env variables
              • 1.6 config hostname and add hosts map
                • 1.7 rsa id auth
                • 2. config hadoop
                  • 2.1 hadoop-env
                    • 2.2 workers
                      • 2.3.1 core
                        • 2.3.2 hdfs
                          • 2.3.3 mapreduce
                            • 2.3.4 yarn
                            • 3. config slave1 node
                              • 3.1 salve1 dest path
                                • 3.3 config hostname
                                  • 3.3 scp hadoop to slave1
                                  • 4. start hdfs
                                  • 5. test hdfs
                                  领券
                                  问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档