前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >spark+hadoop集群搭建

spark+hadoop集群搭建

作者头像
foochane
发布2019-05-23 14:52:18
1.9K0
发布2019-05-23 14:52:18
举报
文章被收录于专栏:foochanefoochane

环境: hadoop-2.6.5 spark-2.3.0 scala-2.12.5

1 设置IP

2 配置ssh

3 安装Java

3.1 下载JDK 本次选择的是 jdk-8u171-linux-x64.tar.gz

3.2 创建新目录

代码语言:javascript
复制
sudo mkdir /usr/local/java

3.3 将下载到压缩包拷贝到java文件夹中 进入jdk源码包所在目录Download 解压压缩包,然后可以删除压缩包

代码语言:javascript
复制
cd ~/cluster/software
sudo tar zxvf jdk-8u171-linux-x64.tar.gz -C /usr/local/java

3.4 设置jdk环境变量

代码语言:javascript
复制
sudo vim  ~/.bashrc

添加

export JAVA_HOME=/usr/local/java/jdk1.8.0_171 export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin

然后

代码语言:javascript
复制
source ~/.bashrc

让配置立即生效

3.5 最后检查是否生效

代码语言:javascript
复制
java -version

4 安装Scala

Scala 安装与Java 类似 4.1 下载压缩包: scala-2.12.5.tgz http://www.scala-lang.org/download/ 4.2 建立目录,解压文件到所建立目录

代码语言:javascript
复制
sudo mkdir /usr/local/scala
sudo tar zxvf scala-2.12.5.tgz -C /usr/local/scala

4.3添加环境变量

代码语言:javascript
复制
sudo vim  ~/.bashrc

添加

代码语言:javascript
复制
export SCALA_HOME=/usr/local/scala/scala-2.12.5
export PATH=/usr/local/scala/scala-2.12.5/bin:$PATH

然后

代码语言:javascript
复制
source ~/.bashrc

4.4 测试

代码语言:javascript
复制
scala -version

Scala code runner version 2.12.5 -- Copyright 2002-2018, LAMP/EPFL and Lightbend, Inc.

5 安装Hadoop

下载 hadoop-2.6.5.tar.gz hadoop: http://hadoop.apache.org/releases.html

将 Hadoop 安装至 /usr/local/ 中:

代码语言:javascript
复制
sudo tar -zxf ./hadoop-2.6.5.tar.gz -C /usr/local     
cd /usr/local/
sudo mv ./hadoop-2.6.0/ ./hadoop            # 将文件夹名改为hadoop
sudo chown -R master:master ./hadoop        # 修改文件权限

Hadoop 解压后即可使用。输入如下命令来检查 Hadoop 是否可用,成功则会显示 Hadoop 版本信息:

代码语言:javascript
复制
cd /usr/local/hadoop
./bin/hadoop version

伪分布式需要修改6个配置文件 ,文件位于/usr/local/hadoop/etc/hadoop/

1. slaves

将文件中原来的 localhost 删除,添加内容:

代码语言:javascript
复制
Slave1
Slave2

2. hadoop-env.sh

代码语言:javascript
复制
vim hadoop-env.sh
#第27行
export JAVA_HOME=/usr/local/java/jdk1.8.0_171

export HADOOP_HOME=/usr/local/hadoop

3. core-site.xml

代码语言:javascript
复制
<configuration>
<!-- 指定HADOOP所使用的文件系统schema(URI),HDFS的老大(NameNode)的地址 -->
        <property>
            <name>fs.defaultFS</name>
            <value>hdfs://Master:9000</value>
        </property>
        <!-- 指定hadoop运行时产生文件的存储目录 -->
        <property>
            <name>hadoop.tmp.dir</name>
            <value>file:/usr/local/hadoop/tmp</value>
            <description>Abase for other temporary directories.</description>
        </property>
        <property>
            <name>hadoop.native.lib</name>
            <value>true</value>
            <description>Should native hadoop libraries, if present, be used.</description>
        </property>
</configuration>

4. hdfs-site.xml

代码语言:javascript
复制
<configuration>
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>Master:50090</value>
        </property>
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>file:/usr/local/hadoop/tmp/dfs/name</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>file:/usr/local/hadoop/tmp/dfs/data</value>
        </property>
        <property>
                <name>dfs.namenode.checkpoint.dir</name>
                <value>file:/usr/local/hadoop/tmp/dfs/namesecondary</value>
        </property>
</configuration>

5. mapred-site.xml

需要先重命名,默认文件名为 mapred-site.xml.template

代码语言:javascript
复制
mv mapred-site.xml.template mapred-site.xml
代码语言:javascript
复制
<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>master:10020</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>master:19888</value>
        </property>
</configuration>

6. yarn-site.xml

代码语言:javascript
复制
<configuration>
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>master</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
</configuration>

将hadoop添加到环境变量

代码语言:javascript
复制
vim  ~/.bashrc
代码语言:javascript
复制
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native

export OPTS="-Djava.library.path=${HADOOP_HOME}/lib"
代码语言:javascript
复制
source ~/.bashrc
代码语言:javascript
复制
scp jdk-8u171-linux-x64.tar.gz slave1@192.168.200.123:/home/slave1/cluster/softwarescp jdk-8u171-linux-x64.tar.gz slave1@192.168.200.123:/home/slave1/cluster/software

在Slave 上配置

配置好后,将 Master 上的 /usr/local/Hadoop 文件夹复制到各个节点上。因为之前有跑过伪分布式模式,建议在切换到集群模式前先删除之前的临时文件。在 Master 节点上执行: //先开Slave 虚拟机

代码语言:javascript
复制
sudo rm -r ./hadoop/tmp     # 删除 Hadoop 临时文件
sudo rm -r ./hadoop/logs/*   # 删除日志文件
sudo tar -zcf hadoop.master.tar.gz  hadoop   # 先压缩再复制
scp hadoop.master.tar.gz  slave1@192.168.200.123:/home/slave1/cluster/software
scp hadoop.master.tar.gz  slave2@192.168.200.124:/home/slave2/cluster/software

在 Slave1 节点上执行:

代码语言:javascript
复制
sudo rm -r /usr/local/hadoop    # 删掉旧的(如果存在)
sudo tar -zxf hadoop.master.tar.gz -C /usr/local
sudo chown -R slave1:slave1 /usr/local/hadoop

同样,如果有其他 Slave 节点,也要执行将 hadoop.master.tar.gz 传输到 Slave 节点、在 Slave 节点解压文件的操作。

记得要配置环境变量(同上)

首次启动需要先在 Master 节点执行 NameNode 的格式化:

代码语言:javascript
复制
./bin/hdfs namenode -format     # 首次运行需要执行初始化,之后不需要
代码语言:javascript
复制
master@master-pc:/usr/local/hadoop$ ./bin/hdfs namenode -format
18/04/20 02:02:12 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master-pc/127.0.1.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.6.5
/*
* 提示:该行代码过长,系统自动注释不进行高亮。一键复制会移除系统注释 
* STARTUP_MSG:   classpath = /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/hadoop-annotations-2.6.5.jar:/usr/local/hadoop/share/hadoop/common/lib/jsp-api-2.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jsr305-1.3.9.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-json-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-recipes-2.6.0.jar:/usr/local/hadoop/share/hadoop/common/lib/junit-4.11.jar:/usr/local/hadoop/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-httpclient-3.1.jar:/usr/local/hadoop/share/hadoop/common/lib/stax-api-1.0-2.jar:/usr/local/hadoop/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-el-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/jasper-compiler-5.5.23.jar:/usr/local/hadoop/share/hadoop/common/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/slf4j-api-1.7.5.jar:/usr/local/hadoop/share/hadoop/common/lib/java-xmlbuilder-0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/jasper-runtime-5.5.23.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-framework-2.6.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jetty-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/zookeeper-3.4.6.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/htrace-core-3.0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/gson-2.2.4.jar:/usr/local/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/activation-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-auth-2.6.5.jar:/usr/local/hadoop/share/hadoop/common/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/common/lib/mockito-all-1.8.5.jar:/usr/local/hadoop/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/hadoop/share/hadoop/common/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-net-3.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-client-2.6.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/jsch-0.1.42.jar:/usr/local/hadoop/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-collections-3.2.2.jar:/usr/local/hadoop/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-2.6.5-tests.jar:/usr/local/hadoop/share/hadoop/common/hadoop-nfs-2.6.5.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-2.6.5.jar:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/jsp-api-2.1.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jsr305-1.3.9.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-el-1.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jasper-runtime-5.5.23.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jetty-6.1.26.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/xercesImpl-2.9.1.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/htrace-core-3.0.4.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jersey-server-1.9.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/xmlenc-0.52.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-nfs-2.6.5.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.6.5-tests.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-2.6.5.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jsr305-1.3.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-json-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-3.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-servlet-3.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-httpclient-3.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/stax-api-1.0-2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jline-0.9.94.jar:/usr/local/hadoop/share/hadoop/yarn/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop/share/hadoop/yarn/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jetty-6.1.26.jar:/usr/local/hadoop/share/hadoop/yarn/lib/zookeeper-3.4.6.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jaxb-api-2.2.2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-server-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/activation-1.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/yarn/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jettison-1.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/yarn/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-collections-3.2.2.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-client-2.6.5.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.6.5.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-2.6.5.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.6.5.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-api-2.6.5.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-registry-2.6.5.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.6.5.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-common-2.6.5.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-2.6.5.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.6.5.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-tests-2.6.5.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-2.6.5.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/hadoop-annotations-2.6.5.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/aopalliance-1.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/guice-3.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jersey-guice-1.9.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/junit-4.11.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/guice-servlet-3.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/hamcrest-core-1.3.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/javax.inject-1.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/commons-io-2.4.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jersey-server-1.9.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/avro-1.7.4.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/asm-3.2.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/commons-compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/paranamer-2.3.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/mapreduce/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.6.5.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.6.5.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.5-tests.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.6.5.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.5.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.6.5.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.6.5.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.6.5.jar:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
*/
STARTUP_MSG:   build = https://github.com/apache/hadoop.git -r e8c9fe0b4c252caf2ebf1464220599650f119997; compiled by 'sjlee' on 2016-10-02T23:43Z
STARTUP_MSG:   java = 1.8.0_171
************************************************************/
18/04/20 02:02:12 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
18/04/20 02:02:12 INFO namenode.NameNode: createNameNode [-format]
18/04/20 02:02:13 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Formatting using clusterid: CID-a3bb4139-d668-46e9-a5f4-b96c960a427e
18/04/20 02:02:13 INFO namenode.FSNamesystem: No KeyProvider found.
18/04/20 02:02:13 INFO namenode.FSNamesystem: fsLock is fair:true
18/04/20 02:02:13 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
18/04/20 02:02:13 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
18/04/20 02:02:13 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
18/04/20 02:02:13 INFO blockmanagement.BlockManager: The block deletion will start around 2018 Apr 20 02:02:13
18/04/20 02:02:13 INFO util.GSet: Computing capacity for map BlocksMap
18/04/20 02:02:13 INFO util.GSet: VM type       = 64-bit
18/04/20 02:02:13 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB
18/04/20 02:02:13 INFO util.GSet: capacity      = 2^21 = 2097152 entries
18/04/20 02:02:13 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
18/04/20 02:02:13 INFO blockmanagement.BlockManager: defaultReplication         = 1
18/04/20 02:02:13 INFO blockmanagement.BlockManager: maxReplication             = 512
18/04/20 02:02:13 INFO blockmanagement.BlockManager: minReplication             = 1
18/04/20 02:02:13 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2
18/04/20 02:02:13 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
18/04/20 02:02:13 INFO blockmanagement.BlockManager: encryptDataTransfer        = false
18/04/20 02:02:13 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000
18/04/20 02:02:13 INFO namenode.FSNamesystem: fsOwner             = master (auth:SIMPLE)
18/04/20 02:02:13 INFO namenode.FSNamesystem: supergroup          = supergroup
18/04/20 02:02:13 INFO namenode.FSNamesystem: isPermissionEnabled = true
18/04/20 02:02:13 INFO namenode.FSNamesystem: HA Enabled: false
18/04/20 02:02:13 INFO namenode.FSNamesystem: Append Enabled: true
18/04/20 02:02:13 INFO util.GSet: Computing capacity for map INodeMap
18/04/20 02:02:13 INFO util.GSet: VM type       = 64-bit
18/04/20 02:02:13 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB
18/04/20 02:02:13 INFO util.GSet: capacity      = 2^20 = 1048576 entries
18/04/20 02:02:13 INFO namenode.NameNode: Caching file names occuring more than 10 times
18/04/20 02:02:13 INFO util.GSet: Computing capacity for map cachedBlocks
18/04/20 02:02:13 INFO util.GSet: VM type       = 64-bit
18/04/20 02:02:13 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB
18/04/20 02:02:13 INFO util.GSet: capacity      = 2^18 = 262144 entries
18/04/20 02:02:13 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
18/04/20 02:02:13 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
18/04/20 02:02:13 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000
18/04/20 02:02:13 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
18/04/20 02:02:13 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
18/04/20 02:02:13 INFO util.GSet: Computing capacity for map NameNodeRetryCache
18/04/20 02:02:13 INFO util.GSet: VM type       = 64-bit
18/04/20 02:02:13 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
18/04/20 02:02:13 INFO util.GSet: capacity      = 2^15 = 32768 entries
18/04/20 02:02:13 INFO namenode.NNConf: ACLs enabled? false
18/04/20 02:02:13 INFO namenode.NNConf: XAttrs enabled? true
18/04/20 02:02:13 INFO namenode.NNConf: Maximum size of an xattr: 16384
18/04/20 02:02:13 INFO namenode.FSImage: Allocated new BlockPoolId: BP-519211981-127.0.1.1-1524204133780
18/04/20 02:02:13 INFO common.Storage: Storage directory /usr/local/hadoop/tmp/dfs/name has been successfully formatted.
18/04/20 02:02:13 INFO namenode.FSImageFormatProtobuf: Saving image file /usr/local/hadoop/tmp/dfs/name/current/fsimage.ckpt_0000000000000000000 using no compression
18/04/20 02:02:14 INFO namenode.FSImageFormatProtobuf: Image file /usr/local/hadoop/tmp/dfs/name/current/fsimage.ckpt_0000000000000000000 of size 323 bytes saved in 0 seconds.
18/04/20 02:02:14 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
18/04/20 02:02:14 INFO util.ExitUtil: Exiting with status 0
18/04/20 02:02:14 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master-pc/127.0.1.1
************************************************************/

成功的话,会看到 “successfully formatted” 和 “Exitting with status 0” 的提示,若为 “Exitting with status 1” 则是出错。

接着开启 NameNode 和 DataNode 守护进程。

代码语言:javascript
复制
./sbin/start-dfs.sh

error

针对 DataNode 没法启动的解决方法

代码语言:javascript
复制
    ./sbin/stop-dfs.sh   # 关闭
    rm -r ./tmp     # 删除 tmp 文件,注意这会删除 HDFS 中原有的所有数据
    ./bin/hdfs namenode -format   # 重新格式化 NameNode
    ./sbin/start-dfs.sh  # 重启

启动hadoop

其他命令:

代码语言:javascript
复制
  ./sbin/start-all.sh 
  ./sbin/stop-all.sh 

过命令 jps 可以查看各个节点所启动的进程。正确的话,在 Master 节点上可以看到 NameNode、ResourceManager、SecondrryNameNode、JobHistoryServer 进程 在 Slave 节点可以看到 DataNode 和 NodeManager 进程 也可以通过 Web 页面看到查看 DataNode 和 NameNode 的状态:http://master:50070/。如果不成功,可以通过启动日志排查原因。

测试的例子,在http://www.powerxing.com/install-hadoop/ 中有介绍。

安装Spark

官网下载地址:http://spark.apache.org/downloads.html 需要下载预编译版本:spark-2.3.0-bin-hadoop2.6.tgz

下载后,执行如下命令进行安装:

代码语言:javascript
复制
sudo tar -zxf  spark-2.3.0-bin-hadoop2.6.tgz -C /usr/local/
cd /usr/local
sudo mv ./spark-1.6.1-bin-hadoop2.6.tgz/ ./spark
sudo chown -R master:master ./spark  

安装后,需要在 ./conf/spark-env.sh 中修改 Spark 的 Classpath,执行如下命令拷贝一个配置文件:

代码语言:javascript
复制
cd /usr/local/spark
cp ./conf/spark-env.sh.template ./conf/spark-env.sh

编辑 ./conf/spark-env.sh(vim ./conf/spark-env.sh) ,在最后面加上如下:

代码语言:javascript
复制
export JAVA_HOME=/usr/local/java/jdk1.8.0_171
export SCALA_HOME=/usr/local/scala/scala-2.12.5
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
export SPARK_MASTER_IP=192.168.200.122
export SPARK_WORKER_INSTANCES=2
export SPARK_WORKER_MEMORY=1g
export SPARK_WORKER_CORES=1
export SPARK_HOME=/usr/local/spark
export SPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath)
代码语言:javascript
复制
cd conf/
cp slaves.template slaves
vim slaves

#insert
Slave1
Slave2

配置系统环境变量 加入

代码语言:javascript
复制
export SPARK_HOME=/usr/local/spark
export PATH=${SPARK_HOME}/sbin:$PATH
export PATH=${SPARK_HOME}/bin:$PATH

source ~/.bashrc

修改spark-defaults.conf

代码语言:javascript
复制
cp  spark-defaults.conf.template  spark-defaults.conf

加入

代码语言:javascript
复制
spark.executor.extraJavaOptions     -XX:+PrintGCDetails -DKey=value -Dnumbers="one two three"
spark.eventLog.enabled              true
spark.eventLog.dir                  hdfs://Master:9000/historyserverforSpark
spark.yarn.historySever.address     Master:18080
spark.history.fs.logDirectory       hdfs://Master:9000/historyserverforSpark

给Slave配置

代码语言:javascript
复制
tar -zcf ~/spark.master.tar.gz ./spark   # 先压缩再复制
cd ~
scp ./spark.master.tar.gz slave1@slave1:/home/slave1/cluster/software
scp ./spark.master.tar.gz slave2@slave2:/home/slave2/cluster/software

在 Slave1 节点上执行(其他的也一样):

代码语言:javascript
复制
sudo rm -r /usr/local/spark    # 删掉旧的(如果存在)
sudo tar -zxf  spark.master.tar.gz -C /usr/local
sudo chown -R hadoop:hadoop /usr/local/spark

配置historyserverforSpark

代码语言:javascript
复制
hadoop dfs -rmr /historyserverforSpark
hadoop dfs -mkdir /historyserverforSpark

到此基本就已经配置完全了!大功告成!

来测试一下。

最后测试

首先启动hadoop

代码语言:javascript
复制
cd /usr/local/hadoop/sbin/

./start-all.sh
./start-yarn.sh
./start-history-server.sh

cd /usr/local/spark/sbin/

./start-all.sh
./start-history-server.sh

./spark-submit --class org.apache.spark.examples.SparkPi --master spark://Master:7077 ../lib/spark-examples-1.6.1-hadoop2.6.0.jar 10

[参考链接]: https://blog.csdn.net/crystal_zero/article/details/50969586 http://www.powerxing.com/install-hadoop/ https://blog.csdn.net/weixin_36394852/article/category/7047453

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2018.04.20 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 1 设置IP
  • 2 配置ssh
  • 3 安装Java
  • 4 安装Scala
  • 5 安装Hadoop
    • 1. slaves
      • 2. hadoop-env.sh
        • 3. core-site.xml
          • 4. hdfs-site.xml
            • 5. mapred-site.xml
              • 6. yarn-site.xml
                • 将hadoop添加到环境变量
                  • 在Slave 上配置
                  • 针对 DataNode 没法启动的解决方法
                    • 启动hadoop
                      • 安装Spark
                        • 给Slave配置
                        • 配置historyserverforSpark
                    • 加入
                      • 最后测试
                      领券
                      问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档