腾讯云设备搭建简单的HADOOP

环境说明

本环境有三台机器。其中master(10.0.0.2),slave1(10.0.0.3),slave2(10.0.0.4)。

一、初始化hadoop环境

1. 创建hadoop帐号

useradd -d /data/hadoop -u 600 -g root hadoop

#修改hadoop的密码

passwd hadoop

2.修改主机名称

  • 将主机名称改成master,从机依次改成slave1,slave2.

vi /etc/hostname

maste

**注意:如果是slave1,则此处填写slave1,如果是slave2,则填写slave2**

  • 修改hosts的配置,其他机器复制此配置

vi /etc/hosts

10.0.0.2 maste

10.0.0.3 slave1

10.0.0.4 slave2

127.0.0.1  localhost  localhost.localdomain  localhost
  • 修改network的配置

vi /etc/sysconfig/network

# Created by anaconda

NETWORKING=yes

NETWORKING\_IPV6=no

HOSTNAME=maste
  • 重启master,slave1,slave2,使配置生效

3. 设置面密码登录

  • 生成密钥对

执行ssh-keygen -t rsa命令。一直按enter建进入即可

  • 将公钥复制到slave1,slave2

执行ssh-copy-id -i ~/.ssh/id_rsa.pub slave1

二、安装java

1.到oracle官网下载java

2.将java文件解压到安装目录

tar -xzvf jdk-8u91-linux-x64.tar.gz

3.设置java的环境变量

vi ~/.bash\_profile

.bash_profile 文件内容如下所示

# .bash\_profile



# Get the aliases and functions

if [ -f ~/.bashrc ]; then

        . ~/.bashrc

fi



# User specific environment and startup programs



PATH=$PATH:$HOME/.local/bin:$HOME/bin:/data/hadoop/hadoop-2.6.4/share/



export PATH



JAVA\_HOME=/data/hadoop/jdk1.8.0\_91

CLASSPATH=.:$JAVA\_HOME/lib

PATH=$JAVA\_HOME/bin:$PATH



export JAVA\_HOME CLASSPATH PATH

**注意:**

  1. 在CLASSPATH 前面必须有.:这个目录,否则会出现【找不到或无法加载主类】的报错

三、安装hadoop

1. 下载hadoop。我们选择hadoop-2.6.4.tar.gz文件,此文件是编译后版本,直接解压后即可。

2. 将文件解压到安装到目录

tar -xzvf hadoop-2.6.4.tar.gz

3.设置环境变量

vi ~/.bashrc

# .bashrc



# Source global definitions

if [ -f /etc/bashrc ]; then

        . /etc/bashrc

fi



# Uncomment the following line if you don't like systemctl's auto-paging feature:

# export SYSTEMD\_PAGER=



# User specific aliases and functions



export HADOOP\_PREFIX=$HOME/hadoop-2.6.4

export HADOOP\_COMMON\_HOME=$HADOOP\_PREFIX

export HADOOP\_HDFS\_HOME=$HADOOP\_PREFIX

export HADOOP\_MAPRED\_HOME=$HADOOP\_PREFIX

export HADOOP\_YARH\_HOME=$HADOOP\_PREFIX

export HADOOP\_CONF\_DIR=$HADOOP\_PREFIX/etc/hadoop



export PATH=$PATH:$HADOOP\_PREFIX/bin:$HADOOP\_PREFIX/sbin

source ~/.bashrc 使配置文件生效

4.修改hadoop配置文件

  • 进入配置文件目录

cd /data/hadoop/hadoop-2.6.4/etc/hadoop

  • 修改hadoop-env.sh ,在JAVA_HOME目录下面添加java路径
# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.



# Set Hadoop-specific environment variables here.



# The only required environment variable is JAVA\_HOME.  All others are

# optional.  When running a distributed configuration it is best to

# set JAVA\_HOME in this file, so that it is correctly defined on

# remote nodes.



# The java implementation to use.

export JAVA\_HOME=/data/hadoop/jdk1.8.0\_91
  • 将slave注册。
#localhost

slave1

slave2
  • 修改core-site.xml
<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

  Licensed under the Apache License, Version 2.0 (the "License");

  you may not use this file except in compliance with the License.

  You may obtain a copy of the License at



    http://www.apache.org/licenses/LICENSE-2.0



  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,

  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

  See the License for the specific language governing permissions and

  limitations under the License. See accompanying LICENSE file.

-->



<!-- Put site-specific property overrides in this file. -->



<configuration>

  <property>

    <name>fs.defaultFS</name>

    <value>hdfs://master:9000</value>

  </property>

  <property>

    <name>hadoop.tmp.dir</name>

    <value>/data/hadoop/tmp/hadoop-master</value>

    <description>Abase for other temporary directories.</description>

  </property>

</configuration>
  • 修改hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

  Licensed under the Apache License, Version 2.0 (the "License");

  you may not use this file except in compliance with the License.

  You may obtain a copy of the License at



    http://www.apache.org/licenses/LICENSE-2.0



  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,

  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

  See the License for the specific language governing permissions and

  limitations under the License. See accompanying LICENSE file.

-->



<!-- Put site-specific property overrides in this file. -->



<configuration>



   <property>

       <name>dfs.namenode.secondary.http-address</name>

       <value>Master:50090</value>

   </property>



  <property>

    <name>dfs.datanode.data.dir</name>

    <value>file:///data/hadoop/tmp/hdfs/datanode</value>

  </property>



  <property>

    <name>dfs.datanode.name.dir</name>

    <value>file:///data/hadoop/tmp/hdfs/namenode</value>

  </property>



  <property>

    <name>dfs.namenode.checkpoint.dir</name>

    <value>file:///data/hadoop/tmp/hdfs/namesecondary</value>

  </property>



  <property>

    <name>dfs.replication</name>

    <value>2</value>

  </property>



</configuration>

**注意:dfs.replication说明的是节点的数量。本案例中有两个slave,因此填写数值为2**

  • 修改mapred-site.xml
<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

  Licensed under the Apache License, Version 2.0 (the "License");

  you may not use this file except in compliance with the License.

  You may obtain a copy of the License at



    http://www.apache.org/licenses/LICENSE-2.0



  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,

  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

  See the License for the specific language governing permissions and

  limitations under the License. See accompanying LICENSE file.

-->



<!-- Put site-specific property overrides in this file. -->



<configuration>



  <property>

    <name>mapreduce.framework.name</name>

    <value>yarn</value>

  </property>



  <property>

    <name>mapreduece.jobtracker.staging.root.dir</name>

    <value>/user</value>

  </property>



  <property>

      <name>mapreduce.jobhistory.address</name>

      <value>Master:10020</value>

  </property>

  <property>

      <name>mapreduce.jobhistory.webapp.address</name>

      <value>Master:19888</value>

  </property>



</configuration>
  • 修改yarn-site.xml
<?xml version="1.0"?>

<!--

  Licensed under the Apache License, Version 2.0 (the "License");

  you may not use this file except in compliance with the License.

  You may obtain a copy of the License at



    http://www.apache.org/licenses/LICENSE-2.0



  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,

  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

  See the License for the specific language governing permissions and

  limitations under the License. See accompanying LICENSE file.

-->

<configuration>



<!-- Site specific YARN configuration properties -->



  <property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce\_shuffle</value>

  </property>



  <property>

    <name>yarn.resourcemanager.hostname</name>

    <value>master</value>

  </property>



</configuration>

5.将hadoop用户下面的目录全部打包,复制到slave1,slave2上。

四、hadoop 启动验证

1.启动hadoop

start-all.sh 即可

2.检查是否正常

hdfs dfsadmin -report

[hadoop@master hadoop]$ vi yarn-site.xml 

[hadoop@master hadoop]$ hdfs dfsadmin -report

Configured Capacity: 20867301376 (19.43 GB)

Present Capacity: 16041099264 (14.94 GB)

DFS Remaining: 15645147136 (14.57 GB)

DFS Used: 395952128 (377.61 MB)

DFS Used%: 2.47%

Under replicated blocks: 0

Blocks with corrupt replicas: 0

Missing blocks: 0



-------------------------------------------------

Live datanodes (2):



Name: 10.0.0.3:50010 (slave1)

Hostname: slave1

Decommission Status : Normal

Configured Capacity: 10433650688 (9.72 GB)

DFS Used: 197976064 (188.80 MB)

Non DFS Used: 2413105152 (2.25 GB)

DFS Remaining: 7822569472 (7.29 GB)

DFS Used%: 1.90%

DFS Remaining%: 74.97%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 1

Last contact: Thu Jun 23 11:03:48 CST 2016





Name: 10.0.0.4:50010 (slave2)

Hostname: slave2

Decommission Status : Normal

Configured Capacity: 10433650688 (9.72 GB)

DFS Used: 197976064 (188.80 MB)

Non DFS Used: 2413096960 (2.25 GB)

DFS Remaining: 7822577664 (7.29 GB)

DFS Used%: 1.90%

DFS Remaining%: 74.97%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 1

Last contact: Thu Jun 23 11:03:49 CST 2016

**说明:因为我们是两个节点,因此只要在这里看到两个节点,就说明正常**

原创声明,本文系作者授权云+社区发表,未经许可,不得转载。

如有侵权,请联系 yunjia_community@tencent.com 删除。

编辑于

大数据应用

1 篇文章1 人订阅

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏乐沙弥的世界

Percona XtraDB Cluster网络安全配置(PXC5.7)

Percona XtraDB Cluster(下称PXC)数据库集群节点在多台机器中分布,尽管这些节点在大多数情况下,位于同一个局域网内,其安全依旧有必要重视。...

881
来自专栏JackieZheng

Spring集成RabbiMQ-Spring AMQP新特性

上一篇《Spring集成RabbitMQ-使用RabbitMQ更方便》中,我们只需要添加响应jar的依赖,就可以写一个Spring集成RabbitMQ下非常简单...

2045
来自专栏微服务生态

深入淘宝Diamond之客户端架构解析

diamond是淘宝内部使用的一个管理持久配置的系统,它的特点是简单、可靠、易用,目前淘宝内部绝大多数系统的配置,由diamond来进行统一管理。 diamo...

803
来自专栏闵开慧

spark on yarn提交任务时一直显示ACCEPTED

spark on yarn提交任务时一直显示ACCEPTED,过一个小时后就会出现任务失败,但在提交时shell终端显示的日志并没有报错,logs文件夹中也没有...

4427
来自专栏bboysoul

关于linux下raid的设备文件和格式化

今天给dell t20装了zstack,没错zstack镜像底层其实就是centos,服务器里面有四块硬盘,一块300g的我是做系统盘的,三块1T的硬盘我是打算...

692
来自专栏Hadoop实操

集群启用Kerberos后对Zookeeper的Znode操作异常分析

温馨提示:要看高清无码套图,请使用手机打开并单击图片放大查看。 Fayson的github:https://github.com/fayson/cdhproje...

3355
来自专栏杨建荣的学习笔记

MySQL 5.7安装部署总结(r10笔记第77天)

之前搭建MySQL环境都是使用公司内部使用的脚本,其实说实话屏蔽了很多细节,对MySQL的安装还是了解比较肤浅,今天有个MySQL 5.7的数据迁移的任务,也...

34714
来自专栏菩提树下的杨过

hadoop 2.6伪分布安装

hadoop 2.6的“伪”分式安装与“全”分式安装相比,大部分操作是相同的,主要区别在于不用配置slaves文件,而且其它xxx-core.xml里的参数很多...

17310
来自专栏Hadoop实操

如何在Kerberos下使用Solr

1391
来自专栏乐沙弥的世界

MySQL MHA配置常见问题

    MHA在MySQL数据库中被广泛使用,它小巧易用,功能强大,实现了基于MySQL replication架构的自手动主从故障转移,从库重定向到主库并自动...

511

扫码关注云+社区