腾讯云设备搭建简单的HADOOP

环境说明

本环境有三台机器。其中master(10.0.0.2),slave1(10.0.0.3),slave2(10.0.0.4)。

一、初始化hadoop环境

1. 创建hadoop帐号

useradd -d /data/hadoop -u 600 -g root hadoop

#修改hadoop的密码

passwd hadoop

2.修改主机名称

  • 将主机名称改成master,从机依次改成slave1,slave2.

vi /etc/hostname

maste

**注意:如果是slave1,则此处填写slave1,如果是slave2,则填写slave2**

  • 修改hosts的配置,其他机器复制此配置

vi /etc/hosts

10.0.0.2 maste

10.0.0.3 slave1

10.0.0.4 slave2

127.0.0.1  localhost  localhost.localdomain  localhost
  • 修改network的配置

vi /etc/sysconfig/network

# Created by anaconda

NETWORKING=yes

NETWORKING\_IPV6=no

HOSTNAME=maste
  • 重启master,slave1,slave2,使配置生效

3. 设置面密码登录

  • 生成密钥对

执行ssh-keygen -t rsa命令。一直按enter建进入即可

  • 将公钥复制到slave1,slave2

执行ssh-copy-id -i ~/.ssh/id_rsa.pub slave1

二、安装java

1.到oracle官网下载java

2.将java文件解压到安装目录

tar -xzvf jdk-8u91-linux-x64.tar.gz

3.设置java的环境变量

vi ~/.bash\_profile

.bash_profile 文件内容如下所示

# .bash\_profile



# Get the aliases and functions

if [ -f ~/.bashrc ]; then

        . ~/.bashrc

fi



# User specific environment and startup programs



PATH=$PATH:$HOME/.local/bin:$HOME/bin:/data/hadoop/hadoop-2.6.4/share/



export PATH



JAVA\_HOME=/data/hadoop/jdk1.8.0\_91

CLASSPATH=.:$JAVA\_HOME/lib

PATH=$JAVA\_HOME/bin:$PATH



export JAVA\_HOME CLASSPATH PATH

**注意:**

  1. 在CLASSPATH 前面必须有.:这个目录,否则会出现【找不到或无法加载主类】的报错

三、安装hadoop

1. 下载hadoop。我们选择hadoop-2.6.4.tar.gz文件,此文件是编译后版本,直接解压后即可。

2. 将文件解压到安装到目录

tar -xzvf hadoop-2.6.4.tar.gz

3.设置环境变量

vi ~/.bashrc

# .bashrc



# Source global definitions

if [ -f /etc/bashrc ]; then

        . /etc/bashrc

fi



# Uncomment the following line if you don't like systemctl's auto-paging feature:

# export SYSTEMD\_PAGER=



# User specific aliases and functions



export HADOOP\_PREFIX=$HOME/hadoop-2.6.4

export HADOOP\_COMMON\_HOME=$HADOOP\_PREFIX

export HADOOP\_HDFS\_HOME=$HADOOP\_PREFIX

export HADOOP\_MAPRED\_HOME=$HADOOP\_PREFIX

export HADOOP\_YARH\_HOME=$HADOOP\_PREFIX

export HADOOP\_CONF\_DIR=$HADOOP\_PREFIX/etc/hadoop



export PATH=$PATH:$HADOOP\_PREFIX/bin:$HADOOP\_PREFIX/sbin

source ~/.bashrc 使配置文件生效

4.修改hadoop配置文件

  • 进入配置文件目录

cd /data/hadoop/hadoop-2.6.4/etc/hadoop

  • 修改hadoop-env.sh ,在JAVA_HOME目录下面添加java路径
# distributed under the License is distributed on an "AS IS" BASIS,

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

# See the License for the specific language governing permissions and

# limitations under the License.



# Set Hadoop-specific environment variables here.



# The only required environment variable is JAVA\_HOME.  All others are

# optional.  When running a distributed configuration it is best to

# set JAVA\_HOME in this file, so that it is correctly defined on

# remote nodes.



# The java implementation to use.

export JAVA\_HOME=/data/hadoop/jdk1.8.0\_91
  • 将slave注册。
#localhost

slave1

slave2
  • 修改core-site.xml
<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

  Licensed under the Apache License, Version 2.0 (the "License");

  you may not use this file except in compliance with the License.

  You may obtain a copy of the License at



    http://www.apache.org/licenses/LICENSE-2.0



  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,

  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

  See the License for the specific language governing permissions and

  limitations under the License. See accompanying LICENSE file.

-->



<!-- Put site-specific property overrides in this file. -->



<configuration>

  <property>

    <name>fs.defaultFS</name>

    <value>hdfs://master:9000</value>

  </property>

  <property>

    <name>hadoop.tmp.dir</name>

    <value>/data/hadoop/tmp/hadoop-master</value>

    <description>Abase for other temporary directories.</description>

  </property>

</configuration>
  • 修改hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

  Licensed under the Apache License, Version 2.0 (the "License");

  you may not use this file except in compliance with the License.

  You may obtain a copy of the License at



    http://www.apache.org/licenses/LICENSE-2.0



  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,

  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

  See the License for the specific language governing permissions and

  limitations under the License. See accompanying LICENSE file.

-->



<!-- Put site-specific property overrides in this file. -->



<configuration>



   <property>

       <name>dfs.namenode.secondary.http-address</name>

       <value>Master:50090</value>

   </property>



  <property>

    <name>dfs.datanode.data.dir</name>

    <value>file:///data/hadoop/tmp/hdfs/datanode</value>

  </property>



  <property>

    <name>dfs.datanode.name.dir</name>

    <value>file:///data/hadoop/tmp/hdfs/namenode</value>

  </property>



  <property>

    <name>dfs.namenode.checkpoint.dir</name>

    <value>file:///data/hadoop/tmp/hdfs/namesecondary</value>

  </property>



  <property>

    <name>dfs.replication</name>

    <value>2</value>

  </property>



</configuration>

**注意:dfs.replication说明的是节点的数量。本案例中有两个slave,因此填写数值为2**

  • 修改mapred-site.xml
<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!--

  Licensed under the Apache License, Version 2.0 (the "License");

  you may not use this file except in compliance with the License.

  You may obtain a copy of the License at



    http://www.apache.org/licenses/LICENSE-2.0



  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,

  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

  See the License for the specific language governing permissions and

  limitations under the License. See accompanying LICENSE file.

-->



<!-- Put site-specific property overrides in this file. -->



<configuration>



  <property>

    <name>mapreduce.framework.name</name>

    <value>yarn</value>

  </property>



  <property>

    <name>mapreduece.jobtracker.staging.root.dir</name>

    <value>/user</value>

  </property>



  <property>

      <name>mapreduce.jobhistory.address</name>

      <value>Master:10020</value>

  </property>

  <property>

      <name>mapreduce.jobhistory.webapp.address</name>

      <value>Master:19888</value>

  </property>



</configuration>
  • 修改yarn-site.xml
<?xml version="1.0"?>

<!--

  Licensed under the Apache License, Version 2.0 (the "License");

  you may not use this file except in compliance with the License.

  You may obtain a copy of the License at



    http://www.apache.org/licenses/LICENSE-2.0



  Unless required by applicable law or agreed to in writing, software

  distributed under the License is distributed on an "AS IS" BASIS,

  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

  See the License for the specific language governing permissions and

  limitations under the License. See accompanying LICENSE file.

-->

<configuration>



<!-- Site specific YARN configuration properties -->



  <property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce\_shuffle</value>

  </property>



  <property>

    <name>yarn.resourcemanager.hostname</name>

    <value>master</value>

  </property>



</configuration>

5.将hadoop用户下面的目录全部打包,复制到slave1,slave2上。

四、hadoop 启动验证

1.启动hadoop

start-all.sh 即可

2.检查是否正常

hdfs dfsadmin -report

[hadoop@master hadoop]$ vi yarn-site.xml 

[hadoop@master hadoop]$ hdfs dfsadmin -report

Configured Capacity: 20867301376 (19.43 GB)

Present Capacity: 16041099264 (14.94 GB)

DFS Remaining: 15645147136 (14.57 GB)

DFS Used: 395952128 (377.61 MB)

DFS Used%: 2.47%

Under replicated blocks: 0

Blocks with corrupt replicas: 0

Missing blocks: 0



-------------------------------------------------

Live datanodes (2):



Name: 10.0.0.3:50010 (slave1)

Hostname: slave1

Decommission Status : Normal

Configured Capacity: 10433650688 (9.72 GB)

DFS Used: 197976064 (188.80 MB)

Non DFS Used: 2413105152 (2.25 GB)

DFS Remaining: 7822569472 (7.29 GB)

DFS Used%: 1.90%

DFS Remaining%: 74.97%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 1

Last contact: Thu Jun 23 11:03:48 CST 2016





Name: 10.0.0.4:50010 (slave2)

Hostname: slave2

Decommission Status : Normal

Configured Capacity: 10433650688 (9.72 GB)

DFS Used: 197976064 (188.80 MB)

Non DFS Used: 2413096960 (2.25 GB)

DFS Remaining: 7822577664 (7.29 GB)

DFS Used%: 1.90%

DFS Remaining%: 74.97%

Configured Cache Capacity: 0 (0 B)

Cache Used: 0 (0 B)

Cache Remaining: 0 (0 B)

Cache Used%: 100.00%

Cache Remaining%: 0.00%

Xceivers: 1

Last contact: Thu Jun 23 11:03:49 CST 2016

**说明:因为我们是两个节点,因此只要在这里看到两个节点,就说明正常**

原创声明,本文系作者授权云+社区发表,未经许可,不得转载。

如有侵权,请联系 yunjia_community@tencent.com 删除。

编辑于

大数据应用

1 篇文章1 人订阅

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏FreeBuf

2018最新款渗透测试框架 | Fsociety搞定各种姿势脚本

Fsociety是一款最新的渗透测试框架,可以帮助各位兄弟在安全测试过程中拥有变身成黑客所需要的各种姿势脚本。 这个工具刚刚出现,目前大概分为以下9类,后续还会...

2878
来自专栏运维小白

Linux基础(day61)

17.1 MySQL主从介绍 MySQL主从介绍 ---- MySQL主从又叫做Replication、AB复制。简单讲就是A和B两台机器做主从后,在A上写数据...

19810
来自专栏pangguoming

当spring 容器初始化完成后执行某个方法

在做web项目开发中,尤其是企业级应用开发的时候,往往会在工程启动的时候做许多的前置检查。 比如检查是否使用了我们组禁止使用的Mysql的group_conca...

47910
来自专栏编程

这款高并发应用框架实在太好用了,伙伴们你造吗?

Titan 框架开发快速入门 Titan Framework是通过Actor模型使用响应式消息传输模式,提供具有高性能、高响应、高可伸缩和高韧性的并发应用框架。...

21310
来自专栏玄魂工作室

Hacker基础之Linux篇:基础Linux命令十六

今天我们来学习几个小知识,不一定是Linux的命令,都是用于查看Linux的系统信息的

1423
来自专栏三杯水

ELKB5.2.2集群环境部署配置优化终极文档

3,logstash filter 加入urldecode支持url、reffer、agent中文显示

2812
来自专栏乐沙弥的世界

基于Linux (RHEL 5.5) 安装Oracle 10g RAC

    本文所描述的是在Red Hat 5.5下使用vmware server 来安装Oracle 10g RAC(OCFS + ASM),本文假定你的RHEL...

1273
来自专栏流柯技术学院

Tomcat中JVM内存溢出及合理配置

Tomcat本身不能直接在计算机上运行,需要依赖于硬件基础之上的操作系统和一个Java虚拟机。Tomcat的内存溢出本质就是JVM内存溢出,所以在本文开始时,应...

872
来自专栏散尽浮华

Linux下误删除后的恢复操作(ext3/ext4)

Linux是作为一个多用户、多任务的操作系统,文件一旦被删除是难以恢复的。尽管删除命令只是在文件节点中作删除标记,并不真正清除文件内容,但是其他用户和一些有写盘...

3867
来自专栏乐沙弥的世界

使用swingbench实现oracle数据库压力测试

    即将上线的数据库如何来评估其性能呢,swingbench是除了Benchmark Factory for Databases的不二之选,可以用短小精悍来...

1652

扫码关注云+社区