在腾讯云上搭建 Hadoop 完全分布式集群

前言

“纸上得来终觉浅,觉知此事要躬行”

本系列文章主要针对腾讯云上进行大数据系统化操作讲解,在互联网盛行的今日,站在巨人头上的我们。一门技术得来,百度一下终得解决。然而互联网上的文章零零碎碎,达不到强度系统化,以及方便性,快捷性,和简洁性,与针对性准则,这给云上大数据爱好者们带来困扰,使适应腾讯云平台需要花费大量的精力与时间。开发成本大幅增加,然而这些对生产项目尤为重要,

本文章就是为了弥补这些而写,借助腾讯云平台进行一些实战性,选择性讲解。并希望与一些志同道合的小伙伴一起来攻克难关,共同促进云计算,大数据发展。

本文章搭建思维图

一.搭建前期所需设备

  1. 三台同号同区腾讯云服务器,配置可根据所需求自行加减,三台系统为centos6.5 64位。 如下图:
  1. 腾讯云主机对应集群节点和相应功能图
  1. 规划图

二.创建hadoop用户

1.添加hadoop用户组

groupadd hadoop

2.创建hadoop用户并添加到用户组中

useradd -m -g hadoop hadoop

3.修改 hadoop 用户的密码为hadoop

三.安装java并配置环境变量

  1. 下载java安装包 版本:1.8.0_131 微云java安装包下载地址:http://url.cn/49Sxz1E
  2. 上传安装包。 使用ftp上传工具上传到服务器,也可wget进行下载。这里就不进行讲解。
  3. 安装 Java:rmp -ivh java.rpm
  4. 验证安装是否成功:java -version

四.配置.bashrc文件

vi /home/hadoop/.bashrc

export JAVA_HOME=/usr/java/1.8.0_131
export HADOOP_HOME=/home/hadoop/bigdate/hadoop
export HADOOP_USER_NAME=hadoop
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

五.配置master节点和ssh文件

  1. 修改 hosts 文件vim /etc/hosts
  1. 修改ssh配置文件取消注释vim /etc/ssh/sshd_config
  1. 重启 sshd 服务service sshd restart

六.上传hadoop文件并配置

  1. 上传文件在home目录下
  2. 解压hadoop压缩包:tar -zxf hadoop-2.7.1.tar.gz
  3. 在home目录下创建bigdata目录:mkdir bigdata
  4. 移动hadoop目录及文件移动到bigdata目录下:mv hadoop-2.7.1 bigdata/
  5. 切换到bigdata目录下:cd bigdata
  6. 修改目录名称hadoop-2.7.1为hadoop:mv hadoop-2.7.1 hadoop
  1. 配置hadoop配置文件,文件如下:
  • 修改core-site.xml文件<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop:tmp.dir</name> <value>/home/hadoop/bigdate/data/hadoop/tmp</value> </property> </configuration>
  • 修改hdfs-site.xml文件 <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name> dfs:namenode.secondary.http-adress</name> <value>master:9001</value> </property> <property> <name>dfs:namenode.name.dir</name> <value>file:/home/hadoop/bigdate/date/hadoop/hdfs/datanode</value> </property> <property> <name>dfs:namenode.name.dir</name> <value>file:/home/hadoop/bigdate/date/hadoop/hdfs/namenode</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> </configuration>
  • 修改mapered-site.xml文件,没有可以生成一个,也可在mapred-site.xml.template中进行配置 <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
  • 修改yarn-site.xml配置文件 <?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <property> <name>yarn.nodemanager.aux-serivices</name> <value>mapreduce_shuffle</value> </property> </configuration>
  • 修改slaves文件 slave01 slave02

七.制作镜像文件,快速配置多台电脑

  1. 关机master服务器进行镜像制作。
  1. 确认后,等待几分钟即可制作完成,完成后开机master主机
  1. 重装slave01服务器和slave02服务器

八.配置主机名称及免密传输

  1. 修改计算机名称,修改下图标红处:vim /etc/sysconfig/network
  1. 重启服务器:reboot
  2. 切换到hadoop用户:su - hadoop
  3. 切换到家目录:cd ~
  4. 生成密钥 一路回车即可:ssh-keygen -t rsa
  1. 切换到.ssh目录下:cd .ssh/
  1. 将id-rsa_pub内容写入生成authorized-keys文件:cat id_rsa.pub >>authorized_keys 8.在master主机中将authorized_keys文件传到slave01主机hadoop家目录.shh目录下:scp authorized_keys hadoop@slave01:/home/hadoop/.ssh/
  1. 在master主机中将authorized_keys文件传到slave02主机hadoop家目录.shh目录下:scp authorized_keys hadoop@slave02:/home/hadoop/.ssh/ 10.目录权限设置,hadoop用户目录权限为755或者700,.ssh目录是755.id_rsa.pub和authorized_keys权限为644,如下图

17.使用shh+主机名进行验证

九、启动 Hadoop程序

  1. hadoop namenode -format namenode格式化,如不能操作检查环境变量或./hadoop namenode -format进行namenode格式化
  2. 启动hadoop程序
  3. 切换到sbin目录下:cd /home/hadoop/bigdata/hadoop/sbin
  4. 启动hadoop程序:sh start-all.sh

十、验证hadoop是否正常运行

  1. 查看:jps

四个进程运行中

  1. ssh slave01

预告

下篇文章,笔者将介绍如何在腾讯云上完成 Hive 安装及配置。

原创声明,本文系作者授权云+社区发表,未经许可,不得转载。

如有侵权,请联系 yunjia_community@tencent.com 删除。

编辑于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏一个爱瞎折腾的程序猿

sqlserver使用存储过程跟踪SQL

USE [master] GO /****** Object: StoredProcedure [dbo].[sp_perfworkload_trace_s...

2940
来自专栏我和未来有约会

Kit 3D 更新

Kit3D is a 3D graphics engine written for Microsoft Silverlight. Kit3D was inita...

2936
来自专栏魂祭心

原 canvas绘制clock

5154
来自专栏pangguoming

Spring Boot集成JasperReports生成PDF文档

由于工作需要,要实现后端根据模板动态填充数据生成PDF文档,通过技术选型,使用Ireport5.6来设计模板,结合JasperReports5.6工具库来调用渲...

1.4K7
来自专栏一个会写诗的程序员的博客

Spring Reactor 项目核心库Reactor Core

Non-Blocking Reactive Streams Foundation for the JVM both implementing a Reactiv...

2822
来自专栏Golang语言社区

【Golang语言社区】GO1.9 map并发安全测试

var m sync.Map //全局 func maintest() { // 第一个 YongHuomap := make(map[st...

5478
来自专栏C#

DotNet加密方式解析--非对称加密

    新年新气象,也希望新年可以挣大钱。不管今年年底会不会跟去年一样,满怀抱负却又壮志未酬。(不过没事,我已为各位卜上一卦,卦象显示各位都能挣钱...)...

5988
来自专栏跟着阿笨一起玩NET

c#实现打印功能

3742
来自专栏java 成神之路

使用 NIO 实现 echo 服务器

5617
来自专栏我和未来有约会

Silverlight第三方控件专题

这里我收集整理了目前网上silverlight第三方控件的专题,若果有所遗漏请告知我一下。 名称 简介 截图 telerik 商 RadC...

4405

扫码关注云+社区