前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >hadoop2-hive的安装和测试

hadoop2-hive的安装和测试

作者头像
Hongten
发布2018-12-05 10:45:02
3980
发布2018-12-05 10:45:02
举报
文章被收录于专栏:HongtenHongten

在安装和测试hive之前,我们需要把Hadoop的所有服务启动

在安装Hive之前,我们需要安装mysql数据库

代码语言:javascript
复制
--mysql的安装 - (https://segmentfault.com/a/1190000003049498)
--检测系统是否自带安装mysql
yum list installed | grep mysql

--删除系统自带的mysql及其依赖
yum -y remove mysql-libs.x86_64


--给CentOS添加rpm源,并且选择较新的源
wget dev.mysql.com/get/mysql-community-release-el6-5.noarch.rpm
yum localinstall mysql-community-release-el6-5.noarch.rpm
yum repolist all | grep mysql
yum-config-manager --disable mysql55-community
yum-config-manager --disable mysql56-community
yum-config-manager --enable mysql57-community-dmr
yum repolist enabled | grep mysql

--安装mysql 服务器
yum install mysql-community-server

--启动mysql
service mysqld start

--查看mysql是否自启动,并且设置开启自启动
chkconfig --list | grep mysqld
chkconfig mysqld on

--查找初始化密码
grep 'temporary password' /var/log/mysqld.log

--mysql安全设置
mysql_secure_installation

--启动mysql
service mysqld start
--登录
mysql –u root –p
--设置的密码
!QAZ2wsx3edc

--开通远程访问
grant all on *.* to root@'%' identified by '!QAZ2wsx3edc';

select * from mysql.user;

--让node1也可以访问
grant all on *.* to root@'node1' identified by '!QAZ2wsx3edc';

--创建hive数据库,后面要用到,hive不会 自动创建
create database hive;

安装和配置Hive

代码语言:javascript
复制
--安装Hive
cd ~
tar -zxvf apache-hive-0.13.1-bin.tar.gz

--创建软链
ln -sf /root/apache-hive-0.13.1-bin /home/hive

--修改配置文件
cd /home/hive/conf/

cp -a hive-default.xml.template hive-site.xml

--启动Hive
cd /home/hive/bin/

./hive

--退出hive
quit;

--修改配置文件
cd /home/hive/conf/

vi hive-site.xml

--以下需要修改的地方
<property>
  <name>javax.jdo.option.ConnectionURL</name>
  <value>jdbc:mysql://node1/hive</value>
  <description>JDBC connect string for a JDBC metastore</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionDriverName</name>
  <value>com.mysql.jdbc.Driver</value>
  <description>Driver class name for a JDBC metastore</description>
</property>
<property>
  <name>javax.jdo.option.ConnectionUserName</name>
  <value>root</value>
  <description>username to use against metastore database</description>
</property>

<property>
  <name>javax.jdo.option.ConnectionPassword</name>
  <value>!QAZ2wsx3edc</value>
  <description>password to use against metastore database</description>
</property>

:wq

添加mysql驱动

代码语言:javascript
复制
--拷贝mysql驱动到/home/hive/lib/
cp -a mysql-connector-java-5.1.23-bin.jar /home/hive/lib/

在这里我写了一个生成文件的java文件

GenerateTestFile.java

代码语言:javascript
复制
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.util.Random;

/**
 * @author Hongwei
 * @created 31 Oct 2018
 */
public class GenerateTestFile {

    public static void main(String[] args) throws Exception{
        int num = 20000000;
        File writename = new File("/root/output1.txt");
        System.out.println("begin");
        writename.createNewFile();
        BufferedWriter out = new BufferedWriter(new FileWriter(writename));
        StringBuilder sBuilder = new StringBuilder();
        for(int i=1;i<num;i++){
            Random random = new Random();
            sBuilder.append(i).append(",").append("name").append(i).append(",")
.append(random.nextInt(50)).append(",").append("Sales").append("\n");
        }
        System.out.println("done........");
        
        out.write(sBuilder.toString());
        out.flush();
        out.close();
    }
}

编译和运行文件:

代码语言:javascript
复制
cd
javac GenerateTestFile.java
java GenerateTestFile

最终就会生成/root/output1.txt文件,为上传测试文件做准备。

启动Hive

代码语言:javascript
复制
--启动hive
cd /home/hive/bin/
./hive

创建t_tem2表

代码语言:javascript
复制
create table t_emp2(
id int,
name string,
age int,
dept_name string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ',';

输出结果:

代码语言:javascript
复制
hive> create table t_emp2(
    > id int,
    > name string,
    > age int,
    > dept_name string
    > )
    > ROW FORMAT DELIMITED
    > FIELDS TERMINATED BY ',';
OK
Time taken: 0.083 seconds

上传文件

代码语言:javascript
复制
load data local inpath '/root/output1.txt' into table t_emp2;

输出结果:

代码语言:javascript
复制
hive> load data local inpath '/root/output1.txt' into table t_emp2;
Copying data from file:/root/output1.txt
Copying file: file:/root/output1.txt
Loading data to table default.t_emp2
Table default.t_emp2 stats: [numFiles=1, numRows=0, totalSize=593776998, rawDataSize=0]
OK
Time taken: 148.455 seconds

测试,查看t_temp2表里面所有记录的总条数:

代码语言:javascript
复制
hive> select count(*) from t_emp2;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1541003514112_0002, Tracking URL = http://node1:8088/proxy/application_1541003514112_0002/
Kill Command = /home/hadoop-2.5/bin/hadoop job  -kill job_1541003514112_0002
Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 1
2018-10-31 09:41:49,863 Stage-1 map = 0%,  reduce = 0%
2018-10-31 09:42:26,846 Stage-1 map = 33%,  reduce = 0%, Cumulative CPU 33.56 sec
2018-10-31 09:42:47,028 Stage-1 map = 44%,  reduce = 0%, Cumulative CPU 53.03 sec
2018-10-31 09:42:48,287 Stage-1 map = 56%,  reduce = 0%, Cumulative CPU 53.79 sec
2018-10-31 09:42:54,173 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 56.99 sec
2018-10-31 09:42:56,867 Stage-1 map = 78%,  reduce = 0%, Cumulative CPU 57.52 sec
2018-10-31 09:42:58,201 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 58.44 sec
2018-10-31 09:43:16,966 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 60.62 sec
MapReduce Total cumulative CPU time: 1 minutes 0 seconds 620 msec
Ended Job = job_1541003514112_0002
MapReduce Jobs Launched: 
Job 0: Map: 3  Reduce: 1   Cumulative CPU: 60.62 sec   HDFS Read: 593794153 HDFS Write: 9 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 0 seconds 620 msec
OK
19999999
Time taken: 105.013 seconds, Fetched: 1 row(s)

查询表中age=20的记录总条数:

代码语言:javascript
复制
hive> select count(*) from t_emp2 where age=20;
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1541003514112_0003, Tracking URL = http://node1:8088/proxy/application_1541003514112_0003/
Kill Command = /home/hadoop-2.5/bin/hadoop job  -kill job_1541003514112_0003
Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 1
2018-10-31 09:44:28,452 Stage-1 map = 0%,  reduce = 0%
2018-10-31 09:44:45,102 Stage-1 map = 11%,  reduce = 0%, Cumulative CPU 5.54 sec
2018-10-31 09:44:49,318 Stage-1 map = 33%,  reduce = 0%, Cumulative CPU 7.63 sec
2018-10-31 09:45:14,247 Stage-1 map = 44%,  reduce = 0%, Cumulative CPU 13.97 sec
2018-10-31 09:45:15,274 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 14.99 sec
2018-10-31 09:45:41,594 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 18.7 sec
2018-10-31 09:45:50,973 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 26.08 sec
MapReduce Total cumulative CPU time: 26 seconds 80 msec
Ended Job = job_1541003514112_0003
MapReduce Jobs Launched: 
Job 0: Map: 3  Reduce: 1   Cumulative CPU: 33.19 sec   HDFS Read: 593794153 HDFS Write: 7 SUCCESS
Total MapReduce CPU Time Spent: 33 seconds 190 msec
OK
399841
Time taken: 98.693 seconds, Fetched: 1 row(s)

========================================================

More reading,and english is important.

I'm Hongten

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2018-10-31 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
云数据库 SQL Server
腾讯云数据库 SQL Server (TencentDB for SQL Server)是业界最常用的商用数据库之一,对基于 Windows 架构的应用程序具有完美的支持。TencentDB for SQL Server 拥有微软正版授权,可持续为用户提供最新的功能,避免未授权使用软件的风险。具有即开即用、稳定可靠、安全运行、弹性扩缩等特点。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档