专栏首页踏歌行的专栏Hive-1.2.1_02_简单操作与访问方式

Hive-1.2.1_02_简单操作与访问方式

1. Hive默认显示当前使用库

 1 1、需要用时,即时配置,在cli执行属性设置,这种配置方式,当重新打开cli时,就会生效:
 2 hive> set hive.cli.print.current.db=true;
 3 
 4 2、一次配置,永久生效,在当前用户的HOME目录下,新建.hiverc文件,把属性设置命令放置到该文件中,每次打开cli时,都会先执行该文件。
 5 [yun@mini01 ~]$ pwd
 6 /app 
 7 [yun@mini01 ~]$ cat .hiverc 
 8 set hive.cli.print.current.db=true;
 9 
10 3、在hive配置文件中添加配置【推荐】,上一篇文章hive配置中已经有了该配置项
11   <!-- 显示当前使用的数据库 -->
12   <property>
13     <name>hive.cli.print.current.db</name>
14     <value>true</value>
15     <description>Whether to include the current database in the Hive prompt.</description>
16   </property>

2. 创建库

 1 # 没有显示当前使用库
 2 [yun@mini01 ~]$ hive
 3 
 4 Logging initialized using configuration in jar:file:/app/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
 5 hive> show databases;  # 默认库为default
 6 OK
 7 default
 8 Time taken: 0.774 seconds, Fetched: 1 row(s)
 9 # 创建库
10 hive> create database zhang;
11 OK
12 Time taken: 0.168 seconds
13 hive> show databases;  
14 OK
15 default
16 zhang
17 Time taken: 0.02 seconds, Fetched: 2 row(s)

浏览器访问

3. 创建表

 1 # 默认显示当前使用库
 2 hive (default)> show databases;
 3 OK
 4 default
 5 zhang
 6 Time taken: 0.729 seconds, Fetched: 2 row(s)
 7 hive (default)> use zhang;
 8 OK
 9 Time taken: 0.036 seconds
10 hive (zhang)> create table t_sz01(id int, name string)
11             > row format delimited
12             > fields terminated by ',';
13 OK
14 Time taken: 0.187 seconds
15 hive (zhang)> show tables;
16 OK
17 t_sz01
18 Time taken: 0.031 seconds, Fetched: 1 row(s)

浏览器访问

4. 创建数据并上传

 1 [yun@mini01 hive]$ cat sz.dat 
 2 1,zhangsan
 3 5,李四
 4 3,wangwu
 5 2,赵六
 6 4,zhouqi
 7 6,孙八
 8 [yun@mini01 hive]$ hadoop fs -put sz.dat /user/hive/warehouse/zhang.db/t_sz01   # 上传 
 9 [yun@mini01 hive]$ hadoop fs -ls /user/hive/warehouse/zhang.db/t_sz01/
10 Found 1 items
11 -rw-r--r--   2 yun supergroup         56 2018-07-03 21:27 /user/hive/warehouse/zhang.db/t_sz01/sz.dat
12 [yun@mini01 hive]$ hadoop fs -cat /user/hive/warehouse/zhang.db/t_sz01/sz.dat
13 1,zhangsan
14 5,李四
15 3,wangwu
16 2,赵六
17 4,zhouqi
18 6,孙八

5. Hive查询数据

 1 hive (zhang)> show tables;
 2 OK
 3 t_sz01
 4 Time taken: 0.028 seconds, Fetched: 1 row(s)
 5 hive (zhang)> select * from t_sz01;  # 全表查询 
 6 OK
 7 1    zhangsan
 8 5    李四
 9 3    wangwu
10 2    赵六
11 4    zhouqi
12 6    孙八
13 Time taken: 0.264 seconds, Fetched: 6 row(s)
14 hive (zhang)> select count(1) from t_sz01;  # 表数据条数 
15 Query ID = yun_20180703213443_ebca743c-2025-405a-8832-59359e1566c2
16 Total jobs = 1
17 Launching Job 1 out of 1
18 Number of reduce tasks determined at compile time: 1
19 In order to change the average load for a reducer (in bytes):
20   set hive.exec.reducers.bytes.per.reducer=<number>
21 In order to limit the maximum number of reducers:
22   set hive.exec.reducers.max=<number>
23 In order to set a constant number of reducers:
24   set mapreduce.job.reduces=<number>
25 Starting Job = job_1530619476293_0001, Tracking URL = http://mini02:8088/proxy/application_1530619476293_0001/
26 Kill Command = /app/hadoop/bin/hadoop job  -kill job_1530619476293_0001
27 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
28 2018-07-03 21:34:53,810 Stage-1 map = 0%,  reduce = 0%
29 2018-07-03 21:35:00,224 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.5 sec
30 2018-07-03 21:35:07,882 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.37 sec
31 MapReduce Total cumulative CPU time: 6 seconds 370 msec
32 Ended Job = job_1530619476293_0001
33 MapReduce Jobs Launched: 
34 Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 6.37 sec   HDFS Read: 6513 HDFS Write: 2 SUCCESS
35 Total MapReduce CPU Time Spent: 6 seconds 370 msec
36 OK
37 6
38 Time taken: 25.312 seconds, Fetched: 1 row(s)
39 hive (zhang)> select id,name from t_sz01 where id >3;  # 查询id>3 
40 OK
41 5    李四
42 4    zhouqi
43 6    孙八
44 Time taken: 0.126 seconds, Fetched: 3 row(s)
45 hive (zhang)> select id,name from t_sz01 where id >3 limit 2;  # 不能使用 limit m,n 
46 OK
47 5    李四
48 4    zhouqi
49 Time taken: 0.072 seconds, Fetched: 2 row(s)
50 hive (zhang)> select id, name from t_sz01 order by name;  # 使用order by 排序
51 Query ID = yun_20180703214314_db222afe-3287-4c8e-8077-73aa4fec62ef
52 Total jobs = 1
53 Launching Job 1 out of 1
54 Number of reduce tasks determined at compile time: 1
55 In order to change the average load for a reducer (in bytes):
56   set hive.exec.reducers.bytes.per.reducer=<number>
57 In order to limit the maximum number of reducers:
58   set hive.exec.reducers.max=<number>
59 In order to set a constant number of reducers:
60   set mapreduce.job.reduces=<number>
61 Starting Job = job_1530619476293_0002, Tracking URL = http://mini02:8088/proxy/application_1530619476293_0002/
62 Kill Command = /app/hadoop/bin/hadoop job  -kill job_1530619476293_0002
63 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
64 2018-07-03 21:43:25,676 Stage-1 map = 0%,  reduce = 0%
65 2018-07-03 21:43:34,166 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.64 sec
66 2018-07-03 21:43:40,606 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4.85 sec
67 MapReduce Total cumulative CPU time: 4 seconds 850 msec
68 Ended Job = job_1530619476293_0002
69 MapReduce Jobs Launched: 
70 Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 4.85 sec   HDFS Read: 5789 HDFS Write: 74 SUCCESS
71 Total MapReduce CPU Time Spent: 4 seconds 850 msec
72 OK
73 3    wangwu
74 1    zhangsan
75 4    zhouqi
76 6    孙八
77 5    李四
78 2    赵六
79 Time taken: 26.768 seconds, Fetched: 6 row(s)

MapReduce信息

1 http://mini02:8088    

6. Hive的访问方式

6.1. Hive交互shell

1 # 之间已经添加环境变量
2 [yun@mini01 ~]$ hive

6.2. Hive thrift服务

  启动方式,(例如是在mini01上):

1 # 之间已经添加环境变量
2 启动为前台: hiveserver2
3 启动为后台: nohup hiveserver2 1>/app/hive/logs/hiveserver.log 2>/app/hive/logs/hiveserver.err &
4 # 没有 /app/hive/logs 目录就创建

  启动成功后,可以在别的节点上用beeline去连接

方式1

 1 # 由于没有在其他机器安装,所以还是在本机用beeline去连接
 2 [yun@mini01 bin]$ beeline 
 3 Beeline version 1.2.1 by Apache Hive
 4 beeline> !connect jdbc:hive2://mini01:10000   # jdbc连接 可以是mini01、127.0.0.0、10.0.0.11、172.16.1.11 
 5 Connecting to jdbc:hive2://mini01:10000
 6 Enter username for jdbc:hive2://mini01:10000: yun
 7 Enter password for jdbc:hive2://mini01:10000: 
 8 Connected to: Apache Hive (version 1.2.1)
 9 Driver: Hive JDBC (version 1.2.1)
10 Transaction isolation: TRANSACTION_REPEATABLE_READ
11 0: jdbc:hive2://mini01:10000>

方式2

1 # 或者启动就连接:
2 [yun@mini01 ~]$ beeline -u jdbc:hive2://mini01:10000 -n yun 

接下来就可以做正常sql查询了

例如:

 1 0: jdbc:hive2://mini01:10000> show databases;
 2 +----------------+--+
 3 | database_name  |
 4 +----------------+--+
 5 | default        |
 6 | zhang          |
 7 +----------------+--+
 8 2 rows selected (0.437 seconds)
 9 0: jdbc:hive2://mini01:10000> use zhang;
10 No rows affected (0.058 seconds)
11 0: jdbc:hive2://mini01:10000> show tables;
12 +-----------+--+
13 | tab_name  |
14 +-----------+--+
15 | t_sz01    |
16 +-----------+--+
17 1 row selected (0.054 seconds)
18 0: jdbc:hive2://mini01:10000> select * from t_sz01;
19 +------------+--------------+--+
20 | t_sz01.id  | t_sz01.name  |
21 +------------+--------------+--+
22 | 1          | zhangsan     |
23 | 5          | 李四         |
24 | 3          | wangwu       |
25 | 2          | 赵六         |
26 | 4          | zhouqi       |
27 | 6          | 孙八         |
28 +------------+--------------+--+
29 6 rows selected (0.641 seconds)
30 0: jdbc:hive2://10.0.0.11:10000> select count(1) from t_sz01;  # 条数查询 
31 INFO  : Number of reduce tasks determined at compile time: 1
32 INFO  : In order to change the average load for a reducer (in bytes):
33 INFO  :   set hive.exec.reducers.bytes.per.reducer=<number>
34 INFO  : In order to limit the maximum number of reducers:
35 INFO  :   set hive.exec.reducers.max=<number>
36 INFO  : In order to set a constant number of reducers:
37 INFO  :   set mapreduce.job.reduces=<number>
38 INFO  : number of splits:1
39 INFO  : Submitting tokens for job: job_1530619476293_0003
40 INFO  : The url to track the job: http://mini02:8088/proxy/application_1530619476293_0003/
41 INFO  : Starting Job = job_1530619476293_0003, Tracking URL = http://mini02:8088/proxy/application_1530619476293_0003/
42 INFO  : Kill Command = /app/hadoop/bin/hadoop job  -kill job_1530619476293_0003
43 INFO  : Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
44 INFO  : 2018-07-03 22:58:43,405 Stage-1 map = 0%,  reduce = 0%
45 INFO  : 2018-07-03 22:58:49,882 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.56 sec
46 INFO  : 2018-07-03 22:58:57,815 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 5.28 sec
47 INFO  : MapReduce Total cumulative CPU time: 5 seconds 280 msec
48 INFO  : Ended Job = job_1530619476293_0003
49 +------+--+
50 | _c0  |
51 +------+--+
52 | 6    |
53 +------+--+
54 1 row selected (25.433 seconds)

6.3. hive -e "HiveQL"

适用于写脚本

 1 [yun@mini01 ~]$ hive -e "use exercise; select * from student;"
 2 
 3 Logging initialized using configuration in jar:file:/app/hive-1.2.1/lib/hive-common-1.2.1.jar!/hive-log4j.properties
 4 OK
 5 Time taken: 1.109 seconds
 6 OK
 7 95001    李勇    男    20    CS
 8 95002    刘晨    女    19    IS
 9 95003    王敏    女    22    MA
10 95004    张立    男    19    IS
11 95005    刘刚    男    18    MA
12 95006    孙庆    男    23    CS
13 95007    易思玲    女    19    MA
14 95008    李娜    女    18    CS
15 95009    梦圆圆    女    18    MA
16 95010    孔小涛    男    19    CS
17 Time taken: 0.786 seconds, Fetched: 10 row(s)

6.4. hive -f 'test.sql'

  适用于hive直接调用一个脚本,该脚本中全是hive的类SQL语句。

1 # aztest.sql  脚本名称
2 CREATE DATABASE IF NOT EXISTS azkaban;
3 use azkaban;
4 DROP TABLE IF EXISTS aztest;
5 DROP TABLE IF EXISTS azres;
6 create table aztest(id int,name string) row format delimited fields terminated by ',';
7 load data inpath '/aztest/hiveinput/azkb.dat' into table aztest;
8 create table azres row format delimited fields terminated by '#' as select * from aztest;
9 insert overwrite directory '/aztest/hiveoutput' select count(1) from aztest; 

7. 文章参考

1、Hive创建表格报【Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.

2、实战 - Hive cli命令行工具默认显示当前数据库

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • Hive-1.2.1_01_安装部署

    Hive是基于Hadoop的一个数据仓库工具,可以将结构化的数据文件映射为一张数据库表,并提供类SQL查询功能。

    踏歌行
  • Hive-1.2.1_03_DDL操作 1.1. Create Database1.2. Drop Database1.3. Use Database2.1. Cre

      注意:各个语句的版本时间,有的是在 hive-1.2.1 之后才有的,这些语句我们在hive-1.2.1中是不能使用的。

    踏歌行
  • zookeeper-01 概述

    B、Zookeeper本身就是一个分布式程序(只要有半数以上节点存活,zk就能正常服务)

    踏歌行
  • 【Hive】Hive结合MySQL的配置及操作

    版权声明:本文为博主原创文章,转载请注明出处。 https://blog.csdn.net/...

    魏晓蕾
  • Hadoop学习笔记—17.Hive框架学习

      Hive 是建立在 Hadoop 基础上的数据仓库基础构架。它提供了一系列的工具,可以用来进行数据提取转化加载(ETL),这是一种可以存储、查询和分析存储在...

    Edison Zhou
  • Apache Hive 安装详解及相关问题解决方案

    天策
  • centos7 hive 单机模式安装配置

    转载请务必注明原创地址为:http://dongkelun.com/2018/03/24/hiveConf/

    董可伦
  • P03_Hive 安装

    安装hive包 [CDH下载地址](http://archive.cloudera.com/cdh5/cdh/5/) 1、将hive-0.13.1-cdh5.3...

    Albert陈凯
  • Hive快速入门系列(7) | Hive常见的属性配置

      1. Default数据仓库的最原始位置是在hdfs上的:/user/hive/warehouse路径下。

    不温卜火
  • Hive日常操作

    根据分区查询数据:select table_coulm from table_name where partition_name = '2018-11-01';

    JouyPub

扫码关注云+社区

领取腾讯云代金券