前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Hive 编程专题之 - 表值函数

Hive 编程专题之 - 表值函数

作者头像
Lenis
发布2019-12-25 12:37:53
7950
发布2019-12-25 12:37:53
举报
文章被收录于专栏:有关SQL有关SQL

环境:

代码语言:javascript
复制
Hive: 2.7.7
Oracle SQL Developer
Cloudera JDBC Driver

案例 - 1 : 生成多列

代码语言:javascript
复制
select  explode(deductions) as (type,vol)
from default.employee 

结果:

image

错误:

1 - 环境部署

代码语言:javascript
复制
create table account(accountid int, account array)
代码语言:javascript
复制
在行: 3 上开始执行命令时出错 -
create table account(accountid int, account array)
错误报告 -
[Cloudera][HiveJDBCDriver](500051) ERROR processing query/statement. Error Code: 40000, SQL state: TStatus(statusCode:ERROR_STATUS, infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Error while compiling statement: FAILED: ParseException line 1:49 mismatched input ')' expecting < near 'array' in list type:17:16, org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:380, org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:206, org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:290, org.apache.hive.service.cli.operation.Operation:run:Operation.java:320, org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:530, org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementAsync:HiveSessionImpl.java:517, org.apache.hive.service.cli.CLIService:executeStatementAsync:CLIService.java:310, org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:530, org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1437, org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1422, org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39, org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39, org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56, org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286, java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1142, java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:617, java.lang.Thread:run:Thread.java:745, *org.apache.hadoop.hive.ql.parse.ParseException:line 1:49 mismatched input ')'
expecting < near 'array' in list type:22:6, org.apache.hadoop.hive.ql.parse.ParseDriver:parse:ParseDriver.java:211, org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:77, org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:70, org.apache.hadoop.hive.ql.Driver:compile:Driver.java:468, org.apache.hadoop.hive.ql.Driver:compileInternal:Driver.java:1317, org.apache.hadoop.hive.ql.Driver:compileAndRespond:Driver.java:1295, org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:204], sqlState:42000, errorCode:40000, errorMessage:Error while compiling statement: FAILED: ParseException line 1:49 mismatched input ')' expecting < near 'array' in list type), Query: create table account(accountid int, account array).

解决方法:

代码语言:javascript
复制
create table account(accountid int, account array<string>)

array, map, struct 都有自己的定义方法,而他们的调用方法基本都相同:

调用方法:

代码语言:javascript
复制
array('Ali','Tencent')
map('company','ali','city','hangzhou')
struct('ZheJiang','Hangzhou','Ali')

定义数据类型:

代码语言:javascript
复制
create table employee(
    [name] string,
    salary float,
    subordinates array<string>,
    deductions map<string,float>,
    address struct<street:string,city:string,state:string,zip:int> ) ;

异常:需要输入绑定

image

最终发现,执行脚本与执行语句,还是有些不一样的地方。

执行语句的时候 :var 是代表一个参数,所以需要输入参数值

2 - 生成测试数据

代码语言:javascript
复制
insert into account(accountid,account) values(2,array('Ali.com','Tencent'))


在行: 6 上开始执行命令时出错 -
insert into account(accountid,account) values(2,array('Ali.com','Tencent'))
错误位于命令行: 6 列: 1
错误报告 -
SQL 错误: [Cloudera][HiveJDBCDriver](500051) ERROR processing query/statement. Error Code: [Cloudera][JSQLEngine](12000) The column "account" could not be found., SQL state: HY000, Query: insert into account(accountid,account) values(2,array('Ali.com','Tencent')).

其实是 array() 不能用在 insert … values… 这种语句中,换成如下便可:

代码语言:javascript
复制
create table employee(
        name string
    ,   salary float  
    ,   subordinates array<string>
)    

insert into employee(name,salary,subordinates) 
select 'ali',320,array('ali','acai','ayun') 
from account 
limit 1

完整的 Array, Map, Struct 调用方法如下:

代码语言:javascript
复制
insert into default.employee (name,salary,subordinates,deductions,address)
select 'ali'
    ,320
    ,array('ali','acai','ayun') as sub
    , map('ali',1,'acai',2,'ayun',7) as ded
    ,named_struct('street',"zhejiang",  'city',"hangzhou",  'state',"hubin",  'zip',"201210")
from student 
limit 1

值得注意的地方,struct() 与 named_struct() 的不同:

代码语言:javascript
复制
struct("zhejiang","hangzhou","hubin","201210")

会被编译成这样的数据结构:

代码语言:javascript
复制
struct("col0":"zhejiang","col1":"hangzhou","col2":"hubin","col3":"201210")

但 named_struct 的节点就被定死了:

代码语言:javascript
复制
named_struct('street',"zhejiang",  'city',"hangzhou",  'state',"hubin",  'zip',"201210")

编译过后就是:

代码语言:javascript
复制
struct('street',"zhejiang",  'city',"hangzhou",  'state',"hubin",  'zip',"201210")

而这种结构正符合表中定义的数据类型。

案例 - 2 : 生成多行

代码语言:javascript
复制
select  explode(subordinates) as subs
from default.employee 

image

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2018-11-16,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 有关SQL 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档