版权声明:本文为博主原创文章,未经博主允许不得转载。 https://cloud.tencent.com/developer/article/1433190
Hive从0.14版本开始支持事务和行级更新,但缺省是不支持的,需要一些附加的配置。要想支持行级insert、update、delete,需要配置Hive支持事务。
一、Hive具有ACID语义事务的使用场景
二、配置Hive支持事务(Hive 2.0版)
<property>
<name>hive.support.concurrency</name>
<value>true</value>
</property>
<property>
<name>hive.exec.dynamic.partition.mode</name>
<value>nonstrict</value>
</property>
<property>
<name>hive.txn.manager</name>
<value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
</property>
<property>
<name>hive.compactor.initiator.on</name>
<value>true</value>
</property>
<property>
<name>hive.compactor.worker.threads</name>
<value>1</value>
</property>
INSERT INTO NEXT_LOCK_ID VALUES(1);
INSERT INTO NEXT_COMPACTION_QUEUE_ID VALUES(1);
INSERT INTO NEXT_TXN_ID VALUES(1);
COMMIT;
说明:初始时这三个表没有数据,如果不添加数据,会报以下错误:
org.apache.hadoop.hive.ql.lockmgr.DbTxnManager FAILED: Error in acquiring locks: Error communicating with the metastore
三、测试
$HADOOP_HOME/sbin/start-dfs.sh
$HADOOP_HOME/sbin/start-yarn.sh
~/mysql/bin/mysqld &
use test;
create table t1(id int, name string)
clustered by (id) into 8 buckets
stored as orc TBLPROPERTIES ('transactional'='true');
说明:建表语句必须带有into buckets子句和stored as orc TBLPROPERTIES ('transactional'='true')子句,并且不能带有sorted by子句。
insert into t1 values (1,'aaa');
insert into t1 values (2,'bbb');
update t1 set name='ccc' where id=1;
delete from t1 where id=2;
执行结果分别如图1-3所示。
图1
图2
图3
说明:不能修改bucket列的值,否则会报以下错误:
FAILED: SemanticException Error 10302: Updating values of bucketing columns is not supported. Column id.
-- 建立非分区表并加载数据
CREATE TABLE t1 (id INT, name STRING, cty STRING, st STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
LOAD DATA LOCAL INPATH '/home/grid/a.txt' INTO TABLE t1;
SELECT * FROM t1;
-- 建立外部分区事务表并加载数据
CREATE EXTERNAL TABLE t2 (id INT, name STRING) PARTITIONED BY (country STRING, state STRING)
CLUSTERED BY (id) INTO 8 BUCKETS
STORED AS ORC TBLPROPERTIES ('transactional'='true');
INSERT INTO T2 PARTITION (country, state) SELECT * FROM T1;
SELECT * FROM t2;
-- 修改数据
INSERT INTO TABLE t2 PARTITION (country, state) VALUES (5,'刘','DD','DD');
UPDATE t2 SET name='张' WHERE id=1;
DELETE FROM t2 WHERE name='李四';
SELECT * FROM t2;
修改前和修改后的数据分别如图4、图5所示。
图4
图5
说明:不能update分区键,否则会报以下错误:
FAILED: SemanticException Error 10292: Updating values of partition columns is not supported
四、参考
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Delete
http://unmeshasreeveni.blogspot.in/2014/11/updatedeleteinsert-in-hive-0140.html