前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >大数据问题排查系列 - HIVE踩坑记

大数据问题排查系列 - HIVE踩坑记

作者头像
数据社
发布2021-09-23 11:21:30
1.8K0
发布2021-09-23 11:21:30
举报
文章被收录于专栏:数据社数据社

前言

本片博文是“大数据线上问题排查系列”大类别之一,讲述前段时间我司某产品在某券商遇到的一个问题及解决方案,其背后涉及到 hive 的一个 BUG,在 hive 3.0 才修复。以下是正文。

问题现象

cdh6.2.1中,开启 kerberos 和 sentry 的hive 中,使用 dml 语句 insert overwrite 插入数据到分区表的新分区时,会报错。同时注意到:

  • 查看底层 hdfs 发现对应的目录和数据都有了;
  • 使用 show partitions xxx 查看不到对应的新分区;
  • 使用 select 语句也查不到该新分区下的数据;
  • 只要是插入数据到新的分区,不管使用静态分区还是动态分区模式,都会报错;

问题分析

为确定问题,首先尝试了不同场景,发现如下现象:

  • 当使用 insert overwrite 插入数据到分区表的已有分区时,不会报错;(不管该分区下有没有数据);
  • 使用 insert into 插入数据到分区表的新分区时,不会报错;
  • 对于非分区表,insert overwrite 和 insert into 都不会报错;
  • 当使用insert overwrite 插入数据到分区表的新分区报错后,可以使用 msck repair test0317 修复hive metastore中相关元数据,修复后 select, show partitions 等语句可以查询到分区和该分区的底层数据,没有问题;

为进一步分析问题,常看了相关日志,包括客户端日志,hiveserver2 日志和 hive metastore 日志分别如下所示。

客户端日志(beeline)

代码语言:javascript
复制
INFO  : Loading data to table apollo_ods_jzfix.test0317 partition (ptdate=1) from hdfs://dev-dw-nn01:8020/user/hive/warehouse/apollo_ods_jzfix.db/test0317/ptdate=1/.hive-staging_hive_2021-03-17_15-09-13_232_1543365768355672834-7333/-ext-10000
ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. org.apache.thrift.TApplicationException: Internal error processing fire_listener_event
INFO  : MapReduce Jobs Launched: 
INFO  : Stage-Stage-1: Map: 1   Cumulative CPU: 2.56 sec   HDFS Read: 5344 HDFS Write: 609 HDFS EC Read: 0 SUCCESS
INFO  : Total MapReduce CPU Time Spent: 2 seconds 560 msec
INFO  : Completed executing command(queryId=hive_20210317150913_9d734c54-f0cf-4dc7-9117-bc7f59c2cb61); Time taken: 17.758 seconds
Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. org.apache.thrift.TApplicationException: Internal error processing fire_listener_event (state=08S01,code=1)

hive metastore 日志

代码语言:javascript
复制
2021-03-17 15:09:30,039 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-9-thread-122]: ugi=hive/dev-dw-nn01@GYDW.COM	ip=10.2.91.100	cmd=get_table : db=apollo_ods_jzfix tbl=test0317	
2021-03-17 15:09:30,044 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore: [pool-9-thread-122]: 121: get_partition : db=apollo_ods_jzfix tbl=test0317[1]
2021-03-17 15:09:30,044 INFO  org.apache.hadoop.hive.metastore.HiveMetaStore.audit: [pool-9-thread-122]: ugi=hive/dev-dw-nn01@GYDW.COM	ip=10.2.91.100	cmd=get_partition : db=apollo_ods_jzfix tbl=test0317[1]	
2021-03-17 15:09:30,053 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-9-thread-122]: NoSuchObjectException(message:partition values=[1])
	at org.apache.hadoop.hive.metastore.ObjectStore.getPartition(ObjectStore.java:2003)
	at sun.reflect.GeneratedMethodAccessor97.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
	at com.sun.proxy.$Proxy26.getPartition(Unknown Source)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition(HiveMetaStore.java:3553)
	at org.apache.hadoop.hive.metastore.events.InsertEvent.<init>(InsertEvent.java:62)
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.fire_listener_event(HiveMetaStore.java:6737)
	at sun.reflect.GeneratedMethodAccessor100.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140)
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99)
	at com.sun.proxy.$Proxy28.fire_listener_event(Unknown Source)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$fire_listener_event.getResult(ThriftHiveMetastore.java:14208)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$fire_listener_event.getResult(ThriftHiveMetastore.java:14193)
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
	at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:594)
	at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:589)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
	at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:589)
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

2021-03-17 15:09:30,054 ERROR org.apache.thrift.ProcessFunction: [pool-9-thread-122]: Internal error processing fire_listener_event
org.apache.hadoop.hive.metastore.api.NoSuchObjectException: partition values=[1]
	at org.apache.hadoop.hive.metastore.ObjectStore.getPartition(ObjectStore.java:2003) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at sun.reflect.GeneratedMethodAccessor97.invoke(Unknown Source) ~[?:?]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
	at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at com.sun.proxy.$Proxy26.getPartition(Unknown Source) ~[?:?]
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partition(HiveMetaStore.java:3553) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.metastore.events.InsertEvent.<init>(InsertEvent.java:62) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.fire_listener_event(HiveMetaStore.java:6737) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at sun.reflect.GeneratedMethodAccessor100.invoke(Unknown Source) ~[?:?]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:140) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:99) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at com.sun.proxy.$Proxy28.fire_listener_event(Unknown Source) ~[?:?]
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$fire_listener_event.getResult(ThriftHiveMetastore.java:14208) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$fire_listener_event.getResult(ThriftHiveMetastore.java:14193) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:594) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:589) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_181]
	at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_181]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) [hadoop-common-3.0.0-cdh6.2.1.jar:?]
	at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:589) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]

hiveserver2 日志

代码语言:javascript
复制
	2021-03-17 15:09:30,016 INFO  org.apache.hadoop.hive.ql.exec.MoveTask: [HiveServer2-Background-Pool: Thread-183022]: Partition is: {ptdate=1}
2021-03-17 15:09:30,033 INFO  org.apache.hadoop.hive.common.FileUtils: [HiveServer2-Background-Pool: Thread-183022]: Creating directory if it doesn't exist: hdfs://dev-dw-nn01:8020/user/hive/warehouse/apollo_ods_jzfix.db/test0317/ptdate=1
2021-03-17 15:09:30,057 WARN  org.apache.hadoop.hive.metastore.RetryingMetaStoreClient: [HiveServer2-Background-Pool: Thread-183022]: MetaStoreClient lost connection. Attempting to reconnect.
org.apache.thrift.TApplicationException: Internal error processing fire_listener_event
	at org.apache.thrift.TApplicationException.read(TApplicationException.java:111) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_fire_listener_event(ThriftHiveMetastore.java:4836) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.fire_listener_event(ThriftHiveMetastore.java:4823) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.fireListenerEvent(HiveMetaStoreClient.java:2531) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at sun.reflect.GeneratedMethodAccessor88.invoke(Unknown Source) ~[?:?]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at com.sun.proxy.$Proxy34.fireListenerEvent(Unknown Source) [?:?]
	at sun.reflect.GeneratedMethodAccessor88.invoke(Unknown Source) ~[?:?]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2562) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at com.sun.proxy.$Proxy34.fireListenerEvent(Unknown Source) [?:?]
	at org.apache.hadoop.hive.ql.metadata.Hive.fireInsertEvent(Hive.java:2431) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.metadata.Hive.loadPartitionInternal(Hive.java:1629) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1525) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1489) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:501) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2200) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1843) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1563) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1339) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1334) [hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:256) [hive-service-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hive.service.cli.operation.SQLOperation.access$600(SQLOperation.java:92) [hive-service-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:345) [hive-service-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_181]
	at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_181]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) [hadoop-common-3.0.0-cdh6.2.1.jar:?]
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:357) [hive-service-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_181]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
2021-03-17 15:09:31,063 INFO  hive.metastore: [HiveServer2-Background-Pool: Thread-183022]: Closed a connection to metastore, current connections: 9231
2021-03-17 15:09:31,063 INFO  hive.metastore: [HiveServer2-Background-Pool: Thread-183022]: Trying to connect to metastore with URI thrift://dev-dw-dn01:9083
2021-03-17 15:09:31,065 INFO  hive.metastore: [HiveServer2-Background-Pool: Thread-183022]: Opened a connection to metastore, current connections: 9232
2021-03-17 15:09:31,065 INFO  hive.metastore: [HiveServer2-Background-Pool: Thread-183022]: Connected to metastore.
2021-03-17 15:09:31,196 ERROR org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-183022]: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. org.apache.thrift.TApplicationException: Internal error processing fire_lis
tener_event2021-03-17 15:09:31,196 INFO  org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-183022]: MapReduce Jobs Launched: 
2021-03-17 15:09:31,197 INFO  org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-183022]: Stage-Stage-1: Map: 1   Cumulative CPU: 2.56 sec   HDFS Read: 5344 HDFS Write: 609 HDFS EC Read: 0 SUCCESS
2021-03-17 15:09:31,197 INFO  org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-183022]: Total MapReduce CPU Time Spent: 2 seconds 560 msec
2021-03-17 15:09:31,197 INFO  org.apache.hadoop.hive.ql.Driver: [HiveServer2-Background-Pool: Thread-183022]: Completed executing command(queryId=hive_20210317150913_9d734c54-f0cf-4dc7-9117-bc7f59c2cb61); Time taken: 17.758 seconds
2021-03-17 15:09:31,206 ERROR org.apache.hive.service.cli.operation.Operation: [HiveServer2-Background-Pool: Thread-183022]: Error running hive query: 
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. org.apache.thrift.TApplicationException: Internal error processing fire_listener_event
	at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:329) ~[hive-service-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:258) ~[hive-service-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hive.service.cli.operation.SQLOperation.access$600(SQLOperation.java:92) ~[hive-service-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:345) [hive-service-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_181]
	at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_181]
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875) [hadoop-common-3.0.0-cdh6.2.1.jar:?]
	at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:357) [hive-service-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_181]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.TApplicationException: Internal error processing fire_listener_event
	at org.apache.hadoop.hive.ql.metadata.Hive.fireInsertEvent(Hive.java:2433) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.metadata.Hive.loadPartitionInternal(Hive.java:1629) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1525) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1489) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:501) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2200) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1843) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1563) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1339) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1334) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:256) ~[hive-service-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	... 11 more
Caused by: org.apache.thrift.TApplicationException: Internal error processing fire_listener_event
	at org.apache.thrift.TApplicationException.read(TApplicationException.java:111) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_fire_listener_event(ThriftHiveMetastore.java:4836) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.fire_listener_event(ThriftHiveMetastore.java:4823) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.fireListenerEvent(HiveMetaStoreClient.java:2531) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at sun.reflect.GeneratedMethodAccessor88.invoke(Unknown Source) ~[?:?]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:154) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at com.sun.proxy.$Proxy34.fireListenerEvent(Unknown Source) ~[?:?]
	at sun.reflect.GeneratedMethodAccessor88.invoke(Unknown Source) ~[?:?]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_181]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_181]
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2562) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at com.sun.proxy.$Proxy34.fireListenerEvent(Unknown Source) ~[?:?]
	at org.apache.hadoop.hive.ql.metadata.Hive.fireInsertEvent(Hive.java:2431) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.metadata.Hive.loadPartitionInternal(Hive.java:1629) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1525) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:1489) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:501) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2200) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1843) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1563) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1339) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1334) ~[hive-exec-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]
	at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:256) ~[hive-service-2.1.1-cdh6.2.1.jar:2.1.1-cdh6.2.1]

问题临时解决方法

通过以上不同场景下的尝试和相关日志,可以确定这应该是某些配置项引起的 HIVE 的一个 BUG,但笔者当时并没有发现相关的JIRA。为了快速推进项目,笔者当时提出了以下 workaround:

  1. 业务测问题解决方法一:在 insert overwrite 后 (会报错),增加脚本执行命令 msck repair test0317 修复hive metastore 中相关元数据: 该方法的缺点是不太方便嵌入到调度流程中,因insert overwrite报错后,调度工具就不会继续调度执行后续流程了,无法做到业务代码不侵入;
  2. 业务测问题解决方法二:在执行正常业务流程前,添加一个前置流程,该前置流程的内容是提前创建好正常流程中需要的新分区,其内容实例如下:alter table test0317 add partition (ptdate=10); (ALTER TABLE table_name ADD [IF NOT EXISTS] PARTITION partition_spec;

最终为快速推进项目,我们采用了上述方案2临时解决了问题。

问题解决方法

在有了临时解决方案,确保了项目能顺利推进的同时,为彻底了解背后技术细节,笔者也在 cloudera 社区发了帖子寻求支持,但社区的相应速度没有那么及时,前几日才收到回馈,截图如下:

JIRA截图

根据 cloudra 官方的回复,参考上述 jira 链接,并对照源码,确认这是 HIVE 的一个BUG,且在 HIVE 3.0 才修复完毕,其主要问题概括如下:

在 hive.metastore.dml.events 配置为 true时,向不存在的分区插入数据,比如向新分区 insert overwrite,或以动态分区模式插入数据到新分区时,hive原先的处理有问题,会因为 metastore 找不到分区而断开链接并报错。

所以,在上述业务侧可以采取的方案二之外,还可以采取如下解决方案:

  1. 配置 hive.metastore.dml.events 为false
  2. 或升级到hive3.0.

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2021-08-26,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 数据社 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 前言
  • 问题现象
  • 问题分析
  • 客户端日志(beeline)
  • hive metastore 日志
  • hiveserver2 日志
  • 问题临时解决方法
  • 问题解决方法
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档