前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >HBase快照迁移数据失败原因及解决办法

HBase快照迁移数据失败原因及解决办法

作者头像
一见
发布2019-10-24 15:25:15
1.9K0
发布2019-10-24 15:25:15
举报
文章被收录于专栏:蓝天蓝天

目录

目录 1

1. 背景 1

2. 环境 1

3. 执行语句 1

4. 问题描述 1

5. 错误信息 2

6. 问题原因 3

7. 解决办法 4

1. 背景

机房裁撤,需将源HBase集群的数据迁移到目标HBase集群,采用快照迁移方式。

2. 环境

Hadoop-3.1.2 + HBase-2.2.1

3. 执行语句

time hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -overwrite -snapshot test.snapshot -copy-from hdfs://192.168.32.30/hbase -copy-to hdfs://192.168.31.30/hbase -mappers 10 -bandwidth 30

4. 问题描述

迁移小表(耗时几分钟内)时没有遇到过错误,但迁移大表(耗时超过30分钟)时,一直报错“Can't find hfile”。网上有关该问题的内容稀少,其中只有一篇文章提到解决办法,但按文章的方法并未能解决问题,而且从错误信息来看,并不是该文章所说的内存配置过小。

5. 错误信息

2019-10-18 19:57:27,261 INFO  [main] snapshot.ExportSnapshot: Finalize the Snapshot Export 2019-10-18 19:57:27,272 INFO  [main] snapshot.ExportSnapshot: Verify snapshot integrity 2019-10-18 19:57:27,323 ERROR [VerifySnapshot-pool1-t7] snapshot.SnapshotReferenceUtil: Can't find hfile: 643c8e0f85e5487982241077ae245f34 in the real (hdfs://192.168.31.30/hbase/data/test/135c6968cf1923ecde60afa8917354bb/cf1/643c8e0f85e5487982241077ae245f34) or archive (hdfs://192.168.31.30/hbase/archive/data/test/135c6968cf1923ecde60afa8917354bb/cf1/643c8e0f85e5487982241077ae245f34) directory for the primary table. 2019-10-18 19:57:27,325 ERROR [main] snapshot.ExportSnapshot: Snapshot export failed org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Can't find hfile: 643c8e0f85e5487982241077ae245f34 in the real (hdfs://192.168.31.30/hbase/data/test/135c6968cf1923ecde60afa8917354bb/cf1/643c8e0f85e5487982241077ae245f34) or archive (hdfs://192.168.31.30/hbase/archive/data/test/135c6968cf1923ecde60afa8917354bb/cf1/643c8e0f85e5487982241077ae245f34) directory for the primary table.         at org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.concurrentVisitReferencedFiles(SnapshotReferenceUtil.java:238)         at org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.verifySnapshot(SnapshotReferenceUtil.java:197)         at org.apache.hadoop.hbase.snapshot.SnapshotReferenceUtil.verifySnapshot(SnapshotReferenceUtil.java:181)         at org.apache.hadoop.hbase.snapshot.ExportSnapshot.verifySnapshot(ExportSnapshot.java:825)         at org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:1043)         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)         at org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:1102)         at org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:1106) 2019-10-18 19:57:27,328 ERROR [VerifySnapshot-pool1-t8] snapshot.SnapshotReferenceUtil: Can't find hfile: 26f079c7a824495fbfd53e19bb17b61b in the real (hdfs://192.168.31.30/hbase/data/test/0efcc556e8e6d9591db883f28377d94b/cf1/26f079c7a824495fbfd53e19bb17b61b) or archive (hdfs://192.168.31.30/hbase/archive/data/test/0efcc556e8e6d9591db883f28377d94b/cf1/26f079c7a824495fbfd53e19bb17b61b) directory for the primary table. 2019-10-18 19:57:27,326 ERROR [VerifySnapshot-pool1-t1] snapshot.SnapshotReferenceUtil: Can't find hfile: b99bf4640cea4b61b236f254523bb411 in the real (hdfs://192.168.31.30/hbase/data/test/01df524951ebfb58fd7997a661849f0c/cf1/b99bf4640cea4b61b236f254523bb411) or archive (hdfs://192.168.31.30/hbase/archive/data/test/01df524951ebfb58fd7997a661849f0c/cf1/b99bf4640cea4b61b236f254523bb411) directory for the primary table. 2019-10-18 19:57:27,326 ERROR [VerifySnapshot-pool1-t6] snapshot.SnapshotReferenceUtil: Can't find hfile: 0c078354218b40f989b5512e87a1d40d in the real (hdfs://192.168.31.30/hbase/data/test/0a3b62dfd3c602a34e2b1a4b5829f37a/cf1/0c078354218b40f989b5512e87a1d40d) or archive (hdfs://192.168.31.30/hbase/archive/data/test/0a3b62dfd3c602a34e2b1a4b5829f37a/cf1/0c078354218b40f989b5512e87a1d40d) directory for the primary table. 2019-10-18 19:57:27,326 ERROR [VerifySnapshot-pool1-t5] snapshot.SnapshotReferenceUtil: Can't find hfile: f539907e9608424a8403d298fddf4570 in the real (hdfs://192.168.31.30/hbase/data/test/12c92f19120021d17d531f036e3d3eac/cf1/f539907e9608424a8403d298fddf4570) or archive (hdfs://192.168.31.30/hbase/archive/data/test/12c92f19120021d17d531f036e3d3eac/cf1/f539907e9608424a8403d298fddf4570) directory for the primary table. 2019-10-18 19:57:27,326 ERROR [VerifySnapshot-pool1-t4] snapshot.SnapshotReferenceUtil: Can't find hfile: 9e7dc776fc204c2688d46e79f7810b01 in the real (hdfs://192.168.31.30/hbase/data/test/01927a47aa1ea2a1a660af6c71f19c35/cf1/9e7dc776fc204c2688d46e79f7810b01) or archive (hdfs://192.168.31.30/hbase/archive/data/test/01927a47aa1ea2a1a660af6c71f19c35/cf1/9e7dc776fc204c2688d46e79f7810b01) directory for the primary table. 2019-10-18 19:57:27,326 ERROR [VerifySnapshot-pool1-t2] snapshot.SnapshotReferenceUtil: Can't find hfile: 807e7447315344e3b4bffbd3d15f1b2a in the real (hdfs://192.168.31.30/hbase/data/test/0209bf6c89327d5b0fb07782188f73fb/cf1/807e7447315344e3b4bffbd3d15f1b2a) or archive (hdfs://192.168.31.30/hbase/archive/data/test/0209bf6c89327d5b0fb07782188f73fb/cf1/807e7447315344e3b4bffbd3d15f1b2a) directory for the primary table. 2019-10-18 19:57:27,336 ERROR [VerifySnapshot-pool1-t7] snapshot.SnapshotReferenceUtil: Can't find hfile: ec45664b691941fe84502783b88f6ed7 in the real (hdfs://192.168.31.30/hbase/data/test/02f45bdfc56d563ee4ffa924c02dcb10/cf1/ec45664b691941fe84502783b88f6ed7) or archive (hdfs://192.168.31.30/hbase/archive/data/test/02f45bdfc56d563ee4ffa924c02dcb10/cf1/ec45664b691941fe84502783b88f6ed7) directory for the primary table. 2019-10-18 19:57:27,338 ERROR [VerifySnapshot-pool1-t3] snapshot.SnapshotReferenceUtil: Can't find hfile: 0c6a05097eb743ea8366b74cdfd3df8e in the real (hdfs://192.168.31.30/hbase/data/test/092dba98be14004efdca23e1d0ecd157/cf1/0c6a05097eb743ea8366b74cdfd3df8e) or archive (hdfs://192.168.31.30/hbase/archive/data/test/092dba98be14004efdca23e1d0ecd157/cf1/0c6a05097eb743ea8366b74cdfd3df8e) directory for the primary table.

6. 问题原因

该问题的原因是从源集群复制过来的文件在目标集群上不存在,检查目标集群,可发现目标集群的NameNode上有出现未找到的文件,也就是说文件原来是存在的,但过程中又被删除了。

7. 解决办法

在快照未建立之前,HBase会定期清理archive目录下的数据。实测也正是如此,可将“org.apache.hadoop.hbase.master.cleaner.CleanerChore”的DEBUG日志打开,以观察文件被删除痕迹(修改HBase的log4j.properties)。

log4j.logger.org.apache.hadoop.hbase.master.cleaner.CleanerChore=DEBUG

CleanerChore线程清理archive目录是通过配置项hbase.master.hfilecleaner.ttl控制的,默认是5分钟(单位:毫秒),大表的文件迁移远超5分钟。将hbase.master.hfilecleaner.ttl调到两小时的足够大值后,问题消失。

window._bd_share_config={"common":{"bdSnsKey":{},"bdText":"","bdMini":"2","bdMiniList":false,"bdPic":"","bdStyle":"0","bdSize":"16"},"share":{}};with(document)0[(getElementsByTagName('head')[0]||body).appendChild(createElement('script')).src='http://bdimg.share.baidu.com/static/api/js/share.js?v=89860593.js?cdnversion='+~(-new Date()/36e5)];

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2019-10-22 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 目录
  • 1. 背景
  • 2. 环境
  • 3. 执行语句
  • 4. 问题描述
  • 5. 错误信息
  • 6. 问题原因
  • 7. 解决办法
相关产品与服务
TDSQL MySQL 版
TDSQL MySQL 版(TDSQL for MySQL)是腾讯打造的一款分布式数据库产品,具备强一致高可用、全球部署架构、分布式水平扩展、高性能、企业级安全等特性,同时提供智能 DBA、自动化运营、监控告警等配套设施,为客户提供完整的分布式数据库解决方案。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档