前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Apache Hudi 0.12.2发布

Apache Hudi 0.12.2发布

作者头像
从大数据到人工智能
发布2023-01-12 11:27:17
6540
发布2023-01-12 11:27:17
举报
文章被收录于专栏:大数据-BigData大数据-BigData

长期支持版本

我们的目标是维护 0.12 更长时间,并通过最新的 0.12.x 版本提供稳定版本供用户迁移。 此版本 (0.12.2) 是最新的 0.12 版本。

迁移指南

此版本 (0.12.2) 没有引入任何新的表版本,因此如果您使用的是 0.12.0,则无需迁移。

如果从旧版本迁移,请查看之前发行说明中的迁移指南,特别是0.6.0, 0.9.0, 0.10.0, 0.11.0, and 0.12.0.中的升级说明。

bug修复

0.12.2 版本主要用于错误修复和稳定性。 这些修复跨越许多组件,包括

  • DeltaStreamer
  • 数据类型/模式相关的错误修复
  • Table服务
  • 元数据表
  • Spark SQL
  • Presto 稳定性/性能修复
  • Trino 稳定性/性能修复
  • 元同步
  • Flink 引擎
  • 单元、功能、集成测试和 CI

Release Notes

Sub-task

  • [HUDI-5244] – Fix bugs in schema evolution client with lost operation field and not found schema

Bug

  • [HUDI-3453] – Metadata table throws NPE when scheduling compaction plan
  • [HUDI-3661] – Flink async compaction is not thread safe when use watermark
  • [HUDI-4281] – Using hudi to build a large number of tables in spark on hive causes OOM
  • [HUDI-4588] – Ingestion failing if source column is dropped
  • [HUDI-4855] – Bootstrap table from Deltastreamer cannot be read in Spark
  • [HUDI-4893] – More than 1 splits are created for a single log file for MOR table
  • [HUDI-4898] – for mor table, presto/hive shoud respect payload class during merge parquet file and log file
  • [HUDI-4901] – Add avro version to Flink profiles
  • [HUDI-4946] – merge into with no preCombineField has dup row in only insert
  • [HUDI-4952] – Reading from metadata table could fail when there are no completed commits
  • [HUDI-4966] – Meta sync throws exception if TimestampBasedKeyGenerator is used to generate partition path containing slashes
  • [HUDI-4971] – aws bundle causes class loading issue
  • [HUDI-4975] – datahub sync bundle causes class loading issue
  • [HUDI-4998] – Inference of META_SYNC_PARTITION_EXTRACTOR_CLASS does not work
  • [HUDI-5003] – InLineFileSystem will throw NumberFormatException, cause the type of startOffset is int and out of bounds
  • [HUDI-5007] – Prevent Hudi from reading the entire timeline's when performing a LATEST streaming read
  • [HUDI-5008] – Avoid unset HoodieROTablePathFilter in IncrementalRelation
  • [HUDI-5025] – Rollback failed with log file not found when rollOver in rollback process
  • [HUDI-5041] – lock metric register confict error
  • [HUDI-5057] – Fix msck repair hudi table
  • [HUDI-5058] – The primary key cannot be empty when Flink reads an error from the hudi table
  • [HUDI-5061] – bulk insert operation don't throw other exception except IOE Exception
  • [HUDI-5063] – totalScantime and other run time stats missing from commit metadata
  • [HUDI-5070] – Fix Flaky TestCleaner test : testInsertAndCleanByCommits
  • [HUDI-5076] – Non serializable path used with engineContext with metadata table initialization
  • [HUDI-5087] – Max value read from metatable incorrect
  • [HUDI-5088] – Failed to synchronize the hive metadata of the Flink table
  • [HUDI-5092] – Querying Hudi table throws NoSuchMethodError in Databricks runtime
  • [HUDI-5096] – boolean param is broken in HiveSyncTool
  • [HUDI-5097] – Read 0 records from partitioned table without partition fields in table configs
  • [HUDI-5151] – Flink data skipping doesn't work with ClassNotFoundException of InLineFileSystem
  • [HUDI-5157] – Duplicate partition path for chained hudi tables.
  • [HUDI-5163] – Failure handling w/ spark ds write failures
  • [HUDI-5176] – Incremental source may miss commits if there are inflight commits before completed commits
  • [HUDI-5185] – Compaction run fails with –hoodieConfigs
  • [HUDI-5203] – Debezium payload does not handle null-field cases
  • [HUDI-5228] – Flink table service job fs view conf overwrites the one of writing job
  • [HUDI-5242] – Do not fail Meta sync in Deltastreamer when inline table service fails
  • [HUDI-5251] – Unexpected avro dependency in flink 1.15 bundle
  • [HUDI-5253] – HoodieMergeOnReadTableInputFormat could have duplicate records issue if it contains delta files while still splittable
  • [HUDI-5260] – Insert into sql with strict insert mode and no preCombineField should not overwrite existing records
  • [HUDI-5277] – RunClusteringProcedure can't exit corretly
  • [HUDI-5286] – UnsupportedOperationException throws when enabling filesystem retry
  • [HUDI-5291] – NPE in collumn stats for null values
  • [HUDI-5320] – Spark SQL CTAS does not propagate Table properties to actual SparkSqlWriter
  • [HUDI-5325] – Fix Create Table to propagate properly Metadata Table enabling config
  • [HUDI-5336] – Fix log file parsing to consider "." at the beginning
  • [HUDI-5346] – Fixing performance traps in CTAS
  • [HUDI-5347] – Fix Merge Into performance traps
  • [HUDI-5350] – oom cause compaction event lost
  • [HUDI-5351] – Handle meta fields being disabled in Bulk Insert Partitioners
  • [HUDI-5373] – Different fileids are assigned to the same bucket
  • [HUDI-5375] – Fix re-using of file readers w/ metadata table in FileIndex
  • [HUDI-5393] – Remove the reuse of metadata table writer for flink write client
  • [HUDI-5403] – Input Format class has metadata table enabled for file listing unexpectedly by default
  • [HUDI-5409] – Avoid file index and use fs view cache in COW input format
  • [HUDI-5412] – Send the boostrap event if the JM also rebooted

Improvement

  • [HUDI-4526] – improve spillableMapBasePath disk directory is full
  • [HUDI-4799] – improve analyzer exception tip when can not resolve expression
  • [HUDI-4960] – Upgrade Jetty version for Timeline server
  • [HUDI-4980] – Make avg record size calculated based on commit instant only
  • [HUDI-4995] – Dependency conflicts on apache http with other projects
  • [HUDI-4997] – use jackson-v2 replace jackson-v1 import
  • [HUDI-5002] – Remove deprecated API usage in SparkHoodieHBaseIndex#generateStatement
  • [HUDI-5027] – Replace hardcoded hbase config keys with HbaseConstants
  • [HUDI-5045] – Add tests to integ test to test bulk_insert followed by upsert
  • [HUDI-5066] – Support hoodie source metaclient cache for flink planner
  • [HUDI-5102] – source operator(monitor and reader) support user uid
  • [HUDI-5104] – Add feature flag to disable HoodieFileIndex and fall back to HoodieROTablePathFilter
  • [HUDI-5111] – Add metadata on read support to integ tests
  • [HUDI-5184] – Remove export PYSPARK_SUBMIT_ARGS="–master local*" from HoodiePySparkQuickstart.py
  • [HUDI-5247] – Clean up java client tests
  • [HUDI-5296] – Support disabling schema on read if not required
  • [HUDI-5338] – Adjust coalesce behavior within "NONE" sort mode for bulk insert
  • [HUDI-5344] – Upgrade com.google.protobuf:protobuf-java
  • [HUDI-5345] – Avoid fs.exists calls for metadata table in HFileBootstrapIndex
  • [HUDI-5348] – Cache file slices within MDT reader
  • [HUDI-5357] – Optimize release artifacts' deployment
  • [HUDI-5370] – Properly close file handles for Metadata writer

Test

Task

  • [HUDI-3287] – Remove unnecessary deps in hudi-kafka-connect
  • [HUDI-5081] – Resources clean-up in hudi-utilities tests
  • [HUDI-5221] – Make the decision for flink sql bucket index case-insensitive
  • [HUDI-5223] – Partial failover for flink
  • [HUDI-5227] – Upgrade Jetty to 9.4.48

本文为从大数据到人工智能博主「xiaozhch5」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。

原文链接:https://cloud.tencent.com/developer/article/2208628

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 长期支持版本
  • 迁移指南
  • bug修复
  • Release Notes
    • Sub-task
      • Bug
        • Improvement
          • Test
            • Task
            相关产品与服务
            文件存储
            文件存储(Cloud File Storage,CFS)为您提供安全可靠、可扩展的共享文件存储服务。文件存储可与腾讯云服务器、容器服务、批量计算等服务搭配使用,为多个计算节点提供容量和性能可弹性扩展的高性能共享存储。腾讯云文件存储的管理界面简单、易使用,可实现对现有应用的无缝集成;按实际用量付费,为您节约成本,简化 IT 运维工作。
            领券
            问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档