首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >聊聊debezium的OffsetCommitPolicy

聊聊debezium的OffsetCommitPolicy

原创
作者头像
code4it
修改2020-05-22 10:16:01
1.1K0
修改2020-05-22 10:16:01
举报
文章被收录于专栏:码匠的流水账码匠的流水账

本文主要研究一下debezium的OffsetCommitPolicy

OffsetCommitPolicy

debezium-v1.1.1.Final/debezium-api/src/main/java/io/debezium/engine/spi/OffsetCommitPolicy.java

@Incubating
@FunctionalInterface
public interface OffsetCommitPolicy {
​
    boolean performCommit(long numberOfMessagesSinceLastCommit, Duration timeSinceLastCommit);
​
    static OffsetCommitPolicy always() {
        return new AlwaysCommitOffsetPolicy();
    }
​
    static OffsetCommitPolicy periodic(Properties config) {
        return new PeriodicCommitOffsetPolicy(config);
    }
​
}
  • OffsetCommitPolicy定义了performCommit方法,并提供了always静态方法用于创建AlwaysCommitOffsetPolicy;提供了periodic静态方法用于创建PeriodicCommitOffsetPolicy

AlwaysCommitOffsetPolicy

debezium-v1.1.1.Final/debezium-api/src/main/java/io/debezium/engine/spi/OffsetCommitPolicy.java

    public static class AlwaysCommitOffsetPolicy implements OffsetCommitPolicy {
​
        @Override
        public boolean performCommit(long numberOfMessagesSinceLastCommit, Duration timeSinceLastCommit) {
            return true;
        }
    }
  • AlwaysCommitOffsetPolicy实现了OffsetCommitPolicy接口,其performCommit返回true

PeriodicCommitOffsetPolicy

debezium-v1.1.1.Final/debezium-api/src/main/java/io/debezium/engine/spi/OffsetCommitPolicy.java

    public static class PeriodicCommitOffsetPolicy implements OffsetCommitPolicy {
​
        private final Duration minimumTime;
​
        public PeriodicCommitOffsetPolicy(Properties config) {
            minimumTime = Duration.ofMillis(Long.valueOf(config.getProperty(DebeziumEngine.OFFSET_FLUSH_INTERVAL_MS_PROP)));
        }
​
        @Override
        public boolean performCommit(long numberOfMessagesSinceLastCommit, Duration timeSinceLastCommit) {
            return timeSinceLastCommit.compareTo(minimumTime) >= 0;
        }
    }
  • PeriodicCommitOffsetPolicy实现了OffsetCommitPolicy接口,其performCommit通过timeSinceLastCommit.compareTo(minimumTime)进行判断,大于等于0返回true

RecordCommitter

debezium-v1.1.1.Final/debezium-api/src/main/java/io/debezium/engine/DebeziumEngine.java

    public static interface RecordCommitter<R> {
​
        /**
         * Marks a single record as processed, must be called for each
         * record.
         *
         * @param record the record to commit
         */
        void markProcessed(R record) throws InterruptedException;
​
        /**
         * Marks a batch as finished, this may result in committing offsets/flushing
         * data.
         * <p>
         * Should be called when a batch of records is finished being processed.
         */
        void markBatchFinished();
    }
  • RecordCommitter接口定义了markProcessed、markBatchFinished方法

EmbeddedEngine

debezium-v1.1.1.Final/debezium-embedded/src/main/java/io/debezium/embedded/EmbeddedEngine.java

@ThreadSafe
public final class EmbeddedEngine implements DebeziumEngine<SourceRecord> {
​
    //......
​
    protected RecordCommitter buildRecordCommitter(OffsetStorageWriter offsetWriter, SourceTask task, Duration commitTimeout) {
        return new RecordCommitter() {
​
            @Override
            public synchronized void markProcessed(SourceRecord record) throws InterruptedException {
                task.commitRecord(record);
                recordsSinceLastCommit += 1;
                offsetWriter.offset(record.sourcePartition(), record.sourceOffset());
            }
​
            @Override
            public synchronized void markBatchFinished() {
                maybeFlush(offsetWriter, offsetCommitPolicy, commitTimeout, task);
            }
        };
    }
​
    protected void maybeFlush(OffsetStorageWriter offsetWriter, OffsetCommitPolicy policy, Duration commitTimeout,
                              SourceTask task) {
        // Determine if we need to commit to offset storage ...
        long timeSinceLastCommitMillis = clock.currentTimeInMillis() - timeOfLastCommitMillis;
        if (policy.performCommit(recordsSinceLastCommit, Duration.ofMillis(timeSinceLastCommitMillis))) {
            commitOffsets(offsetWriter, commitTimeout, task);
        }
    }
​
    protected void commitOffsets(OffsetStorageWriter offsetWriter, Duration commitTimeout, SourceTask task) {
        long started = clock.currentTimeInMillis();
        long timeout = started + commitTimeout.toMillis();
        if (!offsetWriter.beginFlush()) {
            return;
        }
        Future<Void> flush = offsetWriter.doFlush(this::completedFlush);
        if (flush == null) {
            return; // no offsets to commit ...
        }
​
        // Wait until the offsets are flushed ...
        try {
            flush.get(Math.max(timeout - clock.currentTimeInMillis(), 0), TimeUnit.MILLISECONDS);
            // if we've gotten this far, the offsets have been committed so notify the task
            task.commit();
            recordsSinceLastCommit = 0;
            timeOfLastCommitMillis = clock.currentTimeInMillis();
        }
        catch (InterruptedException e) {
            logger.warn("Flush of {} offsets interrupted, cancelling", this);
            offsetWriter.cancelFlush();
        }
        catch (ExecutionException e) {
            logger.error("Flush of {} offsets threw an unexpected exception: ", this, e);
            offsetWriter.cancelFlush();
        }
        catch (TimeoutException e) {
            logger.error("Timed out waiting to flush {} offsets to storage", this);
            offsetWriter.cancelFlush();
        }
    }
​
    //......
​
}    
  • EmbeddedEngine的buildRecordCommitter方法创建了一个匿名RecordCommitter实现,其markBatchFinished方法会执行maybeFlush方法,该方法会通过policy.performCommit方法来判断是否执行commitOffsets;commitOffsets方法主要执行offsetWriter.doFlush

小结

OffsetCommitPolicy定义了performCommit方法,并提供了always静态方法用于创建AlwaysCommitOffsetPolicy;提供了periodic静态方法用于创建PeriodicCommitOffsetPolicy

doc

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。

如有侵权,请联系 cloudcommunity@tencent.com 删除。

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • OffsetCommitPolicy
  • AlwaysCommitOffsetPolicy
  • PeriodicCommitOffsetPolicy
  • RecordCommitter
  • EmbeddedEngine
  • 小结
  • doc
相关产品与服务
批量计算
批量计算(BatchCompute,Batch)是为有大数据计算业务的企业、科研单位等提供高性价比且易用的计算服务。批量计算 Batch 可以根据用户提供的批处理规模,智能地管理作业和调动其所需的最佳资源。有了 Batch 的帮助,您可以将精力集中在如何分析和处理数据结果上。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档