前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >高吞吐框架Disruptor应用场景

高吞吐框架Disruptor应用场景

作者头像
Zeal
发布2020-11-11 15:13:19
4.9K0
发布2020-11-11 15:13:19
举报
文章被收录于专栏:Hyperledger实践Hyperledger实践

多年前在并发编程网http://ifeve.com/disruptor了解到了自认为是黑科技的并发框架DISRUPTOR, 我当时在想NETTY为什么没有和它整合。后来了解过的log4j2, jstorm也慢慢有用到, 而一直以来也并没有机会去使用和了解细节, 大多时候觉得Doug Lea的JDK并发包也足够使用。而近期业务需要基于NETTY简单裹了一个类似vertx的luoying-server https://github.com/zealzeng/luoying-server 业务处理线程不适合在event loop中处理, 简单用有界的ThreadPoolExecutor作为worker pool, 想考虑把disruptor整合进来, 看了两天发觉对disruptor的使用场景产生了误解。

具体的入门有需要可以到并发网或官网https://github.com/LMAX-Exchange/disruptor看下

1. 官方的性能测试用例

1.1 JDK BlockingQueue的吞吐测试, 先来个一个生产者, 一个消费者用例

https://github.com/zealzeng/fabric-samples/blob/master/disruptor-demo/src/main/java/com/lmax/disruptor/queue/OneToOneQueueThroughputTest.java

package com.lmax.disruptor.queue; import com.lmax.disruptor.AbstractPerfTestQueue; import com.lmax.disruptor.support.ValueAdditionQueueProcessor; import com.lmax.disruptor.util.DaemonThreadFactory; import java.util.concurrent.*; import static com.lmax.disruptor.support.PerfTestUtil.failIf; /** * <pre> * UniCast a series of items between 1 publisher and 1 event processor. * * +----+ +-----+ * | P1 |--->| EP1 | * +----+ +-----+ * * Queue Based: * ============ * * put take * +----+ +====+ +-----+ * | P1 |--->| Q1 |<---| EP1 | * +----+ +====+ +-----+ * * P1 - Publisher 1 * Q1 - Queue 1 * EP1 - EventProcessor 1 * * </pre> */ public final class OneToOneQueueThroughputTest extends AbstractPerfTestQueue { private static final int BUFFER_SIZE = 1024 * 64; private static final long ITERATIONS = 1000L * 1000L * 10L; private final ExecutorService executor = Executors.newSingleThreadExecutor(DaemonThreadFactory.INSTANCE); private final long expectedResult = ITERATIONS * 3L; /////////////////////////////////////////////////////////////////////////////////////////////// private final BlockingQueue<Long> blockingQueue = new LinkedBlockingQueue<Long>(BUFFER_SIZE); private final ValueAdditionQueueProcessor queueProcessor = new ValueAdditionQueueProcessor(blockingQueue, ITERATIONS - 1); /////////////////////////////////////////////////////////////////////////////////////////////// @Override protected int getRequiredProcessorCount() { return 2; } @Override protected long runQueuePass() throws InterruptedException { final CountDownLatch latch = new CountDownLatch(1); queueProcessor.reset(latch); Future<?> future = executor.submit(queueProcessor); long start = System.currentTimeMillis(); for (long i = 0; i < ITERATIONS; i++) { blockingQueue.put(3L); } latch.await(); long opsPerSecond = (ITERATIONS * 1000L) / (System.currentTimeMillis() - start); queueProcessor.halt(); future.cancel(true); failIf(expectedResult, 0); return opsPerSecond; } public static void main(String[] args) throws Exception { OneToOneQueueThroughputTest test = new OneToOneQueueThroughputTest(); test.testImplementations(); } }

主线程往queue增加元素, 一个消费线程执行ValueAdditionQueueProcessor是runnable任务,逻辑简单就是获取队列元素后累加, 时间损耗很很小基本相当于空跑。在老的四代I5貌似每秒吞吐也蛮高, 百万级。实际场景ValueAdditionQueueProcessor处理业务也要几十到几百毫秒吧,单线程消费基本就不行了,所以ThreadPoolExecutor基本是要为获取出来的任务分配一个线程,这个是常规的搞法。

Starting Queue tests

Run 0, BlockingQueue=4,539,264 ops/sec

Run 1, BlockingQueue=5,414,185 ops/sec

Run 2, BlockingQueue=4,657,661 ops/sec

Run 3, BlockingQueue=5,288,207 ops/sec

Run 4, BlockingQueue=5,339,028 ops/sec

Run 5, BlockingQueue=5,246,589 ops/sec

Run 6, BlockingQueue=5,197,505 ops/sec

1.2 disruptor的一个生产者, 一个消费者

https://github.com/zealzeng/fabric-samples/blob/master/disruptor-demo/src/main/java/com/lmax/disruptor/sequenced/OneToOneSequencedThroughputTest.java

package com.lmax.disruptor.sequenced; import static com.lmax.disruptor.RingBuffer.createSingleProducer; import static com.lmax.disruptor.support.PerfTestUtil.failIfNot; import java.util.concurrent.CountDownLatch; import java.util.concurrent.ExecutorService; import java.util.concurrent.Executors; import com.lmax.disruptor.*; import com.lmax.disruptor.support.PerfTestUtil; import com.lmax.disruptor.support.ValueAdditionEventHandler; import com.lmax.disruptor.support.ValueEvent; import com.lmax.disruptor.util.DaemonThreadFactory; /** * <pre> * UniCast a series of items between 1 publisher and 1 event processor. * * +----+ +-----+ * | P1 |--->| EP1 | * +----+ +-----+ * * Disruptor: * ========== * track to prevent wrap * +------------------+ * | | * | v * +----+ +====+ +====+ +-----+ * | P1 |--->| RB |<---| SB | | EP1 | * +----+ +====+ +====+ +-----+ * claim get ^ | * | | * +--------+ * waitFor * * P1 - Publisher 1 * RB - RingBuffer * SB - SequenceBarrier * EP1 - EventProcessor 1 * * </pre> */ public final class OneToOneSequencedThroughputTest extends AbstractPerfTestDisruptor { private static final int BUFFER_SIZE = 1024 * 64; private static final long ITERATIONS = 1000L * 1000L * 100L; private final ExecutorService executor = Executors.newSingleThreadExecutor(DaemonThreadFactory.INSTANCE); private final long expectedResult = PerfTestUtil.accumulatedAddition(ITERATIONS); /////////////////////////////////////////////////////////////////////////////////////////////// private final RingBuffer<ValueEvent> ringBuffer = createSingleProducer(ValueEvent.EVENT_FACTORY, BUFFER_SIZE, new YieldingWaitStrategy()); private final SequenceBarrier sequenceBarrier = ringBuffer.newBarrier(); private final ValueAdditionEventHandler handler = new ValueAdditionEventHandler(); private final BatchEventProcessor<ValueEvent> batchEventProcessor = new BatchEventProcessor<ValueEvent>(ringBuffer, sequenceBarrier, handler); { ringBuffer.addGatingSequences(batchEventProcessor.getSequence()); } /////////////////////////////////////////////////////////////////////////////////////////////// @Override protected int getRequiredProcessorCount() { return 2; } @Override protected PerfTestContext runDisruptorPass() throws InterruptedException { PerfTestContext perfTestContext = new PerfTestContext(); final CountDownLatch latch = new CountDownLatch(1); long expectedCount = batchEventProcessor.getSequence().get() + ITERATIONS; handler.reset(latch, expectedCount); executor.submit(batchEventProcessor); long start = System.currentTimeMillis(); final RingBuffer<ValueEvent> rb = ringBuffer; for (long i = 0; i < ITERATIONS; i++) { long next = rb.next(); rb.get(next).setValue(i); rb.publish(next); } latch.await(); perfTestContext.setDisruptorOps((ITERATIONS * 1000L) / (System.currentTimeMillis() - start)); perfTestContext.setBatchData(handler.getBatchesProcessed(), ITERATIONS); waitForEventProcessorSequence(expectedCount); batchEventProcessor.halt(); failIfNot(expectedResult, handler.getValue()); return perfTestContext; } private void waitForEventProcessorSequence(long expectedCount) throws InterruptedException { while (batchEventProcessor.getSequence().get() != expectedCount) { Thread.sleep(1); } } public static void main(String[] args) throws Exception { OneToOneSequencedThroughputTest test = new OneToOneSequencedThroughputTest(); test.testImplementations(); } }

用例没用完整封装的Disruptor类, 而直接用了RingBuffer和BatchEventProcessor处理, 一样的处理逻辑,吞吐是千万级别。但还是那句话, 如果ValueAdditionEventHandler 耗时几十到几百ms, ring buffer再无锁再高效也没用。所以接下来我们看下disruptor的两种消费方式。

Starting Disruptor tests

Run 0, Disruptor=32,701,111 ops/sec BatchPercent=95.16% AverageBatchSize=20

Run 1, Disruptor=36,805,299 ops/sec BatchPercent=62.61% AverageBatchSize=2

Run 2, Disruptor=69,348,127 ops/sec BatchPercent=86.93% AverageBatchSize=7

Run 3, Disruptor=69,396,252 ops/sec BatchPercent=87.21% AverageBatchSize=7

Run 4, Disruptor=67,430,883 ops/sec BatchPercent=86.10% AverageBatchSize=7

Run 5, Disruptor=69,108,500 ops/sec BatchPercent=86.49% AverageBatchSize=7

Run 6, Disruptor=66,979,236 ops/sec BatchPercent=86.42% AverageBatchSize=7

2. Disruptor消息处理方式

2.1 muti-cast 广播消息

官方入门例子给的蛮多都是这个模式, 即使用Disruptor.handleEventsWith(EventHandler... handlers) 这种, 实际会构建一个BatchEventProcessor, 而对应一个线程在跑这个EventProcessor, 这个EventProcessor把ring buffer获取到的任务在同一线程内调用多个EventHandler处理。

多次调用Disruptor.handleEventsWith()就多个BatchEventProcessor消费者线程, 不过这种模式是广播, 每个BatchEventProcessor都可以获取到广播的Event.

// Construct the Disruptor Disruptor<LongEvent> disruptor = new Disruptor<>(factory, bufferSize, DaemonThreadFactory.INSTANCE); // Connect the handler disruptor.handleEventsWith(new LongEventHandler());

不是多个EventProcessor消费者去抢一个Event, 是广播。如果EventHandler如果是耗时多基本没意义, 又得起个线程池异步处理了。

2.2 Work Pool模式

即调用Disruptor.handleEventsWithWorkerPool, 这样每个WorkHandler会是一个线程,各自处理属于自己的Event, 这样就跟平常用的线程池差不多的用法了。

public final EventHandlerGroup<T> handleEventsWithWorkerPool(final WorkHandler<T>... workHandlers) { return createWorkerPool(new Sequence[0], workHandlers); }

官方也有若干个例子,百万级别吞吐, 当WorkHandler耗时较多时其实和线程池相差不大。

https://github.com/zealzeng/fabric-samples/blob/master/disruptor-demo/src/main/java/com/lmax/disruptor/workhandler/OneToThreeWorkerPoolThroughputTest.java

Starting Disruptor tests

Run 0, Disruptor=4,349,906 ops/sec BatchPercent=0.00% AverageBatchSize=-1

Run 1, Disruptor=4,591,579 ops/sec BatchPercent=0.00% AverageBatchSize=-1

Run 2, Disruptor=4,590,946 ops/sec BatchPercent=0.00% AverageBatchSize=-1

Run 3, Disruptor=4,662,222 ops/sec BatchPercent=0.00% AverageBatchSize=-1

Run 4, Disruptor=4,695,276 ops/sec BatchPercent=0.00% AverageBatchSize=-1

Run 5, Disruptor=4,690,211 ops/sec BatchPercent=0.00% AverageBatchSize=-1

Run 6, Disruptor=4,713,201 ops/sec BatchPercent=0.00% AverageBatchSize=-1

3. Disruptor使用场景

参考使用到disruptor的一些框架.

3.1 log4j2

Log4j2异步日志使用到了disruptor, 日志一般是有缓冲区, 满了才写到文件, 增量追加文件结合NIO等应该也比较快, 所以无论是EventHandler还是WorkHandler处理应该延迟比较小的, 写的文件也不多, 所以场景是比较合适的。

3.2 Jstorm

在流处理中不同线程中数据交换,数据计算可能蛮多内存中计算, 流计算快进快出,disruptor应该不错的选择。

3.3 百度uid-generator

部分使用ring buffer和去伪共享等思路缓存已生成的uid, 应该也部分参考了disruptor吧。

3.4小结

Luoying-framework在event loop中在使用disruptor作为work pool性能不会有什么提升, 因为服务器实现内部的业务带着数据库查询等操作, disruptor只是数据交换快, 业务慢终究还是慢。

这就好比我要寄给合同到东北,有些快递确实快两天就到了,有些慢些可能3,4天,但快递只负责把东西送收件人手上, 收件人处理合同讲不好要个一两周,快递再快也解决不了合同处理慢的问题。

不同线程间需要快速交换数据, 快速处理数据的场景我从事的领域见得不多, 可能在大数据开发中和一些中间件开发中会有用武之地。

Disruptor也没深入去看,但是源码貌似不多,如果有纰漏请大家指正, 但ring buffer, sequencer等代码值得研究。

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2019-07-11,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 Hyperledger实践 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
批量计算
批量计算(BatchCompute,Batch)是为有大数据计算业务的企业、科研单位等提供高性价比且易用的计算服务。批量计算 Batch 可以根据用户提供的批处理规模,智能地管理作业和调动其所需的最佳资源。有了 Batch 的帮助,您可以将精力集中在如何分析和处理数据结果上。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档