我在同一个消费者组中启动两个消费者,我订阅了20个主题(每个主题只有一个分区)。
只对消费者使用:
kafka-消费者-组-引导-服务器XXXXX:9092 -组foo -描述-成员-详细
Note: This will not show information about old Zookeeper-based consumers.
CONSUMER-ID HOST CLIENT-ID #PARTITIONS ASSIGNMENT
rdkafka-07cbd673-6a16-4d55-9625-7f0925866540 /xxxxx rdkafka 20 arretsBus(0), capteurMeteo(0), capteurPointMesure(0), chantier(0), coworking(0), horodateur(
0), incident(0), livraison(0), meteo(0), metro(0), parkrelais(0), qair(0), rhdata(0), sensUnique(0), trafic(0), tramway(0), tweets(0), voieRapide(0), zone30(0), zoneRencontre(0)
rdkafka-9a543197-6c97-4213-bd59-cb5a48e4ec15 /xxxx rdkafka 0
我做错了什么?
发布于 2018-08-06 01:51:21
好的,我读了一些关于这种行为的文章,很有趣地知道为什么会发生这种情况。卡夫卡有两种分区分配策略。
Assigns to each consumer a consecutive subset of partitions from each topic it subscribes to. So if consumers C1 and C2 are subscribed to two topics, T1 and T2, and each of the topics has three partitions, then C1 will be assigned partitions 0 and 1 from topics T1 and T2, while C2 will be assigned partition 2 from those topics. Because each topic has an uneven number of partitions and the assignment is done for each topic independently, the first consumer ends up with more partitions than the second. This happens whenever Range assignment is used and the number of consumers does not divide the number of partitions in each topic neatly.
Takes all the partitions from all subscribed topics and assigns them to consumers sequentially, one by one. If C1 and C2 described previously used RoundRobin assignment, C1 would have partitions 0 and 2 from topic T1 and partition 1 from topic T2. C2 would have partition 1 from topic T1 and partitions 0 and 2 from topic T2. In general, if all consumers are subscribed to the same topics (a very common scenario), RoundRobin assignment will end up with all consumers having the same number of partitions (or at most 1 partition difference).
默认策略是范围,这解释了为什么要看到这样的分区分布。
所以我做了个小实验。我创建了两个控制台使用者,每个用户都在收听主题test1, test2, test3, test4
,每个主题只有一个分区。正如预期的那样,所有分区都分配给使用者-1。
然后,我将分区策略更改为org.apache.kafka.clients.consumer.RoundRobinAssignor
,并将其传递给控制台--消费者和瞧,两个消费者现在各获得2个分区。
更新: Oops没有看到它已经在几分钟前被回答了。
发布于 2018-08-06 01:26:32
在Kafka中,一个主题/分区最多只能由一个消费者群体中的一个消费者使用,以避免消费者之间的种族冲突。
发布于 2018-08-06 01:33:56
在Apache中,分区数定义了在同一个消费者组中所需的并行级别;这意味着两个消费者作为同一个消费者组的一部分不能从同一个分区中读取数据。在您的示例中,只有一个分区的主题将只分配给一个消费者,而另一个将处于空闲状态,等待重新平衡:这意味着,如果第一个使用者断开连接,第二个用户将从空闲移动到使用分区。如果您的期望是为每个消费者获得10个主题,这不是Apache的工作方式。正如我所说,并行性单元是主题中的分区,而不是主题本身。
https://stackoverflow.com/questions/51703503
复制