The content of this page has been automatically translated by AI. If you encounter any problems while reading, you can view the corresponding content in Chinese.
In the message scenario, it is common for messages to fail to send, accumulate beyond expectations, or not be consumed normally. To address these issues, TDMQ for Pulsar provides message retry and dead letter mechanisms.
Automatic Retry
An automatic retry topic is designed to ensure that messages are consumed normally. If no normal response is received after a message is consumed by the consumer for the first time, it will enter the retry topic, and after a specified number of retries, it will be delivered to the dead letter topic.
When messages enter the dead-letter queue, it indicates that TDMQ Pulsar Edition can no longer automatically process these messages. Typically, human intervention is required to handle these messages. You can write a dedicated client to subscribe to the dead letter topic and process the messages that failed previously.
Relevant Concepts
Retry topic: A retry topic corresponds to a subscription name (a unique identifier of a subscriber group), and it exists in the form of a topic in the TDMQ for Pulsar. When you create a subscription in the console and turn on Automatically Create Retry & Dead Letter Queues, the system automatically creates a retry topic. This topic autonomously implements the message retry mechanism.
The topic is named:
2.9.2 Version Cluster: [Topic Name]-[Subscription Name]-RETRY
2.7.2 Version Cluster: Subscription Name-RETRY.
2.6.1 Version Cluster: Subscription Name-retry.
How It Works
Once you create a consumer that subscribes to a topic using a specific subscription name in shared mode, if the enableRetry property is enabled, it will automatically subscribe to the retry queue corresponding to that subscription name.
When consumption fails, after calling the consumer.reconsumeLater API, the client internally checks the retry count of the message. If it reaches the specified maximum number of retries, the consumption is delivered to the dead letter queue (messages delivered to the dead letter queue will not be consumed automatically; if needed, users can create additional consumers to consume them). If it does not reach the maximum number of retries, the consumption is delivered to the retry queue. The retry interval is achieved through delayed messages, and what is actually delivered to the retry queue is a delayed message, with the delay time specified by the user in reconsumeLater.
Description
Only the shared mode (including key_shared) supports the automatic retry and dead letter mechanism.
If the subscription mode is Exclusive or Failover, the specified retry time interval is invalid, and it will retry immediately. This is because the retry interval is implemented through the delayed message feature, which is not supported in Exclusive or Failover modes.
Note that the client version needs to be consistent with the cluster version. Thus, the client can accurately identify the automatically created retry and dead letter queues.
When accessing retry/dead letter queues using a token, you need to grant the role used by the consumer the permission to produce messages.
Here's an example using the Java language client: Assuming a subscription sub1 is created for topic1, the client subscribes to topic1 using the subscription name sub1 and enables enableRetry, as shown below:
Consumer consumer = client.newConsumer()
.topic("persistent://1******30/my-ns/topic1")
.subscriptionType(SubscriptionType.Shared)//Only shared consumer mode supports retry and dead-letter.
.enableRetry(true)
.subscriptionName("sub1")
.subscribe();
At this point, a shipping mode with a retry mechanism is formed for subscription sub1 on topic1. sub1 will automatically subscribe to the retry topic (which can be found in the Topic list in the console) that was automatically created when the subscription was created. When a message from topic1 fails to receive the ACK from the consumer on the first attempt, the message is automatically delivered to the retry topic. Because the consumer automatically subscribes to this topic, the message will be consumed again under certain retry rules at a later stage. If the maximum number of retries is reached and the message still fails, it will be shipped to the corresponding dead letter queue to wait for manual processing.
Description
If the subscription is automatically created by the client, you can go to Topic Management > More > View Subscription in the console to manually rebuild the retry and dead letter queues on the consumption management page.
Custom Parameter Settings
The TDMQ for Pulsar configures a set of retry and dead letter parameters by default. The details are as follows:
Clusters On V2.9.2
Clusters On V2.7.2
Clusters On V2.6.1
Specifies the maximum retry count as 16 times (after 16 failures, the message will be delivered to the dead-letter queue on the 17th attempt).
Specify the retry queue as [Topic Name]-[Subscription Name]-RETRY
Specify the dead letter queue as [Topic Name]-[Subscription Name]-DLQ
Specifies the maximum retry count as 16 times (after 16 failures, the message will be delivered to the dead-letter queue on the 17th attempt).
Specify the retry queue as Subscription Name-RETRY.
Specify the dead letter queue as Subscription Name-DLQ.
Specifies the maximum retry count as 16 times (after 16 failures, the message will be delivered to the dead-letter queue on the 17th attempt).
Specifies the retry queue as [subscription name]-retry.
Specifies the dead-letter queue as [subscription name]-dlq.
To customize these parameters, you can use the deadLetterPolicy API for configuration, as shown below:
The first mode: Specify arbitrary delay time. The second parameter specifies the delay time, and the third parameter specifies the time unit. The delay time range is consistent with that of delayed messages, ranging from 1 to 864,000 seconds.
The second mode: Specify any delay level (only available to users of existing Cloud SDK). The effect is basically the same as the first mode, making it more convenient to manage the delay duration in a distributed system. The delay level is explained as follows:
1.1 The second parameter in reconsumeLater(msg, 1) is the message level.
1.2 By default, MESSAGE_DELAYLEVEL = "1s 5s 10s 30s 1m 2m 3m 4m 5m 6m 7m 8m 9m 10m 20m 30m 1h 2h". This constant determines the delay time corresponding to each level. For example, level 1 corresponds to 1s, and level 3 corresponds to 10s. If the default value does not meet actual business requirements, users can redefine it.
The third mode: Level increment (only available to users of existing Cloud SDK). The effect differs from the above two modes. It implements backoff-style retrying, where the interval between retries increases progressively. After the first failure, the retry interval is 1 second, after the second failure, it is 5 seconds, and so on. The more times, the longer the interval. The specific time interval is also determined by the MESSAGE_DELAYLEVEL introduced in the second method.
This retry mechanism is often more practically applicable in business scenarios. If a consumption fails, the service generally does not recover immediately, making this progressive retry approach more reasonable.
Notes:
If you are using the Pulsar community SDK, the delay level and level increment modes are not supported.
Message Properties Of Retried Messages:
A retried message will have the following properties attached to it:
ORIGIN_MESSAGE_ID: The ID of the message that was originally produced.
RETRY_TOPIC: Retry topic.
RECONSUMETIMES: Represents the number of times the message is retried.
Method To Get Retry Count
msg.getProperties().get("RECONSUMETIMES")
Notes:
The retry count obtained through the msg.getRedeliveryCount() API is for the negativeAcknowledge retry method.
Flow Of Message ID For Retried Messages
The flow of message IDs is as follows, allowing you to analyze related logs according to this rule.
Original consumption: msgid=1:1:0:1
First retry: msgid=2:1:-1
Second retry: msgid=2:2:-1
Third retry: msgid=2:3:-1
.......
16th retry: msgid=2:16:0:1
17th retry written to dead-letter queue: msgid=3:1:-1
Complete Code Example
For the Retry (-RETRY) topic, the feature must initially be enabled within the consumer via enableRetry(true), as it is disabled by default. Subsequently, messages are only sent to the retry topic when the reconsumeLater() API is called.
For the dead letter (-DLQ) topic, calling consumer.reconsumeLater() is essential. Once reconsumeLater() is executed, the message from the original topic is acknowledged and subsequently redirected to the retry topic. If the maximum retry attempts are exhausted, the message is then redirected to the DLQ. The Pulsar client automatically subscribes to the retry topic, but it does not automatically subscribe to the DLQ topic. Users must manually subscribe to the DLQ topic.
The following is a sample code for implementing the complete message retry mechanism using the TDMQ for Pulsar, for developers' reference.
Subscription Topic
Consumer<byte[]> consumer1 = client.newConsumer()
.topic("persistent://pulsar-****")
.subscriptionName("my-subscription")
.subscriptionType(SubscriptionType.Shared)
.enableRetry(true)//Enable retry consumption.
//.deadLetterPolicy(DeadLetterPolicy.builder()
// .maxRedeliverCount(maxRedeliveryCount)
// .retryLetterTopic("persistent://my-property/my-ns/my-subscription-retry")//You can specify a retry queue.
// .deadLetterTopic("persistent://my-property/my-ns/my-subscription-dlq")//You can specify the dead-letter queue.
When the client fails to consume a message and wants to reconsume it, the consumer can call the negativeAcknowledge API. The message will be retrieved again after a period of time. The interval for retrieving the message can be specified by the consumer through the negativeAckRedeliveryDelay configuration.
Brief Description Of Implementation Mechanism
The client will cache the message IDs of negativeAcknowledge messages and periodically scan the list of negativeAcknowledge messages (scan interval is negativeAckRedeliveryDelay * 1/3). When messages reach the specified negativeAckRedeliveryDelay time, the client notifies the server to redeliver the messages that need to be retried. After receiving the client's redelivery request, the server pushes the corresponding messages back to the client.
Notes:
1. The actual time to receive the message again may be 1/3 more than the time specified by negativeAckRedeliveryDelay, which is related to the client's implementation logic.
// Acknowledge the message so that it can be deleted by the message broker
consumer.acknowledge(msg);
}catch(Exception e){
// Message failed to process, redeliver later
consumer.negativeAcknowledge(msg);
}
}
Notes
1. In the retry mode of negativeAcknowledge, the messages that need to be retried are still seen as unacknowledged on the server-side.
2. In the retry mode of negativeAcknowledge, there is no default maximum number of retries. However, this can be achieved by configuring the maximum number of retries and a dead-letter queue. In this mode, when a message has been retried the specified number of times, it will be delivered to the dead-letter queue.
Consumer<byte[]> consumer = client.newConsumer()
.topic("persistent://pulsar-****")
.subscriptionName("my-subscription")
.subscriptionType(SubscriptionType.Shared)
// 1 min by default
.negativeAckRedeliveryDelay(1,TimeUnit.MINUTES)
.deadLetterPolicy(DeadLetterPolicy.builder()
.maxRedeliverCount(5)//You can specify the maximum number of retries
// Acknowledge the message so that it can be deleted by the message broker
consumer.acknowledge(msg);
}catch(Exception e){
// Message failed to process, redeliver later
consumer.negativeAcknowledge(msg);
}
}
3. In the retry mode of negativeAcknowledge, the retry count can be obtained from msg.getRedeliveryCount(). However, note that if all consumers under a subscription are offline, the retry count of the message will be reset to 0 (a typical scenario: if there is only one consumer under a subscription, after the consumer restarts, the retry count of the message will be reset to 0).