注: Dubbo版本是2.6.2
图1 Dubbo的FailbackClusterInvoker类继承图
Failback可以理解为后台记录失败请求,定时重发。
核心代码在FailbackClusterInvoker的doInvoke(Invocation,List<Invoker<T>>,LoadBalance)中,源码如下。
@Override
protected Result doInvoke(Invocation invocation, List<Invoker<T>> invokers, LoadBalance loadbalance) throws RpcException {
try {
checkInvokers(invokers, invocation);
Invoker<T> invoker = select(loadbalance, invocation, invokers, null);
return invoker.invoke(invocation);
} catch (Throwable e) {
logger.error("Failback to invoke method " + invocation.getMethodName() + ", wait for retry in background. Ignored exception: "
+ e.getMessage() + ", ", e);
addFailed(invocation, this);
return new RpcResult(); // ignore
}
}
下面来分析addFailed方法的实现
addFailed(invocation,this)的方法源码如下,将invocation和router放如到failed里面(failed是个ConcurrentHashMap)
private void addFailed(Invocation invocation, AbstractClusterInvoker<?> router) {
if (retryFuture == null) {
synchronized (this) {
if (retryFuture == null) {
retryFuture = scheduledExecutorService.scheduleWithFixedDelay(new Runnable() {
@Override
public void run() {
// collect retry statistics
try {
retryFailed();
} catch (Throwable t) { // Defensive fault tolerance
logger.error("Unexpected error occur at collect statistic", t);
}
}
}, RETRY_FAILED_PERIOD, RETRY_FAILED_PERIOD, TimeUnit.MILLISECONDS);
}
}
}
failed.put(invocation, router);
}
retryFailed()方法源码如下,遍历failed中的每个,如果其中一个请求发生异常,则只是记录error日志,不抛出异常,不中断后面的。
void retryFailed() {
if (failed.size() == 0) {
return;
}
for (Map.Entry<Invocation, AbstractClusterInvoker<?>> entry : new HashMap<Invocation, AbstractClusterInvoker<?>>(
failed).entrySet()) {
Invocation invocation = entry.getKey();
Invoker<?> invoker = entry.getValue();
try {
invoker.invoke(invocation);
failed.remove(invocation);
} catch (Throwable e) {
logger.error("Failed retry to invoke method " + invocation.getMethodName() + ", waiting again.", e);
}
}
}
假设定时任务10s中执行一次,0s时已经执行过一次。则如果0s到10s之间失败的请求的有A、B、C,则在10s这个时间点,就会开始对A、B、C进行重新调用。
重点在于,对失败的请求,会记录下来,而后定时重发。
(adsbygoogle = window.adsbygoogle || []).push({});