众所周知,HBase的一个例族flush时,会导致所有例族都跟着被flush。在HBase-0.94的官方说明(http://hbase.apache.org/0.94/book/number.of.cfs.html)也明确HBase不能很好的支持一个以上的例族。
HBase-2.0.0和HBase-1.1.0(https://issues.apache.org/jira/browse/HBASE-10201)引入FlushLargeStoresPolicy来解决这个问题。
FlushLargeStoresPolicy的实现非常简单,就是在flush之前先判断下Store的大小,当超过指定大小时才flush(注:实际上不仅仅受此决定,具体可查看HRegion类的shouldFlushStore()的实现)。
相关的类(之前只有FlushAllStoresPolicy一种flush策略,也就是flush一个例族时也会flush其它所有例族):
flush过程:
相关源代码:
public abstract class FlushPolicy {
protected HRegion region;
protected void configureForRegion(HRegion region) {
this.region = region;
}
public abstract Collection selectStoresToFlush();
}
public class FlushLargeStoresPolicy extends FlushPolicy {
private boolean shouldFlush(Store store) {
if (store.getMemStoreSize() > this.flushSizeLowerBound) {
return true;
}
// 请注意下面这句
return region.shouldFlushStore(store);
}
public Collection selectStoresToFlush() {
Collection stores = region.stores.values();
Set specificStoresToFlush = new HashSet();
for (Store store : stores) {
if (shouldFlush(store)) {
specificStoresToFlush.add(store);
}
}
return specificStoresToFlush;
}
}
public class FlushAllStoresPolicy extends FlushPolicy {
public Collection selectStoresToFlush() {
return region.stores.values();
}
}
public class HRegion {
boolean shouldFlushStore(Store store) {
if ((maxFlushedSeqId > 0)
&& (maxFlushedSeqId + flushPerChanges < sequenceId.get())) {
return true;
}
if (flushCheckInterval <= 0) {
return false;
}
long now = EnvironmentEdgeManager.currentTime();
if (store.timeOfOldestEdit() < now - flushCheckInterval) {
return true;
}
return false;
}
}