Consistent Hashing 是一个经常会被问到的数据结构,在实际工程中也非常有用,比如在cache 系统中,partition系统,积极distributed hash table中,都会用到。这里转载了Tom White一篇网上写的很赞的文章,非常清楚的讲解了consistent hashing里面这个环式如何工作的。wiki里面还有类似的一个算法,HRW (Highest Random Weight), 也比较简单。总结一下,consistent hashing就是要做到增加或者删除节点的时候,要尽可能少得影响其他node,正所谓牵一发,而不动全身。
[Reposted from https://weblogs.java.net/blog/2007/11/27/consistent-hashing ]
I've bumped into consistent hashing a couple of times lately. The paper that introduced the idea (Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web by David Karger et al) appeared ten years ago, although recently it seems the idea has quietly been finding its way into more and more services, from Amazon's Dynamo to memcached (courtesy of Last.fm). So what is consistent hashing and why should you care?
The need for consistent hashing arose from limitations experienced while running collections of caching machines - web caches, for example. If you have a collection of n cache machines then a common way of load balancing across them is to put object o in cache machine number hash(o) mod n. This works well until you add or remove cache machines (for whatever reason), for then n changes and every object is hashed to a new location. This can be catastrophic since the originating content servers are swamped with requests from the cache machines. It's as if the cache suddenly disappeared. Which it has, in a sense. (This is why you should care - consistent hashing is needed to avoid swamping your servers!)