ConcurrentSkipListMap的结构:
属性和部分方法如图:
This class implements a tree-like two-dimensionally linked skip * list in which the index levels are represented in separate * nodes from the base nodes holding data. There are two reasons * for taking this approach instead of the usual array-based * structure: 1) Array based implementations seem to encounter * more complexity and overhead 2) We can use cheaper algorithms * for the heavily-traversed index lists than can be used for the * base lists. Here's a picture of some of the basics for a * possible list with 2 levels of index: * * Head nodes Index nodes * +-+ right +-+ +-+ * |2|---------------->| |--------------------->| |->null * +-+ +-+ +-+ * | down | | * v v v * +-+ +-+ +-+ +-+ +-+ +-+ * |1|----------->| |->| |------>| |----------->| |------>| |->null * +-+ +-+ +-+ +-+ +-+ +-+ * v | | | | | * Nodes next v v v v v * +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ * | |->|A|->|B|->|C|->|D|->|E|->|F|->|G|->|H|->|I|->|J|->|K|->null * +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ * The base lists use a variant of the HM linked ordered set * algorithm. See Tim Harris, "A pragmatic implementation of * non-blocking linked lists" * http://www.cl.cam.ac.uk/~tlh20/publications.html and Maged * Michael "High Performance Dynamic Lock-Free Hash Tables and * List-Based Sets" * http://www.research.ibm.com/people/m/michael/pubs.htm. The * basic idea in these lists is to mark the "next" pointers of * deleted nodes when deleting to avoid conflicts with concurrent * insertions, and when traversing to keep track of triples * (predecessor, node, successor) in order to detect when and how * to unlink these deleted nodes.
这个类是实现了一个类似于树的二维连接的跳表,它的索引级别是放在分割开的节点里面的,基础节点拥有所有的数据。用这个便利的数据结构代替数组结构的原因主要有两点:
关于有着二级索引的的列表见图中所示,分为Index节点和基础列表节点,Index节点为索引节点。
基础列表使用HM链表有序算法的一个变种。具体算法讲述参考:http://www.cl.cam.ac.uk/~tlh20/publications.html,高性能无锁的hash表和以列表为基础的集合参考:http://www.research.ibm.com/people/m/michael/pubs.htm 一般用标识要删除节点的next指针的方式来避免删除时的并发插入和决定(predecessor, node, successor)这样的三元组该在什么时候以什么方式取消与已删除节点的联系。
* Rather than using mark-bits to mark list deletions (which can * be slow and space-intensive using AtomicMarkedReference), nodes * use direct CAS'able next pointers. On deletion, instead of * marking a pointer, they splice in another node that can be * thought of as standing for a marked pointer (indicating this by * using otherwise impossible field values). Using plain nodes * acts roughly like "boxed" implementations of marked pointers, * but uses new nodes only when nodes are deleted, not for every * link. This requires less space and supports faster * traversal. Even if marked references were better supported by * JVMs, traversal using this technique might still be faster * because any search need only read ahead one more node than * otherwise required (to check for trailing marker) rather than * unmasking mark bits or whatever on each read. * * This approach maintains the essential property needed in the HM * algorithm of changing the next-pointer of a deleted node so * that any other CAS of it will fail, but implements the idea by * changing the pointer to point to a different node, not by * marking it. While it would be possible to further squeeze * space by defining marker nodes not to have key/value fields, it * isn't worth the extra type-testing overhead. The deletion * markers are rarely encountered during traversal and are * normally quickly garbage collected. (Note that this technique * would not work well in systems without garbage collection.) * * In addition to using deletion markers, the lists also use * nullness of value fields to indicate deletion, in a style * similar to typical lazy-deletion schemes. If a node's value is * null, then it is considered logically deleted and ignored even * though it is still reachable. This maintains proper control of * concurrent replace vs delete operations -- an attempted replace * must fail if a delete beat it by nulling field, and a delete * must return the last non-null value held in the field. (Note: * Null, rather than some special marker, is used for value fields * here because it just so happens to mesh with the Map API * requirement that method get returns null if there is no * mapping, which allows nodes to remain concurrently readable * even when deleted. Using any other marker value here would be * messy at best.)
节点直接使用cas处理next指针来代替低效地使用AtomicMarkedReference来实现的标识位来进行节点的删除。
/*
* 提示:该行代码过长,系统自动注释不进行高亮。一键复制会移除系统注释
* Here's the sequence of events for a deletion of node n with * predecessor b and successor f, initially: * * +------+ +------+ +------+ * ... | b |------>| n |----->| f | ... * +------+ +------+ +------+ * * 1. CAS n's value field from non-null to null. * From this point on, no public operations encountering * the node consider this mapping to exist. However, other * ongoing insertions and deletions might still modify * n's next pointer. * * 2. CAS n's next pointer to point to a new marker node. * From this point on, no other nodes can be appended to n. * which avoids deletion errors in CAS-based linked lists. * * +------+ +------+ +------+ +------+ * ... | b |------>| n |----->|marker|------>| f | ... * +------+ +------+ +------+ +------+ * * 3. CAS b's next pointer over both n and its marker. * From this point on, no new traversals will encounter n, * and it can eventually be GCed. * +------+ +------+ * ... | b |----------------------------------->| f | ... * +------+ +------+ * * A failure at step 1 leads to simple retry due to a lost race * with another operation. Steps 2-3 can fail because some other * thread noticed during a traversal a node with null value and * helped out by marking and/or unlinking. This helping-out * ensures that no thread can become stuck waiting for progress of * the deleting thread. The use of marker nodes slightly * complicates helping-out code because traversals must track * consistent reads of up to four nodes (b, n, marker, f), not * just (b, n, f), although the next field of a marker is * immutable, and once a next field is CAS'ed to point to a * marker, it never again changes, so this requires less care. * * Skip lists add indexing to this scheme, so that the base-level * traversals start close to the locations being found, inserted * or deleted -- usually base level traversals only traverse a few * nodes. This doesn't change the basic algorithm except for the * need to make sure base traversals start at predecessors (here, * b) that are not (structurally) deleted, otherwise retrying * after processing the deletion. * * Index levels are maintained as lists with volatile next fields, * using CAS to link and unlink. Races are allowed in index-list * operations that can (rarely) fail to link in a new index node * or delete one. (We can't do this of course for data nodes.) * However, even when this happens, the index lists remain sorted, * so correctly serve as indices. This can impact performance, * but since skip lists are probabilistic anyway, the net result * is that under contention, the effective "p" value may be lower * than its nominal value. And race windows are kept small enough * that in practice these failures are rare, even under a lot of * contention. * * The fact that retries (for both base and index lists) are * relatively cheap due to indexing allows some minor * simplifications of retry logic. Traversal restarts are * performed after most "helping-out" CASes. This isn't always * strictly necessary, but the implicit backoffs tend to help * reduce other downstream failed CAS's enough to outweigh restart * cost. This worsens the worst case, but seems to improve even * highly contended cases. * * Unlike most skip-list implementations, index insertion and * deletion here require a separate traversal pass occurring after * the base-level action, to add or remove index nodes. This adds * to single-threaded overhead, but improves contended * multithreaded performance by narrowing interference windows, * and allows deletion to ensure that all index nodes will be made * unreachable upon return from a public remove operation, thus * avoiding unwanted garbage retention. This is more important * here than in some other data structures because we cannot null * out node fields referencing user keys since they might still be * read by other ongoing traversals. * * Indexing uses skip list parameters that maintain good search * performance while using sparser-than-usual indices: The * hardwired parameters k=1, p=0.5 (see method doPut) mean * that about one-quarter of the nodes have indices. Of those that * do, half have one level, a quarter have two, and so on (see * Pugh's Skip List Cookbook, sec 3.4). The expected total space * requirement for a map is slightly less than for the current * implementation of java.util.TreeMap. * * Changing the level of the index (i.e, the height of the * tree-like structure) also uses CAS. The head index has initial * level/height of one. Creation of an index with height greater * than the current level adds a level to the head index by * CAS'ing on a new top-most head. To maintain good performance * after a lot of removals, deletion methods heuristically try to * reduce the height if the topmost levels appear to be empty. * This may encounter races in which it possible (but rare) to * reduce and "lose" a level just as it is about to contain an * index (that will then never be encountered). This does no * structural harm, and in practice appears to be a better option * than allowing unrestrained growth of levels. * * The code for all this is more verbose than you'd like. Most * operations entail locating an element (or position to insert an * element). The code to do this can't be nicely factored out * because subsequent uses require a snapshot of predecessor * and/or successor and/or value fields which can't be returned * all at once, at least not without creating yet another object * to hold them -- creating such little objects is an especially * bad idea for basic internal search operations because it adds * to GC overhead. (This is one of the few times I've wished Java * had macros.) Instead, some traversal code is interleaved within * insertion and removal operations. The control logic to handle * all the retry conditions is sometimes twisty. Most search is * broken into 2 parts. findPredecessor() searches index nodes * only, returning a base-level predecessor of the key. findNode() * finishes out the base-level search. Even with this factoring, * there is a fair amount of near-duplication of code to handle * variants. * * To produce random values without interference across threads, * we use within-JDK thread local random support (via the * "secondary seed", to avoid interference with user-level * ThreadLocalRandom.) * * A previous version of this class wrapped non-comparable keys * with their comparators to emulate Comparables when using * comparators vs Comparables. However, JVMs now appear to better * handle infusing comparator-vs-comparable choice into search * loops. Static method cpr(comparator, x, y) is used for all * comparisons, which works well as long as the comparator * argument is set up outside of loops (thus sometimes passed as * an argument to internal methods) to avoid field re-reads. * * For explanation of algorithms sharing at least a couple of * features with this one, see Mikhail Fomitchev's thesis * (http://www.cs.yorku.ca/~mikhail/), Keir Fraser's thesis * (http://www.cl.cam.ac.uk/users/kaf24/), and Hakan Sundell's * thesis (http://www.cs.chalmers.se/~phs/). * * Given the use of tree-like index nodes, you might wonder why * this doesn't use some kind of search tree instead, which would * support somewhat faster search operations. The reason is that * there are no known efficient lock-free insertion and deletion * algorithms for search trees. The immutability of the "down" * links of index nodes (as opposed to mutable "left" fields in * true trees) makes this tractable using only CAS operations. * * Notation guide for local variables * Node: b, n, f for predecessor, node, successor * Index: q, r, d for index node, right, down. * t for another index node * Head: h * Levels: j * Keys: k, key * Values: v, value * Comparisons: c
*/
一个b->n->f的列表,在删除节点n时需要经历如下几步:
其中的属性:
Node是单向的链表结构。
一个节点的属性有key值和value值,节点之间是有序连接的,头节点和一些marker节点有些不一样,因为它们经常被赋予特殊值。next用来指向下一个节点。
/** * Creates a new regular node. 创建普通节点 */ Node(K key, Object value, Node<K,V> next) { this.key = key; this.value = value; this.next = next; } /** 创建marker节点 * Creates a new marker node. A marker is distinguished by * having its value field point to itself. Marker nodes also * have null keys, a fact that is exploited in a few places, * but this doesn't distinguish markers from the base-level * header node (head.node), which also has a null key. */ Node(Node<K,V> next) { this.key = null; this.value = this; this.next = next; }
marker节点在删除元素时被使用
/** * Creates index node with given values. */ Index(Node<K,V> node, Index<K,V> down, Index<K,V> right) { this.node = node;//当前节点 this.down = down;//下节点 this.right = right;//右节点 } /** * compareAndSet right field */ final boolean casRight(Index<K,V> cmp, Index<K,V> val) { return UNSAFE.compareAndSwapObject(this, rightOffset, cmp, val); } /** * Returns true if the node this indexes has been deleted. * @return true if indexed node is known to be deleted */ final boolean indexesDeletedNode() { return node.value == null; } /** * Tries to CAS newSucc as successor. To minimize races with * unlink that may lose this index node, if the node being * indexed is known to be deleted, it doesn't try to link in. * @param succ the expected current successor * @param newSucc the new successor * @return true if successful */ final boolean link(Index<K,V> succ, Index<K,V> newSucc) { Node<K,V> n = node; //将新的后继节点的右节点设置为当前的后继节点 newSucc.right = succ; //CAS设置当前的后继节点 return n.value != null && casRight(succ, newSucc); } /** * Tries to CAS right field to skip over apparent successor * succ. Fails (forcing a retraversal by caller) if this node * is known to be deleted. * @param succ the expected current successor * @return true if successful */ final boolean unlink(Index<K,V> succ) { //CAS设置当前后继节点的右节点为当前的后继节点 return node.value != null && casRight(succ, succ.right); }
HeadIndex:
final int level; HeadIndex(Node<K,V> node, Index<K,V> down, Index<K,V> right, int level) { super(node, down, right); this.level = level; }
在Index的基础上添加了一个表示节点层级的level。
最终图如下(懒得画,来源于网络):
public class ConcurrentSkipListMap<K,V> extends AbstractMap<K,V> implements ConcurrentNavigableMap<K,V>, Cloneable, Serializable { // 版本序列号 private static final long serialVersionUID = -8627078645895051609L; // 基础层的头结点 private static final Object BASE_HEADER = new Object(); // 最顶层头结点的索引 private transient volatile HeadIndex<K,V> head; // 比较器 final Comparator<? super K> comparator; // 键集合 private transient KeySet<K> keySet; // entry集合 private transient EntrySet<K,V> entrySet; // 值集合 private transient Values<V> values; // 降序键集合 private transient ConcurrentNavigableMap<K,V> descendingMap; // Unsafe mechanics private static final sun.misc.Unsafe UNSAFE; // head域的偏移量 private static final long headOffset; // Thread类的threadLocalRandomSecondarySeed的偏移量 private static final long SECONDARY; static { try { UNSAFE = sun.misc.Unsafe.getUnsafe(); Class<?> k = ConcurrentSkipListMap.class; headOffset = UNSAFE.objectFieldOffset (k.getDeclaredField("head")); Class<?> tk = Thread.class; SECONDARY = UNSAFE.objectFieldOffset (tk.getDeclaredField("threadLocalRandomSecondarySeed")); } catch (Exception e) { throw new Error(e); } }}
带标签break参考:https://blog.csdn.net/qwelkjzxc369/article/details/60479633 /** * Gets value for key. Almost the same as findNode, but returns * the found value (to avoid retries during re-reads) * * @param key the key * @return the value, or null if absent */ private V doGet(Object key) { if (key == null) throw new NullPointerException(); Comparator<? super K> cmp = comparator; outer: for (;;) { for (Node<K,V> b = findPredecessor(key, cmp), n = b.next;;) { Object v; int c; if (n == null) break outer;回到了标签位置,不重新进入外循环迭带,直接返回null Node<K,V> f = n.next; if (n != b.next) // inconsistent read 读不一致,有其他线程在操作 break;//重试 进入外循环迭带 if ((v = n.value) == null) { // n is deleted ,n已经被删除了 n.helpDelete(b, f); break;//重试 进入外循环迭带 } if (b.value == null || v == n) // b is deleted ,b被删除了 break;//重试 进入外循环迭带 if ((c = cpr(cmp, key, n.key)) == 0) { @SuppressWarnings("unchecked") V vv = (V)v; return vv; } if (c < 0) break outer;回到了标签位置,不重新进入外循环迭带,直接返回null b = n; n = f; } } return null; } /** * Returns a base-level node with key strictly less than given key, * or the base-level header if there is no such node. Also * unlinks indexes to deleted nodes found along the way. Callers * rely on this side-effect of clearing indices to deleted nodes. * @param key the key * @return a predecessor of key */ private Node<K,V> findPredecessor(Object key, Comparator<? super K> cmp) { if (key == null) throw new NullPointerException(); // don't postpone errors for (;;) { for (Index<K,V> q = head, r = q.right, d;;) {//从顶层索引开始查找 if (r != null) { Node<K,V> n = r.node; K k = n.key; if (n.value == null) {//数据节点的值为null表示该数据标记为已删除 if (!q.unlink(r))//断开连接,如果断开连接失败,代表有其他线程在操作,则跳到外层循环重试 break; // restart 重试 r = q.right; // reread r 如果已经断开连接,则重新加载右节点 continue; } if (cpr(cmp, key, k) > 0) {//给定key大于当前key,继续往右查找 q = r; r = r.right; continue; } } //两种情况:1.当前级别查找结束,未找到;2.给定key小于等于当前key,需要在下一级别中查找 if ((d = q.down) == null)//没有更低级别,直接返回 return q.node; //有更低级别,在更低级别中继续查找 q = d; r = d.right; } } }
主要分为以下几步:
/*
* 提示:该行代码过长,系统自动注释不进行高亮。一键复制会移除系统注释
* /** * Main insertion method. Adds element if not present, or * replaces value if present and onlyIfAbsent is false. * @param key the key * @param value the value that must be associated with key * @param onlyIfAbsent if should not insert if already present * @return the old value, or null if newly inserted */ private V doPut(K key, V value, boolean onlyIfAbsent) { Node<K,V> z; // added node if (key == null) throw new NullPointerException(); Comparator<? super K> cmp = comparator; outer: for (;;) { // 找到“key的前继节点”,从前继节点开始遍历,设置n为“b的后继节点”(即若key存在于“跳表中”,n就是key对应的节点) for (Node<K,V> b = findPredecessor(key, cmp), n = b.next;;) {//b为指定key的前驱节点,n为该前驱节点的下一个节点 if (n != null) { Object v; int c; // 设置n为“key的前继节点的后继节点”,即n应该是“插入节点”的“后继节点” Node<K,V> f = n.next; // 如果两次获得的b.next不是相同的Node,就跳转到”外层for循环“,重新获得b和n后再遍历。 if (n != b.next) // inconsistent read break;//读取不一致,重试 // 当n的值为null(意味着其它线程删除了n);此时删除b的下一个节点,然后跳转到”外层for循环“,重新获得b和n后再遍历。 if ((v = n.value) == null) { // n is deleted n.helpDelete(b, f);//数据节点的值为null,表示该数据节点标记为已删除,移除该数据节点并重试。 break; } // 如果其它线程删除了b;则跳转到”外层for循环“,重新获得b和n后再遍历。 if (b.value == null || v == n) // b is deleted break;//重试 if ((c = cpr(cmp, key, n.key)) > 0) {//给定key大于当前可以,继续寻找合适的插入点 b = n; n = f; continue; } if (c == 0) {//找到插入点 if (onlyIfAbsent || n.casValue(v, value)) { @SuppressWarnings("unchecked") V vv = (V)v; return vv; } break; // restart if lost race to replace value } // else c < 0; fall through } //没有找到,新建数据节点 z = new Node<K,V>(key, value, n); if (!b.casNext(n, z)) break; // restart if lost race to append to b break outer; } } int rnd = ThreadLocalRandom.nextSecondarySeed();//生成随机数 if ((rnd & 0x80000001) == 0) { // test highest and lowest bits //如果生成的随机数是低位则新建上层索引 int level = 1, max; //直到变成偶数才中止循环,相当于随机获取一层level while (((rnd >>>= 1) & 1) != 0)//移位操作并与1取&来判断奇偶,相当于%操作,如果为奇数则对level加1。获取跳表层级 ++level; //需要构建的节点 Index<K,V> idx = null; //头节点 HeadIndex<K,V> h = head; //如果获取到的层级小于最大层级 if (level <= (max = h.level)) {//如果获取的链表层级小于等于当前最大层级,则直接添加,并将它们组成一个上下的链表 //从第一屋到第level层的链表中都插入新建节点 for (int i = 1; i <= level; ++i) idx = new Index<K,V>(z, idx, null); } else { // try to grow by one level 否则增加一层level,在这里体现为Index<K,V>数组 level = max + 1; // hold in array and later pick the one to use //新建一层级的Index数组 @SuppressWarnings("unchecked")Index<K,V>[] idxs = (Index<K,V>[])new Index<?,?>[level+1]; for (int i = 1; i <= level; ++i) idxs[i] = idx = new Index<K,V>(z, idx, null);//数组里面放的是这一层的所有Index<K,V> for (;;) { h = head; int oldLevel = h.level; if (level <= oldLevel) // lost race to add level break; HeadIndex<K,V> newh = h;//更新head Node<K,V> oldbase = h.node; for (int j = oldLevel+1; j <= level; ++j)//新添加的level层的数据 newh = new HeadIndex<K,V>(oldbase, newh, idxs[j], j); if (casHead(h, newh)) { h = newh; idx = idxs[level = oldLevel]; break; } } } // find insertion points and splice in //逐层插入数据 splice: for (int insertionLevel = level;;) { int j = h.level; //遍历h=>HeadIndex对应的链表,从左向右遍历 for (Index<K,V> q = h, r = q.right, t = idx;;) { if (q == null || t == null) break splice; if (r != null) { Node<K,V> n = r.node; // compare before deletion check avoids needing recheck //比较n的值与key是否相同 int c = cpr(cmp, key, n.key); if (n.value == null) {//如果n的值为null if (!q.unlink(r))//试着与右节点unlink,如果失败(有其他线程操作)则重试 break; r = q.right;//如果上面n的值为null,则让r指向q的下一个右节点 continue; } if (c > 0) { q = r; r = r.right; continue; } } if (j == insertionLevel) {//j表示层级位于要插入层 if (!q.link(r, t))//如果有其他线程在操作则重试 break; // restart if (t.node.value == null) { findNode(key); break splice; } if (--insertionLevel == 0) break splice; } if (--j >= insertionLevel && j < level) t = t.down; q = q.down; r = q.right; } } } return null; }
*/
主要有以下几步:
/* ---------------- Deletion -------------- */ /** * Main deletion method. Locates node, nulls value, appends a * deletion marker, unlinks predecessor, removes associated index * nodes, and possibly reduces head index level. * * Index nodes are cleared out simply by calling findPredecessor. * which unlinks indexes to deleted nodes found along path to key, * which will include the indexes to this node. This is done * unconditionally. We can't check beforehand whether there are * index nodes because it might be the case that some or all * indexes hadn't been inserted yet for this node during initial * search for it, and we'd like to ensure lack of garbage * retention, so must call to be sure. * * @param key the key * @param value if non-null, the value that must be * associated with key * @return the node, or null if not found */ final V doRemove(Object key, Object value) { if (key == null) throw new NullPointerException(); Comparator<? super K> cmp = comparator; outer: for (;;) { // 找到“key的前继节点”,从前继节点开始遍历,设置n为“b的后继节点”(即若key存在于“跳表中”,n就是key对应的节点) for (Node<K,V> b = findPredecessor(key, cmp), n = b.next;;) { Object v; int c; if (n == null) break outer; // f是“当前节点n的后继节点” Node<K,V> f = n.next; // 如果两次读取到的“b的后继节点”不同(其它线程操作了该跳表),则返回到外层循环重新遍历。 if (n != b.next) // inconsistent read break; // 如果“当前节点n的值”变为null(其它线程操作了该跳表),则返回到外层循环重新遍历。 if ((v = n.value) == null) { // n is deleted n.helpDelete(b, f); break; } //如果“后继节点b”被删除(其它线程操作了该跳表),则返回到外层for循环重新遍历 if (b.value == null || v == n) // b is deleted break; if ((c = cpr(cmp, key, n.key)) < 0) break outer; if (c > 0) { b = n; n = f; continue; } if (value != null && !value.equals(v)) break outer; if (!n.casValue(v, null)) break; //尝试着将n指向marker节点,marker节点指向f节点,将b与f建立link,n节点会被gc if (!n.appendMarker(f) || !b.casNext(n, f)) //如果上面的尝试失败,则通过findNode进行重试 findNode(key); // retry via findNode else { findPredecessor(key, cmp); // clean index if (head.right == null) //如果头节点的右节点的null,则删除一个层级 tryReduceLevel(); } @SuppressWarnings("unchecked") V vv = (V)v; return vv; } } return null; }
主要步骤为:
参考:https://www.jianshu.com/p/edc2fd149255