# 最通俗易懂的HashMap底层原理图文详解

HashMap的底层原理面试必考题。为什么面试官如此青睐这道题？

HashMap里面涉及了很多的知识点，可以比较全面考察面试者的基本功，想要拿到一个好offer,这是一个迈不过的坎，接下来我们用最通俗易懂的语言带着大家揭开HashMap的神秘面纱。

# 数据结构

HashMap的数据结构为 : 数组+(链表或红黑树)。

```    /**
* The table, initialized on first use, and resized as
* necessary. When allocated, length is always a power of two.
* (We also tolerate length zero in some operations to allow
* bootstrapping mechanics that are currently not needed.)
*/
transient Node<K,V>[] table;```

HashMap的数据模型：

```public class HashMap<K,V> extends AbstractMap<K,V> implements Map<K,V>, Cloneable, Serializable {

static class Node<K,V> implements Map.Entry<K,V> {
final int hash;
final K key;
V value;
Node<K,V> next;
}

...

/**
* Entry for Tree bins. Extends LinkedHashMap.Entry (which in turn
* extends Node) so can be used as extension of either regular or
*/
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
TreeNode<K,V> parent;  // red-black tree links
TreeNode<K,V> left;
TreeNode<K,V> right;
TreeNode<K,V> prev;    // needed to unlink next upon deletion
boolean red;
...
}

}```

Node的模型：

TreeNode的模型：

# 算法

### 红黑树的五个性质

• 若任意节点的左子树不空，则左子树上所有结点的值均小于它的根结点的值；
• 若任意节点的右子树不空，则右子树上所有结点的值均大于它的根结点的值；
• 任意节点的左、右子树也分别为二叉查找树。
• 没有键值相等的节点（no duplicate nodes）。

# 为什么节点数大于8的时候转红黑树？

```    /*
* Implementation notes.
*
* This map usually acts as a binned (bucketed) hash table, but
* when bins get too large, they are transformed into bins of
* TreeNodes, each structured similarly to those in
* java.util.TreeMap. Most methods try to use normal bins, but
* relay to TreeNode methods when applicable (simply by checking
* instanceof a node).  Bins of TreeNodes may be traversed and
* used like any others, but additionally support faster lookup
* when overpopulated. However, since the vast majority of bins in
* normal use are not overpopulated, checking for existence of
* tree bins may be delayed in the course of table methods.
*
* Tree bins (i.e., bins whose elements are all TreeNodes) are
* ordered primarily by hashCode, but in the case of ties, if two
* elements are of the same "class C implements Comparable<C>",
* type then their compareTo method is used for ordering. (We
* conservatively check generic types via reflection to validate
* this -- see method comparableClassFor).  The added complexity
* of tree bins is worthwhile in providing worst-case O(log n)
* operations when keys either have distinct hashes or are
* orderable, Thus, performance degrades gracefully under
* accidental or malicious usages in which hashCode() methods
* return values that are poorly distributed, as well as those in
* which many keys share a hashCode, so long as they are also
* Comparable. (If neither of these apply, we may waste about a
* factor of two in time and space compared to taking no
* precautions. But the only known cases stem from poor user
* programming practices that are already so slow that this makes
* little difference.)
*
* Because TreeNodes are about twice the size of regular nodes, we
* use them only when bins contain enough nodes to warrant use
* (see TREEIFY_THRESHOLD). And when they become too small (due to
* removal or resizing) they are converted back to plain bins.  In
* usages with well-distributed user hashCodes, tree bins are
* rarely used.  Ideally, under random hashCodes, the frequency of
* nodes in bins follows a Poisson distribution
* (http://en.wikipedia.org/wiki/Poisson_distribution) with a
* parameter of about 0.5 on average for the default resizing
* threshold of 0.75, although with a large variance because of
* resizing granularity. Ignoring variance, the expected
* occurrences of list size k are (exp(-0.5) * pow(0.5, k) /
* factorial(k)). The first values are:
*
* 0:    0.60653066
* 1:    0.30326533
* 2:    0.07581633
* 3:    0.01263606
* 4:    0.00157952
* 5:    0.00015795
* 6:    0.00001316
* 7:    0.00000094
* 8:    0.00000006
* more: less than 1 in ten million
*
* The root of a tree bin is normally its first node.  However,
* sometimes (currently only upon Iterator.remove), the root might
* be elsewhere, but can be recovered following parent links
* (method TreeNode.root()).
*
* All applicable internal methods accept a hash code as an
* argument (as normally supplied from a public method), allowing
* them to call each other without recomputing user hashCodes.
* Most internal methods also accept a "tab" argument, that is
* normally the current table, but may be a new or old one when
* resizing or converting.
*
* When bin lists are treeified, split, or untreeified, we keep
* them in the same relative access/traversal order (i.e., field
* Node.next) to better preserve locality, and to slightly
* simplify handling of splits and traversals that invoke
* iterator.remove. When using comparators on insertion, to keep a
* total ordering (or as close as is required here) across
* rebalancings, we compare classes and identityHashCodes as
* tie-breakers.
*
* The use and transitions among plain vs tree modes is
* complicated by the existence of subclass LinkedHashMap. See
* below for hook methods defined to be invoked upon insertion,
* removal and access that allow LinkedHashMap internals to
* otherwise remain independent of these mechanics. (This also
* requires that a map instance be passed to some utility methods
* that may create new nodes.)
*
* The concurrent-programming-like SSA-based coding style helps
* avoid aliasing errors amid all of the twisty pointer operations.
*/```

```public class LinkedHashMap<K,​V>
extends HashMap<K,​V>
implements Map<K,​V>```

Hash table and linked list implementation of the Map interface, with predictable iteration order. This implementation differs from HashMap in that it maintains a doubly-linked list running through all of its entries. This linked list defines the iteration ordering, which is normally the order in which keys were inserted into the map (insertion-order). Note that insertion order is not affected if a key is re-inserted into the map. (A key k is reinserted into a map m if m.put(k, v) is invoked when m.containsKey(k) would return true immediately prior to the invocation.)

This implementation spares its clients from the unspecified, generally chaotic ordering provided by HashMap (and Hashtable), without incurring the increased cost associated with TreeMap. It can be used to produce a copy of a map that has the same order as the original, regardless of the original map's implementation:

``` void foo(Map m) {
...
}```

This technique is particularly useful if a module takes a map on input, copies it, and later returns results whose order is determined by that of the copy. (Clients generally appreciate having things returned in the same order they were presented.)

# 参考资料

0 条评论

• ### 乔布斯斯坦福毕业演讲，这是我听过最精彩的毕业演讲！

2005年6月14号乔布斯在斯坦福大学的毕业典礼上做的一次精彩的演讲。被很多人称为是听过的最好的毕业演讲，而且每一次听都有新的收获。

• ### 《MongoDB极简教程》第一章 NoSQL简史 & MongoDB安装&环境配置NoSQLNoSQL 简史CAP定理（CAP theorem）BASEMongoDB 特性&优势文档参考安装&环境配置

MongoDB 是一款开源的文档数据库，并且是业内领先的 NoSQL 数据库，用 C++ 编写而成。

• ### 计算机科学领域的任何问题都可以通过增加一个间接的中间层来解决

这句话几乎概括了计算机软件体系结构的设计要点.整个体系从上到下都是按照严格的层级结构设计的.

• ### MiniZinc中的自动大规模班级调度(AI)

班级调度是一项高度受限的任务。教育机构以时间和手动计算的形式花费了大量资源，以找到满足所有要求的令人满意的时间表。令人满意的上课时间表可让所有学生在方便的时间上...

• ### 双语 | 数据科学家抢了经济学家的饭碗吗？

实在不好意思亲！大猫由于最近论文与工作的压力，R语言课堂更新有点拖（但愿不是有生之年系列Orz）。今天更新的这篇和R没有太大关系，但却讨论了一个很有意思的问题：...

• ### 极限验证潜伏学习算法的比较分析（AI LG）

在计算智能中，更具挑战性的现实世界问题之一是向非平稳流数据学习，也称为概念漂移。也许在这种情况下更具挑战性的版本是-在遵循少量初始标记数据之后-数据流仅包含未标...

• ### 关于如何从Go 1到Go 2进行不兼容的更改，而尽可能少地进行破坏的建议。

A proposal for how to make incompatible changes from Go 1 to Go 2 while breaking...

• ### 价值驱动的事后分析模型

价值评估是强化学习范式的重要组成部分。如何有效地从数据中学习预测值的问题是RL社区研究的主要问题之一，不同的方法以不同的方式利用问题域的结构。模型学习可以利用观...