HashMap源码详解

提莫队长

发布于 2019-02-21 11:32:28

4350

发布于 2019-02-21 11:32:28

文章被收录于专栏：刘晓杰

HashMap中有个重要的数据HashMapEntry，在源码里面有介绍

    static class HashMapEntry<K, V> implements Entry<K, V> {
        final K key;
        V value;
        final int hash;
        HashMapEntry<K, V> next;

        ......
    }

Entry 是一个 static class，其中包含了 key 和 value，也就是键值对，另外还包含了一个 next 的 Entry 指针。我们可以总结出：Entry 就是数组中的元素，每个 Entry 其实就是一个 key-value 对，它持有一个指向下一个元素的引用，这就构成了链表。

1.源码详解：

public class HashMap<K, V> extends AbstractMap<K, V> implements Cloneable, Serializable {
    // 存数据的数组
    transient HashMapEntry<K, V>[] table;

    public HashMap(int capacity) {
        if (capacity < 0) {
            throw new IllegalArgumentException("Capacity: " + capacity);
        }

        if (capacity == 0) {
            @SuppressWarnings("unchecked")
            HashMapEntry<K, V>[] tab = (HashMapEntry<K, V>[]) EMPTY_TABLE;
            table = tab;
            threshold = -1; // Forces first put() to replace EMPTY_TABLE
            return;
        }

        if (capacity < MINIMUM_CAPACITY) {
            capacity = MINIMUM_CAPACITY;
        } else if (capacity > MAXIMUM_CAPACITY) {
            capacity = MAXIMUM_CAPACITY;
        } else {
            capacity = Collections.roundUpToPowerOfTwo(capacity);
        }
        makeTable(capacity);
    }

    private HashMapEntry<K, V>[] makeTable(int newCapacity) {
        @SuppressWarnings("unchecked") 
        // 几乎所有的构造函数都会有new HashMapEntry[newCapacity];
        HashMapEntry<K, V>[] newTable = (HashMapEntry<K, V>[]) new HashMapEntry[newCapacity];
        table = newTable;
        threshold = (newCapacity >> 1) + (newCapacity >> 2); // 3/4 capacity
        return newTable;
    }

    // **************************以上就是构造函数，很明显就是new一个HashMapEntry数组出来以便后续保存数据


    // 获取数据。先计算key的hashCode。然后逐个访问链表，如果K相同或者hash值和key值完全相等，就返回值（不高效）
    public V get(Object key) {
        if (key == null) {
            HashMapEntry<K, V> e = entryForNullKey;
            return e == null ? null : e.value;
        }

        // 计算key的hashCode
        int hash = key.hashCode();
        hash ^= (hash >>> 20) ^ (hash >>> 12);
        hash ^= (hash >>> 7) ^ (hash >>> 4);

        HashMapEntry<K, V>[] tab = table;
        for (HashMapEntry<K, V> e = tab[hash & (tab.length - 1)];
                e != null; e = e.next) {
            K eKey = e.key;
            // 如果K相同或者hash值和key值完全相等，就返回值
            if (eKey == key || (e.hash == hash && key.equals(eKey))) {
                return e.value;
            }
        }
        return null;
    }

    // 赋值操作。问题和get一样，逐个访问不高效
    public V put(K key, V value) {
        if (key == null) {
            return putValueForNullKey(value);
        }

        int hash = secondaryHash(key);
        HashMapEntry<K, V>[] tab = table;
        int index = hash & (tab.length - 1);
        for (HashMapEntry<K, V> e = tab[index]; e != null; e = e.next) {
            if (e.hash == hash && key.equals(e.key)) {
                preModify(e);
                V oldValue = e.value;
                e.value = value;
                return oldValue;
            }
        }

        // No entry for (non-null) key is present; create one
        modCount++;
        if (size++ > threshold) {
            tab = doubleCapacity();
            index = hash & (tab.length - 1);
        }
        addNewEntry(key, value, hash, index);
        return null;
    }

    // 移除某个Entry，相当于链表里面移除某个节点
    public V remove(Object key) {
        if (key == null) {
            return removeNullKey();
        }
        int hash = secondaryHash(key);
        HashMapEntry<K, V>[] tab = table;
        int index = hash & (tab.length - 1);
        for (HashMapEntry<K, V> e = tab[index], prev = null;
                e != null; prev = e, e = e.next) {
            if (e.hash == hash && key.equals(e.key)) {
                if (prev == null) {
                    tab[index] = e.next;
                } else {
                    prev.next = e.next;
                }
                modCount++;
                size--;
                postRemove(e);
                return e.value;
            }
        }
        return null;
    }
}

2.纠正误区：

现在应该能分清楚哪个更高效了吧？第一种

　　Map map = new HashMap();
　　Iterator iter = map.entrySet().iterator();
　　while (iter.hasNext()) {
        Map.Entry entry = (Map.Entry) iter.next();
    　　  Object key = entry.getKey();
    　　  Object val = entry.getValue();
　　}

第二种

　　Map map = new HashMap();
　　Iterator iter = map.keySet().iterator();
　　while (iter.hasNext()) {
    　　  Object key = iter.next();
    　　  Object val = map.get(key);//要去逐个遍历链表，效率低
　　}

3.散列的index

还有一个很有意思的地方，经常看见这句话 index = hash & (tab.length - 1); 这是用来寻找数组中的index确定放在哪一位。那为什么要这么做呢？ 1.提高效率。一般会想到用hash值对length取模（即除法散列法），但取模会用到除法运算，效率很低，&的效率高于取模 2.节省空间。length为2的整数次幂，length-1为奇数，奇数的最后一位是1，这样便保证了h&(length-1)的最后一位可能为0，也可能为1（这取决于h的值），即与后的结果可能为偶数，也可能为奇数，这样便可以保证散列的均匀性

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2017年02月09日，如有侵权请联系 cloudcommunity@tencent.com 删除

hashmap