深入Hashmap

hashmap在我们开发过程使用很多的一种数据结构，哈希表的经典实现，更是面试的高频问题，数据结构的学习对于程序员是一种很大的内功修炼，当我们put(key,value)到一个hashmap中，提供一个get(key)为什么就可以快速返回我们想要的值的？

深入了解jdk1.7-Hashmap

（1）hashmap的重要字段

1. size //hashmap已经存了多少键值对数量
2. capacity //桶的数量 table.length即数组的长度,默认值为16
3. loadFactor //负载因子 默认为0.75
4. threshold  //阈值 当size>threshold时,那么就resize()进行扩容

负载因子是0.75，官方给的解释是：时间和空间成本之间的权衡。较高的值会降低空间开销但增加了查找成本（反映在类的操作）

当我们**new hashmap()**时

 public HashMap() {
        this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR); //默认初始化桶数量和负载因子
    }

（2）put方法

public V put(K key, V value) {
        //空数组，则分配内存空间（延迟初始化）
        if (table == EMPTY_TABLE) {
            inflateTable(threshold);
        }
        //null键处理
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key); //计算出hash值
        int i = indexFor(hash, table.length);//确定桶下标
        for (Entrye = table[i]; e != null; e = e.next) {
            Object k;
            //比较key是否相同
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

当我们的key==null时----putForNullKey

 private V putForNullKey(V value) {
     /**
     *直接在table[0]查找,如果存在key=null则覆盖,返回原来oldValue的值,不存在则addEntry添加到桶中
     */
        for (Entrye = table[0]; e != null; e = e.next) {
            if (e.key == null) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }
        modCount++;
        addEntry(0, null, value, 0);
        return null;
    }

当key!=null先通过**hash()计算出一个hash值，在通过indexFor()**确定桶的下标：

hash函数

/**
*1.扰动函数防止过多的发生hash碰撞
*2.当k为String的时候，使用了stringHash32来减少hash碰撞,主要的原因是tomcat的一封邮件,tomcat使用hashmap存储的http连接,黑客可以通过特定的数据生成相同的hash,将tomcat存储http请求的hashmap蜕变成一个链表,hash碰撞就会变得很致命。
*/
final int hash(Object k) {
        int h = hashSeed;
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }
        h ^= k.hashCode();
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

indexFor函数

//indexFor函数,确定桶的下标
    static int indexFor(int h, int length) {
        // length必须是2的幂
        return h & (length-1);
    }

通过hash函数计算的哈希值h与我们的桶的数量-1进行&运算。这也是我们桶的数量为什么是2的幂的原因

&运算：只有都为1时,结果才为1
例如: 16
二进制数字为: 0001 0000
减一后的二进制数为: 0000 1111
哈希h的二进制很随机,所以我们确定桶的下标值也很随机。

例如: 17 不是2的幂
减一后的二进制数为: 0001 0000  
我们桶的数量为17,无论哈希h怎么变,桶的下标只有0或16,产生的哈希冲突可想而知。

addEntry函数，添加数据

 void addEntry(int hash, K key, V value, int bucketIndex) {
        //扩容条件：size大于阈值并且发生碰撞
        if ((size >= threshold) && (null != table[bucketIndex])) {
            //容量扩为2倍
            resize(2 * table.length);
            hash = (null != key) ? hash(key) : 0;
            //确定新桶下标
            bucketIndex = indexFor(hash, table.length);
        }
        createEntry(hash, key, value, bucketIndex);
    }

resize函数，扩容

 void resize(int newCapacity) {
        Entry[] oldTable = table;
        int oldCapacity = oldTable.length;
        if (oldCapacity == MAXIMUM_CAPACITY) {
            //当桶数量达到允许最大值时，只调整阈值，无法再扩容
            threshold = Integer.MAX_VALUE;
            return;
        }
        //扩容,生成新的Table
        Entry[] newTable = new Entry[newCapacity];
        //数据复制
        transfer(newTable, initHashSeedAsNeeded(newCapacity));
        table = newTable;
       //调整阈值
        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
    }

transfer函数，数据复制

 void transfer(Entry[] newTable, boolean rehash) {
        int newCapacity = newTable.length;//新桶的数量
        for (Entry<K,V> e : table) {
            while(null != e) { 
                Entry<K,V> next = e.next; //将头节点的next 复制给next
                if (rehash) {//达到条件从新hash
                    e.hash = null == e.key ? 0 : hash(e.key);
                }
                int i = indexFor(e.hash, newCapacity);//确定桶下标
                e.next = newTable[i];         //头插法完成数据的复制
                newTable[i] = e;
                e = next;
            }
        }
    }

hashmap是线程不安全的，对于数组越界和数据丢失很常见的，hashmap本来就不支持并发，但是还存在一个重要的问题：死循环

（3）死循环

1）假设我们有两条线程

do {
    Entry<K,V> next = e.next; // <--假设线程一执行到这里就被调度挂起了
    int i = indexFor(e.hash, newCapacity);
    e.next = newTable[i];
    newTable[i] = e;
    e = next;
} while (e != null);

容器扩容，链表复制采用头插法，这时线程1的e指向key3,next指向key7

2）线程1调度回来

 e.next = newTable[i];
    newTable[i] = e;
    e = next; //key3复值过去,放在1位,但是e指向了key7 ,next指向了key3
//再次执行key7放在了1位，key3放在了2位,此时e指向了key3,next指向了null
//再次执行环链就出来了......

（4）和jdk1.8对比

1.8hashmap添加了红黑树，当链表的长度大于等于8时，链表会自动转换成红黑树，当删除元素使链表的长度小于8时，又会自动转换成链表。
1.7采用尾插法，1.8采用头插法。
采用高16位和低16**^运算**进行hash干扰

# 基础

深入Hashmap

深入了解jdk1.7-Hashmap

（1）hashmap的重要字段

（2）put方法

（3）死循环

（4）和jdk1.8对比

评论

相关推荐

相关文章

Your browser is out-of-date!