[Leetcode 2021 刷题计划] 705. 设计哈希集合

原创

windism

修改于 2021-03-14 16:56:25

2910

文章被收录于专栏：风扬风扬

每日一题时间: 2020-03-13 题目链接: 705. 设计哈希集合官方题解链接: 设计哈希集合

题目

不使用任何内建的哈希表库设计一个哈希集合（HashSet）。

实现 MyHashSet 类：

void add(key) 向哈希集合中插入值 key 。
bool contains(key) 返回哈希集合中是否存在这个值 key 。
void remove(key) 将给定值 key 从哈希集合中删除。如果哈希集合中没有这个值，什么也不做。

示例：
输入：
["MyHashSet", "add", "add", "contains", "contains", "add", "contains", "remove", "contains"]
[[], [1], [2], [1], [3], [2], [2], [2], [2]]
输出：
[null, null, null, true, false, null, true, null, false]

解释：
MyHashSet myHashSet = new MyHashSet();
myHashSet.add(1);      // set = [1]
myHashSet.add(2);      // set = [1, 2]
myHashSet.contains(1); // 返回 True
myHashSet.contains(3); // 返回 False ，（未找到）
myHashSet.add(2);      // set = [1, 2]
myHashSet.contains(2); // 返回 True
myHashSet.remove(2);   // set = [1]
myHashSet.contains(2); // 返回 False ，（已移除）

提示：

0 <= key <= 10^6
最多调用 10^4 次 add、remove 和 contains 。

解题方法

对于最为难的 红黑树 解法就不做编写啦，感兴趣可以自行查阅。

数组

解题思路: 题目中给出 key 的范围, 因此可以利用大于 10^6 的数据进行解题。

关于数组和容器的性能问题本部分不做探讨, 对于定长的问题应当使用数组进行解题，不定长适合容器进行解题。

class MyHashSet {
private:
    vector<bool> data = vector<bool>(1000001, false); 
public:
    /** Initialize your data structure here. */
    MyHashSet() {}
    
    void add(int key) {
        data[key] = true;
    }
    
    void remove(int key) {
        data[key] = false;
    }
    
    /** Returns true if this set contains the specified element */
    bool contains(int key) {
        return data[key];
    }
};

复杂度分析
- 时间复杂度: O(1)
- 空间复杂度: O(M)
  - M 代表 key 的范围, 本题中为 10^6

HashTable

解题思路: 常常解题中使用到 C++ 的 unordered_set, 该类就是利用 HashTable 实现的。如下图，图片取自官方题解:设计哈希集合。

对于数组解法最大的问题是如果 key 数量少，则浪费空间较多，如果 key 较大，可能空间急速膨胀。

选用固定长度 N 的数组, 则 key 的位置为 key mod N
如果 key = N, 2N, 3N 则可能同时需要在一个位置, 此时的情况叫做哈希碰撞, 对于哈希碰撞有多种解决方法
1. 链地址法: 对于同一位置 key 利用链表进行存储
2. 开放寻址法: 一旦发生了冲突，利用 key+di 寻找下一个空的散列地址(每个位置是 key 值)
3. 再哈希: 重复使用不同的Hash函数找下一个空的散列地址
4. 公共溢出区: 若有冲突的，均放置在公共溢出区

对于 N 一般选用质数, 简单来说，就是由于非质数存在非1和本身的其他因子，因此遇到该因子等差构成的序列时更容易发生哈希碰撞。
其原因详见算法分析：哈希表的大小为何是素数和 good hash table primes

class MyHashSet {
private:
    vector<list<int>> data;
    static const int base = 769;
    static int hash(int key) {
        return key % base;
    }
public:
    /** Initialize your data structure here. */
    MyHashSet(): data(base) {}
    
    void add(int key) {
        int h = hash(key);
        for (auto it = data[h].begin(); it != data[h].end(); it++) {
            if ((*it) == key) {
                return;
            }
        }
        data[h].push_back(key);
    }
    
    void remove(int key) {
        int h = hash(key);
        for (auto it = data[h].begin(); it != data[h].end(); it++) {
            if ((*it) == key) {
                data[h].erase(it);
                return;
            }
        }
    }
    
    /** Returns true if this set contains the specified element */
    bool contains(int key) {
        int h = hash(key);
        for (auto it = data[h].begin(); it != data[h].end(); it++) {
            if ((*it) == key) {
                return true;
            }
        }
        return false;
    }
};

复杂度分析
- 时间复杂度: O(N/B)
  - 哈希值是均匀分布的, 则为该式
- 空间复杂度: O(N + B)
  - N: 哈希表中的元素数量
  - B: 链表的数量

参考资料

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

编程算法

网站

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

编程算法

网站

登录后参与评论

0 条评论

热度

[Leetcode 2021 刷题计划] 705. 设计哈希集合

[Leetcode 2021 刷题计划] 705. 设计哈希集合

题目

解题方法

数组

HashTable

参考资料

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐