数据结构设计题

原创

王脸小

修改于 2019-11-07 17:03:54

1.1K0

修改于 2019-11-07 17:03:54

文章被收录于专栏：王漂亮

十万个为什么

What to return for each function? Size of data?

在做题之前就clearify(功能，use case，function的input和returns)，在左边写下来，作为design的agreement

146. LRU缓存机制 - M

LRU Cache

Design and implement a data structure for Least Recently Used (LRU) cache. It should support the following operations: get and put.

get(key) - Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1. put(key, value) - Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.

The cache is initialized with a positive capacity.

Follow up: Could you do both operations in O(1) time complexity?

Example:

LRUCache cache = new LRUCache( 2 /* capacity */ );

cache.put(1, 1);
cache.put(2, 2);
cache.get(1);       // returns 1
cache.put(3, 3);    // evicts key 2
cache.get(2);       // returns -1 (not found)
cache.put(4, 4);    // evicts key 1
cache.get(1);       // returns -1 (not found)
cache.get(3);       // returns 3
cache.get(4);       // returns 4

Solution

Solution - OrderedDict

from collections import OrderedDict
class LRUCache:

    def __init__(self, capacity: int):
        self.di = OrderedDict()
        self.capacity = capacity

    def get(self, key: int) -> int:
        if key not in self.di: return -1
        else:
            v = self.di.pop(key)
            self.di[key] = v
            
        return v

    def put(self, key: int, value: int) -> None:
        if key in self.di:
            self.di.pop(key)
        else:
            if self.capacity > 0:
                self.capacity -= 1
            else:
                self.di.popitem(last=False)
                
        self.di[key] = value

抄的：主要是self.di.popitem(last=False)，追加在最后，先进先出

Solution - 哈希双向链表

如何实现LRU缓存机制

主要靠点在O1，hash实现get/put O1，双向链表实现有序，并O1删除元素，并设计空表头/表尾方便操作

class Node:
    def __init__(self, key, val):
        self.key = key
        self.val = val
        self.pre= None
        self.next = None

class LRUCache:
    def __init__(self, capacity):
        # init dict
        self.cache = {}
        self.capacity = capacity
        
        # init double linked list
        self.head, self.tail = Node(0,0), Node(0,0)
        self.head.next = self.tail
        self.tail.pre = self.head
    
    def get(self, key) -> int:
        if key in self.cache:
            node = self.cache[key]
            self._remove(node)
            self._add(node)
            return node.val
        else:
            return -1
        
    def put(self, key, value) -> None:
        if key in self.cache:
            self._remove(self.cache[key])
            
        if len(self.cache) == self.capacity:
            self._remove(self.head.next)
        
        self._add(Node(key, value))
        
    def _remove(self, node):
        self.cache.pop(node.key)
        node.pre.next = node.next
        node.next.pre = node.pre
        
    def _add(self, node):
        self.cache[node.key] = node
        pre_tail = self.tail.pre
        pre_tail.next = node
        node.pre = pre_tail
        node.next = self.tail
        self.tail.pre = node

380. 常数时间插入、删除和获取随机元素

Insert Delete GetRandom O(1)

Design a data structure that supports all following operations in average O(1) time.

insert(val): Inserts an item val to the set if not already present.
remove(val): Removes an item val from the set if present.
getRandom: Returns a random element from current set of elements. Each element must have the same probability of being returned.

Solution - 哈希+数组

get/insert不要忘记判断key是否存在！！！

random.choice(self.arr)

import random 
class RandomizedSet:

    def __init__(self):
        self.di = {}
        self.arr = []        

    def insert(self, val: int) -> bool:
        if val not in self.di:
            self.di[val] = len(self.arr)
            self.arr.append(val)
            return True
        return False
        
    def remove(self, val: int) -> bool:
        if val in self.di:
            self.di[self.arr[-1]] = self.di[val]
            self.arr[self.di[val]] = self.arr[-1]

            self.arr.pop()
            self.di.pop(val)

            return True
        return False

    def getRandom(self) -> int:
        return random.choice(self.arr)

706. 设计哈希映射

Design HashMap

Design a HashMap without using any built-in hash table libraries.

To be specific, your design should include these functions:

put(key, value) : Insert a (key, value) pair into the HashMap. If the value already exists in the HashMap, update the value.
get(key): Returns the value to which the specified key is mapped, or -1 if this map contains no mapping for the key.
remove(key) : Remove the mapping for the value key if this map contains the mapping for the key.

Example:

MyHashMap hashMap = new MyHashMap();
hashMap.put(1, 1);          
hashMap.put(2, 2);         
hashMap.get(1);            // returns 1
hashMap.get(3);            // returns -1 (not found)
hashMap.put(2, 1);          // update the existing value
hashMap.get(2);            // returns 1 
hashMap.remove(2);          // remove the mapping for 2
hashMap.get(2);            // returns -1 (not found)

All keys and values will be in the range of [0, 1000000].
The number of operations will be in the range of [1, 10000].
Please do not use the built-in HashMap library.

Solution

这道题由于key都是整数有非常粗暴的方法，直接初始化len=100000的数组，数组的index作为key

下面的实现是链地址法：size1000的list，每个位置是个链表，将100万数据映射到list中的链表里，表头为空

class Node:
    def __init__(self, key=None, val=None, nex=None):
        self.key = key
        self.val = val
        self.nex = nex

class MyHashMap:
    def __init__(self):
        self.size = 1000
        self.h = [Node() for _ in range(self.size)]

    def put(self, key: int, value: int) -> None:
        pre = self.h[key%self.size]
        cur = pre.nex
        while cur:
            if cur.key == key:
                cur.val = value
                break
            pre, cur = cur, cur.nex
        else:
            pre.nex = Node(key, value)
        
    def get(self, key: int) -> int:
        cur = self.h[key%self.size]
        while cur:
            if cur.key == key:
                return cur.val
            cur = cur.nex
        return -1

    def remove(self, key: int) -> None:
        pre = self.h[key%self.size]
        cur = pre.nex
        while cur:
            if cur.key == key:
                pre.nex = cur.nex
                break
            pre, cur = cur, cur.nex

reference 链地址法

155. 最小栈 - E

Min Stack

Design a stack that supports push, pop, top, and retrieving the minimum element in constant time.

push(x) -- Push element x onto stack.
pop() -- Removes the element on top of the stack.
top() -- Get the top element.
getMin() -- Retrieve the minimum element in the stack.

Example:

MinStack minStack = new MinStack();
minStack.push(-2);
minStack.push(0);
minStack.push(-3);
minStack.getMin();   --> Returns -3.
minStack.pop();
minStack.top();      --> Returns 0.
minStack.getMin();   --> Returns -2.

Solution - 数据栈+辅助最小栈

11.5： self.stack[-1]用-1来取栈顶； pop时最小栈部分判断栈顶是不是popitem就好

class MinStack:

    def __init__(self):
        self.stack = []
        self.min_stack = []

    def push(self, x: int) -> None:
        self.stack.append(x)
        if not self.min_stack or self.min_stack[-1] >= x: self.min_stack.append(x)

    def pop(self) -> None:
        x = self.stack.pop()
        if x == self.min_stack[-1]: self.min_stack.pop()

    def top(self) -> int:
        return self.stack[-1]

    def getMin(self) -> int:
        return self.min_stack[-1]

优化：不使用辅助栈，单个栈实现，每次压入两个值：新值以及最小值。出栈时弹出两个值，获取最小值时直接读取栈顶元素：

Trie

208. 实现 Trie (前缀树)

Implement a trie with insert, search, and startsWith methods.

Example:

Trie trie = new Trie();

trie.insert("apple");
trie.search("apple");   // returns true
trie.search("app");     // returns false
trie.startsWith("app"); // returns true
trie.insert("app");   
trie.search("app");     // returns true

Note:

You may assume that all inputs are consist of lowercase letters a-z.
All inputs are guaranteed to be non-empty strings.

Solution

单节点结构：哈希，key为前序，value为后继

11.5：self.nodes =defaultdict(TrieNode)不是defaultdict(list)

class TrieNode:
    def __init__(self):
        self.nodes = defaultdict(TrieNode)
        self.isword = False

class Trie:
    def __init__(self):
        self.root = TrieNode()
    
    def insert(self, word: str) -> None:
        curr = self.root
        for ch in word:
            curr = curr.nodes[ch]
        curr.isword = True
        
    def search(self, word: str) -> bool:
        curr = self.root
        for ch in word:
            if ch not in curr.nodes: return False
            curr = curr.nodes[ch]
        
        return curr.isword
        
    def startsWith(self, prefix: str) -> bool:
        curr = self.root
        for ch in prefix:
            if ch not in curr.nodes: return False
            curr = curr.nodes[ch]
            
        return True

insert是抄的，另外两个自己写的

复杂度：如果敏感词的长度为 m，则每个敏感词的查找时间复杂度是 O(m)，字符串的长度为 n，我们需要遍历 n 遍，所以敏感词查找这个过程的时间复杂度是 O(n * m)。如果有 t 个敏感词的话，构建 trie 树的时间复杂度是 O(t * m)。

642. 设计搜索自动补全系统

为搜索引擎设计一个搜索自动补全系统。用户会输入一条语句（最少包含一个字母，以特殊字符 '#' 结尾）。除 '#' 以外用户输入的每个字符，返回历史中热度前三并以当前输入部分为前缀的句子。下面是详细规则：

一条句子的热度定义为历史上用户输入这个句子的总次数。
返回前三的句子需要按照热度从高到低排序（第一个是最热门的）。如果有多条热度相同的句子，请按照 ASCII 码的顺序输出（ASCII 码越小排名越前）。
如果满足条件的句子个数少于 3，将它们全部输出。
如果输入了特殊字符，意味着句子结束了，请返回一个空集合。

你的工作是实现以下功能：

构造函数：

AutocompleteSystem(String[] sentences, int[] times): 这是构造函数，输入的是历史数据。 Sentences 是之前输入过的所有句子，Times 是每条句子输入的次数，你的系统需要记录这些历史信息。

现在，用户输入一条新的句子，下面的函数会提供用户输入的下一个字符：

List<String> input(char c): 其中 c 是用户输入的下一个字符。字符只会是小写英文字母（'a' 到 'z' ），空格（' '）和特殊字符（'#'）。输出历史热度前三的具有相同前缀的句子。

样例： 操作： AutocompleteSystem(["i love you", "island","ironman", "i love leetcode"], [5,3,2,2]) 系统记录下所有的句子和出现的次数： "i love you" : 5 次 "island" : 3 次 "ironman" : 2 次 "i love leetcode" : 2 次现在，用户开始新的键入：

输入： input('i') 输出： ["i love you", "island","i love leetcode"] 解释： 有四个句子含有前缀 "i"。其中 "ironman" 和 "i love leetcode" 有相同的热度，由于 ' ' 的 ASCII 码是 32 而 'r' 的 ASCII 码是 114，所以 "i love leetcode" 在 "ironman" 前面。同时我们只输出前三的句子，所以 "ironman" 被舍弃。 输入： input(' ') 输出： ["i love you","i love leetcode"] 解释: 只有两个句子含有前缀 "i "。 输入： input('a') 输出： [] 解释： 没有句子有前缀 "i a"。 输入： input('#') 输出： [] 解释：

用户输入结束，"i a" 被存到系统中，后面的输入被认为是下一次搜索。

Solution

原生Trie+search+回溯(列出所有可能)，抄了search和traverse部分

Reference

from collections import defaultdict 
class AutocompleteSystem:
    class TrieNode():
        def __init__(self):
            self.nodes = defaultdict(AutocompleteSystem.TrieNode)
            self.isword = False
            self.times = 0
            
    class Trie():
        def __init__(self):
            self.root = AutocompleteSystem.TrieNode()
        
        def insert(self, word, times=1):
            r = self.root
            for ch in word:
                r = r.nodes[ch]
            r.isword = True
            r.times += times
        
        def search(self, word):
            cur = self.root
            for ch in word:
                cur = cur.nodes[ch]
            res,path = [], word
            self.traverse(cur, path, res)
            return res
        
        # 开始回溯
        def traverse(self, cur, path, res):
            if cur.isword:
                res.append((-cur.times, path))
            for ch in cur.nodes:
                self.traverse(cur.nodes[ch], path + ch, res)
            

    def __init__(self, sentences, times):
        self.trie = AutocompleteSystem.Trie()
        self.cur = ''
        for word, cnt in zip(sentences,times):
            self.trie.insert(word,cnt)
        
    def input(self, c: str) -> List[str]:
        if c == '#':
            self.trie.insert(self.cur)
            self.cur = ''
            return []
        else:
            self.cur += c
            res = self.trie.search(self.cur)
            if not res:
                return res
            res.sort()
            res = [word for times, word in res[:3]]
            return res

其他设计

295-. 数据流的中位数

Find Median from Data Stream

Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.

For example,

[2,3,4], the median is 3

[2,3], the median is (2 + 3) / 2 = 2.5

Design a data structure that supports the following two operations:

void addNum(int num) - Add a integer number from the data stream to the data structure.
double findMedian() - Return the median of all elements so far.

Example:

addNum(1)
addNum(2)
findMedian() -> 1.5
addNum(3) 
findMedian() -> 2

Follow up:

If all integer numbers from the stream are between 0 and 100, how would you optimize it?
If 99% of all integer numbers from the stream are between 0 and 100, how would you optimize it?

Solution

基础解法：1. 排序Onlogn；2. 插入排序logn(查找)+On(插入)

Reference

最大堆，最小堆，求中位数就是前半部分的最大值，后半部分的最小值。

每次入堆，都有从另一个堆里挤出一个元素，保证最小堆和最大堆是数据流前后两部分

注意：maxheap的负号

import heapq
class MedianFinder(object):
    def __init__(self):
        self.maxheap = []
        self.minheap = []

    def addNum(self, num):
        if len(self.minheap) == len(self.maxheap):
            heapq.heappush(self.maxheap, -heapq.heappushpop(self.minheap, num))
        else:
            heapq.heappush(self.minheap, -heapq.heappushpop(self.maxheap, -num))
        

    def findMedian(self):
        if len(self.minheap) == len(self.maxheap):
            return (-self.maxheap[0] + self.minheap[0])/2.0
        else:
            return -self.maxheap[0]

348. 判定井字棋胜负

给定棋盘边长 n = 3, 玩家 1 的棋子符号是 "X"，玩家 2 的棋子符号是 "O"。

TicTacToe toe = new TicTacToe(3);

toe.move(0, 0, 1); -> 函数返回 0 (此时，暂时没有玩家赢得这场对决)
|X| | |
| | | |    // 玩家 1 在 (0, 0) 落子。
| | | |

toe.move(0, 2, 2); -> 函数返回 0 (暂时没有玩家赢得本场比赛)
|X| |O|
| | | |    // 玩家 2 在 (0, 2) 落子。
| | | |

toe.move(2, 2, 1); -> 函数返回 0 (暂时没有玩家赢得比赛)
|X| |O|
| | | |    // 玩家 1 在 (2, 2) 落子。
| | |X|

toe.move(1, 1, 2); -> 函数返回 0 (暂没有玩家赢得比赛)
|X| |O|
| |O| |    // 玩家 2 在 (1, 1) 落子。
| | |X|

toe.move(2, 0, 1); -> 函数返回 0 (暂无玩家赢得比赛)
|X| |O|
| |O| |    // 玩家 1 在 (2, 0) 落子。
|X| |X|

toe.move(1, 0, 2); -> 函数返回 0 (没有玩家赢得比赛)
|X| |O|
|O|O| |    // 玩家 2 在 (1, 0) 落子.
|X| |X|

toe.move(2, 1, 1); -> 函数返回 1 (此时，玩家 1 赢得了该场比赛)
|X| |O|
|O|O| |    // 玩家 1 在 (2, 1) 落子。
|X|X|X|

进阶:

您有没有可能将每一步的 move() 操作优化到比 O(n2) 更快吗?

Solution

记录player横/竖/对角线棋子个数，当有=n出现时返回player

class TicTacToe(object):

    def __init__(self, n):
        self.n = n
        self.rows = [[0 for _ in range(n)] for _ in range(2)]
        self.cols = [[0 for _ in range(n)] for _ in range(2)]
        self.angs = [[0 for _ in range(2)] for _ in range(2)]      

    def move(self, row, col, player):
        """
        @return The current winning condition, can be either:
                0: No one wins.
                1: Player 1 wins.
                2: Player 2 wins.
        """
        self.rows[player-1][row] += 1
        self.cols[player-1][col] += 1
        self.angs[player-1][0] += 1 if row == col else 0
        self.angs[player-1][1] += 1 if row+col==self.n-1 else 0
                  
        if self.n == max(self.rows[player-1][row],self.cols[player-1][col],max(self.angs[player-1])):
            return player
        return 0

直接抄了最优解，待自己想思路

Reference: https://leetcode-cn.com/problems/number-of-islands/solution/dfs-bfs-bing-cha-ji-python-dai-ma-java-dai-ma-by-l/ 并查集(Union-Find)算法介绍 https://blog.csdn.net/dm_vincent/article/details/7655764

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

缓存

http

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

缓存

http

登录后参与评论

0 条评论

热度