本文介绍堆和在Python内置库的实现。
该模块提供了堆队列算法的实现,也称为优先级队列算法。
堆是二叉树,其中每个父节点的值小于或等于其任何子节点的值。
heapq.heapify(x)
将列表 x 转换为线性时间内的就地堆。
a = [[13, 'asdf'], [22, 'asdf'], [4, 'asdf'], [6, 'asdf'], [8, 'asdf'], [45, 'asdf'], [67, 'asdf'], [7, 'asdf']]
heapq.heapify(a)
-->
a
[[4, 'asdf'], [6, 'asdf'], [13, 'asdf'], [7, 'asdf'], [8, 'asdf'], [45, 'asdf'], [67, 'asdf'], [22, 'asdf']]
压入新元素到堆 log(n)
import heapq
a = [[13, 'asdf'], [22, 'asdf'], [4, 'asdf'], [6, 'asdf'], [8, 'asdf'], [45, 'asdf'], [67, 'asdf'], [7, 'asdf']]
heapq.heapify(a)
heapq.heappush(a, [1, 'etrfg'])
-->
a
[[1, 'etrfg'], [4, 'asdf'], [13, 'asdf'], [6, 'asdf'], [8, 'asdf'], [45, 'asdf'], [67, 'asdf'], [22, 'asdf'], [7, 'asdf']]
从堆中弹出并返回最小的项,同时保持堆的不变性。
import heapq
a = [[13, 'asdf'], [22, 'asdf'], [4, 'asdf'], [6, 'asdf'], [8, 'asdf'], [45, 'asdf'], [67, 'asdf'], [7, 'asdf']]
heapq.heapify(a)
heapq.heappush(a, [1, 'etrfg'])
b = heapq.heappop(a)
-->
b
[1, 'etrfg']
在堆上推送项,然后弹出并从堆中返回最小的项。
import heapq
a = [[13, 'asdf'], [22, 'asdf'], [4, 'asdf'], [6, 'asdf'], [8, 'asdf'], [45, 'asdf'], [67, 'asdf'], [7, 'asdf']]
heapq.heapify(a)
b = heapq.heappushpop(a, [9, 'etrfg'])
-->
b
[4, 'asdf']
a
[[6, 'asdf'], [7, 'asdf'], [13, 'asdf'], [9, 'etrfg'], [8, 'asdf'], [45, 'asdf'], [67, 'asdf'], [22, 'asdf']]
先 pop 堆顶元素,再push 元素进去
import heapq
a = [[13, 'asdf'], [22, 'asdf'], [4, 'asdf'], [6, 'asdf'], [8, 'asdf'], [45, 'asdf'], [67, 'asdf'], [7, 'asdf']]
heapq.heapify(a)
b = heapq.heapreplace(a, [1, 'etrfg'])
-->
b
[4, 'asdf']
a
[[1, 'etrfg'], [6, 'asdf'], [13, 'asdf'], [7, 'asdf'], [8, 'asdf'], [45, 'asdf'], [67, 'asdf'], [22, 'asdf']]
合并多个堆成一个
import heapq
a = [[13, 'asdf'], [22, 'asdf'], [4, 'asdf'], [6, 'asdf'], [8, 'asdf'], [45, 'asdf'], [67, 'asdf'], [7, 'asdf']]
heapq.heapify(a)
b = heapq.merge(a, a, a)
-->
list(b)
[[1, 'etrfg'], [1, 'etrfg'], [1, 'etrfg'], [6, 'asdf'], [6, 'asdf'], [6, 'asdf'], [13, 'asdf'], [7, 'asdf'], [8, 'asdf'], [13, 'asdf'], [7, 'asdf'], [8, 'asdf'], [13, 'asdf'], [7, 'asdf'], ...]
返回最大的 n 个元素
import heapq
a = [[13, 'asdf'], [22, 'asdf'], [4, 'asdf'], [6, 'asdf'], [8, 'asdf'], [45, 'asdf'], [67, 'asdf'], [7, 'asdf']]
heapq.heapify(a)
b = heapq.nlargest(3, a)
-->
[[67, 'asdf'], [45, 'asdf'], [22, 'asdf']]
返回 n 个最小元素
import heapq
a = [[13, 'asdf'], [22, 'asdf'], [4, 'asdf'], [6, 'asdf'], [8, 'asdf'], [45, 'asdf'], [67, 'asdf'], [7, 'asdf']]
heapq.heapify(a)
b = heapq.nsmallest(3, a)
-->
b
[[4, 'asdf'], [6, 'asdf'], [13, 'asdf']]
元素需要自底向上方法建堆,底层堆建完后可以固定下来不需要根据上层堆的调整而进行调整。过程为从最后一个元素 index 向前,首先需要找到其父亲元素(index - 1) // 2 ,如果其前一个元素的父亲(index - 2) // 2是同一个节点(或者该元素是偶数下标,下标从0 开始),则他俩是兄弟,查找此三个元素中最小值,替换到父亲的位置,即完成了当前局部堆的构建,这样一路调整到数组起始位置,就完成了堆构建,时间复杂度 O(n)。