# 数据结构（五）：哈夫曼树（Huffman Tree）

### 哈夫曼树的构造

1. 对字符集合按照字符频率进行升序排序，并构建一颗空树；
2. 遍历字符集合，将每一个字符添加到树中，添加规则为： 【1】若树为空，则作为根节点； 【2】若字符频率不大于根节点频率，则字符作为根节点的左兄弟，形成一个新的根节点，频率值为左、右子节点之和； 【3】若字符频率大于根节点频率，则字符作为根节点的右兄弟，形成一个新的根节点，频率值为左右子节点之和。
##### 构造示例

~

step 1:

```# synchronise sort the valueArr and contentArr
def insertionSort(valueArr, contentArr):
for i in range(1, len(valueArr)):  # iteration times
tmpValue = valueArr[i]
tmpContent = contentArr[i]
while i > 0 and tmpValue < valueArr[i - 1]:
valueArr[i] = valueArr[i - 1]
contentArr[i] = contentArr[i - 1]
i = i - 1
valueArr[i] = tmpValue
contentArr[i] = tmpContent```

step 2:

```# tree node definition
class Node(object):
def __init__(self, value, content=None, lchild=None, rchild=None):
self.lchild = lchild
self.rchild = rchild
self.value = value
self.content = content```

```# tree definition
class Tree(object):
def __init__(self, root=None):
self.root = root
self.codeMap = {}

# merge two nodes and return one root node
def acceptNewNode(self, value, content):
if not self.root:
self.root = Node(value, content)
else:
newNode = Node(value, content)
newRoot = Node(self.root.value + value)
lchild, rchild = (self.root, newNode) if self.root.value < value else (newNode, self.root)
newRoot.lchild, newRoot.rchild = lchild, rchild
self.root = newRoot```

，频率为

，频率为

，频率为

... ... ...

，频率为

### 哈夫曼树编解码

##### 构造哈希表

```# initialize the huffman tree code map
def initializeCodeMap(node, byteArr, codeMap):
if node.lchild:
byteArr.append('0')
initializeCodeMap(node.lchild, byteArr, codeMap)
byteArr.append('1')
initializeCodeMap(node.rchild, byteArr, codeMap)
byteArr.pop() if len(byteArr) > 0 else None  # in case only the root node left
else:
codeMap[node.content] = ''.join(byteArr)
byteArr.pop()```

##### 编码与解码

```# tree definition
class Tree(object):

# encode
def encode(self, chars):
bytes = ''
for i in chars:  # get the mapped bytes
bytes += self.codeMap.get(i.upper(), '###')
return bytes

# decode
def decode(self, bytes):
chars = ''
tmpNode = self.root
for i in bytes:
if i == '0':
tmpNode = tmpNode.lchild
elif i == '1':
tmpNode = tmpNode.rchild
if not tmpNode.lchild:
chars += tmpNode.content
tmpNode = self.root
return chars```

### 代码附录

```# tree node definition
class Node(object):
def __init__(self, value, content=None, lchild=None, rchild=None):
self.lchild = lchild
self.rchild = rchild
self.value = value
self.content = content

# tree definition
class Tree(object):
def __init__(self, root=None):
self.root = root
self.codeMap = {}

# initialize the huffman tree code map
def initializeCodeMap(self):
initializeCodeMap(self.root, [], self.codeMap)

# encode
def encode(self, chars):
bytes = ''
for i in chars:  # get the mapped bytes
bytes += self.codeMap.get(i.upper(), '###')
return bytes

# decode
def decode(self, bytes):
chars = ''
tmpNode = self.root
for i in bytes:
if i == '0':
tmpNode = tmpNode.lchild
elif i == '1':
tmpNode = tmpNode.rchild
if not tmpNode.lchild:
chars += tmpNode.content
tmpNode = self.root
return chars

# merge two nodes and return one root node
def acceptNewNode(self, value, content):
if not self.root:
self.root = Node(value, content)
else:
newNode = Node(value, content)
newRoot = Node(self.root.value + value)
lchild, rchild = (self.root, newNode) if self.root.value < value else (newNode, self.root)
newRoot.lchild, newRoot.rchild = lchild, rchild
self.root = newRoot

# initialize the huffman tree code map
def initializeCodeMap(node, byteArr, codeMap):
if node.lchild:
byteArr.append('0')
initializeCodeMap(node.lchild, byteArr, codeMap)
byteArr.append('1')
initializeCodeMap(node.rchild, byteArr, codeMap)
byteArr.pop() if len(byteArr) > 0 else None  # in case only the root node left
else:
codeMap[node.content] = ''.join(byteArr)
byteArr.pop()

# construct the huffman tree
def createHuffmanTree(valueArr, contentArr):
insertionSort(valueArr, contentArr)  # synchronise sort the valueArr and contentArr
hfTree = Tree()
for i in range(len(valueArr)):  # construct the huffman tree
hfTree.acceptNewNode(valueArr[i], contentArr[i])
hfTree.initializeCodeMap() # initialize the huffman tree code map
return hfTree

# synchronise sort the valueArr and contentArr
def insertionSort(valueArr, contentArr):
for i in range(1, len(valueArr)):  # iteration times
tmpValue = valueArr[i]
tmpContent = contentArr[i]
while i > 0 and tmpValue < valueArr[i - 1]:
valueArr[i] = valueArr[i - 1]
contentArr[i] = contentArr[i - 1]
i = i - 1
valueArr[i] = tmpValue
contentArr[i] = tmpContent```

```if __name__ == '__main__':
valueArr = [5, 3, 4, 0, 2, 1, 8, 6, 9, 7]
contentArr = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']
huTree = createHuffmanTree(valueArr, contentArr)
chars = 'hIebAhcHdfGc'
bytes = huTree.encode(chars)
print(chars.lower(),'encode =',bytes)

chars = huTree.decode(bytes)
print(bytes,'decode =',chars.lower())```

```hiebahchdfgc encode = 11100111111111111110111101110111110111011111110011111110110111110
11100111111111111110111101110111110111011111110011111110110111110 decode = hiebahchdfgc```

75 篇文章13 人订阅

0 条评论

## 相关文章

2046

### Java核心数据结构(List,Map,Set)使用技巧与优化

JDK提供了一组主要的数据结构实现，如List、Map、Set等常用数据结构。这些数据都继承自 java.util.Collection 接口，并位于 java...

3143

### Java开发者易犯错误Top10

Arrays.asList()将返回一个数组内部是私有静态类的ArrayList，这不是java.util.ArrayList类，java.util.Array...

1214

### 【Java提高二十】集合指定初始容量&asList缺陷&subList缺陷

【Java提高二十】集合指定初始容量 &asList缺陷&subList缺陷 集合指定初始容量 集合是我们在Java编程中使用非常广泛的，它就像大海，海纳百川，...

3817

761

### 堆结构的优秀实现类----PriorityQueue优先队列

2466

LinkedHashMap 继承自 HashMap，在 HashMap 基础上，通过维护一条双向链表，解决了 HashMap 不能随时保持遍历顺序和插入顺序一致...

1763

864

### 数据结构-线性表|顺序表|链表(下)

1 1 1 哈喽。各位小伙伴好久不见，热心的小编赶在开学季又来给大家送上满满的干货了。祝大家开心快乐！ 继上两次咱们聊了顺序表、单链表、静态链表等知识。那么热爱...

2907

2755