# Python Algorithms - C5 Traversal

Python算法设计篇(5) Chapter 5: Traversal

You are in a narrow hallway. This continues for several metres and ends in a doorway. Halfway along the passage you can see an archway where some steps lead downwards. Will you go forwards to the door (turn to 5), or creep down the steps (turn to 344)? ——Steve Jackson, Citadel of Chaos

Traversal就是遍历，主要是对图的遍历，也就是遍历图中的每个节点。对一个节点的遍历有两个阶段，首先是发现(discover)，然后是访问(visit)。遍历的重要性自然不必说，图中有几个算法和遍历没有关系？！

[算法导论对于发现和访问区别的非常明显，对图的算法讲解地特别好，在遍历节点的时候给节点标注它的发现节点时间d[v]和结束访问时间f[v]，然后由这些时间的一些规律得到了不少实用的定理，本节后面介绍了部分内容，感兴趣不妨阅读下算法导论原书]

Let’s look at the following related problem. Show that you can order the nodes in a connected graph, V1, V2, … Vn, so that for any i = 1…n, the subgraph over V1, … , Vi is connected. If we can show this and we can figure out how to do the ordering, we can go through all the nodes in a connected component and know when they’re all used up.

How do we do this? Thinking inductively, we need to get from i -1 to i. We know that the subgraph over the i -1 first nodes is connected. What next? Well, because there are paths between any pair of nodes, consider a node u in the first i -1 nodes and a node v in the remainder. On the path from u to v, consider the last node that is in the component we’ve built so far, as well as the first node outside it. Let’s call them x and y. Clearly there must be an edge between them, so adding y to the nodes of our growing component keeps it connected, and we’ve shown what we set out to show.

```# Walking Through a Connected Component of a Graph Represented Using Adjacency Sets
def walk(G, s, S=set()):                        # Walk the graph from node s
P, Q = dict(), set()                        # Predecessors + "to do" queue
P[s] = None                                 # s has no predecessor
Q.add(s)                                    # We plan on starting with s
while Q:                                    # Still nodes to visit
u = Q.pop()                             # Pick one, arbitrarily
for v in G[u].difference(P, S):         # New nodes?
Q.add(v)                            # We plan to visit them!
P[v] = u                            # Remember where we came from
return P                                    # The traversal tree```

```def some_graph():
a, b, c, d, e, f, g, h = range(8)
N = [
[b, c, d, e, f],    # a
[c, e],             # b
[d],                # c
[e],                # d
[f],                # e
[c, g, h],          # f
[f, h],             # g
[f, g]              # h
]
return N

G = some_graph()
for i in range(len(G)): G[i] = set(G[i])
print list(walk(G,0)) #[0, 1, 2, 3, 4, 5, 6, 7]```

```def components(G):                              # The connected components
comp = []
seen = set()                                # Nodes we've already seen
for u in G:                                 # Try every starting point
if u in seen: continue                  # Seen? Ignore it
C = walk(G, u)                          # Traverse component
seen.update(C)                          # Add keys of C to seen
comp.append(C)                          # Collect the components
return comp```

```G = {
0: set([1, 2]),
1: set([0, 2]),
2: set([0, 1]),
3: set([4, 5]),
4: set([3, 5]),
5: set([3, 4])
}

print [list(sorted(C)) for C in components(G)]  #[[0, 1, 2], [3, 4, 5]]```

[接下来作者作为扩展介绍了欧拉回路和哈密顿回路：前者是经过图中的所有边一次，然后回到起点；后者是经过图中的所有顶点一次，然后回到起点。网上资料甚多，感兴趣自行了解]

[注：具体的转换方式我还未明白，下面是作者给出的构造说明]

Here the “keep one hand on the wall” strategy will work nicely. One way of seeing why it works is to observe that the maze really has only one inner wall (or, to put it another way, if you put wallpaper inside it, you could use one continuous strip). Look at the outer square. As long as you’re not allowed to create cycles, any obstacles you draw have to be connected to the it in exactly one place, and this doesn’t create any problems for the left-hand rule. Following this traversal strategy, you’ll discover all nodes and walk every passage twice (once in either direction).

```def rec_dfs(G, s, S=None):
if S is None: S = set()                     # Initialize the history
for u in G[s]:                              # Explore neighbors
if u in S: continue                     # Already visited: Skip
rec_dfs(G, u, S)                        # New: Explore recursively
return S # For testing

G = some_graph()
for i in range(len(G)): G[i] = set(G[i])
print list(rec_dfs(G, 0))   #[0, 1, 2, 3, 4, 5, 6, 7]```

```def iter_dfs(G, s):
S, Q = set(), []                            # Visited-set and queue
Q.append(s)                                 # We plan on visiting s
while Q:                                    # Planned nodes left?
u = Q.pop()                             # Get one
if u in S: continue                     # Already visited? Skip it
S.add(u)                                # We've visited it now
Q.extend(G[u])                          # Schedule all neighbors
yield u                                 # Report u as visited

G = some_graph()
for i in range(len(G)): G[i] = set(G[i])
print list(iter_dfs(G, 0))  #[0, 5, 7, 6, 2, 3, 4, 1]```

```def traverse(G, s, qtype=set):
S, Q = set(), qtype()
while Q:
u = Q.pop()
if u in S: continue
for v in G[u]:
yield u```

```class stack(list):

G = some_graph()
print list(traverse(G, 0, stack)) #[0, 5, 7, 6, 2, 3, 4, 1]```

```#Depth-First Search with Timestamps
def dfs(G, s, d, f, S=None, t=0):
if S is None: S = set()                     # Initialize the history
d[s] = t; t += 1                            # Set discover time
for u in G[s]:                              # Explore neighbors
if u in S: continue                     # Already visited. Skip
t = dfs(G, u, d, f, S, t)               # Recurse; update timestamp
f[s] = t; t += 1                            # Set finish time
return t                                    # Return timestamp```

(1)白色表示该边是一条树边；

(2)灰色表示该边是一条反向边；

(3)黑色表示该边是一条正向边或者交叉边。

```#Topological Sorting Based on Depth-First Search
def dfs_topsort(G):
S, res = set(), []                          # History and result
def recurse(u):                             # Traversal subroutine
if u in S: return                       # Ignore visited nodes
for v in G[u]:
recurse(v)                          # Recurse through neighbors
res.append(u)                           # Finished with u: Append it
for u in G:
recurse(u)                              # Cover entire graph
res.reverse()                               # It's all backward so far
return res

G = {'a': set('bf'), 'b': set('cdf'), 'c': set('d'), 'd': set('ef'), 'e': set('f'), 'f': set()}
print dfs_topsort(G)```

[接下来作者介绍了一个Iterative Deepening Depth-First Search，没看懂，貌似和BFS类似]

One way of visualizing BFS and DFS is as browsing the Web. DFS is what you get if you keep following links and then use the Back button once you’re done with a page. The backtracking is a bit like an “undo.” BFS is more like opening every link in a new window (or tab) behind those you already have and then closing the windows as you finish with each page.

BFS的代码很好实现，主要是使用队列

```#Breadth-First Search
from collections import deque

def bfs(G, s):
P, Q = {s: None}, deque([s])                # Parents and FIFO queue
while Q:
u = Q.popleft()                         # Constant-time for deque
for v in G[u]:
if v in P: continue                 # Already has parent
P[v] = u                            # Reached from u: u is parent
Q.append(v)
return P

G = some_graph()
print bfs(G, 0)```

Python的list可以很好地充当stack，但是充当queue则性能很差，函数`bfs`中使用的是`collections`模块中的`deque`，即双端队列(`double-ended queue`)，它一般是使用链表来实现的，这个类有`extend``append``pop`等方法都是作用于队列右端的，而方法`extendleft``appendleft``popleft`等方法都是作用于队列左端的，它的内部实现是非常高效的。

Internally, the deque is implemented as a doubly linked list of blocks, each of which is an array of individual elements. Although asymptotically equivalent to using a linked list of individual elements, this reduces overhead and makes it more efficient in practice. For example, the expression d[k] would require traversing the first k elements of the deque d if it were a plain list. If each block contains b elements, you would only have to traverse k//b blocks.

1.对原图G运行DFS，得到每个节点的完成时间f[v]；

2.得到原图的转置图GT；

3.对GT运行DFS，主循环按照节点的f[v]降序进行访问；

4.输出深度优先森林中的每棵树，也就是一个强连通分支。

```def tr(G):                                      # Transpose (rev. edges of) G
GT = {}
for u in G: GT[u] = set()                   # Get all the nodes in there
for u in G:
for v in G[u]:
return GT

def scc(G):
GT = tr(G)                                  # Get the transposed graph
sccs, seen = [], set()
for u in dfs_topsort(G):                    # DFS starting points
if u in seen: continue                  # Ignore covered nodes
C = walk(GT, u, seen)                   # Don't go "backward" (seen)
seen.update(C)                          # We've now seen C
sccs.append(C)                          # Another SCC found
return sccs

from string import ascii_lowercase
def parse_graph(s):
# print zip(ascii_lowercase, s.split("/"))
# [('a', 'bc'), ('b', 'die'), ('c', 'd'), ('d', 'ah'), ('e', 'f'), ('f', 'g'), ('g', 'eh'), ('h', 'i'), ('i', 'h')]
G = {}
for u, line in zip(ascii_lowercase, s.split("/")):
G[u] = set(line)
return G

G = parse_graph('bc/die/d/ah/f/g/eh/i/h')
print list(map(list, scc(G)))
#[['a', 'c', 'b', 'd'], ['e', 'g', 'f'], ['i', 'h']]```

[最后作者提到了一点如何进行更加高效的搜索，也就是通过分支限界来实现对搜索树的剪枝，具体使用可以看下这个问题顶点覆盖问题Vertext Cover Problem]

In Kosaraju’s algorithm, we find starting nodes for the final traversal by descending finish times from an initial DFS, and we perform the traversal in the transposed graph (that is, with all edges reversed). Why couldn’t we just use ascending finish times in the original graph?

Try finding a simple example where this would give the wrong answer. (You can do it with a really small graph.)

0 条评论

• ### Head First Stanford NLP (2)

(深入浅出Stanford NLP 进阶篇) 本文接着介绍Stanford NLP工具的使用方法。

• ### Art of Android Development Reading Notes 7

《Android开发艺术探索》读书笔记 (7) 第7章 Android动画深入分析

• ### Problem: Matrix Chain Problem

矩阵链乘问题是最典型的动态规划问题，本文介绍如何用动规算法解决这个问题，要理解下面的内容请先阅读这篇动态规划的总结。

• ### 一文搞定 Redis 复制（全会的举个手看看）

Step 2：从节点只是保存了 slaveof 命令中主节点的信息，并没有立即发起复制。

• ### kubernetes从懵圈到熟练 – 集群伸缩原理

阿里云K8S集群的一个重要特性，是集群的节点可以动态的增加或减少。有了这个特性，集群才能在计算资源不足的情况下扩容新的节点，同时也可以在资源利用率降低的时候，释...

• ### 数据结构之二叉树解析

可是，排序有快速排序，归并排序，查找有二分法，甚至直接遍历查找，我干啥要使用二叉树呢？

• ### 算法面试能过几关：咱也不知道，咱也不敢问

微信公众号“程序员小灰”的作者，具有多年软件行业从业经验，先后在京东金融、摩拜科技从事研发工作，对算法有一定的兴趣和经验。

• ### 图表示学习起源: 从Word2vec到DeepWalk

本文发表在知乎专栏<435算法研究所>,介绍的是2014年的一篇文章《DeepWalk: Online Learning of Social Represent...

• ### 红黑树(二):删除操作

构造上，节点会有三种可能：其一是删除节点的孩子都为Nil，其二是删除节点的一个孩子节点为Nil，其三是删除节点的两个孩子节点都不为Nil。而如果两个节点都不为N...