算法细节系列（17）：有向环检测&&拓扑排序

用户1147447

发布于 2019-05-26 09:44:47

6730

发布于 2019-05-26 09:44:47

文章被收录于专栏：机器学习入门

算法细节系列（17）：有向环检测&&拓扑排序

详细代码可以fork下Github上leetcode项目，不定期更新。

题目均摘自leetcode：

Leetcode 207: Course Schedule

Problem:

There are a total of n courses you have to take, labeled from 0 to n - 1. Some courses may have prerequisites, for example to take course 0 you have to first take course 1, which is expressed as a pair: 0,1 Given the total number of courses and a list of prerequisite pairs, is it possible for you to finish all courses?

Example :

2, [1,0] There are a total of 2 courses to take. To take course 1 you should have finished course 0. So it is possible. 2, [1,0,0,1] There are a total of 2 courses to take. To take course 1 you should have finished course 0, and to take course 0 you should also have finished course 1. So it is impossible.

Note:

The input prerequisites is a graph represented by a list of edges, not adjacency matrices. Read more about how a graph is represented.
You may assume that there are no duplicate edges in the input prerequisites.

有向环的检测问题是拓扑排序的基础问题，可以采用两种思路DFS&&BFS，DFS的想法很简单，先简单说说。

思路：（DFS）

有向环的特点很明显，一个顶点连接下一个顶点，如果存在从某个顶点出发，在中间的某个顶点的有向边又一次指向了该顶点，那么它必然是有向环。所以采用DFS的做法，无非就是不断的深度遍历下去，如果遍历到重复的顶点，存在有向边。如果遍历完所有顶点均未发现，那就不存在。

所以该问题，首先得先把相互依赖的课程关系转换成邻接表。代码如下：

public boolean canFinish(int numCourses, int[][] prerequisites) {
        List<Integer> graph[] = new ArrayList[numCourses];
        for (int i = 0; i < numCourses; i++){
            graph[i] = new ArrayList<>();
        }
        boolean[] visited = new boolean[numCourses];
        for (int[] pre : prerequisites){
            graph[pre[0]].add(pre[1]);
        }

        for (int i = 0; i < numCourses; i++){
            if (dfs(graph, visited, i)){
                return false;
            }
        }
        return true;
    }

    private boolean dfs(List<Integer> graph[], boolean[] visited,int start){
        if (visited[start]) return true;
        visited[start] = true;
        for (int i = 0; i < graph[start].size(); i++){
            if(dfs(graph, visited, graph[start].get(i))){
                return true;
            }
        }
        visited[start] = false;
        return false;
    }

这是一个简单版本的有向环检测，但运行速度不够快，原因在于每次访问一个新的顶点时，都会深度遍历所有结点，其实就是没有记录先前的遍历的状态。看图，来说明上述算法：

如果仔细观察visited这个路径记录的话，你会发现它在dfs函数结尾处做了一个状态还原，想过这是为什么？

这是典型的无状态记录递归方法，而因为在一条DFS调用链上，我们得利用重复访问结点这个性质来检测有向环，所以把它带入到了DFS的参数列表中，比如我们DFS(V)时，紧接着DFS(W)，在DFS(X)，此时若没有有向环，那么函数必然返回false。而递归函数在逐层返回时，必须同时把先前visit置为true的点，全部restore成false，否则当从顶点h进行DFS时，会让检测出错。

该问题有没有可优化的地方？明显存在，就拿上图来说，如果刚开始DFS(V)，递归结束时，我们知道DFS(W)的答案一定是没有环，所以当开始DFS(H)时，到DFS(W)就可以直接告诉H，从顶点W出发的顶点往下一定无环。这样就省去了一次DFS(W)的遍历时间，纵观整个图结构，性能是相当客观的。

优化代码如下：

public boolean canFinish(int numCourses, int[][] prerequisites) {
        List<Integer> graph[] = new ArrayList[numCourses];
        for (int i = 0; i < numCourses; i++){
            graph[i] = new ArrayList<>();
        }
        boolean[] visited = new boolean[numCourses];
        boolean[] onStack = new boolean[numCourses];
        for (int[] pre : prerequisites){
            graph[pre[0]].add(pre[1]);
        }

        isCycle = false;
        for (int i = 0; i < numCourses; i++){
            if (!visited[i]){
                dfs(graph, i, visited, onStack);
            }
        }
        return !hasCycle();
    }

    private void dfs(List<Integer> g[], int v,boolean[] visited, boolean[] onStack){
        visited[v] = true;
        onStack[v] = true;
        for (int w : g[v]){
            if (this.hasCycle()) return;
            if (!visited[w]) dfs(g, w, visited, onStack);
            else if (onStack[w]){
                isCycle = true;
            }
        }
        onStack[v] = false;
    }

    boolean isCycle = false;
    private boolean hasCycle(){
        return isCycle;
    }

嘿，这两个boolean数组分工就明确了很多，visited真的是一个全局变量，只要是DFS链上访问过的顶点，它都会被记录下来，那么先前的W就不会被访问两次。那如何确保不存在有向环呢，用onStack去检测，它在递归返回时，会还原现场，当然得还原，否则就出现了第一个版本提到的问题，检测出错。一旦检测出有向环，整个函数返回。

思路：（BFS）

首先说明一点，并不是因为BFS而去BFS，而是检测有向环的性质符合了BFS，才使用了BFS的手段。你别跟我说，BFS跟DFS一样，不断遍历所有相连的顶点，如果某个顶点被遍历了两次，就存在有向环。这种思路是对的，但为什么不直接DFS，而转去BFS？说明这不是使用BFS真正的目的，你也没理解BFS和DFS之间的区别到底是什么。

继续看图，来说明思路的来源。如下图所示：

刚才DFS的想法是把那个有向环给单独拎出来，DFS算法能够找到v->w的有向环。现在逆向思考一下，我们把顶点w去掉，那么剩下的图一定不是个有向环。所以我们能否发现某些性质，可以检测出有向环不存在的情况？

突破口：循环依赖的性质

在循环依赖中，每个顶点的入度不可能为0。

所以，我们可以删除一些不影响检测的无效信息，如上图，我们是否可以把指向顶点v但入度为0的顶点直接删除？直观上来看，的确可以直接删了，而且很重要的一点，删除后，不影响有向环的检测。

所以总结一句话，就是该算法的核心思想：

在有向图中，不存在有向环的情况下，我们可以从入度为0的顶点开始，依次删除所有顶点，且最终一定能被删完。反之，出现循环依赖的情况下，你找不到入度为0的起始点，你自然无从下手，顶点一定会被保留到最后。

然后，我们再来看看DFS的思想，它的想法是直接从顶点中找出有向环，而此想法则是，直接从顶点中删除非有向环，剩下的一定是有向环。一个操作有向环，一个操作非有向环，刚好是个逆向思维。

而我们知道，找到了入度为0的顶点，删除该顶点，牵连它的顶点入度都要减一，所以这自然而然就用BFS解决来得方便，完事。代码如下：

public boolean canFinish(int numCourses, int[][] prerequisites) {
        int[] indegree = new int[numCourses];
        int[][] matrix = new int[numCourses][numCourses];

        for (int[] pre : prerequisites){
            int prepr = pre[1];
            int ready = pre[0];
            indegree[ready]++;
            matrix[prepr][ready] = 1;
        }

        Queue<Integer> queue = new LinkedList<>();
        for (int i = 0; i < numCourses; i++){
            if (indegree[i] == 0){
                queue.offer(i);
            }
        }

        int count = 0;
        while (!queue.isEmpty()){
            int v = queue.poll();
            count ++;
            for (int i = 0; i < numCourses; i++){
                if (matrix[v][i] == 1){
                    matrix[v][i] = 0;
                    indegree[i]--;
                    if (indegree[i] == 0) queue.offer(i);
                }
            }
        }
        return count == numCourses;
    }

相比较于无状态的DFS，它的速度要快一点，因为你想想，如果存在有向环，那么有向环还带这一些杂七杂八的顶点在这样的一个大集合中。DFS不管三七二十一，那些杂七杂八的顶点也得访问一遍，即时那里不存在有向环，而BFS，删除了那些外围的点之后，有向环牵连的那些顶点都不会被访问，所以自然少了些消耗。除非DFS的运气很好，一眼看准了有向环，直接检测出来，但多少受限于数据输入分布。

Leetcode 210: Course Schedule II

Problem:

There are a total of n courses you have to take, labeled from 0 to n - 1. Some courses may have prerequisites, for example to take course 0 you have to first take course 1, which is expressed as a pair: 0,1 Given the total number of courses and a list of prerequisite pairs, return the ordering of courses you should take to finish all courses. There may be multiple correct orders, you just need to return one of them. If it is impossible to finish all courses, return an empty array.

Example :

2, [1,0] There are a total of 2 courses to take. To take course 1 you should have finished course 0. So it is possible. 2, [1,0,0,1] There are a total of 2 courses to take. To take course 1 you should have finished course 0, and to take course 0 you should also have finished course 1. So it is impossible.

Note:

The input prerequisites is a graph represented by a list of edges, not adjacency matrices. Read more about how a graph is represented.
You may assume that there are no duplicate edges in the input prerequisites.

理解了有向环的检测，拓扑排序就很容易实现了。因为不管是DFS还是BFS方法，它们的访问顺序本身可以认为是有序的，DFS的遍历返回的逆序为拓扑排序，而BFS的遍历顺序即为拓扑排序，所以我们只要在它们的遍历路径上添加适当的记录路径的代码即可。

DFS拓扑排序：

    public int[] findOrder(int numCourses, int[][] prerequisites) {
        List<Integer> graph[] = new ArrayList[numCourses];
        for (int i = 0; i < numCourses; i++) {
            graph[i] = new ArrayList<>();
        }
        boolean[] visited = new boolean[numCourses];
        boolean[] onStack = new boolean[numCourses];
        for (int[] pre : prerequisites) {
            graph[pre[1]].add(pre[0]);
        }

        isCycle = false;
        ans = new LinkedList<>();
        for (int i = 0; i < numCourses; i++) {
            if (!visited[i]) {
                dfs(graph, i, visited, onStack);
            }
        }

        if (hasCycle()) return new int[]{};
        int[] res = new int[ans.size()];
        for (int i = 0; i < ans.size(); i++){
            res[i] = ans.get(i);
        }
        return res;
    }

    List<Integer> ans = new LinkedList<>();
    private void dfs(List<Integer> g[], int v, boolean[] visited, boolean[] onStack) {
        visited[v] = true;
        onStack[v] = true;
        for (int w : g[v]) {
            if (this.hasCycle())
                return;
            if (!visited[w])
                dfs(g, w, visited, onStack);
            else if (onStack[w]) {
                isCycle = true;
            }
        }
        onStack[v] = false;
        ans.add(0,v);
    }

    boolean isCycle = false;
    private boolean hasCycle() {
        return isCycle;
    }

BFS拓扑排序：

public int[] findOrder(int numCourses, int[][] prerequisites) {
        int[] ans = new int[numCourses];
        int[] indegree = new int[numCourses];
        int[][] matrix = new int[numCourses][numCourses];

        for (int[] pre : prerequisites){
            int prepr = pre[1];
            int ready = pre[0];
            indegree[ready]++;
            matrix[prepr][ready] = 1;
        }

        Queue<Integer> queue = new LinkedList<>();
        for (int i = 0; i < numCourses; i++){
            if (indegree[i] == 0){
                queue.offer(i);
            }
        }

        int count = 0;
        while (!queue.isEmpty()){
            int v = queue.poll();
            ans[count ++] = v;
            for (int i = 0; i < numCourses; i++){
                if (matrix[v][i] == 1){
                    indegree[i]--;
                    if (indegree[i] == 0) queue.offer(i);
                }
            }
        }
        return count == numCourses ? ans : new int[0];
    }

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2017年05月12日，如有侵权请联系 cloudcommunity@tencent.com 删除

编程算法