专栏首页机器学习入门算法细节系列(20):Word Ladder系列

算法细节系列(20):Word Ladder系列

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/u014688145/article/details/72469345

算法细节系列(20):Word Ladder系列

详细代码可以fork下Github上leetcode项目,不定期更新。

题目摘自leetcode: 1. Leetcode 127: Word Ladder 2. Leetcode 126: Word Ladder II

Leetcode 127: Word Ladder

Problem:

Given two words (beginWord and endWord), and a dictionary’s word list, find the length of shortest transformation sequence from beginWord to endWord, such that:

  • Only one letter can be changed at a time.
  • Each transformed word must exist in the word list. Note that beginWord is not a transformed word.

Example:

Given: beginWord = “hit” endWord = “cog” wordList = [“hot”,”dot”,”dog”,”lot”,”log”,”cog”] As one shortest transformation is “hit” -> “hot” -> “dot” -> “dog” -> “cog”, return its length 5.

Note:

  • Return 0 if there is no such transformation sequence.
  • All words have the same length.
  • All words contain only lowercase alphabetic characters.
  • You may assume no duplicates in the word list.
  • You may assume beginWord and endWord are non-empty and are not the same.

这道题其实不难,但要想到这种解法却要费一番周折,如果对最短路径搜索熟悉的话,相信你一眼就能看出答案了,并且我们要论证一点,为什么最短路径算法对这道题来说是正确解法。

我的思路: DFS,把所有编辑距离为1的单词连接在一块,构建一个MAP(邻接矩阵)。这样之后,我们就可以从beginWord开始DFS搜索了,中间需要状态记录。代码如下:

public int ladderLength(String beginWord, String endWord, List<String> wordList) {
        Map<String, List<String>> map = new HashMap<>();
        map.put(beginWord, new ArrayList<>());
        for (String word : wordList){
            map.put(word, new ArrayList<>());
        }
        for (String key : map.keySet()){
            List<String> container = map.get(key);
            for (String word : wordList){
                if (oneDiff(key, word)){
                    container.add(word);
                }
            }
            map.put(key, container);
        }
        int ans = helper(map, beginWord, endWord, new HashSet<>());
        return ans >= 1 << 30 ? 0 : ans;
    }

    private int helper(Map<String, List<String>> map,String beginWord, String endWord, Set<String> visited){
        if (visited.contains(beginWord)) return 1<<30;
        visited.add(beginWord);
        for (String find : map.get(beginWord)){
            if (find.equals(endWord)){
                visited.remove(beginWord);
                return 2;
            }
        }
        int min = Integer.MAX_VALUE;
        for (String find : map.get(beginWord)){
            int x = 1 + helper(map, find, endWord, visited);
            min = Math.min(min, x);
        }
        visited.remove(beginWord);
        return min;
    }


    private boolean oneDiff(String a, String b){
        if (a.equals(b)) return false;
        char[] aa = a.toCharArray();
        char[] bb = b.toCharArray();

        int oneDiff = 0;
        for (int i = 0; i < aa.length; i++){
            if (aa[i] != bb[i]){
                oneDiff ++;
                if (oneDiff >= 2) return false;
            }
        }
        return true;
    }

代码没有多大问题,典型的DFS+状态回溯,遍历搜索每一条到达endWord的路径,找寻最短路径。但很可惜TLE了,直观上来看是因为为了拿到到endWord的最短路径,我们需要遍历每一条到endWord的路径,这是递归求解的一个特点。但实际情况,我们可以省去某些点的遍历。

就那这个问题来说,如从beginWord开始搜索,如

beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log","cog"]
wordList中编辑距离为1的单词有:
a. hot
此时BFS搜索与"hot"最近距离的单词,有:
a. dot
b. lot
再BFS搜索"dot"时,有:
a. cog
所以我们只需要BFS三次就能得到正确答案,而DFS中,需要DFS至少三次。

上述过程就是经典的Dijkstra算法,代码如下:

public int ladderLength(String beginWord, String endWord, List<String> wordList) {
        List<String> reached = new ArrayList<>();
        reached.add(beginWord);
        Set<String> wordSet = new HashSet<>(wordList);
        if(!wordSet.contains(endWord)) return 0;
        wordSet.add(endWord);

        int distance = 1;
        while (!reached.contains(endWord)){ //到达该目的地
            List<String> toAdd = new ArrayList<>();
            for (String each : reached){
                for (int i = 0; i < each.length(); i++){
                    char[] chars = each.toCharArray();
                    for (char c = 'a';  c <= 'z'; c++){
                        chars[i] = c;
                        String wd = new String(chars);
                        if (wordSet.contains(wd)){
                            toAdd.add(wd);
                            wordSet.remove(wd); //记录访问状态
                        }
                    }
                }
            }
            distance ++;
            if (toAdd.size() == 0) return 0; //没有编辑距离为1的单词
            reached = toAdd;
        }
        return distance;
    }

Leetcode 126: Word Ladder II

Problem:

Given two words (beginWord and endWord), and a dictionary’s word list, find all shortest transformation sequence(s) from beginWord to endWord, such that:

  • Only one letter can be changed at a time
  • Each transformed word must exist in the word list. Note that beginWord is not a transformed word.

Example:

Given: beginWord = “hit” endWord = “cog” wordList = [“hot”,”dot”,”dog”,”lot”,”log”,”cog”] Return [ [“hit”,”hot”,”dot”,”dog”,”cog”], [“hit”,”hot”,”lot”,”log”,”cog”] ]

Note:

  • Return an empty list if there is no such transformation sequence.
  • All words have the same length.
  • All words contain only lowercase alphabetic characters.
  • You may assume no duplicates in the word list.
  • You may assume beginWord and endWord are non-empty and are not the same.

这道题的思路让我对DFS和BFS有了一些基本理解,但还不够深刻,咋说呢,我没想到BFS和DFS还可以分工合作,BFS用来快速求出最小distance,而DFS则用来遍历所有路径,两种遍历方法各有长处,综合起来就能解决该问题了,所以我写了一个版本,代码如下:

public List<List<String>> findLadders(String beginWord, String endWord, List<String> wordList) {
        Map<String, List<String>> map = new HashMap<>();
        map.put(beginWord, new ArrayList<>());
        for (String word : wordList){
            map.put(word, new ArrayList<>());
        }
        for (String key : map.keySet()){
            List<String> container = map.get(key);
            for (String word : wordList){
                if (oneDiff(key, word)){
                    container.add(word);
                }
            }
            map.put(key, container);
        }
        int distance = bfs(beginWord, endWord, wordList);
        List<List<String>> ans = new ArrayList<>();
        dfs(map, beginWord, endWord, ans, new ArrayList<>(), distance);
        return ans;
    }

    private void dfs(Map<String, List<String>> map,String beginWord, String endWord, List<List<String>> ans, List<String> path, int distance){
        path.add(beginWord);
        if (distance == 0){
            path.remove(path.size()-1); 
            return;
        }
        if (beginWord.equals(endWord)){
            ans.add(new ArrayList<>(path));
            path.remove(path.size()-1);
            return;
        }
        for (String find : map.get(beginWord)){
            dfs(map, find, endWord, ans, path, distance-1);
        }
        path.remove(path.size()-1);
    }



    private int bfs(String beginWord, String endWord, List<String> wordList) {
        List<String> reached = new ArrayList<>();
        reached.add(beginWord);
        Set<String> wordSet = new HashSet<>(wordList);
        if(!wordSet.contains(endWord)) return 0;
        wordSet.add(endWord);

        int distance = 1;
        while (!reached.contains(endWord)){ //达到该目的地
            List<String> toAdd = new ArrayList<>();
            for (String each : reached){
                for (int i = 0; i < each.length(); i++){
                    char[] chars = each.toCharArray();
                    for (char c = 'a';  c <= 'z'; c++){
                        chars[i] = c;
                        String wd = new String(chars);
                        if (wordSet.contains(wd)){
                            toAdd.add(wd);
                            wordSet.remove(wd);
                        }
                    }
                }
            }
            distance ++;
            if (toAdd.size() == 0) return 0;
            reached = toAdd;
        }
        return distance;
    }

    private boolean oneDiff(String a, String b){
        if (a.equals(b)) return false;
        char[] aa = a.toCharArray();
        char[] bb = b.toCharArray();

        int oneDiff = 0;
        for (int i = 0; i < aa.length; i++){
            if (aa[i] != bb[i]){
                oneDiff ++;
                if (oneDiff >= 2) return false;
            }
        }
        return true;
    }

思路相当清楚了,以为能够AC,结果发现TLE了,说明该题对时间的要求很高,从上述代码我们也能发现一些基本问题,如BFS遍历时可以构建MAP,而不用单独构建MAP,非常耗时。其次,最关键的问题在于DFS,此版本的DFS没有进行剪枝处理,剪枝能省去很多时间,所以我还需要对BFS进行改进。

思路: 首先,我们来看看上述代码构建图的一个模型,如下图所示:

很明显,如果我们对BFS没有做任何限制,我们拿到的邻接表一定是上述探头斯,而此时如果用DFS进行搜索时,如从“hot”开始,它会搜索:

一条可能的搜索路径:
hot ---> dot ---> dog ---> cog
但与此同时DFS还会搜索路径:
hot ---> dot ---> tot ---> hot
上述路径很明显不需要DFS,但因为边的相连,使得这种没必要的搜索也将继续。

所以一个优化点就在于,好马不吃回头草,存在环路的回头草绝对不是达到endWord的最短路径。很遗憾,邻接表无法表示这种非环的图,所以想法就是用一个Map<String,Integer>来记录到达每个单词的最短路径,一旦map中有该单词,就不再更新最短路径(避免环路搜索)

所以BFS代码如下:

private int bfs(String beginWord, String endWord, Set<String> wordDict, Map<String, Integer> distanceMap,
            Map<String, List<String>> map) {
        if (!wordDict.contains(endWord))
            return 0;
        map.put(beginWord, new ArrayList<>());
        for (String word : wordDict) {
            map.put(word, new ArrayList<>());
        }

        Queue<String> queue = new LinkedList<>();
        queue.offer(beginWord);
        distanceMap.put(beginWord, 1);

        while (!queue.isEmpty()) {
            int count = queue.size();
            boolean foundEnd = false;
            // 这种循环遍历很有意思,看作一个整体
            for (int i = 0; i < count; i++) {
                String cur = queue.poll();
                int curDistance = distanceMap.get(cur);
                List<String> neighbors = getNeighbors(cur, wordDict);
                if (neighbors.size() == 0)
                    return 0;
                for (String neighbor : neighbors) {
                    map.get(cur).add(neighbor);
                    //存在环的情况,不去更新最短路径
                    if (!distanceMap.containsKey(neighbor)) {
                        distanceMap.put(neighbor, curDistance + 1);
                        if (endWord.equals(neighbor)) {
                            foundEnd = true;
                        } else {
                            queue.offer(neighbor);
                        }
                    }
                }
            }
            //一旦抵到了endWord,我们就放弃建立后续的图
            if (foundEnd)
                break;
        }
        return distanceMap.get(endWord);
    }

上述代码在BFS时,与endWord无关的那些结点都丢弃掉了,且解决了有环路的情况。图结构如下所示:

这样在DFS构建路径时,它的速度就比原先要快得多。在BFS中还需要注意一个函数【getNeighbors()】,刚开始我写的这版程序也超时了,苦思许久都找不到原因,后来才发现是getNeighbors的玄机,它在建立邻接表时,一定要使用【HashSet】的搜索方法,而不要用原生的【List】的搜索方法。

所以完整代码如下:

    public List<List<String>> findLadders(String beginWord, String endWord, List<String> wordList) {
        Map<String, List<String>> map = new HashMap<>();
        Map<String, Integer> distanceMap = new HashMap<>();
        Set<String> wordDict = new HashSet<>(wordList);
        wordDict.add(beginWord);
        int distance = bfs(beginWord, endWord, wordDict, distanceMap, map);
        List<List<String>> ans = new ArrayList<>();
        if (distance == 0)
            return ans;
        dfs(map, beginWord, endWord, ans, new ArrayList<>(), distance, distanceMap);
        return ans;
    }

    private void dfs(Map<String, List<String>> map, String beginWord, String endWord, List<List<String>> ans,
            List<String> path, int distance, Map<String, Integer> distanceMap) {
        path.add(beginWord);
        if (distance == 0) {
            path.remove(path.size() - 1);
            return;
        }
        if (beginWord.equals(endWord)) {
            ans.add(new ArrayList<>(path));
            path.remove(path.size() - 1);
            return;
        }
        for (String find : map.get(beginWord)) {
            if (!distanceMap.containsKey(find))
                continue;
            if (distanceMap.get(beginWord) + 1 == distanceMap.get(find))
                dfs(map, find, endWord, ans, path, distance - 1, distanceMap);
        }
        path.remove(path.size() - 1);
    }

    private int bfs(String beginWord, String endWord, Set<String> wordDict, Map<String, Integer> distanceMap,
            Map<String, List<String>> map) {
        if (!wordDict.contains(endWord))
            return 0;
        map.put(beginWord, new ArrayList<>());
        for (String word : wordDict) {
            map.put(word, new ArrayList<>());
        }

        Queue<String> queue = new LinkedList<>();
        queue.offer(beginWord);
        distanceMap.put(beginWord, 1);

        while (!queue.isEmpty()) {
            int count = queue.size();
            boolean foundEnd = false;
            for (int i = 0; i < count; i++) {
                String cur = queue.poll();
                int curDistance = distanceMap.get(cur);
                List<String> neighbors = getNeighbors(cur, wordDict);
                if (neighbors.size() == 0)
                    return 0;
                for (String neighbor : neighbors) {
                    map.get(cur).add(neighbor);
                    if (!distanceMap.containsKey(neighbor)) {
                        distanceMap.put(neighbor, curDistance + 1);
                        if (endWord.equals(neighbor)) {
                            foundEnd = true;
                        } else {
                            queue.offer(neighbor);
                        }
                    }
                }
            }
            if (foundEnd)
                break;
        }
        return distanceMap.get(endWord);
    }

    private List<String> getNeighbors(String word, Set<String> wordList) {
        List<String> ans = new ArrayList<>();
        for (int i = 0; i < word.length(); i++) {
            char[] cc = word.toCharArray();
            for (char c = 'a'; c <= 'z'; c++) {
                cc[i] = c;
                String newWord = new String(cc);
                if (wordList.contains(newWord)) {
                    if (newWord.equals(word))
                        continue;
                    ans.add(newWord);
                }
            }
        }
        return ans;
    }

DFS是一个典型的回溯+剪枝的递归方法,凡是函数返回的地方,我们都需要进行状态还原,注意再注意。

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • 算法细节系列(23):回溯

    版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.n...

    用户1147447
  • Special Binary String

    Example 1: Input: S = “11011000” Output: “11100100” Explanation: The str...

    用户1147447
  • 1035. Spell checker

    用户1147447
  • 微信公众号H5支付遇到的那些坑

    简史 官方文档说的很清楚,商户已有H5商城网站,用户通过消息或扫描二维码在微信内打开网页时,可以调用微信支付完成下单购买的流程。 当然,最近微信支付平台也加入了...

    小柒2012
  • 微信公众号H5支付遇到的那些坑

    官方文档说的很清楚,商户已有H5商城网站,用户通过消息或扫描二维码在微信内打开网页时,可以调用微信支付完成下单购买的流程。

    小柒2012
  • 老板看了我的代码,直呼“666”,说涨工资!

    如何更规范化编写Java 代码的重要性想必毋需多言,其中最重要的几点当属提高代码性能、使代码远离Bug、令代码更优雅。

    程序员小强
  • 这样规范写代码,同事直呼“666”

    Java团长
  • 当我遵循了这 16 条规范写代码,同事只对我说了三个字: 666

    Many of the happiest people are those who own the least. But are we really so ha...

    良月柒
  • 如何更规范化编写 Java 代码

    如何更规范化编写 Java 代码的重要性想必毋需多言,其中最重要的几点当属提高代码性能、使代码远离 Bug、令代码更优雅。

    淡定的蜗牛
  • Java-String.intern的深入研究

    When---什么时候需要了解String的intern方法: 面试的时候(蜜汁尴尬)!虽然不想承认,不过面试的时候经常碰到这种高逼格的问题来考察我们是否真正理...

    SecondWorld

扫码关注云+社区

领取腾讯云代金券