字符串高阶经典题

原创

王脸小

修改于 2019-11-07 17:03:53

7780

修改于 2019-11-07 17:03:53

文章被收录于专栏：王漂亮王漂亮

答的好不如问得好

是不是都是小写？

有没有为空的单词，单词是不是一样长？单词有没有重复的？同一个单词可以多次使用吗？

有可能为空吗？为空返回什么？

所有解的题返回的顺序有要求吗？

注意

dfs，回溯注意设置visited，不然死循环。

单词系列

127. 单词接龙

Word Ladder

Given two words (beginWord and endWord), and a dictionary's word list, find the length of shortest transformation sequence from beginWord to endWord, such that:

Only one letter can be changed at a time.
Each transformed word must exist in the word list. Note that beginWord is not a transformed word.

Example 1:

Input:
beginWord = "hit",
endWord = "cog",
wordList = ["hot","dot","dog","lot","log","cog"]

Output: 5

Explanation: As one shortest transformation is "hit" -> "hot" -> "dot" -> "dog" -> "cog",
return its length 5.

Example 2:

Input:
beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log"]

Output: 0

Explanation: The endWord "cog" is not in wordList, therefore no possible transformation.

Solution

思路：hit作为key "h*t", "*it", "hi*"的value，key可达value

注意：用队列popleft，先进先出才是minstep；设置visited不然会无限循环

class Solution:
    def ladderLength(self, beginWord: str, endWord: str, wordList: List[str]) -> int:
        di = defaultdict(list)
        
        for word in wordList:
            for i in range(len(word)):
                k = word[:i] + "_" + word[i+1:]
                di[k] += [word]
                
        visited = set()
        queue = deque([(beginWord, 1)])
        
        while queue:
            word, step = queue.popleft()
            
            if word not in visited: visited.add(word)
            if word == endWord: return step
            
            for i in range(len(word)):
                k = word[:i] + "_" + word[i+1:]
                for w in di[k]:
                    if w not in visited:
                        queue.append((w, step+1))
        
        return 0

10.27 BFS很明显，但不知道这个单词怎么构造图，所以看了思路抄了构造的代码。

126. 单词接龙 II

Word Ladder II

Given two words (beginWordandendWord), and a dictionary's word list, find all shortest transformation sequence(s) frombeginWordtoendWord.

Example 1:

Input:
beginWord = "hit",
endWord = "cog",
wordList = ["hot","dot","dog","lot","log","cog"]

Output:
[
  ["hit","hot","dot","dog","cog"],
  ["hit","hot","lot","log","cog"]
]

Solution

自己写的，但是要先BFS找到minstep，再回溯，超过minstep就跳出， find minstep用上面的Function

from collections import defaultdict, deque
class Solution(object):
    def findLadders(self, beginWord, endWord, wordList):
        di = defaultdict(list)
        
        for word in wordList:
            for i in range(len(word)):
                k = word[:i] + "_" + word[i+1:]
                di[k] += [word]
        
        # 找到最小步数
        max_step = self.findMinStep(beginWord, endWord, wordList)
        if not max_step: return []

        self.visited = set()
        self.res = []
        
        # 回溯找出所有解
        def backtrack(track, cur, step, max_step):
            if cur == endWord:
                self.res.append(track.copy())

            if step >= max_step: return

            for i in range(len(cur)):
                k = cur[:i] + "_" + cur[i+1:]
                for w in di[k]:
                    if w not in self.visited:
                        self.visited.add(w)
                        backtrack(track + [w], w, step+1, max_step)
                        self.visited.pop()
            
        backtrack([beginWord], beginWord, 1, max_step)
        return self.res

139. 单词拆分

Word Break

Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, determine if s can be segmented into a space-separated sequence of one or more dictionary words.

Note:

The same word in the dictionary may be reused multiple times in the segmentation.

Example 2:

Input: s = "applepenapple", wordDict = ["apple", "pen"]
Output: true
Explanation: Return true because "applepenapple" can be segmented as "apple pen apple".
             Note that you are allowed to reuse a dictionary word.

Example 3:

Input: s = "catsandog", wordDict = ["cats", "dog", "sand", "and", "cat"]
Output: false

Solution

s[l:r]in wordDict and dp[l]

注意dp要在两项都满足的情况下赋值，之前写错了一直debug

dp[r] = dp[l] and s[l:r] in wordDict，这是不对的！dp[r]会多次被赋值，如果已经为True了会被覆盖掉

class Solution:
    def wordBreak(self, s: str, wordDict) -> bool:
        if not s: return False
        dp = [True] + [False]*len(s)
        
        for l in range(len(s)):
            for r in range(l+1, len(s)+1):
                if s[l:r] in wordDict and dp[l]:
                    dp[r] = True
                    
        return dp[-1]

Given a non-empty string s and a dictionary wordDict containing a list of non-empty words, add spaces in s to construct a sentence where each word is a valid dictionary word. Return all such possible sentences.

Example 1:

Input:
s = "catsanddog"
wordDict = ["cat", "cats", "and", "sand", "dog"]
Output:
[
  "cats and dog",
  "cat sand dog"
]

Solution

11.6 暴力回溯超时

优化：带记忆的回溯

class Solution:
    def wordBreak(self, s: str, wordDict):
        
        if not s: return True
        res = []
        dp = [False] * len(s)
        
        def backtrack(track, start):
            if start == len(s):
                res.append(" ".join(track))
            
            for end in range(start+1, len(s)+1):
                if s[start:end] in wordDict:
                    backtrack(track+[s[start:end]], end)
                    
        backtrack([], 0)
        
        return res

472. 连接词- H

Concatenated Words

Given a list of words (without duplicates), please write a program that returns all concatenated words in the given list of words.

A concatenated word is defined as a string that is comprised entirely of at least two shorter words in the given array.

Example:

Input: ["cat","cats","catsdogcats","dog","dogcatsdog","hippopotamuses","rat","ratcatdogcat"]

Output: ["catsdogcats","dogcatsdog","ratcatdogcat"]

Explanation: "catsdogcats" can be concatenated by "cats", "dog" and "cats"; 
 "dogcatsdog" can be concatenated by "dog", "cats" and "dog"; 
"ratcatdogcat" can be concatenated by "rat", "cat", "dog" and "cat".

Solution

1. 根据长度排序words：words.sort(key = lambda x:len(x))

2. 从短到长check是否在trie(初始为空)里面：如果不在就insert_trie；如果在添加word到res

3. 判断在的逻辑就是dfs回溯

from collections import defaultdict
class Solution(object):
    def findAllConcatenatedWordsInADict(self, words):
        words.sort(key = lambda x:len(x))
        trie = defaultdict(list)
        res = []
        
        for word in words:
            if not len(word): continue   #有为空的词
            
            if self.checkWord(word, trie):
                res.append(word)
            else:
                self.insert(word, trie)
        return res
        
            
    def checkWord(self, word, trie):
        if not len(word): return True
        cur = trie
        for i in range(len(word)):
            if word[i] not in cur:
                return False
            cur = cur[word[i]]
            if "end" in cur: 
                if self.checkWord(word[i+1:], trie):
                    return True
        return False

    def insert(self, word, trie):
        cur = trie
        for ch in word:
            if ch not in cur:
                cur[ch] = defaultdict(list)
            cur = cur[ch]
        cur["end"] == 1

11.7 想到了trie但是不知道怎么search，看了思路，自己写的

79. 单词搜索

Word Search

Given a 2D board and a word, find if the word exists in the grid.

The word can be constructed from letters of sequentially adjacent cell, where "adjacent" cells are those horizontally or vertically neighboring. The same letter cell may not be used more than once.

Example:

board =
[
  ['A','B','C','E'],
  ['S','F','C','S'],
  ['A','D','E','E']
]

Given word = "ABCCED", return true.
Given word = "SEE", return true.
Given word = "ABCB", return false.

Solution

注意：used

class Solution:
    def exist(self, board, word: str) -> bool:
        if not word: return False
        row = len(board)
        col = len(board[0]) if row else 0
        
        if not row or not col : return False
        
        # 回溯 + dfs
        def dfs(idx, i, j):
            if idx == len(word): return True
            
            for x,y in [(1,0),(-1,0),(0,1),(0,-1)]:
                if 0<=i+x<row and 0<=j+y<col and not used[i+x][j+y] and board[i+x][j+y] == word[idx]:
                    used[i+x][j+y] = True
                    if dfs(idx+1, i+x, j+y): return True
                    used[i+x][j+y] = False
            
        for i in range(row):
            for j in range(col):
                if board[i][j] == word[0]:
                    used = [ [ False for _ in range(col)] for _ in range(row) ]
                    used[i][j] = True
                    if dfs(1, i,j): return True
                    
        return False

10.28 DFS + BackTrack和DFS，自己想出来的，但是好几次才过，不少坑。

212. 单词搜索 II

Word Search II

Given a 2D board and a list of words from the dictionary, find all words in the board.

Each word must be constructed from letters of sequentially adjacent cell, where "adjacent" cells are those horizontally or vertically neighboring. The same letter cell may not be used more than once in a word.

Example:

Input: 
board = [
  ['o','a','a','n'],
  ['e','t','a','e'],
  ['i','h','k','r'],
  ['i','f','l','v']
]
words = ["oath","pea","eat","rain"]

Output: ["eat","oath"]

Solution

Trie+dfs

from collections import defaultdict
class TrieNode():
    def __init__(self):
        self.nodes = defaultdict(TrieNode)
        self.isword = False

class Trie():
    def __init__(self):
        self.root = TrieNode()
        
    def insert(self, word):
        curr = self.root
        for ch in word:
            curr = curr.nodes[ch]
        curr.isword = True
        
class Solution:
    def findWords(self, board, words):
        if not words: return []
        
        row = len(board)
        col = len(board[0]) if row else 0
        res = set()
        
        def dfs(i, j, track, di):
            if di.isword:
                res.add(track)
            
            for x,y in [(1,0),(-1,0),(0,1),(0,-1)]:
                if 0<=i+x<row and 0<=j+y<col and not used[i+x][j+y] and board[i+x][j+y] in di.nodes:
                    used[i+x][j+y] = True
                    dfs(i+x, j+y, track+board[i+x][j+y], di.nodes[board[i+x][j+y]])
                    used[i+x][j+y] = False

        trie = Trie()
        for word in words:
            trie.insert(word)
        
        root = trie.root
        
        for i in range(row):
            for j in range(col):
                if board[i][j] in root.nodes:
                    used = [ [ False for _ in range(col)] for _ in range(row) ]
                    used[i][j] = True
                    dfs(i, j, str(board[i][j]), root.nodes[board[i][j]])

        return res

10.26 看了思路(Trie+dfs)自己写的，有case没过，还得再琢磨琢磨

11.6 没过是因为没有回溯后used=False.....

正则匹配问题

10. 正则表达式匹配 - H

Regular Expression Matching

Given an input string (s) and a pattern (p), implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Example 2:

Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the preceding element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".

Example 3:

Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".

Example 5:

Input:
s = "mississippi"
p = "mis*is*p*."
Output: false

Solution

如果发现"*"，只考虑匹配0次及一次的情况，多次通过递归解决

如果匹配0次，跳过该字符和"*"

如果匹配1次，s[0]==p[0]，移动text

Reference

Solution - 递归

class Solution:
    def isMatch(self, s: str, p: str) -> bool:
        
        if not p: return not s
        
        first_match = s and p[0] in [s[0], "."]
        
        if len(p)>=2 and p[1] == "*":
            return (first_match and self.isMatch(s[1:], p)) or self.isMatch(s, p[2:])
        else:
            return first_match and self.isMatch(s[1:], p[1:])

Solution - DP

class Solution:
    def isMatch(self, s: str, p: str) -> bool:
        if not p: return not s
    
        dp = [[False for _ in range(len(p)+1)] for _ in range(len(s)+1)]
        dp[0][0] = True
        
        for i in range(0, len(s)+1):
            for j in range(1, len(p)+1):
                if p[j-1] == "*":
                    dp[i][j] = (i>0 and p[j-2] in [s[i-1], "."] and dp[i-1][j]) or dp[i][j-2]
                else:
                    dp[i][j] = i>0 and p[j-1] in [s[i-1], "."] and dp[i-1][j-1]
                
        return dp[-1][-1]

44. 通配符匹配 - H

Wildcard Matching

Given an input string (s) and a pattern (p), implement wildcard pattern matching with support for '?' and '*'.

'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).

The matching should cover the entire input string (not partial).

Example 3:

Input:
s = "cb"
p = "?a"
Output: false
Explanation: '?' matches 'c', but the second letter is 'a', which does not match 'b'.

Example 4:

Input:
s = "adceb"
p = "*a*b"
Output: true
Explanation: The first '*' matches the empty sequence, while the second '*' matches the substring "dce".

Solution

class Solution:
    def isMatch(self, s: str, p: str) -> bool:
        if not p: return not s
        
        dp = [[False for _ in range(len(p)+1)] for _ in range(len(s)+1)]
        
        dp[0][0] = True

        for i in range(1, len(p)+1):
            dp[0][i] = p[i-1] == "*" and dp[0][i-1]

        for i in range(1,len(s)+1):
            for j in range(1,len(p)+1):
                dp[i][j] = (dp[i-1][j-1] and p[j-1] in (s[i-1], "?")) 
                        or ((dp[i-1][j] or dp[i][j-1]) and p[j-1] == "*")
            
        return dp[-1][-1]

怕是做了n个小时吧：DP方程是自己写的但是漏了一条；DP的len+1比较好做

dp[i][j]表示s到i位置,p到j位置是否匹配!

初始化:

dp[0][0]:什么都没有,所以为true
第一行dp[0][j],换句话说,s为空,与p匹配,所以只要p开始为*才为true
第一列dp[i][0],当然全部为False

动态方程:
如果(s[i] == p[j] || p[j] == "?") && dp[i-1][j-1] ,有dp[i][j] = true
如果p[j] == "*" && (dp[i-1][j] = true || dp[i][j-1] = true) 有dp[i][j] = true

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

编程算法

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

编程算法

登录后参与评论

0 条评论

热度