【题目】
给定一个段落 (paragraph) 和一个禁用单词列表 (banned)。返回出现次数最多,同时不在禁用列表中的单词。题目保证至少有一个词不在禁用列表中,而且答案唯一。
禁用列表中的单词用小写字母表示,不含标点符号。段落中的单词不区分大小写。答案都是小写字母。
示例:
输入:
paragraph = "Bob hit a ball, the hit BALL flew far after it was hit."
banned = ["hit"]
输出: "ball"
解释:
"hit" 出现了次,但它是一个禁用的单词。
"ball" 出现了次,是段落里出现次数最多的,且不在禁用列表中的单词。
注意,所有这些单词在段落里不区分大小写,标点符号需要忽略(即使是紧挨着单词也忽略, 比如 "ball,"),
"hit"不是最终的答案,虽然它出现次数更多,但它在禁用单词列表中。
说明:
【思路】
首先得到所有的单词,再对单词进行计数,接着循环遍历计数字典,得到不在banned数组中且计数最大的单词。
【代码】
python版本
class Solution(object):
def mostCommonWord(self, paragraph, banned):
"""
:type paragraph: str
:type banned: List[str]
:rtype: str
"""
# 得到每个单词计数
words = []
start =
for i in range(len(paragraph)):
num = ord(paragraph[i])
if <= num <= + or <= num <= +:
continue
else:
if i - start > :
words.append(paragraph[start: i].lower())
start = i+
if len(paragraph) - start > :
words.append(paragraph[start:].lower())
count = {}
for w in words:
count[w] = count.get(w, ) +
# 按照计数大小排序
count_ls = list(count.items())
count_ls = sorted(count_ls, key=lambda x: x[], reverse=True)
# 得到非禁词的最大计数词(按照题意,保证有)
banned_st = set(banned)
for ci in count_ls:
if ci[] not in banned:
return ci[]
C++版本
class Solution {
public:
string mostCommonWord(string paragraph, vector<string>& banned) {
// 转换为单词
vector<string> words;
string tmp;
for(int i=; i<paragraph.size(); i++){
if(paragraph[i] >= && paragraph[i] <= +)
tmp += (paragraph[i] + );
else{
if(paragraph[i] >= && paragraph[i] <= +)
tmp += paragraph[i];
else{
if(tmp != "")
words.push_back(tmp);
tmp = "";
}
}
}
if(tmp != "")
words.push_back(tmp);
// 单词计数
map<string, int> d;
for(auto w: words){
d[w]++;
}
// 取得最大值
set<string> banned_set(banned.begin(), banned.end());
string res = "";
for(map<string, int>::iterator it = d.begin(); it != d.end(); it++){
if((res == "" || it->second > d[res]) && banned_set.find(it->first) == banned_set.end())
res = it->first;
}
return res;
}
};