(不,Python regex, how to delete all matches from a string不能解决我的问题)
假设我有这个列表:
names = ['your name', 'the name', 'his name', 'her name', 'their name', 'employer name', "employer's name", "father's name",
"mother's name", "maiden name", "son's name", "daughter's name", "brother's name", "sister's name"]
假设我有这样一段文字:
text = "What is your name? Well, uh it's John Smith. Thanks for asking. Anyway, I'd doing well."
如何使用regex查找文本中列表名称的每个元素,并将紧跟在元素后面的文本块(长度为50)替换为“name”。因此,我的输出将是:
text = "What is your name [name] Anyway, I'd doing well."
到目前为止,我有下面的代码,但它只用“name”替换了元素,而不是元素后面的实际文本。
def my_replace3(match):
match = match.group()
return " [name] "
def no_name(text):
names = ['your name', 'the name', 'his name', 'her name', 'their name', 'employer name', "employer's name", "father's name",
"mother's name", "maiden name", "son's name", "daughter's name", "brother's name", "sister's name"]
regex = re.compile(r'\b(' + '|'.join(names) + r')\b', re.IGNORECASE)
text = re.sub(regex, my_replace3, text)
return text
我不是一个伟大的正则表达式专家,所以您的帮助将非常感谢。
发布于 2019-10-04 05:42:17
如果要在匹配后替换50个字符,请将.{50}
添加到正则表达式中。
然后在替换字符串中使用反向引用将匹配的字符串复制到替换字符串。
def no_name(text):
names = ['your name', 'the name', 'his name', 'her name', 'their name', 'employer name', "employer's name", "father's name",
"mother's name", "maiden name", "son's name", "daughter's name", "brother's name", "sister's name"]
regex = re.compile(r'\b(' + '|'.join(map(re.escape, names)) + r')\b.{50}', re.IGNORECASE)
text = re.sub(regex, r'\1 [name]', text)
return text
在插入应该与正则表达式完全匹配的字符串时,也应该使用re.escape()
,以防其中任何字符串包含正则表达式运算符。
https://stackoverflow.com/questions/58227193
复制相似问题