根据我收到的关于这个问题的答案,我编辑了下面的regex。
我的字符串有一个混合的年和月术语。我需要用regex来检测两者。
String1 = " I have total exp of 10-11 years. This includes 15yearsin SAS and 5
years in python. I also have 8 months of exp in R programming."
import re
pat= re.compile(r'\d{1,3}(?:\W+\d{1,3})?\W+(?:plus\s*)?(?:year|month|Year|Month)s?\b', re.X)
experience = re.findall(pat,String1 )
print(experience)
['10-11 years', '5 years', '8 months']
但我也想要没有篇幅的术语,即15年(就像我从自由流动的文本中读到的那样)。
有人能帮我实现正确的判断力吗?
发布于 2019-02-22 12:17:11
你可以用
r'\b\d{1,2}(?:\D+\d{1,2})?\D+(?:year|month)s?\b'
请参阅输出['10-11 years', '15 years in SAS and 5 years', '8 months']
的['10-11 years', '15 years in SAS and 5 years', '8 months']
。
详细信息
\b
-字边界\d{1,2}
-一位或两位数(?:\D+\d{1,2})?
-一个可选的序列\D+
- 1+字符,而不是数字\d{1,2}
-1或2位数
\D+
-一个或多个非数字字符(?:year|month)
-一个year
或month
s?
-一个可选的s
\b
-词边界。import re
String1 = " I have total exp of 10-11 years. This includes 15 years in SAS and 5 years in python. I also have 8 months of exp in R programming."
reg = r'\b\d{1,2}(?:\D+\d{1,2})?\D+(?:year|month)s?\b'
print(re.findall(reg, String1))
# => ['10-11 years', '15 years in SAS and 5 years', '8 months']
备注:如果您计划将['10-11 years', '15 years', '5 years', '8 months']
替换为\W+
(一个或多个字符而不是字母、数字、下划线)并使用
r'\b\d{1,2}(?:\W+\d{1,2})?\W+(?:year|month)s?\b'
https://stackoverflow.com/questions/54826338
复制相似问题