是否有一种方法可以使用以下(无文档的) re.Scanner
来查找双引号中的所有内容,以便将这种匹配归类为字符串?
scanner = re.Scanner([
(r"[-10-9]+", lambda scanner, token:("INTEGER", int(token))),
(r"[A-Za-z]+", lambda scanner, token:("NAME", str(token))),
(r"[:true::false:]+", lambda scanner, token:("BOOL", token)),
(r"[:error:]+", lambda scanner, token:("ERROR", token)),
(r'.', lambda scanner, token: None),
])
发布于 2014-03-30 00:22:58
您可以简单地向扫描仪添加一个字符串regex,如下所示:
>>> import re
>>> scanner = re.Scanner([
(r"[-10-9]+", lambda scanner, token:("INTEGER", int(token))),
(r"[A-Za-z]+", lambda scanner, token:("NAME", str(token))),
(r"[:true::false:]+", lambda scanner, token:("BOOL", token)),
(r"[:error:]+", lambda scanner, token:("ERROR", token)),
(r'".*?"', lambda scanner, token:("STRING", token)), # added STRING regex
(r'.', lambda scanner, token: None),
])
现在您可以测试它了:
>>> i = '"string"' # simulated input
>>> t = '"this is a very long string with whitespace"' # another simulated input
>>> scanner.scan(i)
([('STRING', '"string"')], '') # ([(token_label, match)], remainder_of_string)
>>> scanner.scan(t)
([('STRING', '"this is a very long string with whitespace"')], '')
https://stackoverflow.com/questions/22738680
复制相似问题