给定以下示例文本,我是否可以使用正则表达式来匹配每个地址的每一行,并添加标记以了解一个地址何时结束,下一个地址何时开始?目前,我知道如何匹配每个完整的地址。然后,我可以运行第二个正则表达式来挑选单独的行,但是可以一次完成这两个步骤吗?
Address:
Address 1 line 1,
Address 1 line 2,
Address 1 line 3
Address:
Address 2 line 1,
Address 2 line 2,
Address 2 line 3,
Address 2 line 4
Address:
Address 3 line 1,
Address 3 line 2发布于 2016-03-08 21:19:49
下面是一个启用了DOTALL标志的Pattern,可以使用"Address:"字符串作为分隔符,通过多行进行查找:
// for test
String addresses = "Address:" + System.getProperty("line.separator")
+ "Address 1 line 1," + System.getProperty("line.separator")
+ "Address 1 line 2," + System.getProperty("line.separator")
+ "Address 1 line 3"
+ "Address:" + System.getProperty("line.separator")
+ "Address 2 line 1," + System.getProperty("line.separator")
+ "Address 2 line 2," + System.getProperty("line.separator")
+ "Address 2 line 3";
// | look behind for "Address:"
// | | any 1+ character,
// | | reluctantly quantified
// | | | lookahead for "Address:"
// | | | or end of input
// | | | | dot can mean
// | | | | line separator
Pattern p = Pattern.compile("(?<=Address:).+?(?=Address:|$)", Pattern.DOTALL);
Matcher m = p.matcher(addresses);
// iterating matches within given string, and printing
while (m.find()) {
System.out.printf("Found: %s%n%n", m.group());
}输出
Found:
Address 1 line 1,
Address 1 line 2,
Address 1 line 3
Found:
Address 2 line 1,
Address 2 line 2,
Address 2 line 3便笺
为了从匹配中排除"Address:"标记后面的行分隔符,您可以使用以下改进的模式:
Pattern p = Pattern.compile("(?<=Address:"
+ System.getProperty("line.separator")+").+?(?=Address:"
+ System.getProperty("line.separator")+"|$)",
Pattern.DOTALL
);发布于 2016-03-08 21:24:26
如果你想要正则表达式...
如果地址中的行数有限(在示例4中),则可以使用以下命令获取它们:
Address:\s*?(?:\n(.*),)?(?:\n(.*),)?(?:\n(.*),)?(?:\n(.*),)?(?:\n(.*))在这里,文本Address:标记块的开始,四行被抓取,前三行是可选的。
(您将需要全局标志。)
https://stackoverflow.com/questions/35868417
复制相似问题