首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >PEG语法有序选择失败

PEG语法有序选择失败
EN

Stack Overflow用户
提问于 2020-06-22 23:37:30
回答 1查看 115关注 0票数 1

我使用Python包为玩具DSL提供了一个PEG语法:

代码语言:javascript
运行
复制
from arpeggio.cleanpeg import ParserPEG

grammar = """
    root    = block* EOF
    block   = header (item1+ / item2+)
    header  = "block"
    item1   = number name comment?
    item2   = number name list comment?
    number  = r"\d+"
    name    = r"\w+"
    list    = r"\[.*\]"
    comment = r"\/\/.*"
"""

doc = """
block
  5 alpha []        //
  3 beta [a, b, c]  // this is an item2

block
  6 foo
  1 bar  // This is an item1
  4 baz  // more stuff
"""

parser = ParserPEG(grammar, 'root', debug=True)
parse_tree = parser.parse(doc)
print ('Tree:', parse_tree)

这在解析测试文档时给出了奇怪的结果:它在有序的选择中没有正确地匹配item1,但是错误地声称(标记为xxxx的行)它确实匹配了选择,而没有测试item2,这本来是匹配的。

代码语言:javascript
运行
复制
>> Matching rule root=Sequence at position 0 => * block   5
   >> Matching rule ZeroOrMore in root at position 0 => * block   5
      >> Matching rule block=Sequence in root at position 0 => * block   5
         ?? Try match rule header=StrMatch(block) in block at position 1 =>  *block   5 
         ++ Match 'block' at 1 => ' *block*   5 '
         >> Matching rule OrderedChoice in block at position 6 =>  block*   5 alpha
            >> Matching rule OneOrMore in block at position 6 =>  block*   5 alpha
               >> Matching rule item1=Sequence in block at position 6 =>  block*   5 alpha
                  ?? Try match rule number=RegExMatch(\d+) in item1 at position 9 =>  block   *5 alpha []
                  ++ Match '5' at 9 => ' block   *5* alpha []'
                  ?? Try match rule name=RegExMatch(\w+) in item1 at position 11 => block   5 *alpha []  
                  ++ Match 'alpha' at 11 => 'block   5 *alpha* []  '
                  >> Matching rule Optional in item1 at position 16 =>    5 alpha* []       
                     ?? Try match rule comment=RegExMatch(\/\/.*) in item1 at position 17 =>   5 alpha *[]        
                     -- NoMatch at 17
                  <<- Not matched rule Optional in item1 at position 16 =>    5 alpha* []       
               <<+ Matched rule item1=Sequence in item1 at position 16 =>    5 alpha* []       
               >> Matching rule item1=Sequence in block at position 16 =>    5 alpha* []       
                  ?? Try match rule number=RegExMatch(\d+) in item1 at position 17 =>   5 alpha *[]        
                  -- NoMatch at 17
               <<- Not matched rule item1=Sequence in item1 at position 16 =>    5 alpha* []
xxxx-->     <<+ Matched rule OneOrMore in block at position 16 =>    5 alpha* []       
         <<+ Matched rule OrderedChoice in block at position 16 =>    5 alpha* []       
      <<+ Matched rule block=Sequence in block at position 16 =>    5 alpha* []       
      >> Matching rule block=Sequence in root at position 16 =>    5 alpha* []       
         ?? Try match rule header=StrMatch(block) in block at position 17 =>   5 alpha *[]        
         -- No match 'block' at 17 => '  5 alpha *[]   *     '
      <<- Not matched rule block=Sequence in block at position 16 =>    5 alpha* []       
   <<+ Matched rule ZeroOrMore in root at position 16 =>    5 alpha* []       
   ?? Try match rule EOF in root at position 17 =>   5 alpha *[]        
   !! EOF not matched.
<<- Not matched rule root=Sequence in root at position 0 => * block   5

结果,解析器无法使用如果它在item2失败之后实际匹配item2就会使用的item2

,这是解析器包中的一个bug,还是我的语法中的一个错误?

请注意,颠倒顺序选择:

代码语言:javascript
运行
复制
    block   = header (item2+ / item1+)

正确地分析示例文档。但是,在玩具问题上很容易解决的异常结果在实际语法中可能要难得多。一个项目无疑是一个item1或item2,所以检查它们的顺序应该是无关的,解析它们的代码应该一致地工作。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-06-23 10:39:48

钉住中,表达式的顺序是OrderedChoice是很重要的。当解析器尝试item1+时,只要匹配至少一个item1就足够了,然后整个有序的选择就被认为是成功的。

一般来说,总是把更多的具体匹配在开始和更一般的结尾,有序的选择。

更新:维基百科的歧义检测及规则顺序对匹配语言的影响部分有一个很好的解释。

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/62525111

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档