我试图用Python构建一个使用正则表达式的语言解释器。该语言是一种基本的免费语言,用于将发送给钻床机器人的可能命令形式化,语法由三组主要指令组成,用于移动机器人:
1. ROTATE LEFT, ROTATE RIGHT, GO N UNITS (N is any natural number)
2. DRILL
3. REPEAT N TIMES { ... }
这样做的目的是获取一个带有指令的文件(可能是一行多个),并对每条指令在机器人上执行特定的方法,如果出现任何语法错误,我必须指定它所指向的行。这类文件的一个例子是:
GO 4 UNITS ROTATE LEFT ROTATE LEFT
GO 5 UNITS DRILL GO 2 UNITS
REPEAT 4 TIMES { GO 5 UNITS ROTATE LEFT } GO 1 UNITS
REPEAT 2 TIMES {
GO 5 UNITS
ROTATE RIGHT DRILL
REPEAT 2 TIMES { GO 1 UNITS DRILL } ROTATE LEFT
}
我的计划是用regex逐行匹配可能的指令,并执行相应的方法来修改机器人的状态。这原来是相对简单的,但是当我试图处理REPEAT N TIMES { ... }
指令时,问题就出现了,因为它可以使用多行,并且可以是递归的。我可以匹配REPEAT N TIMES {
,然后寻找一个}
,但是由于指令可能是多行和递归,这可能会变得非常混乱,非常快。
对于使用regex最简单的方法来解决这个问题,我会非常感激的。
发布于 2022-08-23 20:44:35
这解决了基本文本解析的问题。这将不会处理格式错误的程序。别那么做,就像我们说的.
body = """\
GO 4 UNITS ROTATE LEFT ROTATE LEFT
GO 5 UNITS DRILL GO 2 UNITS
REPEAT 4 TIMES { GO 5 UNITS ROTATE LEFT } GO 1 UNITS
REPEAT 2 TIMES {
GO 5 UNITS
ROTATE RIGHT DRILL
REPEAT 2 TIMES { GO 1 UNITS DRILL } ROTATE LEFT
}""".split()
def handle_go( u ):
print( "going %d units" % int(u))
def handle_rotate( direc ):
print( "rotating %s" % direc.lower() )
def handle_drill():
print( "drilling" )
def find_matching_brace(body):
nest = 0
for i, n in enumerate(body):
if n == '{':
nest += 1
if n == '}':
if not nest:
return i
nest -= 1
def process(body):
while body:
verb = body.pop(0)
if verb == "GO":
handle_go( body.pop(0) )
assert body.pop(0) == 'UNITS'
elif verb == "ROTATE":
handle_rotate(body.pop(0))
elif verb == "DRILL":
handle_drill()
elif verb == "REPEAT":
count = body.pop(0)
assert body.pop(0)=="TIMES"
assert body.pop(0)=="{"
closing = find_matching_brace(body)
newbody = body[0:closing]
print("repeat", count, closing, newbody)
for _ in range(int(count)):
process(newbody[:])
body = body[closing+1:]
process(body)
输出:
going 4 units
rotating left
rotating left
going 5 units
drilling
going 2 units
repeat 4 5 ['GO', '5', 'UNITS', 'ROTATE', 'LEFT']
going 5 units
rotating left
going 5 units
rotating left
going 5 units
rotating left
going 5 units
rotating left
going 1 units
repeat 2 17 ['GO', '5', 'UNITS', 'ROTATE', 'RIGHT', 'DRILL', 'REPEAT', '2', 'TIMES', '{', 'GO', '1', 'UNITS', 'DRILL', '}', 'ROTATE', 'LEFT']
going 5 units
rotating right
drilling
repeat 2 4 ['GO', '1', 'UNITS', 'DRILL']
going 1 units
drilling
going 1 units
drilling
rotating left
going 5 units
rotating right
drilling
repeat 2 4 ['GO', '1', 'UNITS', 'DRILL']
going 1 units
drilling
going 1 units
drilling
rotating left
https://stackoverflow.com/questions/73464753
复制相似问题