首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >表达式语法分析中的左因子分解

表达式语法分析中的左因子分解
EN

Stack Overflow用户
提问于 2012-03-18 14:25:18
回答 1查看 505关注 0票数 4

我试图为一种允许以下表达式的语言编写语法:

  1. 表单f args的函数调用(注:没有括号!)
  2. 表单a + b的加法(以及更复杂的表达式,但这不是重点)

例如:

代码语言:javascript
运行
复制
f 42       => f(42)
42 + b     => (42 + b)
f 42 + b   => f(42 + b)

语法是明确的(每个表达式都可以用一种方式解析),但是我不知道如何将这个语法写成一个聚乙二醇,因为这两个结果都可能以相同的标记id开头。这是我的错PEG。我怎样才能重写它以使它有效?

代码语言:javascript
运行
复制
expression ::= call / addition

call ::= id addition*

addition ::= unary
           ( ('+' unary)
           / ('-' unary) )*

unary ::= primary
        / '(' ( ('+' unary)
              / ('-' unary)
              / expression)
          ')'

primary ::= number / id

number ::= [1-9]+

id ::= [a-z]+

现在,当这个语法试图解析输入“a + b”时,它会将“a”解析为一个函数调用,函数调用为零参数,并对“+ b”进行限制。

我上传了一个语法的C++ / Boost.Spirit.Qi实现,以防有人想玩它。

(请注意,unary消除一元操作和添加的歧义:为了调用带有负数的函数作为参数,需要指定括号,例如f (-1)。)

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2012-03-22 00:20:48

正如在聊天中建议的那样,您可以从以下内容开始:

代码语言:javascript
运行
复制
expression = addition | simple;

addition = simple >>
    (  ('+' > expression)
     | ('-' > expression)
    );

simple = '(' > expression > ')' | call | unary | number;

call = id >> *expression;

unary = qi::char_("-+") > expression;

// terminals
id = qi::lexeme[+qi::char_("a-z")];
number = qi::double_;

从那时起,我就用AST表示在C++中实现了这一点,因此您可以通过漂亮的打印来了解这种语法是如何构建表达式树的。

所有源代码都在github: https://gist.github.com/2152518上 有两个版本(向下滚动到“Alternative”以阅读更多内容)。

语法:

代码语言:javascript
运行
复制
template <typename Iterator>
struct mini_grammar : qi::grammar<Iterator, expression_t(), qi::space_type> 
{
    qi::rule<Iterator, std::string(),  qi::space_type> id;
    qi::rule<Iterator, expression_t(), qi::space_type> addition, expression, simple;
    qi::rule<Iterator, number_t(),     qi::space_type> number;
    qi::rule<Iterator, call_t(),       qi::space_type> call;
    qi::rule<Iterator, unary_t(),      qi::space_type> unary;

    mini_grammar() : mini_grammar::base_type(expression) 
    {
        expression = addition | simple;

        addition = simple [ qi::_val = qi::_1 ] >> 
           +(  
               (qi::char_("+-") > simple) [ phx::bind(&append_term, qi::_val, qi::_1, qi::_2) ] 
            );

        simple = '(' > expression > ')' | call | unary | number;

        call = id >> *expression;

        unary = qi::char_("-+") > expression;

        // terminals
        id = qi::lexeme[+qi::char_("a-z")];
        number = qi::double_;
    }
};

使用非常强大的Boost变量定义了相应的AST结构:

代码语言:javascript
运行
复制
struct addition_t;
struct call_t;
struct unary_t;
typedef double number_t;

typedef boost::variant<
    number_t,
    boost::recursive_wrapper<call_t>,
    boost::recursive_wrapper<unary_t>,
    boost::recursive_wrapper<addition_t>
    > expression_t;

struct addition_t
{
    expression_t lhs;
    char binop;
    expression_t rhs;
};

struct call_t
{
    std::string id;
    std::vector<expression_t> args;
};

struct unary_t
{
    char unop;
    expression_t operand;
};

BOOST_FUSION_ADAPT_STRUCT(addition_t, (expression_t, lhs)(char,binop)(expression_t, rhs));
BOOST_FUSION_ADAPT_STRUCT(call_t,     (std::string, id)(std::vector<expression_t>, args));
BOOST_FUSION_ADAPT_STRUCT(unary_t,    (char, unop)(expression_t, operand));

在完整的代码中,我还重载了这些结构的operator<<。

全演示

代码语言:javascript
运行
复制
//#define BOOST_SPIRIT_DEBUG
#include <iostream>
#include <iterator>
#include <string>

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/adapted.hpp>
#include <boost/optional.hpp>

namespace qi = boost::spirit::qi;
namespace phx= boost::phoenix;

struct addition_t;
struct call_t;
struct unary_t;
typedef double number_t;

typedef boost::variant<
    number_t,
    boost::recursive_wrapper<call_t>,
    boost::recursive_wrapper<unary_t>,
    boost::recursive_wrapper<addition_t>
    > expression_t;

struct addition_t
{
    expression_t lhs;
    char binop;
    expression_t rhs;

    friend std::ostream& operator<<(std::ostream& os, const addition_t& a) 
        { return os << "(" << a.lhs << ' ' << a.binop << ' ' << a.rhs << ")"; }
};

struct call_t
{
    std::string id;
    std::vector<expression_t> args;

    friend std::ostream& operator<<(std::ostream& os, const call_t& a)
        { os << a.id << "("; for (auto& e : a.args) os << e << ", "; return os << ")"; }
};

struct unary_t
{
    char unop;
    expression_t operand;

    friend std::ostream& operator<<(std::ostream& os, const unary_t& a)
        { return os << "(" << a.unop << ' ' << a.operand << ")"; }
};

BOOST_FUSION_ADAPT_STRUCT(addition_t, (expression_t, lhs)(char,binop)(expression_t, rhs));
BOOST_FUSION_ADAPT_STRUCT(call_t,     (std::string, id)(std::vector<expression_t>, args));
BOOST_FUSION_ADAPT_STRUCT(unary_t,    (char, unop)(expression_t, operand));

void append_term(expression_t& lhs, char op, expression_t operand)
{
    lhs = addition_t { lhs, op, operand };
}

template <typename Iterator>
struct mini_grammar : qi::grammar<Iterator, expression_t(), qi::space_type> 
{
    qi::rule<Iterator, std::string(),  qi::space_type> id;
    qi::rule<Iterator, expression_t(), qi::space_type> addition, expression, simple;
    qi::rule<Iterator, number_t(),     qi::space_type> number;
    qi::rule<Iterator, call_t(),       qi::space_type> call;
    qi::rule<Iterator, unary_t(),      qi::space_type> unary;

    mini_grammar() : mini_grammar::base_type(expression) 
    {
        expression = addition | simple;

        addition = simple [ qi::_val = qi::_1 ] >> 
           +(  
               (qi::char_("+-") > simple) [ phx::bind(&append_term, qi::_val, qi::_1, qi::_2) ] 
            );

        simple = '(' > expression > ')' | call | unary | number;

        call = id >> *expression;

        unary = qi::char_("-+") > expression;

        // terminals
        id = qi::lexeme[+qi::char_("a-z")];
        number = qi::double_;

        BOOST_SPIRIT_DEBUG_NODE(expression);
        BOOST_SPIRIT_DEBUG_NODE(call);
        BOOST_SPIRIT_DEBUG_NODE(addition);
        BOOST_SPIRIT_DEBUG_NODE(simple);
        BOOST_SPIRIT_DEBUG_NODE(unary);
        BOOST_SPIRIT_DEBUG_NODE(id);
        BOOST_SPIRIT_DEBUG_NODE(number);
    }
};

std::string read_input(std::istream& stream) {
    return std::string(
        std::istreambuf_iterator<char>(stream),
        std::istreambuf_iterator<char>());
}

int main() {
    std::cin.unsetf(std::ios::skipws);
    std::string const code = read_input(std::cin);
    auto begin = code.begin();
    auto end = code.end();

    try {
        mini_grammar<decltype(end)> grammar;
        qi::space_type space;

        std::vector<expression_t> script;
        bool ok = qi::phrase_parse(begin, end, *(grammar > ';'), space, script);

        if (begin!=end)
            std::cerr << "Unparsed: '" << std::string(begin,end) << "'\n";

        std::cout << std::boolalpha << "Success: " << ok << "\n";

        if (ok)
        {
            for (auto& expr : script)
                std::cout << "AST: " << expr << '\n';
        }
    }
    catch (qi::expectation_failure<decltype(end)> const& ex) {
        std::cout << "Failure; parsing stopped after \""
                  << std::string(ex.first, ex.last) << "\"\n";
    }
}

备选方案:

我有一个替代版本,可以迭代地构建addition_t,而不是递归构建,因此可以这样说:

代码语言:javascript
运行
复制
struct term_t
{
    char binop;
    expression_t rhs;
};

struct addition_t
{
    expression_t lhs;
    std::vector<term_t> terms;
};

这消除了使用凤凰构建表达式的需要:

代码语言:javascript
运行
复制
    addition = simple >> +term;

    term = qi::char_("+-") > simple;
票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/9759093

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档