boost spirit 2.x: how to deal with keywords and identifiers?
good day.
i've been using boost spirit classic in the past and now i'm trying to stick to the newer one, boost spirit 2.x. could someone be so kind to point me in how to deal with keywords? say, i want to distinguish between "foo" and "int" where "foo" is identifier and "int" is just a keyword. i want to protect my grammar from incorrect parsing, say, "intfoo".
okay, i have
struct my_keywords : boost::spirit::qi::symbols<char, std::string> {
my_keywords() {
add
("void")
("string")
("float")
("int")
("bool")
//TODO: add others
;
}
} keywords_table_;
and the ident rule declared as:
boost::spirit开发者_如何学Python::qi::rule<Iterator, std::string(), ascii::space_type> ident;
ident = raw[lexeme[((alpha | char_('_')) >> *(alnum | char_('_'))) - keywords_table_]];
and, say, some rule:
boost::spirit::qi::rule<Iterator, ident_decl_node(), ascii::space_type> ident_decl;
ident_decl = ("void" | "float" | "string" | "bool") >> ident;
how to write it correctly, stating that "void", "float", etc are keywords? thanks in advance.
Hmmm just declare your rule to be:
//the > operator say that your keyword MUST be followed by an ident
//instead of just may (if I understood spirit right the >> operator will
//make the parser consider other rules if it fail which might or not be
//what you want.
ident_decl = keyword_table_ > ident;
Expending on your exemple you should have something like this at the end:
struct my_keywords : boost::spirit::qi::symbols<char, int> {
my_keywords() {
add
("void", TYPE_VOID)
("string", TYPE_STRING)
("float", TYPE_FLOAT)
("int", TYPE_INT)
("bool", TYPE_BOOL)
//TODO: add others
;
}
} keywords_table_;
//...
class ident_decl_node
{
//this will enable fusion_adapt_struct to access your private members
template < typename, int>
friend struct boost::fusion::extension::struct_member;
//new version of spirit use:
//friend struct boost::fusion::extension::access::struct_member;
int type;
std::string ident;
};
BOOST_FUSION_ADAPT_STRUCT(
ident_decl_node,
(int, type)
(std::string, ident)
)
//...
struct MyErrorHandler
{
template <typename, typename, typename, typename>
struct result { typedef void type; };
template <typename Iterator>
void operator()(Iterator first, Iterator last, Iterator error_pos, std::string const& what) const
{
using boost::phoenix::construct;
std::string error_msg = "Error! Expecting ";
error_msg += what; // what failed?
error_msg += " here: \"";
error_msg += std::string(error_pos, last); // iterators to error-pos, end
error_msg += "\"";
//put a breakpoint here if you don't have std::cout for the console or change
//this line for something else.
std::cout << error_msg;
}
};
//...
using boost::spirit::qi::grammar;
using boost::spirit::ascii::space_type;
typedef std::vector<boost::variant<ident_decl_node, some_other_node> ScriptNodes;
template <typename Iterator>
struct NodeGrammar: public grammar<Iterator, ScriptNodes(), space_type>
{
using boost::spirit::arg_names; //edit1
NodeGrammar: NodeGrammar::base_type(start)
{
//I had problem if I didn't add the eps rule (which do nothing) so you might
//want to leave it
start %= ident_decl | some_other_node_decl >> eps;
ident_decl %= keyword_table > ident;
//I'm not sure if the %= operator will work correctly on this, you might have to do
//the push_back manually but I think it should work
ident %= raw[lexeme[((alpha | char_('_')) >> *(alnum | char_('_'))) - keywords_table_]];
on_error<fail>(start, error_handler(_1, _2, _3, _4)); //edit1
}
my_keywords keyword_table_;
boost::spirit::qi::rule<Iterator, ScriptNodes(), ascii::space_type> start;
boost::spirit::qi::rule<Iterator, ident_decl_node(), ascii::space_type> ident_decl;
boost::spirit::qi::rule<Iterator, some_other_node(), ascii::space_type> ident_decl;
boost::spirit::qi::rule<Iterator, std::string(), ascii::space_type> ident;
boost::phoenix::function<MyErrorHandler> error_handler; //edit1
};
Also I don't know which version you use but I used the one in boost 1.40 and it seems there is a bug when using operator %= followed by only one argument (the parser would not parse correctly this rule). Ex:
ident_decl %= ident;
do this instead
ident_decl %= ident > eps;
which should be equivalent.
Hope this helped.
精彩评论