Boost spirit is too greedy

2023-02-18 05:21 问答作者：

I'm in between a deep admiration about boost::spirit and eternal frustration not to understand it ;)

I have problems with strings that are too greedy and therefore it doesn't match. Below a minimal example that doesn't parse as the txt rule eats up end.

More information about what i'd like to do : the goal is to parse some pseudo-SQL and I skip whitespaces. In a statement like

select foo.id, bar.id 开发者_JS百科from foo, baz

I need to treat from as a special keyword. The rule is something like

"select" >> txt % ',' >> "from" >> txt % ','

but it obviously doesn't work at it sees bar.id from foo as one item.

#include <boost/spirit/include/qi.hpp>
#include <iostream>
namespace qi = boost::spirit::qi;
int main(int, char**) {
    auto txt = +(qi::char_("a-zA-Z_"));
    auto rule = qi::lit("Hello") >> txt % ',' >> "end";
    std::string str = "HelloFoo,Moo,Bazend";
    std::string::iterator begin = str.begin();
    if (qi::parse(begin, str.end(), rule))
        std::cout << "Match !" << std::endl;
    else
        std::cout << "No match :'(" << std::endl;
}

Here's my version, with changes marked:

#include <boost/spirit/include/qi.hpp>
#include <iostream>
namespace qi = boost::spirit::qi;
int main(int, char**) {
  auto txt = qi::lexeme[+(qi::char_("a-zA-Z_"))];     // CHANGE: avoid eating spaces
  auto rule = qi::lit("Hello") >> txt % ',' >> "end";
  std::string str = "Hello Foo, Moo, Baz end";        // CHANGE: re-introduce spaces
  std::string::iterator begin = str.begin();
  if (qi::phrase_parse(begin, str.end(), rule, qi::ascii::space)) {          // CHANGE: used phrase_parser with a skipper
    std::cout << "Match !" << std::endl << "Remainder (should be empty): '"; // CHANGE: show if we parsed the whole string and not just a prefix
    std::copy(begin, str.end(), std::ostream_iterator<char>(std::cout));
    std::cout << "'" << std::endl;
  }
  else {
    std::cout << "No match :'(" << std::endl;
  }
}

This compiles and runs with GCC 4.4.3 and Boost 1.4something; output:

Match !
Remainder (should be empty): ''

By using lexeme, you can avoid eating spaces conditionally, so that txt matches up to a word boundary only. This yields the desired result: because "Baz" is not followed by a comma, and txt doesn't eat spaces, we never accidentally consume "end".

Anyway, I'm not 100% sure this is what you're looking for -- in particular, is str missing spaces as an illustrative example, or are you somehow forced to use this (spaceless) format?

Side note: if you want to make sure that you've parsed the entire string, add a check to see if begin == str.end(). As stated, your code will report a match even if only a non-empty prefix of str was parsed.

Update: Added suffix printing.

继续阅读：boost-spirit

Boost spirit is too greedy

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？