How do I get yacc/bison and/or lex/flex to restart scanning after something like token substitution?
Is there a way to force bison and/or flex to restart scanning after I replace some token with something else?
My particular example would be with replacement for a specific word/string. If I want a word of hello
to be replaced by echo hello
, how can I get flex or bison to replace hello
and then start parsing again (to pick up 2 words instead of just one). So it would be like:
- Get token WORD (which is a string type)
- If
hello
, replace token value withecho hello
- Restart parsing entire input (which is now
echo hello
) - Get token WORD (
echo
) - Get token WORD (
hello
)
I've seen very tempting functions like yyrestart()
, but I don't really understand what that function in particular really accomplishes. Any help is greatly appreciated, thanks!
Update 4/23/2010
One kind of hack-and-slash solution I've ended up using is for each word
that comes through, I check an "alias" array. If the word
has an alias, I replace the value of the word (using, for example, strcopy($1,aliasval)
), and mark an aliasfound
flag.
Once the entire line of input is parsed once, if the aliasfound
flag is true, I use yy_scan_string()
to switch the buffer state to the input with expanded aliases, and call YYACCEPT
.
So then it jumps out to the main function and I call yyparse()
again, with the buffer still pointing to my string. This continues until no aliases are found. Once all of my grammar actions are complete, I call 开发者_运维知识库yyrestart(stdin)
to go back to "normal" mode.
If anyone knows how I can effectively expand my words w/ their alias values, inject into stdin
(or some other method), and basically expand all aliases (even nested) as I go, that would be awesome. I was playing around with yypush_buffer_state()
and yypop_buffer_state()
, along with yy_switch_to_buffer()
, but I couldn't get "inline" substitution with continued parsing working...
It seems to me that the place to fix this is the lexer. I would suggest using flex, which supports a state machine (called "Start Conditions" in the flex documentation). You change states using BEGIN
, and the states need to be defined in the definitions section.
So, for example, you could have a rule like
<INITIAL>hello BEGIN(in_echo); yyless(0); return (WORD_ECHO);
<in_echo>hello BEGIN(0); return (WORD_HELLO);
yyless()
truncates the yytext
to the given value, so this puts the entire input back into the stream.
I haven't tried this out myself, but I think this is the structure of the solution you want.
Adding an "answer" based on what I ended up doing. Want to mark this question as answered.
Update 4/23/2010
One kind of hack-and-slash solution I've ended up using is for each word that comes through, I check an "alias" array. If the word has an alias, I replace the value of the word (using, for example, strcopy($1,aliasval)), and mark an aliasfound flag.
Once the entire line of input is parsed once, if the aliasfound flag is true, I use yy_scan_string() to switch the buffer state to the input with expanded aliases, and call YYACCEPT.
So then it jumps out to the main function and I call yyparse() again, with the buffer still pointing to my string. This continues until no aliases are found. Once all of my grammar actions are complete, I call yyrestart(stdin) to go back to "normal" mode.
If anyone knows how I can effectively expand my words w/ their alias values, inject into stdin (or some other method), and basically expand all aliases (even nested) as I go, that would be awesome. I was playing around with yypush_buffer_state() and yypop_buffer_state(), along with yy_switch_to_buffer(), but I couldn't get "inline" substitution with continued parsing working...
精彩评论