How do I get Bison/YACC to not recognize a command until it parses the whole string?
I have some bison grammar:
input: /* empty */
| input command
;
command:
builtin
| external
;
builtin:
CD { printf("Changing to home directory...\n"); }
| CD WORD { printf("Changing to directory %s\n", $2); }
;
I'm wondering how I get Bison to not accept (YYACCEPT?) something as a command
until it reads ALL of the input. So I can have all these rules below that use recursion or whatever to build things up, which either results in a valid command or something that's not going to work.
One simple test I'm doing with the code above is just entering "cd mydir mydir". Bison parses CD
and WORD
and goes "hey! this is a command, put it 开发者_如何学运维to the top!". Then the next token it finds is just WORD
, which has no rule, and then it reports an error.
I want it to read the whole line and realize CD WORD WORD
is not a rule, and then report an error. I think I'm missing something obvious and would greatly appreciate any help - thanks!
Also - I've tried using input command NEWLINE
or something similar, but it still pushes CD WORD
to the top as a command and then parses the extra WORD
separately.
Sometimes I deal with these cases by flattening my grammars.
In your case, it might make sense to add tokens to your lexer for newline and command separators (;) so you can explicitly put them in your Bison grammar, so the parser will expect a full line of input for a command before accepting as a commmand.
sep: NEWLINE | SEMICOLON
;
command: CD sep
| CD WORD sep
;
Or, for an arbitrary list of arguments like a real shell:
args:
/* empty */
| args WORD
;
command:
CD args sep
;
Instead of calling actions directly, just build yourself an Abstract Syntax Tree first. Then depending on the result and your preference you either execute the part of it or nothing. If there is a parsing error during tree building you may want to use %destructor directive to tell bison how to do the cleanup.
That actually is a proper way of doing it as you get full control over the contents and logic and you let bison just take care of parsing.
Usually, things aren't done the way you describe.
With Bison/Yakk/Lex, one usually carefully designs their syntax to do exactly what they need. Because Bison/Yakk/Lex are naturally greedy with their regular expressions, this should help you.
So, how about this instead.
Since you are parsing whole lines at a time, I think we can use this fact to our advantage and revise the syntax.
input : /* empty */
| line
command-break : command-break semi-colon
| semi-colon
line : commands new-line
commands : commands command-break command
| commands command-break command command-break
| command
| command command-break
...
Where new-line
, 'semi-colonis defined in your
lexsource as something like
\n,
\t` . This should give you the UNIX-style syntax for commands that you are looking for. All sorts of things are possible, and it is a little bloated allowing for multiple semicolons and doesn't take in consideration white-space, but you should get the idea.
Lex and Yakk are a powerful tool, and I find them quite enjoyable - at least, when you aren't on a deadline.
Couldn't you just change your rule match actions to append to a list of actions you want to perform if the whole thing works? Then after the entire input has been processed you decide if you want to do what was in that list of actions based on if you saw any parse errors.
精彩评论