Left Recursion in Grammar Results in Conflicts
Throughout a Bison grammar I am using right recursion, and I have read that left recursion is better because it doesn't have to build the whole stack first.
However, when I try to switch to left recursion on any of them, I always end up with a lot of conflicts, and I am not seeing why.
Can anyone show me a generic example of where using left recursion instead of right causes a conflict (when the right recursion doesn't cause a conflict). Then what needs to be done when switching to left to correct such a conflict. I think a fundamental exampl开发者_如何学Goe will help me more than just a correction of my own grammar.
Edit:
But I guess I should include a particular example anyways, since my understand is a little less than complete :-) Changing 'list separator command' to 'command separator list' resolves the conflict.State 9 conflicts: 3 shift/reduce
Grammar
0 $accept: input $end
1 input: error NEWLINE
2 | input NEWLINE
3 | input list NEWLINE
4 | /* empty */
5 list: command
6 | command separator
7 | list separator command
8 separator: SEMI
9 | L_OR
10 | L_AND
11 command: variable_assignment
12 | external_w_redir
13 | external_w_redir AMP
14 | pipeline
15 | pipeline AMP
...
state 9
5 list: command .
6 | command . separator
SEMI shift, and go to state 18
L_AND shift, and go to state 19
L_OR shift, and go to state 20
SEMI [reduce using rule 5 (list)]
L_AND [reduce using rule 5 (list)]
L_OR [reduce using rule 5 (list)]
$default reduce using rule 5 (list)
separator go to state 22
EDIT:
I have to retract my original answer. Your left-recursive grammar does not seem to be ambiguous, as I first thought. I think I got confused by the extra rule which is used to make a final separator optional.
Here is a simplified version of your original, right-recursive, grammar:
list: COMMAND
| COMMAND SEPARATOR
| COMMAND SEPARATOR list
;
This grammar matches (if I'm not even more confused than I think, which is of course a possibility) the inputs C, CS, CSC, CSCS, CSCSC, CSCSCS, etc. That is, a sequence of SEPARATOR-separated COMMANDs, with an optional SEPARATOR at the end.
And this is the simplified version of your left-recursive grammar, which gives a shift/reduce conflict in Bison:
list: COMMAND
| COMMAND SEPARATOR
| list SEPARATOR COMMAND
;
If I expanded things correctly, this grammar matches the inputs C, CS, CSC, CSSC, CSCSC, CSSCSC, etc. It is not ambiguous, but it is not equivalent to your left-recursive grammar. The list of COMMANDs can not have a SEPARATOR at the end, and the SEPARATORs between the COMMANDS can be doubled.
I think the reason for the shift/reduce conflict is that Bison only looks one token ahead when it decides if to shift or reduce, and with the double separators that are introduced in the second grammar, it can sometimes get confused.
If it is important that the final separator is optional, and that the grammar must be left-recursive, I would suggest re-writing it:
list: separatedlist
| separatedlist SEPARATOR
;
separatedlist: COMMAND
| separatedlist SEPARATOR COMMAND
;
But I wouldn't worry about left or right, unless your lists are really long, or you intend to run your parser on very limited hardware. I don't think it matters much, on modern desktop computers.
The confusion/errors seem to come from the productions for list. Your left recursive version is:
list: command
| command separator
| list separator command
Your right recursive version is:
list: command
| command separator
| command separator list
These two rule sets are actually quite different, matching different things. The first one matches one or more commands with separators between them, and an optional extra separator after each command. The second is similar, except it only allows the extra optional separator after the LAST command.
Assuming what you really want is a left-recursive version of the latter, what you want is
list_no_trailer: command
| list_no_trailer separator command
list: list_no_trailer
| list_no_trailer separator
The conflict in your version come from the fact that after parsing a 'command' and seeing a separator on the lookahed, it doesn;t know whether to pull that in as the optional separator after the command, or to reduce the command under the assumption that its a list separator and there will be another command right after it. It would need 2 characters of lookahead for that, so your grammar is LR(2), but not LR(1)
精彩评论