开发者

Grammar ambiguity: why? (problem is: "(a)" vs "(a-z)")

So I am trying to implement a pretty simple grammar for one-line statements:

# Grammar

   c         : Character c        [a-z0-9-]
  (v)        : Vowel              (= [a,e,u,i,o])
  (c)        : Consonant
  (?)        : Any character (incl. number)
  (l)        : Any alpha char     (= [a-z])
  (n)        : Any integer        (= [0-9])
  (c1-c2)    : Range from char c1 to char c2
  (c1,c2,c3) : List including chars c1, c2 and c3

  Examples:
  h(v)(c)no(l)(l)jj-k(n)
  h(v)(c)no(l)(l)(a)(a)(n)
  h(e-g)allo
  h(e,f,g)allo
  h(x,y,z)uul
  h(x,y,z)(x,y,z)(x,y,z)(x,y,z)uul

I am using the Happy parser generator (http://www.haskell.org/happy/) but for some reason there seems to be some ambiguity problem.

The error message is: "shift/reduce conflicts: 1"

I think the ambiguity is with these two lines:

  | lBracket char rBracket              { (\c -> case c of
                                                 'v' -> TVowel
                                                 'c' -> TConsonant
                                                 'l' -> TLetter
                                                 'n' -> TNumber) $2 }
  | lBracket char hyphen char rBracket  { TRange $2 $4              }

An example case is: "(a)" vs "(a-z)"

The lexer would give the following for the two cases:

(a)   : [CLBracket, CChar 'a', CRBracket]
(a-z) : [CLBracket, CChar 'a', CHyphen, CChar 'z', CRBracket]

What I don't understand is how this can be ambiguous with an LL[2] parser.

In case it helps here is the entire Happy grammar definition:

{

module XHappyParser where

import Data.Char
import Prelude   hiding (lex)
import XLexer
import XString

}

%name parse
%tokentype { Character  }
%error     { parseError }

%token
    lBracket                  { CLBracket   }
    rBracket                  { CRBracket   }
    hyphen                    { CHyphen     }
    question                  { CQuestion   }
    comma                     { CComma      }
    char                      { CChar $$    }

%%

xstring : tokens                            { XString (reverse $1)       }
开发者_开发百科
tokens : token                              { [$1]                       }
       | tokens token                       { $2 : $1                    }

token : char                                { TLiteral $1                }
      | hyphen                              { TLiteral '-'               }
      | lBracket char rBracket              { (\c -> case c of
                                                     'v' -> TVowel
                                                     'c' -> TConsonant
                                                     'l' -> TLetter
                                                     'n' -> TNumber) $2 }
      | lBracket question rBracket          { TAny                      }
      | lBracket char hyphen char rBracket  { TRange $2 $4              }
      | lBracket listitems rBracket         { TList $2                  }

listitems : char                            { [$1]                      }
          | listitems comma char            { $1 ++ [$3]                }

{

parseError :: [Character] -> a
parseError _ = error "parse error"

}

Thank you!


Here's the ambiguity:

token : [...]
      | lBracket char rBracket
      | [...] 
      | lBracket listitems rBracket

listitems : char
          | [...]

Your parser could accept (v) as both TString [TVowel] and TString [TList ['v']], not to mention the missing characters in that case expression.

One possible way of solving it would be to modify your grammar so lists are at least two items, or have some different notation for vowels, consonants, etc.


The problem seems to be:

| lBracket char rBracket
...
| lBracket listitems rBracket

or in cleaner syntax:

(c)

Can be a TVowel, TConsonant, TLetter, TNumber (as you know) or a singleton TList.

As the happy manual says, shift reduce usually isn't an issue. You can us precedence to force behavior/remove the warning if you'd like.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜